20 August, 2022

Rob Norris

The Rust I used to know

I don’t remember where I first heard about Rust, but I know it was a long time ago, likely back in 2012.

At the time, I was the unofficial project lead and heavy contributor to Pioneer, a space exploration game in the spirit of Frontier: Elite II. I had landed here in my usual way, spending my free time bouncing from open-source project to open-source project, messing with whatever took my interest and invariably ended up deeply entrenched. As it was, this was my first outing with any kind of game engine, and with C++.

I made a lot of mistakes there, but I learned a lot. Althought C++ is perhaps not the greatest example, I had learned to appreciate the power of strong types, generics and letting the compiler do the work for you. I had also really begun to appreciate the difficulty of managing a complex graph of objects appearing and disappearing and moving between threads, and tried a lot of different techniques to try and make it easier to avoid data races and memory corruption, with limited success.

So when Rust appeared in front of me, I was in the right place to understand its promises. It had many of the tools from C++ that I’d learned to appreciate, but it also knew where all the data was at all times and was able to make decisions from that knowledge, rather than papering over the problem with a garbage collector. Bold claims, I thought, but my after-hours programming work has always been about learning and understanding things, so I resolved to keep an eye on it. I guessed Rust probably wouldn’t make it, consigned to the fate of so many good ideas that came before it. That doesn’t mean it couldn’t be learned from!

Happily, that hasn’t been the case. Rust has spent the last decade delivering and building on that promise. As my career moved more into system and platform operations I would look at Rust longingly while I debugged the latest crash or data loss, understanding that whole classes of problem wouldn’t exist if this had been written in Rust.

After a tough week of debugging @__chrismorgan has added this calming influence to my workspace (who also sometimes whispers quietly “you know, this wouldn’t have happened if it were written in @rustlang”). pic.twitter.com/AqEsUw0ySV

— Rob 💚 (@robn) July 19, 2018

Thinking on it recently caused me to look back over the Rust I’ve written in the last decade, and its been so interesting to see how the language has changed in that time, and also the problems I was dabbling with. I never turned anything into a proper finished product, again, because this is all experimenting and learning in my own time. There’s still lots of interesting things though!

This is not intended to be a well researched or cross-referenced piece, rather just an old man noting a few things while reading some old code. So I will be vague, and that’s a feature today :)

2013 - game engine experiments

Some of the earliest Rust code I have is in a repo called wrench. As I said, I was working on a game engine at the time, so it seemed useful to fiddle with things I already knew about in this new language. I expect this was targeting Rust 0.04, which was released late in 2012 and I feel reasonably certain I remember that version number drift by.

First observation: this was either before or very very early in Cargo’s lifetime. I had opted for a traditional Makefile calling rustc. I think it likely that I wasn’t yet sure where Cargo would fit into the ecosystem, and my experience with other languages what that was definitely useful to understand how to use the language without using its package manager. It does look amusingly childish these days!

This was long before Rust’s syntax and memory model was firmly pinned down. It still had green threads and managed memory, or maybe had just started to remove them, I don’t remember. Those two never interested me, but I see in this code I have used another thing that is long gone: the ~ operator. Consider:

struct Mesh {
    name: ~str,
    vertices: ~[Vector3],
    texcoords: ~[Vector2],
    normals: ~[Vector3],
    faces: ~[Face],
}

~ is roughly equivalent to Box<T> these days, or possibly, the box operator. I don’t know if I understood what I was doing here or if I just added them everywhere to make the code compile. Its definitely everywhere in this code though.

Another interesting snippet:

impl Vertex {
    fn from_str(v: &str) -> Vertex {
        let mut idx = ~[];
        for str::each_split_char(v, '/') |s| {
            idx.push(match u16::from_str(s) {
                Some(i) => i-1,
                None => 0,
            });
        }
        Vertex::from_vec(idx)
    }

I remember being delighted that pattern matching and sum types existed right from the beginning (of course they had existed elsewhere, and I had used them myself in Scala, but they were still fairly novel to me). Curiously in this code is calling functions in the primitive type namespaces directly, vs calling methods on those types directly, or using .into() as we’d be more likely to do now. Also the use of for instead of using an iterator chain directly. Of course, it doesn’t mean those things weren’t in the language yet, and maybe this is just me not knowing what I’m doing and writing to a more C++-ish style. On the other hand, I had to have got this from somewhere.

fn main() {
    #[main];

Remember when we had to tell the compiler explicitly where the main entry point was?

    do sdl::start {
        sdl::init([sdl::InitVideo]);

I barely remember this, but I think this was a kind of shorthand for passing a closure to a function. I do know that this would spawn a thread to do the work, but maybe that was specific to SDL. (I tried to search for it, but finding out details of “do” from 2013 is a tricky search).

2013 - graphics experiments

I never did write the game, though I did and do dabble more with graphics and SDL. Around the same time I ported a small OpenGL “hello world” program to SDL and GLES, and from there to Rust: rust-hello-gl

The early history seems to suggest that there wasn’t anything especially interesting here, Rust-wise. OpenGL has its own types and it appears the bindings were mostly bringing those out, and I didn’t have any need for more complex types. This is definitely low-level though, just raw OpenGL calls mostly.

Of interest here and in the previous example is the ceremony required to load in a shader (a GPU program):

fn make_shader(ty: GLenum, filename: &str) -> GLuint {
    let path = Path(filename);
    let r = match io::file_reader(&path) {
        Err(err) => fail!(fmt!("couldn't open %s for read: %s", filename, err)),
        Ok(r) => r,
    };
    let source = ~[r.read_whole_stream()];

Its a little different to how you’d write it these days, but its not especially controversial. The thing of note to me is that these days you’d be more likely to just use include_str!() to bake the file directly into the binary as text, since its not the kind of thing that really changes much.

This might have also been the time of one of my first contributions to an upstream library. servo/rust-opengles was where the GLES bindings lived at the time, and I sent a small patch to allow fully-buffered drawing.

Particularly amusing to me though is this PR I sent in 2014, to update the library for a breaking change in the Rust release of the moment (see rust-lang/rust@7d3b0bf for the details; its the ongoing removal of the aforementioned ~ boxing operator). At the time Rust was moving too quickly for Servo to keep up with, so they were pinning to a specific Rust commit and moving it along periodically. I was trying to update a library they maintained to a version of Rust more recent than what they were using, which would have broken Servo. So I was at least a small part responsible for the Servo project changing the way they managed the libraries they maintained. That makes me smile.

From stuff on my filesystem, it looks like I was still dabbling with improving SDL and GL bindings in 2014, though I can’t see any obvious purpose. I did have other stuff to do at the time, so probably Rust had lost a bit of its shine for me at that point because I didn’t have anything I needed to do right then.

2015 - JMAP and jmap-rs

By this point I’m working at Fastmail, and we’re embarking on a major project to turn the protocol used by our web client for exchanging mail, calendar, contacts, etc with the server into a true internet standard (as of 2022, its going great).

Rust had its 1.0 release in May, and I would guess the lead up to that drew my eye, so the timing was good and I started work on jmap-rs.

This was the point in my Rust career when I really got into metaprogramming. JMAP is, at its core, a set of types for representing emails, or contact cards, or calendar events, or whatever. Rust’s whole thing is types of course, so it made sense as a first pass simply map those to Rust types.

But then more requirements surfaced. JMAP’s wire format is JSON, and so it became necessary to convert to and from the Rust versions. This was before Serde existed as more than an idea and some experiments, back then, rustc_serialize was the only vaguely standard way to JSON encoding and decoding, and it didn’t have any of the magic we’re now used to. Instead, you got a Json type which you could walk through pulling out the things you needed, and a ToJson trait you could implement to convert your type into a Json object, and that was it.

This naturally led to a stupendous amount of work with generics. The very first commit already had hints of the nightmare to come. I will try to present how it was all built up here using simplified types, because there’s kind of a lot going on.

Consider a very simple version of a contact card (at the time, the real one had 19 fields):

#[derive(Clone, PartialEq, Debug)]
pub struct Contact {
    id: String,
    name: String,
    birthday: Option<Date>,
    phone_numbers: Vec<PhoneNumber>
};

To turn it into JSON, it needs to implement ToJson:

impl ToJson for Contact {
    fn to_json(&self) -> Json {
        let mut d = BTreeMap<<String,Json>::new();
        to_json_field(&mut d, "id", &self.id);
        to_json_field(&mut d, "name", &self.name);
        to_json_field_opt(&mut d, "birthday", &self.birthday);
        to_json_field(&mut d, "phoneNumbers", &self.phone_numbers);
        Json::Object(d)
    }
}

To go the other way, I actually had to define a FromJson trait, as rustc_serialize doesn’t provide one, I assume because a supporting error infrastructure would be too heavy for its purposes. So:

// trait for things that can be created from a JSON fragment
pub trait FromJson {
    fn from_json(json: &Json) -> Result<Self,ParseError>;
}

And then we can implement that for Contact:

impl FromJson for Contact {
    fn from_json(json &Json) -> Result<Contact,ParseError> {
        match *json {
            Json::Object(ref o) => {
                let mut contact = Contact::default();
                contact.id            = try!(from_json_field(o, "id"));
                contact.name          = try!(from_json_field(o, "name"));
                contact.birthday      = try!(from_json_field_opt(o, "birthday"));
                contact.phone_numbers = try!(from_json_field(o, "id"));
                Ok(Contact)
            }
        }
        _ => Err(ParseError::InvalidJsonType("Contact".to_string())),
    }
}

(aside: still using try!, the ? shorthand didn’t exist yet).

As you already see, that’s a lot of infrastructure. And then further on, we need implementations of ToJson and FromJson for all the primitive types we use, and for Option, and even for Vec, which is chaos:

impl<T> FromJson for Vec<T> where T: FromJson {
    fn from_json(json: &Json) -> Result<Vec<T>,ParseError> {
        match *json {
            Json::Array(ref a) => {
                let (ok, mut err): (Vec<_>,Vec<_>)  = a.iter().map(|ref j| T::from_json(j)).partition(|ref r| match **r { Ok(_) => true, Err(_) => false });
                match err.len() {
                    0 => Ok(ok.into_iter().map(|r| r.ok().unwrap()).collect()),
                    _ => Err(err.remove(0).err().unwrap()),
                }
            }
            _ => Err(ParseError::InvalidJsonType("Vec".to_string())),
        }
    }
}

Of course, there are subtypes like Date and PhoneNumber that also need their own ToJson and FromJson.

So that was fine, but then I had to start thinking about how to do partial types. JMAP at the time would update an object by only sending the fields that are changed, which necessitated a version of each object where all fields were optional:

#[derive(Clone, PartialEq, Debug)]
pub struct PartialContact {
    id: Option<String>,
    name: Option<String>,
    birthday: Option<Option<Date>,
    phone_numbers: Option<Vec<PhoneNumber>>
};

And of course there was then all the serialisation and deserialisation ceremony for that, as well as an apply function that would return a new Foo with the changes from a PartialFoo applied.

The intent was always that once I’d figured out the right way to do all this for a relatively “simple” type, I’d build some kind of code generation setup to magic it all away. And that did come to pass:

make_record_type!(Contact, PartialContact, "Contact",
    name:          String            => "name",
    birthday:      Date              => "birthday",
    phone_numbers: Vec<PhoneNumbers> => "phoneNumbers",
);

In the end, I abandoned this work. I forget the exact details, but there was something I couldn’t safely represent the relationship between, and at this point JMAP had started its journey through the IETF where it has changed into something much better. These days it has a much more rigid model that isn’t as reliant on, for example, how JavaScript handles null values in a dictionary, and its patch structure is more like a set of instructions on how to modify the original, rather than a weird version of the original type but all the data is optional.

And of course, Rust has progressed. Its macros are more powerful these days but also, Serde now exists and much of this you just don’t need to go near. Every time I use it I remember the past and am grateful.

And funnily enough, here in 2022 I am actually contributing to a JMAP mail client written in Rust, mujmap and its pleasant and fun work and I barely have to think about JMAP because all pretty sensible now and Serde does most of the lift.

The wild west is an interesting place.

2016 - Borderlands

2016 was one of my busier years with Rust, where it seems like I started reaching for it more often.

I really like the game Borderlands 2. I have played it a lot since its release in 2012, in fact, I just finished another playthrough a couple of weeks ago. Why I like it is for another time.

So during 2016 I had a particular weapons and equipment setup that I really enjoyed playing, but one of the things with the game is that as you and the enemies level up, your gear stays at the same level, so eventually its not powerful enough to get anything done and you have to discard it. For this particular game though, I just wanted to keep the same gear, levelled with me. So I started to look for ways to do that.

There are tools around, and descriptions of the save file format, and it was complex enough that it caught in my brain in that particular way and I wanted to explore it a lot more deeply. So this was my first outing with Rust for exploring and reverse-engineering a data format.

The code is pretty chaotic, and I never really did much with it other than print out some parts of the save file, but it was really interested just how much stuff had to be done to get into it. Its like an onion, full of layers, all them making you cry.

Up front, there’s a SHA-1 checksum.
The rest of the file is a LZO-compressed payload, which the minilzo made light work of.
Inside the payload, there’s some metadata, a Huffman tree, and then a payload compressed with it that needs exploding.
Inside that payload, there’s a large protobuf structure. Fortunately the schema file was extracted from the game years ago and readily available, so rust-protobuf had no problem with it.
From there, you can for the array of PackedWeaponData, which is all the items you currently hold. But its never that easy.
Each weapon item is a raw string of bytes. It starts with 4 bytes of “key”, and then a payload “encrypted” with this key. The “encryption” is mostly a series of xor and byteswaps based around some constants. Its only purpose is deep obfuscation, which isn’t nothing in the game world, so I can forgive it to a degree.
Once the “decrypt” is done, the remaining data is a packed structure of all the item’s “parts” (items in the game are generated from a randomly-selected set of parts, so that no two items are ever the same).

Somewhere in the item description is a “level” field. So all you have to do is bump that, then reverse all of this to generate a new save file. Hah!

This is bonkers stuff. I can’t really explain why you’d deliberately construct this this way, other than to make it hard for people to take apart. I don’t know enough about the economics of the games industry to know if that’s the kind of thing you’d do, but it wouldn’t surprise me. I suppose the other possibility might be that its the result of some disfunctional and siloed org structure, each team choosing its flavour of madness to wrap the output it got from the previous team.

In any case, I remember Rust being a good language for this kind of exploration. The Huffman decoder, for example, was very pleasant (though reading it today, its a little less safe than I might like):

#[derive(Debug)]
enum HuffmanNode {
    Internal(Box<HuffmanNode>, Box<HuffmanNode>),
    Leaf(u8),
}

fn read_huffman_tree<R: Read>(brdr: &mut BitReader<R,MSB>) -> HuffmanNode {
    let is_leaf = brdr.read_bit().unwrap();
    match is_leaf {
        true  => HuffmanNode::Leaf(brdr.read_byte().unwrap()),
        false => {
            let left  = Box::new(read_huffman_tree(brdr));
            let right = Box::new(read_huffman_tree(brdr));
            HuffmanNode::Internal(left, right)
        },
    }
}

fn decompress_huffman<R: Read>(brdr: &mut BitReader<R,MSB>, tree: HuffmanNode, outsize: usize) -> Vec<u8> {
    let mut out = Vec::with_capacity(outsize);
    while out.len() < outsize {
        let mut node = &tree;
        while let HuffmanNode::Internal(ref left, ref right) = *node {
            let b = brdr.read_bit().unwrap();
            node = match b {
                false => left,
                true  => right,
            };
        }
        if let HuffmanNode::Leaf(ref c) = *node {
            out.push(*c);
        }
    }
    out
}

Another interesting thing is that much of the save file is packed on uneven bit boundaries, so I needed a reader that could read N bits at a time but was also endian-aware, as PC savefiles and Playstation savefiles had different endianness (not that I had any non-PC savefiles, but I was vaguely aware that this might go somewhere and I should be ready). So I actually wrote and released a whole crate, bitbit that provides bit-at-a-time readers and writers over another Reader or Writer.

I think I like Rust for this sort of work. In the past I might have used C but been frustrated by the lack of tools, or Perl, which has good tools (its pack and unpack builtins are great) but doesn’t actually know anything about the data its looking at. I should use it more.

2016 - Sieve

This was the year I started looking at real places to use Rust. I figured the way forward in my own work was to show a place where we could potentially make use of it in an existing piece of software. Fastmail looks after the Cyrus mail system, and at the time one of its more awkward pieces was its mail filtering library, an implementation of the Sieve language. So I figured maybe it’d be worth seeing what it would take to write a Sieve parser and processor of my own.

I had some success with sieve-rs. Mostly, its an application of the impressive nom parser library. The basic gist is to write your entire parser in macros and in the process shed much of the runtime overhead you’d find in a more traditional language.

I actually can’t remember why I abandoned this. I remember running into a nom shortcoming that was going to be fixed in 2.0. It also looks like I was moving to a two-stage compiler, presumably meaning there was something that couldn’t be expressed in the grammar itself.

I think I’d like to revisit this one sometime. I don’t do a lot of parser work, but I really like this:

/*
   alpha          =  %x41-5A / %x61-7A   ; A-Z / a-z
*/
named!(alpha<&str,char>,
  alt!(
    char_between_s!('A', 'Z') |
    char_between_s!('a', 'z')
  )
);

I am sorta surprised there isn’t a set of nom combinators for ABNF though, since that’s so common in implementing standards protocols.

2016 - not very futuristic

I tried my hand at the nascent futures support around here, and this one did not go well.

My idea was to see if I could write a simple IMAP proxy (in the style of the nginx mail module) using futures, to get a feel for it. Alas, I failed spectacularly.

I don’t have any code to show here, just this tweet from my despair at the time:

.@rustlang compiler fighting, hour four. pic.twitter.com/OP5JrzYVUE

— Rob 💚 (@robn) October 26, 2016

As I remember it, I had got the main loop working, and then I tried to lift some of the code out into its own function, but I couldn’t name the types for the return.

This is likely improved now, but I am still wary of async Rust because of this experience. I have had more success since with a more conventional IO event loop, unsurprising perhaps since I know a thing or two about them. I probably won’t try again until I actually have a thing that will seriously benefit from it.

And beyond

There’s been a bit of play since then, of course:

twoskip-rs: an implementation of the Cyrus twoskip single-file KV store
u2f-rs: the start of a library to handle the server parts of U2F
twister: the beginnings of something like a MUD server, and a testbed for learning about ECS
spaceops: an idea for a server activity visualisation tool
mosaic: a window layout program
mujmap: a JMAP mail syncing tool

I come back to these every now and then and tweak and try things. ECS in particular is a very interesting data management model for Rust, where trees full of object pointers are no longer possible as they were in C++.

In fact, its kinda funny that the original problems I was struggling with way back at the start are ones that I’m still trying to find a way through. I don’t mind that - I like to play and learn things.

The thing is, at this point, none of these programs are particular interesting examples of Rust. The language is actually stable. The vast majority of things now are application and library things; the language has faded into the background somewhat, which is as it should be.

So here we are, ten years and 4000 words later, and I think Rust has clearly made it. I’m still waiting for an opportunity to build something and put it in production, but that’s more about inspiration and opportunity. I look forward to Rust being there when I’m ready.