Getting to Rust

Wed 28 August 2019

A few years ago, I decided to hop on the FOSS development train and contribute to a project which seemed pretty neat: the reference compiler for the Rust programming language. Coming from a small amount of poking at C++ and Arduino from the safety of garbage collected languages, Rust seemed to be full of good ideas. Even as a wee novice programmer, I knew the danger of buffer overflows and use after free. I also appreciated Java's type system checking my work for me, and Rust's familiar syntax and functional programming idioms made it seem like *a cool thing* to try out - a programming language which let me use all the power of my computer like C or C++, but with all the fancy "modern" stuff that came from functional languages and strong type systems.

As most Rust programmers will warn you, I had a difficult time getting comfortable with Rust. A lot of the language is very familiar - structs, objects, variable declaration, and so on. But Rust's main selling point is its strictness about data ownership, and the complexity of the ownership system required me to think hard about exactly where the data was going, something I never really had to do before. Thankfully, the issue I latched onto in the compiler's github repo was pretty simple - after getting my head around the relevant parts of the vast code base, it mostly amounted to changing some of the classes used and settling on good message text and behaviour.

Feeling validated (but a little exhausted) by my accepted pull request, I tried some more complex projects. But my will was utterly broken by another aspect of the Rust language: lifetimes. There comes a time in every nontrivial Rust program's life where the compiler can't figure out data ownership on its own. More or less, it can't keep good enough track of objects to be able to guarantee memory isn't used after the corresponding objects have been destroyed. In these instances, the compiler whips the ball back into the programmer's side of the court and requires some variables to have annotations describing how long they are required to live for. Using these annotations, the compiler can properly track the entire lifecycle of a block of memory and make sure it is not used after it is freed or otherwise rendered unusable.

To me, this was a brick wall: the syntax was completely alien, I didn't understand how or why the compiler decided where annotations were needed, and when tasked with actually annotating a function it took me ages to figure out what my objects were actually doing and how to describe them in the language of lifetimes. Trying to work through one particular compiler error, I abandoned my fledgeling Tetris clone and moved on to gentler pastures, never to return in earnest. I've kept up with new releases and some of the big libraries, but I've never felt comfortable enough with Rust to start any projects with it.

When I took it upon myself to try to get a solid grasp of C and modern C++, some of those gears started turning again. Working through some toy C and C++ programs, I was reminded of my struggles getting up to speed with Rust. Some things were familiar: it turns out the syntax and semantics related to pointers aren't just similar in Rust and C; they're basically identical. But the heart of my problem was that while I understood what Rust was doing and what the problems it tries to solve were, I didn't really understand those problems themselves. In my time writing software in Python and Java, I never had to worry about when an object would be freed or pointers hanging around in data structures far away from the original object. While I understood that attempting to use memory that has already been freed, or files which have already been closed, was quite a bad thing, I didn't understand how much those issues interfered with the day-to-day software development process. The moment I understood that you didn't have to be an idiot to put a use after free into your code was when I finally understood what Rust was doing. References are just pointers; but when you're designing software, a pointer should never just be a pointer. That's why Rust has the totally separate & and &mut (and, not so coincidentally, why C++11 introduced smart pointers).

I'm not entirely sure where I'm going with this, but my experience with learning Rust was the first time that I recognized myself being blocked on a topic by a lack of experiential understanding of its background. The novelty of that experience alone was enough to warrant a closer inspection, and after a lot of thought I think it is the programming equivalent of learning a simplified mathematical model before the more accurate (but complex) one. There is only so much new information we can absorb before it needs to be cemented.

In my case, I am an extreme "learn by doing" person, so upgrading from some toy Rust programs to a medium-sized self guided one seemed like the logical way to get the basics of Rust down. Unfortunately, Rust does not present a way to navigate a larger undertaking like that without wading into its more advanced topics. This is a place where garbage collected languages - and, for their credit, C and archaic C++ - shine. You can write a naïve Python or Go program without worrying about what's happening with your memory or the shape of your call graph, and it'll do something. You'll hit bugs big and small and should never work on software being used by other people, but you'll feel good about the foundations of how to actually get stuff done in those languages. Rust is insistent about hitting you over the head with a sack of bricks until you understand all the things that make Rust uniquely Rust. After that and only after that, can you get up off the ground and do what you originally wanted. At least they have the courtesy to warn you up front – "fighting with the borrow-checker" didn't become a meme for nothing.