On variable naming when teaching

One of the hardest things a programmer has to do on a daily basis is naming things. Anything that we name will stay with us for a while and it's very likely that other programmers will have to use the thing we just named as well. So naming something properly is very important. It's often said that the two hardest problems in programming are naming things and cache invalidation. I tend to agree with that statement.

A lot of times when I struggled with a piece of code, naming could have made my struggle easier. A function name like fetchData might make sense at the time of writing. But a few weeks later you look at that code and you start wondering.. what data does that function fetch, exactly? Couldn't it have been named fetch just as well? I mean, the fact that it fetches data is implied. Or is it? Almost always there's room for discussion on the naming of things in code. However, that's not the point I want to make in this post. This post is about naming things in the context of tutorials and books. In other words, code snippets that have a teaching purpose.

When you write code that is intended or teaching you should make sure that you keep two things in mind.

You want to get a certain point or concept across to the reader. The code should make this point as clear as possible.
You want to stick to best practices so you're not teaching bad habits.

I have found that these two goals can easily be in conflict.

An example

Let me show you an example I have found in the Functional Swift book by the folks over at objc.io.

struct Times<T, U> {
    let fst: T
    let snd: U
}

These four lines won't look very scary to somebody who is familiar with Generics in Swift. They will know that T and U are just placeholders for types. Any value could go there, we could have defined them as A and B or Hello and World just as well. However, the best practice is that generic naming starts at T and seems to work it's way up through the alphabet from there. So that's why this snippet uses the T and U as type names for the generic portion of this struct. For beginners this might be confusing so you could argue that more descriptive type names would be better. For example, FirstType and SecondType are a lot clearer. They don't follow best practices though, so picking between the two can prove to be quite tough and in my opinion it depends on the point you're trying to get across. In the above snippet the concept of generics is already explained in previous chapters, so T and U are just fine in this snippet. If this snippet was about explaining generics it might have been better to help the reader out a little bit by breaking best practices for the sake of readability and introducing the proper way after explaining how generics work.

What does bother me about the way the Times struct is defined is the way fst and snd are named. The author chose to sacrifice readability in order to save a few keystrokes. In production code this happens all the the time. Loops like for u in users or [obj.name for obj in res] are not uncommon in Swift and Python. One might even argue that using short names like this is actually some sort of convention and while that might be true, if you're explaining something in code you do not want the reader to have a single doubt about what something does because of the name. For example, fst and snd in the Times struct could have been named first and second or left and right. A loop like for u in users could be for user in users. [obj.name for obj in res] could be clarified by writing [user.name for user in fetched_users]. These more verbose versions of code might not be fully in line with best practices or common in production code, but when you're using code to explain something you need to make sure that your code is as readable as possible.

In conclusion

Naming things is hard, there's no doubt about it. What might be clear one day could look like gibberish the next. What might be obvious to me could be nonsense to you. Conventions help ease the pain. If everybody uses the same rules for naming things it becomes a little bit easier for programmers to come up with good naming. However, we should not forget that people who read our code to learn more about a certain topic might not be fully aware of certain conventions. Or they might not be very good at understanding what i, j and k mean when we're nesting loops. And let's be honest, those single letter variables lack all kinds of meaning. Even though it's convention and we all do it, it's just not a good convention follow when teaching. At least not all of the time.

Next time you write code that's intended for explaining something, ask yourself if breaking a convention will make your snippet simpler or easier to follow. If the answer is yes, it just might be a good idea to break the convention and save your readers some brainpower.