Thursday, August 16, 2018

Stop describing me!

Anyone who's ever worked in software knows how difficult it is to keep the code and the documentation in agreement. It's all too common to find errors in documentation, and not just typos, but misleading or even just false statements. This is (generally) not the result of malice, laziness, or stupidity, but is rather a natural consequence of trying to keep two unrelated things in sync with each other. Programming and natural languages are just not similar enough to make this a simple or automatic task. It's one reason experienced programmers are prone to say that all code comments are lies.

Additionally, documentation is far too often a crutch used to make up for the fact that the design is not intuitive. As Chip Camden once said, "Documentation helps, but documentation is not the solution - simplicity is." That doesn't mean that software can't be doing complex things - doing complex things is the whole point! What it means is that the interfaces to software, whether they be APIs or GUIs, have to make the complex thing simple.

Documentation then is typically a symptom of poor design and a difficult symptom to manage at that. But how then do we communicate the what and why of a software system?

The answer is self-documenting systems. "Self-documenting system" sounds like the tagline of some new tool somebody's trying to sell, but it's actually a design philosophy. It's an approach that says: make the endpoints, buttons, menu options, links, namespaces, class, method, parameter, and property names so obvious as to their purpose & function that, for the majority of them, documentation would be redundant.

The term "redundant" is used very deliberately here - it's not that documentation is never warranted, it's that we should seek to make it unnecessary, for the reasons mentioned above. There will still need to be a few bits of explanation here and there, but they'll be few, so maintaining them won't be a big burden, and they'll be very targeted, so it'll be very obvious when they need to be updated. (By targeted, I mean they'll be either high-level descriptions of the unit as a whole, or notes on some edge cases.)

Self-documenting systems lead us to the Principle of Minimum Necessary Documentation, which states:

There is an inverse relationship between how well designed something is and how much documentation it needs.

The term "inverse" is also used very deliberately. The inverse of infinity is not zero - it's darn near close to it, but it's not actually nothing. A perfect system needs no documentation, but there are no perfect systems, so there is no system that can have zero documentation, no matter how well designed it is. But for a well-designed system, a README file or some tooltips will probably suffice.

But do not assume from this that a project with no documentation except a README is well-designed, or that you can get away with just throwing together a README and taking no other thought for what's going on. Note the principle says "how much documentation it needs" not how much it has. If users have cause to say "the documentation has been skimped on", then the design is poor and no crutch was provided, which is even worse than a bad design band-aided with documentation. Intuitive, self-documenting design is the critical factor, not the documentation itself.

Now, there are a few caveats & clarifications that should be mentioned:
  • 'Self-documenting' is often used in connection with test-driven development. There is a vein of thought that says the tests should be the mechanism for documenting the code. This is a very good approach to take, but tests cannot serve as the user documentation. If the project is a public one, expecting people to read the tests is not realistic, and even with an internal project, understanding the guts of how a component works is often beyond the scope of the task at hand. Tests can and should be the developer documentation, and complete test coverage is critically important to developing robust software, but a test suite is typically too large and specialized to be the sole documentation.
  • Most systems have a fairly obvious surface. It may not be instantly obvious what a class structure or API or SDK or GUI does, but it's usually pretty easy to see what it has. There are interfaces though that don't have an easily reflectable surface, such as command-line tools. At first blush, it might seem that this is a place where more documentation is a necessary evil. Certainly there are a lot of tools that believe extensive documentation is preferable to abstraction (git, I'm looking at you). But the principle does still apply here. It is still incumbent on the developer to make the behavior intuitive (typically by following conventions), and to make the options easily discoverable. There are a myriad of ways to approach this, and it depends a lot on how flexible the tool needs to be. It requires more work, but that's our job: to make it easy to use.
It's as I've said before: deal with things directly. Don't build hedges around them, because it just takes more work and isn't as effective. Don't worry about documentation unless there truly is no other way to get the point across. Focus instead of making your code as intuitive and straightforward as possible, as this is the greatest programming good.