Why we have code | Simon Dobson

Coding is an under-rated skill, even for non-programmers. Computer science undergraduates spend a lot of time learning to program. While one can argue convincingly that computer science is about more than programming, it’s without doubt a central pillar of the craft: one can’t reasonably claim to be a computer scientist without demonstrating a strong ability to work with code. (What this says about the many senior computer science academics who can no longer program effectively is another story.) The reason is that it helps one to think about process, and some of the best illustrations of that comes from teaching. Firstly, why is code important? One can argue that both programming languages and the discipline of code itself are two of the main contributions computer science has made to knowledge. (To this list I would add the fine structuring of big data and the improved understanding of human modes of interaction — the former is about programming, the latter an area in which the programming structures are still very weak.) They’re so important because they force an understanding of a process at its most basic level. When you write computer software you’re effectively explaining a process to a computer in perfect detail. You often get a choice about the level of abstraction you choose. You can exploit the low-level details of the machine using assembler or C, or use the power of the machine to handle the low-level details and write in Haskell, Perl, or some other high-level language. But this doesn’t alter the need to express precisely all that the machine needs to know to complete the task at hand. But that’s not all. Most software is intended to be used by someone other than the programmer, and generally will be written or maintained in part by more than one person — either directly as part of the programming team or indirectly through the use of third-party compilers and libraries. This implies that, as well as explaining a purpose to the computer, the code also has to explain a purpose to other programmers. So code, and programming languages more generally, are about communication — from humans to machines, and to other humans. More importantly, code is the communication of process reduced to its purest form: there is no clearer way to describe the way a process works than to read well-written, properly-abstracted code. I sometimes think (rather tenuously, I admit) this is an unexpected consequence of the halting problem, which essentially says that the simplest (and generally only) way to decide what a program does is to run it. The simplest way to understand a process is to express it as close to executable form as possible.

You think you know when you learn, are more sure when you can write, even more when you can teach, but certain only when you can program. Alan Perlis

There are caveats here, of course, the most important of which is that the code be well-written and properly abstracted: it needs to separate-out the details so that there’s a clear process description that calls into — but is separate from — the details of exactly what each stage of the process does. Code that doesn’t do this, for whatever reason, obfuscates rather than explains. A good programming education will aim to impart this skill of separation of concerns, and moreover will do so in a way that’s independent of the language being used. Once you adopt this perspective, certain things that are otherwise slightly confusing become clear. Why do programmers always find documentation so awful? Because the code is a clearer explanation of what’s going on, because it’s a fundamentally better description of process than natural language. This comes through clearly when marking student assessments and exams. When faced with a question of the form “explain this algorithm”, some students try to explain it in words without reference to code, because they think explanation requires text. As indeed it does, but a better approach is to sketch the algorithm as code or pseudo-code and then explain with reference to that code — because the code is the clearest description it’s possible to have, and any explanation is just clearing up the details. Some of the other consequences of the discipline of programming are slightly more surprising. Every few years some computer science academic will look at the messy, unstructured, ill-defined rules that govern the processes of a university — especially those around module choices and student assessment — and decide that they will be immensely improved by being written in Haskell/Prolog/Perl/whatever. Often they’ll actually go to the trouble of writing some or all of the rules in their code of choice, discover inconsistencies and ambiguities, and proclaim that the rules need to be re-written. It never works out, not least because the typical university administrator has not the slightest clue what’s being proposed or why, but also because the process always highlights grey areas and boundary cases that can’t be codified. This could be seen as a failure, but can also be regarded as a success: coding successfully distinguishes between those parts of an organisation that are structured and those parts that require human judgement, and by doing so makes clear the limits of individual intervention and authority in the processes. The important point is that, by thinking about a non-programming problem within a programming idiom, you clarify and simplify the problem and deepen your understanding of it. So programming has an impact not only on computers, but on everything to which one can bring a description of process; or, put another way, once you can precisely describe processes easily and precisely you’re free to spend more time on the motivations and cultural factors that surround those processes without them dominating your thinking. Programmers think differently to other people, and often in a good way that should be encouraged and explored.