Home » Blog » The only computer science book worth reading twice?

The only computer science book worth reading twice?

I was talking to one of my students earlier, and lent him a book to read over summer. It was only after he’d left that I realised  that — for me at any rate — the book I’d given him is probably the most seminal work in the whole of computer science, and certainly the book that’s most influenced my career and research interests.

So what’s the book? Structure and interpretation of computer programs by Hal Abelson and Jerry Sussman (MIT Press. 1984. ISBN 0-262-01077-1), also known as SICP. The book’s still in print, but — even better — is available online in its entirety.

OK, everyone has their favourite book: why’s this one so special to me? The first reason is the time I first encountered it: in Newcastle upon Tyne in the second year of my first degree. I was still finding my way in computer science, and this book was a recommended text after you’d finished the first programming courses. It’s the book that introduced me to programming as it could be (rather than programming as it was, in Pascal at the time). What I mean by that is that SICP starts out by introducing the elements of programming — values, names, binding, control and so on — and then runs with them to explore a quite dazzling breadth of issues including:

  • lambda-abstraction and higher-order computation
  • complex data structures, including structures with embedded computational content
  • modularity and mutability
  • streams
  • lazy evaluation
  • interpreter and compiler construction
  • storage management, garbage collection and virtual memory
  • machine code
  • domain-specific languages

…and so forth. The list of concepts is bewildering, and only stays coherent because the authors are skilled writers devoted to their craft. But it’s also a remarkable achievement to handle all these concepts within a single language framework — Scheme — in such a way that each builds on what’s gone before.

The second reason is the way in which Hal and Jerry view everything as an exercise in language design:

We have also obtained a glimpse of another crucial idea about languages and program design. This is the approach of stratified design, the notion that a complex system should be structured as a sequence of levels that are described using a sequence of languages. Each level is constructed by combining parts that are regarded as primitive at that level, and the parts constructed at each level are used as primitives at the next level. The language used at each level of a stratified design has primitives, means of combination, and means of abstraction appropriate to that level of detail.

Layered abstraction of course is second nature to all computer scientists. What’s novel in this view is that each level should be programmable: that the layers are all about computation and transformation, and not simply about hiding information. We don’t see that in the mainstream of programming languages, because layering doesn’t extend the language at all: Java is Java from top to bottom, with class and libraries but no new control structures. If a particular domain has concepts that would benefit from dedicated language constructs, that’s just tough. Conversely (and this is something that very much interests me) if there are constructs it’d be desirable not to have in some domain, they can’t be removed. (Within the language, anyway: Java-ME dumps some capabilities in the interests of running on small devices, but that’s not something you can do without re-writing the compiler.)

The third influential feature is the clear-sighted view of what computer science is actually about:

The computer revolution is a revolution in the way we think and in the way we express what we think. The essence of this change is the emergence of what might best be called procedural epistemology — the study of the structure of knowledge from an imperative point of view, as opposed to the more declarative point of view taken by classical mathematical subjects. Mathematics provides a framework for dealing precisely with notions of “what is.” Computation provides a framework for dealing precisely with notions of “how to.”

I’ve taken a view before about computers being the new microscopes, opening-up new science on their own as well as facilitating existing approaches. The “how to” aspect of computer science re-appears everywhere in this: in describing the behaviours of sensor networks that can adapt while continuing the reflect the phenomena they’ve been deployed to sense; in the interpretation of large-scale data mined and mashed-up across the web; in capturing scientific methods and processes for automation; and so forth. The richness of these domains mitigates against packaged software and encourages integration through programming languages like R, so that the interfaces and structures remain “soft” and open to experimentation.

When I looked at my copy, the date I’d written on the inside was September 1988. So a book I bought nearly 22 years ago is still relevant. In fact, I’d go further and say that it’s the only computer science book of that age that I’d happily and usefully read again without it being just for historical interest: the content has barely aged at all. That’s not all that unusual for mathematics books, but it’s almost unheard of in computer science, where the ideas move so quickly and where much of what’s written about is ephemeral rather than foundational. It goes to show how well SICP nailed the core concepts. In this sense, it’s certainly one of the very few  books on computer science that it’s worth reading twice (or more). SICP is to computer science what Feynman’s Lectures on Physics are to physics: an accessible distillation of the essence of the subject that’s stood the test of time.


6 Comments

  1. Unfortunately, the layered abstract machine view of the world doesn’t work well for real complex systems as they simply aren’t like that. The interactions and the coupling in real systems just aren’t organised in this neat way (see Perrow’s ‘Normal Accidents’). It can work for some artificial systems (such as an operating system) but trying to force real systems into this framework leads to systems that don’t support the world as it is, rather than the world as computer scientists think it should be.

    However, I agree its still a good book.

    • True — of course 🙂 — but I suspect that’s not the crux of the argument. Even if we move to a more realistic and complex architecture (distributed objects, SOA, components, whatever), I think the main point is to keep the system programmable in all its aspects: in other words, allow programmatic extension and scripting everywhere and not just at the highest level. As you point out, layering and information hiding don’t work as cleanly as might naively be supposed, and what we need is the ability to tailor and add functionality (and remove — which gets overlooked all too often) at all the appropriate abstraction levels. There are some serious issues of exposure and permissions, of course, but that’s a slightly separate argument.

  2. Even though computer science curriculum has evolved a lot in the past few years, the concept remains unchanged. What students need to learn is about the concept of the computer language. Once you get a hold of the concept, it can be applied to all of the computer languages in the world. It doesn’t matter whether you study C++ or Java programming, the concept basically remains the same. As a result, even though the book is written 22 years ago, the concept is still pretty much relevant in today technology.

Leave a comment