The Death of Grass

John Christopher (1956)

Post-apocalyptic fiction of a determinedly British kind. It’s not a bad book, and has a certain complexity to it in exploring how people’s attitudes might change when faced with the destruction of normal civilisation.

A deadly virus destroys all grass-like plants, thereby eliminating almost all food crops and the cattle that they feed. Worried the government might atom-bomb the major population centres, a small group leave London to make their way to a Lake District valley. On the way they encounter looters, towns gone feral to protect themselves – and become feral themselves, willing to kill to survive. In some ways it reads like Lord of the Flies with adults and a more pervading sense of long-term doom.

But it’s also very much a novel of its time, full of racism, sexism, classism, deference, stiff upper lips, and a sense of self-justification wrapped-up as duty. That makes it a hard read, and it doesn’t really have enough force to balance that out. Manu similar points are made elsewhere, for example in Earth Abides, without the 1950s baggage.

2/5. Finished Sunday 11 February, 2024.

(Originally published on Goodreads.)

Trying to refute some criticisms of Lisp

Trying to refute some criticisms of Lisp

I recently had a discussion with someone on Mastodon about Lisp and its perceived (by them) deficiencies as a language. There were some interesting points, but I felt I had to try to refute them, at least partially.

I should say from the start the I’m not blind to Lisp’s many inadequacies and anachronisms, merely pointing out that it has a context like everything else.

There seemed to be two main issues:

  • Poor design decisions throughout, and especially a lack of static typing
  • The shadows of really early machines in car and cadr

These points are tied together, but let’s try to unpack them.

Design

Let’s start with design. Lisp is over half a century old. I’d argue it was exceptionally well-designed – when it was designed. It lacks most modern advances in types because … well, they didn’t exist, many of them arose as solutions to perceived problems in Lisp (and Fortran), and many of those “solutions” still aren’t universally accepted, such as static typing itself.

What we’ve actually learned is that many aspects of programming lack any really universal solutions. If static typing were such an obvious and unarguable route to efficiency and quality, all new software would be being written in Haskell.

Typing and features

And the lack of modern types isn’t really as clear-cut as it appears. The argument about the lack of features in Lisp also ignores the presence of other features that are absent from almost all other languages.

Lisp’s numeric types are surprisingly flexible. Indeed, Common Lisp is still, in the 21st century, just about the only language in which one can write modern crypto algorithms like Diffie-Hellman key exchange without recourse to additional libraries, because it has arbitrary-precision integer arithmetic built-in to the standard operators. It also has rational numbers, so no loss of precision on division either.

The Common Lisp Object System (CLOS) is vastly more flexible than any modern object-oriented language. Sub-class methods can specify their relationship with the methods they override, such as being called after or just filtering the return values. Methods themselves are multiple-dispatch and so can be selected based on the types of their arguments as well as their target. The basic mechanisms can be overridden or extended using a meta-object protocol.

Then there are macros. It’s easy to underestimate these: after all, C has macros, doesn’t it? Well, yes – and no. A C macro is a function from strings to strings that can do literal string substitution of its arguments. A Lisp macro is a function from code to code that can perform arbitrary computation. They’re really not the same things at all, and it’s misleading that the same word is used for both. (C++ templates are a closer analogy, but still limited in comparison.)

The persistence of hardware 1: Stupid operation names

The complaints about car and cdr are long established: they were originally derived from machine-language instructions on the IBM 704 that was used for the first Lisp implementations. They’re a terrible hold-over from that terrible decision … aren’t they?

Well, yes – and no. Of course they’re terrible in one sense. But car and cdr are basically nouns as far as Lisp programmers are concerned. One could replace them with more modern usages like head and tail (and indeed many Lisps define these using macros).

But it’s important to remember that even “head” and “tail” are analogies, sanctified by familiarity in the computer science literature but still inexplicable to anyone outside. (If you doubt that, try explaining to someone who isn’t a programmer that a shopping list has a “head” consisting of the first entry, and a “tail” consisting of another, shorter, shopping list, is “in fact” a recursive type, and you have to acquire each item of shopping sequentially by working your way down the list from the head.) car and cdr are artificial nouns, and cons is an artificial verb – but really no more artificial that head, tail, and append, their rough equivalents in other languages.

One can argue that the persistence of car and cdr drives the persistence of compounds like caaddr. But those are unnecessary and seldom used: barely anyone would mind if they were removed.

The persistence of hardware 2: It happens a lot

The suggestion that Lisp has hardware holdovers that should be removed also neglects these holdovers in other languages.

As an example, check the definition of std::memcpy in C++. It doesn’t work with overlapping memory areas. Why is that? – why is it so fast, but so dangerous? Does it relate to underlying machine features, such as machine code move instructions on particular machines with particular restrictions? Doesn’t this introduce the risk of security flaws like buffer overruns?

Languages with more abstracted machine models don’t have these issues. I struggle to think of how one could even introduce the concept of a buffer overrun into Lisp, other than by using some external raw-memory-access library: the language itself is immune, as far as I know.

The different choices

For the sake of argument, let’s turn the argument around and ask: give that early Lisps had proper macros, arbitrary-precision integers, and so on, why did these features disappear from what we now consider to be “the mainstream” of programming language design?

Lisp’s designers had a goal of building a powerful machine in which to think: indeed, they intended it to eventually have its own hardware designed specifically for it to run on. They therefore didn’t buy into the necessity of immediate performance, and as their applications were largely symbolic AI they didn’t need numerical performance at all. They chose instead to create high-level constructs even if these couldn’t be compiled efficiently, and explored using these to create more code as they identified more and more abstract patterns whose details could be automated away. (Paul Graham has a great essay on this.)

Other language designers had other priorities. Often they needed to do numerical simulation, and needed both performance and scale. So they chose a different design pathway, emphasising efficient compilation to the hardware they had available, and made the compromises needed to get it. These have persisted, and that’s why we have languages with fixed-width integers scaled to fit into a single machine register, and compilers that generate – but don’t directly execute – the code of programs, which limits our ability to abstract and automate code generation without recourse to complicated external tools.

It’s interesting to explore these choices. They’re at one level “just” historical: accidents that shaped the present. But at another level they’re still very much present in the hardware and software landscape we inhabit. I think it’s important that we remind ourselves, continuously, that much of that landscape is a choice, not a given, and one we can question and change as we wish.

Local email from Office365 using OAUTH2 with mbsync

Local email from Office365 using OAUTH2 with mbsync

I decided recently I wanted to have a more controlled email setup, with a local archive rather than relying on remote servers to keep everything. The point of this is twofold:

  1. To have a local archive of email, separate from the corporate servers in case I need to change provider etc
  2. To use different MUAs locally, rather than being stuck with only a few that will work with all the providers and that are clunky and not well-integrated with my workflow

There’s a lot of outdated information on the web about how to set this up and it took some time for me to get a working setup, so I thought I’d share my experience. Specifically this involves interfacing command-line email receiving and sending to a Microsoft Office365 server using IMAP and SMTP with corporate-grade OAUTH2 2FA authentication: it’s the last part that’s tricky. As a bonus the same approach also works for OAUTH2 and Gmail, dispensing with insecure application passwords.

In case it’s not obvious by now, this is a hacker set-up that requires quite a lot of technical manual configuration.

How the internet email architecture works

The old-school approach to email involves several elements, each potentially provided by a different tool:

  • a client program or mail user agent (MUA) that presents email to you and lets you search, delete, store, etc;
  • a retrieval program or mail delivery agent (MDA) that retrieves mail from the providers and manages local email directories
  • a sending program or mail transfer agent (MTA) that takes; and locally-created messages and transfer them to their intended recipients.

Modern GUI email clients like Thunderbird typically wrap-up all three services into one program that’s easier to deploy and manage, but that therefore forces certain choices on the user. By reverting to the older architecture we regain flexibility and choice, at the expense of making our lives harder.

All these tools need to authenticate against other services. Traditionally this used usernames and passwords, which are clearly inadequate for the modern web. Instead we need a system based around stronger encryption.

OAUTH2 is a an authorisation delegation protocol that lets a site grant access to someone who’s authenticated against another, without getting sight of their credentials. The typical use case is for a web site to allow users to sign-in using social media services like Facebook or Google, which reduces the number of passwords a user needs to remember or manage – and, totally incidentally I’m sure, improves the social media services’ ability to track users’ activities across the web.

In our case, the OAUTH2 “flow” interacts with the authentication provider and acquires a bearer token that can then be presented to authorise access to the various email services.

Outline solution

In outline the solution is as follows:

  1. Install mbsync as MDA
  2. Set up OAUTH2 authentication for Office365
  3. Use the to authenticate mbsync against Office365 to allow retrieval
  4. Install msmtp as MTA, using the same authentication scheme
  5. Install mu4e as MUA, since I want to read my email from inside Emacs

Packages

Under Arch Linux we need the isync package for synchronisation and the cyrus-sasl-xoauth2 provider for OAUTH authentication.

   sudo pacman -S isync
   yay -S cyrus-sasl-xoauth2

The same packages are available for other distros under similar names. Note that the actual synchronisation tool is called mbsync, even though the package that contains it is called isync.

OAUTH2 flow management

We want to use OAUTH2 to authenticate an IMAP transaction, so that no additional passwords are needed. To this we need a script to manage the OAUTH2 flow.

Weirdly for an operation that’s becoming so common on the web, there doesn’t seem to be a package that offers OAUTH2 from the command line. However, there is a script that does it that’s included as an example with the mutt MUA, and we can use that. It can be found (in Arch) in the mutt package.

   sudo pacman -S mutt
   cp /usr/share/doc/mutt/samples/mutt_oauth2.py .

This puts a copy of the script into the current directory, which we can then edit in two ways:

  • add the internal application identification and client secrets for accessing Office365; and
  • set up the security for the OAUTH2 access tokens when they’re downloaded and held locally.

The client secret and app id need to be “proper”, in the sense that Office365 knows about them – but weirdly they don’t have to be related to your email domain or cloud tenancy. It’s perfectly fine to use credentials available in the public domain, for example those of Thunderbird:

   AppID = "08162f7c-0fd2-4200-a84a-f25a4db0b584"
   ClientSecret = "TxRBilcHdC6WGBee]fs?QR:SJ8nI[g82"

(I got these from here, but other than that have no idea where they come from: they’re not the same as those in the Thunderbird source code, as far as I can tell.)

The mutt_oauth2.py script stores the tokens it manages in a gpg-encrypted file. You therefore need to provide your gpg keypair identification, and I’m assuming anyone wanting to get local email has one of those! Mine is “simoninireland”.

   GPGKey = "simoninireland"

I edited the file to look like this, with some details elided:

   MSAppID = "08162f7c-0fd2-4200-a84a-f25a4db0b584"
   MSClientSecret = "TxRBilcHdC6WGBee]fs?QR:SJ8nI[g82"
   GPGKey = "simoninireland"

   ENCRYPTION_PIPE = ['gpg', '--encrypt', '--recipient', GPGKey]
   DECRYPTION_PIPE = ['gpg', '--decrypt']

   registrations = {
       'google': {
           ...
           'client_id': '',
           'client_secret': '',
       },
       'microsoft': {
           ...
           'client_id': MSAppID,
           'client_secret': MSClientSecret,
       },
   }

Put resulting script into /usr/local/bin and make it executable. Then run it in “authorisation” mode. The token file can go anywhere: I put it in the directory used by pass to allow for an alternative access route:

   mutt_oauth2.py -t .password-store/email/work.gpg --authorize

This will ask some questions:

  • we want “microsoft” authentication
  • and a “localhostauthcode” flow
  • enter your email address (the actual user, not any alias)

and it prints out a URL to copy into a browser to authenticate against Office365’s web interface. In my case this involved interacting with the university’s single sign-on and two-factor authentication (2FA) system. Doing this successfully put the necessary OAUTH2 tokens, encrypted, into the specified file. Running:

   mutt_oauth2.py -t .password-store/email/work.gpg

will output the token, refreshing it automatically if it’s expired. This may ask for the GPG key’s passphrase, if it has one, and if it’s not available from a local key agent.

(All this security means that the bearer tokens are stored encryoted at rest. It’s a little inconvenient, though, as it means you need to enter a gpg passphrase periodically, and makes it hard to run mbsync in a cron job. This is fine if, like me, your level of security paranoia is such that you accept the minor inconvenience in exchange for not having plain-text access tokens lying around; on the other hand, you may decide that using, for example, a machine with full-disc encryption is secure enough, in which case you need to edit the ENCRYPTION_PIPE and DECRYPTION_PIPE commands in the script to not do encryption: they can basically just use cat to store and retrieve the token information.)

mbsync for Office365

We now have OAUTH2 tokens for accessing Office365, which we can integrate with our MDA. mbsync has four main concepts:

  • Accounts, typically using IMAP
  • IMAP message stores, which are remote
  • Maildir stores, which are local
  • Channels, which tie local and remote together

Maildir is a file format for storing email in a directory structure, and is a long-running standard that’s supported by lots of tools. A maildir is typically presented in the MUA to a user as a folder, and represented to the MDA as a directory.

For Office365 we have:

   IMAPAccount work
   Host outlook.office365.com
   Port 993
   User <<work-email>>
   PassCmd "mutt_oauth2.py -t ~/.password-store/email/work.gpg"
   AuthMechs XOAUTH2
   SSLType IMAPS

   IMAPStore work-remote
   Account work

   MaildirStore work-local
   Subfolders Verbatim
   Path ~/Maildir/Work/
   Inbox ~/Maildir/Work/Inbox

   Channel Work
   Far :work-remote:
   Near :work-local:
   Patterns * !"Conversation History" !Calendar !Archive !Archives !Clutter !Drafts
   Create Both
   Expunge Both
   SyncState *

(See the mbsync man pages for the details of its configuration. <<work-email>> should be a a proper username, not an alias.) For our purposes the important line is the PassCmd that calls our edited script to retrieve the OAUTH2 bearer token. Email will be downloaded into a maildir tree rooted at ~/Maildir/Work: you need to create this before sync-ing.

   mkdir -p ~/Maildir/Work

Sync’ing the email

For a full sync of all maildirs just run:

   mbsync -a

That can be time-consuming, as all the maildirs (i.e., folders) have to be visited – and I have several hundred. A faster option is to normally just look at (for example) the inbox:

   mbsync Work:INBOX

This will ignore everything else, which means they’ll drift – but can be re-sync’ed periodically by running a full sync. One could also set up a cron job to do a full sync early every morning, for example, as long as the access token was held unencrypted (see above).

Indexing email

You’ll almost certainly now want to index your newly-downloaded trove of messages. There are two common tools for this mu and notmuch. Both do basically the same job of maintaining a structured and full-text index of messages that can be queried by an appropriate MUA. I chose mu, for no particular reason: some people swear by notmuch, which is based on extensive tagging of messages and so might be more familiar to Gmail users.

To install mu, we first grab the package:

   pacman -S mu

We then initialise the index by running the indexer over the maildir. If we also provide our own email address (or more than one) it knows to index these differently.

   mu init -m ~/Maildir --my-address=<<work-email>>

Sending email

All of the above sets up the MDA to get mail: we now need to be able to send mail. Fortunately we’ve already done most of the hard work needed to get this working.

We need a local MTA, for which I chose msmtp. It understands OAUTH2 natively. Installation in Arch is easy:

   sudo pacman -S msmtp

It needs to be pointed at the Office365 SMTP server and provided with the OAUTH2 tokens, which are the same as we used above:

   defaults
   auth           on
   tls            on
   tls_starttls   on
   tls_trust_file /etc/ssl/certs/ca-certificates.crt
   logfile        ~/.msmtp.log

   account        work
   host           smtp.office365.com
   port           587
   auth           xoauth2
   user           <<work-email>>
   from           <<work-email>>
   passwordeval   "mutt_oauth2.py -t ~/.password-store/email/work.gpg"

   account default : work

Again, see the msmtp man pages for the details of this, and replace <<work-email>> as appropriate: the only interesting part from our current perspective is that the passwordeval line calls exactly the same script as we used above.

Reading and writing email

Finally we’re ready to read email. I’ll leave this to you: there are lots of text-based email clients around, notably mutt that we encountered earlier. There’s also mu4e for reading email in Emacs, making use of the mu index; and notmuch also has an Emacs interface.

I use mu4e. There’s a lot of documentation on the web for setting this up, all of which applies immediately to our new set-up: the MUA is entirely independent of the MDA and MTA, and simply needs to be pointed at the right directories and accounts.

Accessing Gmail using OAUTH2

Gmail lets one use “app passwords” for accessing using IMAP, but also supports OAUTH2, which is obviously more secure. The same approach as above works for Gmail too. The initial credentials are:

   GAppID = '406964657835-aq8lmia8j95dhl1a2bvharmfk3t1hgqj.apps.googleusercontent.com'
   GClientSecret = 'kSmqreRr0qwBWJgbf5Y-PjSU'

(Same source as above.) Edit these into the script and change the entries in the config files to call it to authenticate with an appropriate store, for example:

   mutt_oauth2.py -t .password-store/email/personal.gpg --authorize

and similarly in the configurations of mbsync and msmtp.

Conclusion

If you’re still with me: congratulations, but you must really want to read your email old-school!

For me, this has completely changed my relationship with email in ways I didn’t expect. Using Emacs means typically not having the client visible all the time, which reduces the temptation to check all the time. Instead I can adopt a more structured approach and only check my email when I want to, which often means only three or four times a day. It’s also made email easier to manage, for example by adding hyperlinks in my to-do list straight to messages that need attention, and adding some integrations with org mode to simplify email processing. Those are matters for another time, though.

Resources

There are many resources on using mbsync, mu, mu4e, and the rest on the web. I found these covered all the topics in great detail, with the exception of the OAUTH2 integration I’ve detailed here. In particular I’d like to acknowledge the following:

LISPcraft

LISPcraft

nil

Robert Wilensky. LISPcraft. W.W. Norton. ISBN 0-393-95442-0. 1984.

Hard to know whether to include this as an introduction or collection of applications, since it runs all the way from basic uses to pattern-matching and associative retrieval, by way of the non-list data types in Lisp, and includes discussion of the symbol table and other internals that definitely fall into the “advanced” category.

However, this was my second introduction to Lisp (after SICP), so it has a fond place in my memory. The fact that it deals with language internals isn’t a bad thing, because it deals with the basics so well. It’s very much a traditional programming introduction focusing on the “needed” parts of the language. It pre-dates the Common Lisp standard and doesn’t touch on CLOS, which perhaps make it a less appropriate choice for newcomers these days than Practical Common Lisp.


There is also a second edition. I haven’t read it, but it seems that it addresses at least the concern about being non-standard:

Robert Wilensky. Common LISPcraft. W.W. Norton. ISBN 978-039395544-6. 1986.

The CONNIVER reference manual

The CONNIVER reference manual

Drew McDermott and Gerald Jay Sussman. The Conniver Reference Manual. Technical report AIM-259a. MIT AI Laboratory. 1974.

I think Conniver may have a claim to being the most influential language you’ve never heard of. It’s a mostly forgotten Lisp variant that was a laboratory for some radically different language design ideas, and a precursor to a surprising set of features – many of which are still uncommon in the mainstream.

Conniver was intended to manage knowledge databases. This does make the report slightly hard to read in places, as there are a lot of explicit references to planning techniques wrapped-up with language mechanisms that don’t really depend on them.

Conniver is (to the best of my knowledge) the first appearance of generators in a programming language. It is therefore a distant precursor of all the lazy functional languages and libraries, as well as the generators found in Python. Implementing generators within a language (rather than as a built-in part of one) requires control structures that can be exited and re-entered, and therefore needs more flexible frames for controlling executing code rather than conventional stack frames that are unwound destructively on return.

The obvious (for Lisp, anyway) next step is to make these “hairy” control structures visible within the language, to allow them to be re-arranged in interesting ways. It does this by exposing the structure of frames, consisting of:

  • the bound variables
  • the state of the ongoing computation within the frame (e.g., the program counter)
  • a link (the ALINK) to the frame within which free variables should be looked-up
  • a link (the CLINK) to the frame to which control should return on exit from the frame

This structure in turn mandates the use of spaghetti stack (or parent pointer trees) where frames are implemented using lists that can be combined in richer ways than actual, literal stacks. Thee are the underpinnings of several different common structures:

  • generators and continuations
  • closures
  • non-local transfers, like CATCH and THROW in Common Lisp, and therefore probably encompassing the entire condition system
  • functions with access to extra state (as with object methods, but in this case used as callbacks for database updates)
  • symbolic debuggers (not mentioned in the text)
  • lexical versus dynamic variable scope (not mentioned in the text, and I think it’s a binary choice between one or the other depending on the ALINK, rather than accommodating lexical and “special” variable classes as Common Lisp does)

So these features are constructed in Conniver from more basic mechanisms rather than being provided built-in. I’m fascinated by what other structures one might build when every frame has two independent super-frames (one for variable lookup,one for control return) instead of one, and both can be modified independently. This is radically different to most languages in which frames are hidden and their manipulation reserved for the compiler and run-time: it’s a set of ideas that re-surface at the object level in metaobject protocols.