Posted on Friday 5 July, 2024

Debugging Lisp

A four-part blog post series on debugging Lisp, covering:

Live recompilation of code from inside a debugging session, together with re-executing changed code from within a running stack trace
Inspecting objects and interacting with traces of function calls
Redefining classes, and how to ensure that existing instances are upgraded to be compatible with the new definition
Restarts, the neglected part of the condition system that controls how programs continue after encountering errors

The first two are essential, and show how different Lisp programming is from using other languages. In fact it requires a considerable mental shift to re-acquire the right reflexes for dealing with errors and debugging in a fully interactive environment: well, it did for me, anyway. We’re not used to interactivity in large development environments. There is seldom any need to close down a running Lisp session and start again, as everything can usually be changed and adapted within a session. This is very unlike the usual compile-edit-debug cycles we’ve become accustomed to.

The third post – on redefining classes – shows how one can upgrade a program that simply has to keep running, because its live state upgrade can be programmed too.

The most significant part of the mental shift is to realise that the debugger is written in Lisp itself, and makes use of restarts and other features to provide the interface. This is a consequence of the degree of exposure of the Lisp run-time structures into to language itself, where they can be examined and manipulated using the full power of the language – and then be re-started or discarded as required.

Posted on Thursday 4 July, 2024

Ultra-Processed People: The Science Behind Food That Isn’t Food

Chris van Tulleken (2023)

Food versus food-shaped industrial products.

Ultra-processed food (UPF) is pervasive in modern diets, and with it comes a litany of actual, conjectured, and supposed harms. UPF itself is a strange beast, taking food crops and turning them into pure components that can then be re-mixed to construct new products. This has some perverse consequences, such as flavours being removed from purified and “modified” oils and starches in order to make them more broadly usable – and then having those same flavours re-introduced later in the process, again in “modified” form. That sounds insane – how can it be cheaper than using the original oil? – but in fact it makes perfet commercial sense for companies wanting fungible raw materials to produce unvarying, known-ahead-of-time tastes and textures.

The resulting products (I’m now reluctant to call them foods), being created in labs, can be re-worked in pursuit of particular commercial goals, for example by adding powdered soya when looking to create a “high-protein” snack. They can also be re-engineered to be far more pleasurable and addictive for consumers, for example by hacking the body’s responses to food (which are themselves coming to be understood as far more subtle and complicated than we used to think).

van Tulleken has a science background, and it shows in the writing: most of the claims are carefully framed and evidenced. His background also saves him from falling for the industry’s faux-refutations about there not being definitive causal links to specific harms: randomised controlled trials aren’t the “scientific gold standard” in situations where they’re impossible to conduct in the real world, and epidemiological evidence coupled with some knowledge of the possible harm pathways can provide sufficient evidence. Having said that, he does sometimes deviate from this careful path, and there are a few instances of words like “may” and “could” doing a lot of heavy lifting.

One of the most powerful elements of the book is that it simultaneously doesn’t preach or prescribe, but does offer suggestions for ways forward. UPF is very difficult to precisely define, and is therefore difficult to legislate for or avoid. Does a single stabiliser in a product render it UPF? – because if so literally anything in a packet would be included. van Tulleken also takes aim at some of the wider social drivers of UPF, notably its cheapness and ease of preparation compared to “real” food, reinforcing poor diet as a consequence of poverty. He also suggests some interesting policy options, while also taking aim at a policy infrastructure that’s heavily co-opted by the UPF industry. Regulators and the food industry are not partners and have goals that are mutually irreconcilable within the current framework of pre-eminent shareholder value.

5/5. Finished Thursday 4 July, 2024.

(Originally published on Goodreads.)

Posted on Friday 28 June, 2024

Class slots that work with classes and instances in CLOS

I recently had a use case where I wanted to associate a constant value with a class and its instances – but I needed to be able to get the value without having an instance to hand. This turns out to be solvable in CLOS.

In languages like Java you can associate class variables with classes, which can then be accessed without having an instance of the class. CLOS also has class-allocated slots, for example:

    (defclass A ()
      ((instance-slot
        :initform 1)
       (class-slot
        :allocation :class
        :initform 2))
      (:documentation "A class with instance- and class-allocated slots."))

An instance of A has two slots: instance-slot stored per-instance, and class-slot stored only once and shared amongst all instances. This is close to Java’s notion of class variables, but one still needs an instance against which to call the method. (Seibel makes this point in chapter 17 of “Practical Common Lisp”.)

One could just create a basic object and retrieve the slot:

 (slot-value (make-instance 'A) 'class-slot)

but that’s inelegant and could potentially trigger a lot of unnecessary execution (and errors) if there are constructors (overridden initialize-instance methods) for A. One could use the metaobject protocol to introspect on the slot, but that’s quite involved and still allows the slot to be changed, which isn’t part of this use case.

What I really want is to be able to define a generic function such as class-slot – but specialised against the class A rather than against the instances of A. I thought this would need a metaclass to define the method on, but it turned out that generic functions are powerful enough on their own.

The trick is to first define a generic method:

    (defgeneric class-slot (classname)
      "Access the class slot on class.")

As the argument name suggests, we’re planning on passing a class name to this method, not an instance. To set the value for A, we specialise the method as working on exactly the class A:

    (defmethod class-slot ((classname (eql 'A)))
      2)

The eql specialiser selects this method only when exactly this object is passed in – that is to say, the name of A.

But what if we have an instance of A? The same generic function can still be used, but instead we specialise it against objects of class A in the usual way:

    (defmethod class-slot ((a A))
      (class-slot (class-name (class-of a))))

If we now pass an instance of A, we extract its class name and then re-call the same generic function, passing it the class name instead of the object itself (which it doesn’t need, because the slot value is independent of the actual object). This will select the correct specialisation and return the slot value.

This approach works if we generate sub-classes of A: we just use eql to specialise the generic function to the class we’re interested in. It also works fine with packages, since the undecorated symbol passed to the specialiser will be expanded correctly according to what symbols are in scope. However, the value is only associated with a single class, and isn’t inherited. That’s not a massive limitation for my current use case, but would be in general, I think.

This approach critically relies on an easily-forgotten property of Lisp: values have types, but variables don’t, and we can specialise the same generic function against any value or type. The pattern makes use of this to avoid actually storing the value of class-slot anywhere, which as a side effect avoids the problem of someone accidentally assigning a new value to it. It’s an example of how powerful generic functions are: more so than the method tables and messages found in most O-O languages. And it’s sufficiently structured that it’s crying-out for a couple of macros to define these kinds of class slots.

UPDATED 2024-06-29: Fixed the typo in the class definition to use :initform and not :initarg. Thanks to @vindarel for pointing this out to me.

Posted on Friday 28 June, 2024

Sandworm: A New Era of Cyberwar and the Hunt for the Kremlin’s Most Dangerous Hackers

Andy Greenberg (2019)

A history of cybersecurity with an emphasis on attacks against critical infrastructure.

Rather than bomb a power plant or dam, why not have its own IT systems turn on it? That’s the premise of a lot of modern cyberattacks, and they can be run with all the same sophistication as more conventional attacks against purely digital targets – but with the proviso that a lot of the targets don’t think of themselves as IT organisations and haven’t interalised the importance of their digital control systems. Since they don’t emphasise security at board level, it gets neglected and becomes a weak spot that can be exploited from anywhere on the globe.

The remedies aren’t always trivial. Many attacks detailed in this book are almost incredibly elongated, involving the compromise of several servers and softwate packages on the way to the target. Keeping industrial control systems up-to-date can be difficult (or impossible): physical infrastructure exists on far longer timescales than the digital systems than control it. (I’ve had personal experience of laboratory equipment with control software that can only run on Windows 95, which isn’t being upgraded. Keeping that secure needs dedicated changes in the network architecture, and has a lot of knock-on consequences for efficiency and data management.)

The value of such hacking for criminals is easy to understand, but it’s also the ultimate technique of asymmetric warefare, letting an attacker deny responsibility and avoid counterattacks. Even the most powerful countries have an incentive to stop responses going kinetic, after all. But the asymmetry works both ways, with the US as one of the primary developers of sophisticated cyberweapons and so having an incentive not to push for international controls even while at the same time the US critical infrastructure is more vulnerable than others’ to those weapons.

Greenberg is a long-time student of cyberwarfare, and writes with a lot of insight into both the politics and the technology. He highlights the impacts of many short-sighted decisions made in the interests of national security advantage, culminating in the Shadow Brokers’ release of a cache of NSA tools that form the basis for a new generation of cyberweapons. This is a great book to pair with This Is How They Tell Me the World Ends: The Cyberweapons Arms Race for a broad-ranging and highly technically literate exploration of the new arms frontier.

4/5. Finished Friday 28 June, 2024.

(Originally published on Goodreads.)

Posted on Friday 21 June, 2024

C++ template macroprogramming versus Lisp macros

Following on from Lisp macros versus Rust macros, I also want to compare C++ templates to Lisp macros.

Templates in C++ were designed as a way of generating typed versions of classes. The template declares some type variables that can be used as placeholders within a class declaration. When the template is instanciated and provided with actual type names, these are substituted for the type variables and the class is expanded. (It used to literally happen like this, so each use generated a completely new class. Modern compilers are smart enough to avoid the code repetition.) A classic example is a typed singly-linked list:

  template<typename A>
  struct List<A> {
    A value;
    struct List<A> next;
  };

However, the template system also allows values to be used in templates instead of (or as well as) type names. When these are encountered they are expanded at compile-time, and may cause further templates to be expanded. A classic example of this is to pre-compute some factorials:

  template<unsigned n>
  struct factorial {
    enum { value = n * factorial<n - 1>::value };
  };

  template <>
  struct factorial<0> {
    enum { value = 1 };
  };

In this code the first clause defines a template that defines the usual recursive factorial calculation. The second clause bottoms-out this recursion by defining a specialised template that directly provides the factorial of zero. This can then be used in code such as:

  template<unsigned n>
  struct factorial {
    enum { value = n * factorial<n - 1>::value };
  };

  template <>
  struct factorial<0> {
    enum { value = 1 };
  };

  int main() {
    std::cout << factorial<7>::value << std::endl;
  }

which outputs the factorial of 7 as one might expect – but with the factorial having been computed at compile-time and inserted into the code as a literal, so the calculation introduces no run-time calculation.

There are some stringent limitations on the ways in which templates can be expanded. They can’t have mutable variables for a start (that’s why we needed to use the recursive factorial algorithm). Weirdly this makes the template language a functional programming sub-set of C++. Having said that, as with Lisp macros, it allows calculations that can be statically performed forward to be brought forward to compile-time. This makes it useful for building read-only tables, unrolling loops, and the like.

It’s claimed that templates are now so akin to “normal” C++ that they incur less of a readability penalty. That’s a subjective statement that may be true. But the template language isn’t C++. While one can write programs in it, they’re nothing like the C++ one would normally write. The template language is Turing complete, but that just means one can encode any computation, not that one can encode any particular program – and most template programs will require massive re-writing from the code one would write normally for execution at run-time. Template macroprogramming is therefore a non-trivial programming task to undertake.

Again as with Rust versus Lisp, C++ templates are an extension to the language rather than a core part of it, although they’re now used quite extensively in the standard library for generic typing. Also as with Rust, use of templates is semantically and syntactically distinct from “normal” C++ code or syntax, and it’s this that causes the programming load.

A Lisp macro for the factorial computation, by contrast, looks almost exactly like a normal factorial function that can access the entire language, both when defined and when used:

  (defmacro factorial (n)
    (labels ((fact (m)
               (if (= m 0)
                   1
                   (* m (fact (1- m))))))
      `,(fact n)))

  (princ (factorial 7))

The choice of macro or function (defmacro or defun) has no further syntactic implications for the rest of the program, and no restrictions on the code that can be used within the definition; we could re-write the to use iteration, mutable variables, or any other code, and it would simply be executed at compile-time. The whole language is there, all the time. We can show this by taking a factorial function written in “normal” Lisp and macro-ifying it to be computed at compile-time:

  (defun fact (m)
    "Compute the factorial of M."
    (if (= m 0)
        1
        (* m (fact (1- m)))))

  (defmacro factorial (n)
    `,(fact n))

  (princ (factorial 7))

More importantly, Lisp (and indeed Rust) macro can abstract over syntax as well as classes and values, and so allow the language to be extended with new first-class-at-compile-time structures. Templates are restricted to instanciating templates written with a fixed syntax; in Lisp the syntax has to be “Lisp-like”, although that’s a very light restriction; and in Rust a macro can use any syntax that Rust can tokenise.

While C++ templates are sometimes described as macroprogramming (or metaprogramming), they’re addressing a substantially different use case to that addressed by Lisp or Rust macros, and doing so within a more restricted computational and syntactic envelope.