Parser-centric grammar complexity

We usually think about formal language grammars in terms of the complexity of the language, but it’s also possible — and potentially more useful, for a compiler-writer or other user — to think about them from the perspective of the parser, which is slightly different. I’m teaching grammars and parsing to a second-year class at the moment, which involves getting to grips with the ideas of formal languages, their representation, using them to recognise (“parse”) texts that conform to their rules. Simultaneously I’m also looking at programming language grammars and how they can be simplified. I’ve realised that these two activities don’t quite line up the way I expected. Formal languages are important for computer scientists, as they let us describe the texts allowed for valid programs, data files and the like. The understanding of formal languages derives from linguistics, chiefly the work of Noam Chomsky in recognising the hierarchy of languages induced by very small changes in the capabilities of the underlying grammar: regular languages, context-free and context-sensitive, and so forth. Regular and context-free languages have proven to be the most useful in practice, with everything else being a bit too complicated for most uses. The typical use of grammars in (for example) compilers is that we define a syntax for the language we’re trying to recognise (for example Java) using a (typically context-free) grammar. The grammar states precisely what constitutes a syntactically correct Java program and, conversely, rejects syntactically incorrect programs. This parsing process is typically purely syntactic, and says nothing about whether the program makes sense semantically. The result of parsing is usually a parse tree, a more computer-friendly form of the program that’s then used to generate code after some further analysis and optimisation. In this process we start from the grammar, which is generally quite complex: Java is a complicated system to learn. It’s important to realise that many computer languages aren’t like this at all: they’re a lot simpler, and consequently can have parsers that are extremely simple, compact, and don’t need much in the way of a formal grammar: you can build them from scratch, which would be pretty much impossible for a Java parser. This leads to a different view of the complexity of languages, based on the difficulty of writing a parser for them — which doesn’t quite track the Chomsky hierarchy. Instead you end up with something like this: Isolated. In languages like Forth, programs are composed of symbols separated by whitespace and the language says nothing about the ways in which they can be put together. Put another way, each symbol is recognised and responded to on its own rather than as part of a larger construction. In Java terms, that’s like giving meaning to the { symbol independent of whether it’s part of a class definition or a for loop. It’s easy to see how simple this is for the compiler-writer: we simply look-up each symbol extracted from the input and execute or compile it, without reference to anything else. Clearly this only works to some extent: you only expect to see an else branch somewhere after a corresponding if statement, for example. In Forth this isn’t a matter of grammar, but rather of compilation: there’s an extra-grammatical mechanism for testing the sanity of the control constructs as part of their execution. While this may sound strange (and is, really) it means that it’s easy to extend the syntax of the language — because it doesn’t really have one. Adding new control constructs means adding symbols to be parsed and defining the appropriate control structure to sanity-check their usage. The point is that we can simplify the parser well below the level that might be considered normal and still get a usable system. Incremental. The next step up is to allow the execution of a symbol to take control of the input stream itself, and define what it consumes itself. A good example of this is from Forth again, for parsing strings. The Forth word introducing a literal string is S", as in S" hello world". What happens with the hello world" part of the text isn’t defined by a grammar, though: it’s defined by S", which takes control of the parsing process and consumes the text it wants before returning control to the parser. (In this case it consumes all characters until the closing quotes.)  Again this means we can define new language constructs, simply by having words that do their own parsing. Interestingly enough, these parser words could themselves use a grammar that’s different from that of the surrounding language — possibly using a standard parser generator. The main parser takes a back seat while the parser word does its job, allowing for arbitrary extension of the syntax of the language. The disadvantage of course is that there’s no central definition of what constitutes “a program” in the language — but that’s an advantage too, certainly for experimentation and extension. It’s the ideas of dynamic languages extended to syntax, in a way. Structured. Part of the subtlety of defining grammars is avoid ambiguity, making sure that every program can be parsed in a well-defined and unique way. The simplest way to avoid ambiguity is to make everything structured and standard. Lisp and Scheme are the best illustrations of this. Every expression in the language takes the same form: an atom, or a list whose elements may be atoms or other lists. Lists are enclosed in brackets, so it’s always possible to find the scope of any particular construction. It’s extremely easy to write a parser for such a language, and the “grammar” fits onto about three lines of description. Interestingly, this sort of language is also highly extensible, as all constructs look the same. Adding a new control construct is straightforward as long as it follows the model, and again can be done extra-grammatically without defining new elements to the parser. One is more constrained than with the isolated or incremental models, but this constraint means that there’s also more scope for controlled expansion. Scheme, for example, provides a macro facility that allows new syntactic forms, within the scope and style of the overall parser, that nevertheless behave differently to “normal” constructs: they can capture and manipulate fragments of program text and re-combine them into new structures using quasi-quoting and other mechanisms. One can provide these facilities quite uniformly without difficulty, and without explicit syntax. (This is even true for quasi-quoting, although there’s usually some syntactic sugar to make it more usable.) The results will always “look like Lisp”, even though they might behave rather differently: again, we limit the scope of what is dealt with grammatically and what is dealt with programmatically, and get results that are somewhat different to those we would get with the standard route to compiler construction. This isn’t to say that we should give up on grammars and go back to more primitive definitions of languages, but just to observe that grammars evolved to suit a particular purpose that needs to be continually checked for relevance. Most programmers find Java (for example) easier to read than Scheme, despite the latter’s more straightforward and regular syntax: simplicity is in the eye of the beholder, not the parser. But a formal grammar may not be the best solution in situations where we want variable, loose and extensible syntax for whatever reason, so it’s as well to be aware of the alternative that make the problem one of programming rather than of parsing.

The type wheel turns again

It’s slightly spooky when you’re discussing a topic and evidence for (or against) your position seems to spontaneously appear. The fashion for strong versus weak type systems seems to change on a cycle of about a decade. It might be turning again. On Monday I was talking with a student who’s doing a project using node.js, looking at using it as the basis for doing elastic concurrent programs. It’s the sort of project that could underpin some kinds of low-end cloud computing, letting designers use JavaScript end to end. The discussion turned to type systems, and how Javascript’s very weak view of types makes things more difficult for his project, as he will have to constantly protect against having the wrong methods called. On the other hand, it makes other things easier by letting him create proxies and other structures freely. The question is then whether strongly-typed languages are preferable to weakly-typed ones. In a strongly-typed language, every denotable value has a type, the language ensures that all uses of those values are type-correct and generates a type error if not. A strongly statically-typed language does this purely at compile-time, and it’s generally recognised by those who advocate these approaches that it’s preferable to catch as many errors as possible as early as possible. It’s also recognised that this isn’t always possible (think Java class loaders), and so some run-time structures are also needed — but these can be provided so as to catch problems as early as possible (when code is loaded). (See Dobson and Matthews. Ionic types. In ECOOP 2000 – object-oriented programming, pages 296–312. Elisa Bertoni (ed). Volume 1850 of LNCS. Springer-Verlag. 2000.) For some people, static typing feels too rigid: the compile will often prevent things that the programmer “knows” to be possible. In this case a looser type regime is often preferred. Strong dynamic typing checks at every operation to make sure that the values being manipulated are type-correct; weak dynamic typing does fewer checks, often only detecting problems very late; untyped or monotyped languages do few or no checks and will apply any operation to any piece of data at the programmer’s instruction. I tend to fall into the strong static typing camp — which is slightly ironic, given that I’m currently working on untyped extensible virtual machines. Programmers’ beliefs that they know better than the type-checker are often erroneous, the more so as code gets more complicated. The fashion for type systems seems to follow a cycle. People are using a language with strong typing when a new kind of programming comes along, often driven by some new technology. The strong types are restrictive for this new domain (having been designed for a different world) so programmers invent or re-discover languages with dynamic typing that allow them to write the code they need to write without the difficulties of fighting a type system. In large-scale systems, programmers also like being able to evolve the data structures gradually, without having to update every user. (Doing so in the presence of strong types often doesn’t work, although with care it’s possible.) This leads to a widespread belief that type-checking is unnecessary, difficult, for losers, etc, and that dynamic languages are the only way to go. Then, as programs get bigger and more complicated, problems start to emerge. Typically these revolve around different parts of the system not agreeing on the exact data representation, so everyone has to check the data they receive because the language offers no guarantees that it’ll be correct. (This is the down-side of being able to evolve the representation piecemeal.)  Such checks rapidly become untenable, and so programmers start thinking about whether there are automated mechanisms to improve checking — and re-discover strong type systems. Having been discussing this in the context of Javascript, I then stumbled across TypeScript, a Javascript extension that allows type annotations. These aren’t exactly a strong type system — they’re optional, for a start — but definitely mark a change in the way Javascript would be used, as a system with defined type structure rather than as a type free-for-all. Since Javascript occurs in a lot of systems these days — on the front-ends, but also increasingly server-side — this is a welcome departure. I find it hard to believe that large, long-lived component-based systems can be built in a dependable fashion using only a dynamic approach to typing. It relies too much on programmers’ willingness and ability to check everything, every time. Actually there are strong arguments for the need for non-static run-time checks, most notably in distributed systems when you can’t be sure the data you receive will be type-correct even if the compiler that generated the code thinks it is, since you generally don’t have complete control over all the components and their evolutions. But this isn’t an argument against strong typing in general: it still helps, even if there are small holes. Instead one perhaps needs to check types at the component boundaries so that, once admitted, you have confidence in their type-correctness. This in turn places demands on the transport protocols to be self-describing in terms of their payloads’ types, and doing so supports other constructs (like type-driven computation) for free without sacrificing the benefits of the strong checks. Having some dynamism (and room for run-time failure) within a generally static seems like a decent compromise.

What’s new for awareness researchers?: an interview

I was interviewed at SASO as part of collecting what people thought were the ways forward and important topics for pervasive systems. The Aware project co-ordinates between a collection of EU projects into self-awareness and adaptive systems. As part of its remit it collects the evolving opinions of the research community to try to decide where the research is leading, and in particular what are the emerging questions that might be supported by future EU initiatives. Interviewing researchers is a new departure that I haven’t seen before, and they interviewed several people with diverse backgrounds to get a broad sweep of the area. I focused on how little we understand how to model what people actually do in the course of their lives, and consequently how hard it is to explain these processes to computers and build computer-mediated analysis and support. (Sorry about the noise in the background, we were standing in a corridor….)

Response to Science Foundation Ireland’s consultation on its strategic plan

Science Foundation Ireland (SFI) recently launched its strategic plan for the coming years. This is my contribution to the consultation process. It’s quite unusual for there to be a consultation process, of course: many such documents are drafted by governments and their agencies without reference to the opinions of outside stakeholders, so it’s gratifying that SFI is confident enough to put its thoughts out for public comment. It’s also gratifying that the aims and aspirations embodied by the document are so forward-looking and ambitious, given the parlous state of the country’s finances: the straightened times only make government investment in science more important, as a route to improving the country’s position.

There are however some issues that I think merit further consideration. These include:

  • the orientation of basic research;

  • the demands on staff from different phases of the research lifecycle;

  • the career trajectories of PhD students;

  • the significance of capacity-building, especially in the area of spin-outs and other indirect benefits; and

  • the possible extended leveraging of the expertise present in Irish science as part of the organisation.

What follows is abstracted from the letter I submitted to the consultation. I’ve removed some of the extraneous detail and generalised slightly to make the content less Ireland-specific, as a lot of the issues will be faced (or indeed are being faced) by funding agencies elsewhere.

Orientation of basic research. The orientation of research suggests that one can create a clear vision of what is going to be successful and impactful. This is clearly not the case: many areas that have changed the world in the long term had little or no short-term applicability when they were first investigated (for example lasers and hypermedia). The notion of government funding as “sowing seeds” therefore needs to be regarded with caution, and the history of (for example) Silicon Valley is notable mostly for the lack of directed and co-ordinated government investment and a focus instead on capacity-building (see below).

To maximise the chances of breakthroughs, one must allow experimentation that cannot be justified in terms of its known or predicted impact. One hears a lot about the “impact” and “spillover” of “basic” research into more “applied” areas. It is worth noting that such a hard distinction between “basic” and “applied” research is now largely discredited: a more accurate characterisation might be between “applied” and “not applied (yet)”. This is important, as it implies that any calculation of the value for money of any piece of research is often more a matter of timescale than of any intrinsic property of the work or field. Much of the mathematics that now underlies on-line cryptography, for example, was developed decades before its was practically applied, without such number theory having any obvious applications.

The basic difficulty with orientating basic research is that it is almost always a backward-facing activity, in the sense that one can only generate evidence in support of areas that have already demonstrated relevance and/or which already have a significant local industrial presence. Unless care is taken this can exclude promising areas for which there is currently no market but which have the capacity for enormous impact going forward. (3-D printing is the technology that springs most to mind here.) Setting a limit on the funding that will go to non-prioritised areas seems unlikely to provide for a broad exploration of speculative areas.

Phases and skills. It is important to recognise that excellent research scientists are often not the appropriate individuals to commercialise their own work. When one is speaking of the research lifecycle, it is clearly true that the translation phase is as creative and exciting as the discovery phase. However, it must be recognised that the skills required in these two phases are very different. While some exceptional individuals are able to muster excellence in both discovery and translation, this is extremely rare, as evidenced by the tendency of the founders of research-led start-ups to leave (or be eased out) as their companies grow: most scientists function better in one regime or the other. Put another way, excellent research scientists will be more productive overall if they are not forced into an inappropriate role. It would therefore be appropriate to generate structures whereby research can be ”handed off” between groups, and that recruitment and funding structures be introduced to ensure that scientists in each phase are treated equally and fairly – although not necessarily identically, to reflect their different motivations.

PhD careers. The decision whether to go into industry or academia is a complex one, driven by an individual’s temperament and interests. I believe that care is needed in aspiring to move some given percentage of PhD graduates into industry. It would be a mistake to attempt to direct such career decisions, since trained researchers wanting to pursue academic careers that are not available locally will not generally take up industrial posts as an alternative: they will simply move abroad. This cohort of researchers is highly mobile and motivated, and only by providing matching opportunities will their skills be retained.

Capacity-building. While there is clearly enormous potential value in direct commercialisation of research products, there is far more value in the ICT space from simply building capacity. I have been struck by the number of start-up companies in Dublin formed by former PhD students (including several of my own) – but I have been further struck by the work these companies are doing, which often does not relate to the research topics of their founders. Indeed, in most cases the companies’ work could not have led to a PhD.

This I think underlines the importance of intellectual capacity-building, and a corollary is that what is important is that the system generate researchers, rather than being solely concerned about the actual research done in training these individuals. Brilliant, educated minds will go on to do good work: if attracting the best minds is best accomplished by supporting them in basic research for their PhDs, this will be a good investment. It is noticeable that many staff in Silicon Valley companies have PhDs from local universities in very foundational topics.

Another aspect of capacity-building that often goes unmentioned is the progression of staff in post: the recognition that the excellent researchers need to have their career aspirations met and respected. There is ample evidence that this function is not properly dealt with by many institutions within their current structures: the exclusive focus on importing “iconic” and “prize-winning” staff can be demoralising to local staff, who can then become demotivated or induced to emigrate.

I believe the evidence supports the notion that staff in post will overall be more motivated, and more committed, by promotion than many high-flying individuals who may regard their appointment as a temporary base or a prelude to retirement, and may not continue to do the world-leading work that underpinned their recruitment.

Integrating expertise. SFI aspires to be a “model” department in terms of supporting scientific activity. One approach that might be beneficial is that pursued by the NSF, to second scientists into the organisation as programme officers. This approach – which I believe is currently unique to NSF – seems to deliver very well-managed programmes, gives the organisation access to a range of scientific talent, ensures that the staff in charge of programmes are up-to-date with the latest science, and also immensely benefits the management skills of the scientists involved. It is true that  it can be challenging to manage conflicts of interest, but the research community in the US is also “a small country” in this sense, so I am sure that mechanisms can be found. Providing seconded individuals with a funded postdoc ex officio (as we do in St Andrews for Heads of School) might allow their own research to proceed in their absence.

It’ll be interesting to see what happens to the strategic plan as a result of the consultation, but whatever the result it’s a creative and constructive exercise to test the plan against an outside audience. I’d like to think this can only improve the process of governance for State-supported science.

Distinguished lecture on artificial life

Our most recent distinguished lecture was the best I’ve heard so far, and on a fascinating topic I’d like to know more about. We run two distinguished lectures each academic year, inviting an academic in to teach a four-hour course on some topic that often we don’t really have expertise with in St Andrews. It exposes undergraduates, grad students and staff to new topics and ways of thinking about research. The series goes back to 1969 and includes some of the best-known names in computer science. Last semester’s speaker was Larry Yaeger from the University of Indiana (on sabbatical at the University of Hertfordshire), who talked about artificial life: using computers to study the processes of evolution, speciation and adaptation. It’s a topic that sits on the boundary between novel computing and theoretical biology. Artificial life sounds too much like science fiction to be “real” computer science: do we understand enough about life to be able to study it in the abstract?, and, if not, can artificial life have any scientific content. If you confine yourself enough, of course, then the answer to both questions is a qualified “yes”, but the question then becomes, does it tell you anything meaningful about anything? Larry’s talk showed that even quite abstracted artificial life scenarios still give some high-level information about the potential for systems design, especially for very dynamic adaptive systems in changing environments. Larry’s work has focused on building multi-agent simulations of processes, seeing how simple rule sets can give rise to complex behaviours. This has culminated in a system called Polyworld, that lets users set up “genetically” based  behaviours for agents. (There are some very cool movies of it all working.) The genetic basis — completely synthetic and higher-level that real genetics — means that agents can evolve through mutation and cross-over. The part I found most interesting was the way that these systems — like the natural systems they’re abstractions of — tend not to do optimisation per se. Instead they find, and stick with, solutions that are “good enough”. You get to a balance between evolutionary pressure not being strong enough, and the benefits not being great enough, for further changes to take place. The difference with traditional engineering is quite profound, both in this satisfaction with the sub-optimal but also in the fact that the selection is dynamic, so if the chosen approach ceases to be “good enough” as the environmental pressures change it will shift to another process as a matter of course. You get this dynamism all over chemistry, too, where chemical equilibrium remains a dynamic process with lots of reactions going on all the time without changing the macroscopic concentrations of the reagents involved. It’s easy to mistake this for a static system, which it most definitely isn’t: I think this is a mistake a lot of scientists and engineers make, though, and it’s something we probably need to address when designing adaptive systems or sensor networks that need to operate against or within a complex environment. To do this we’d need to give up a lot of intuitions we have about design, and the possibility of a single “correct” solution to a problem, and think instead of a design space in which the system is (to some extent) free to explore — and make this design space, and the exploration of it, concepts that are propagated to run-time. I think this kind of approach makes sense even if you don’t embrace the genetic algorithms style view of the world (which in the main I don’t). In some ways this is a measure of the success of artificial life research: it’s illuminating concepts that are of general utility outside the context from which they’re being drawn, that can be used to influence other approaches to systems design without our having to junk everything we already know, which we’re clearly not going to do. These sorts of incremental changes are far more useful than revolutions, in many ways, but they come about from thinking that’s more-or-less divorced from mainstream thinking. It’s a good illustration of why blue-skies research is important, and that knowledge really is all one piece with interconnections and interactions that we can’t predict.