The changing student computing experience

I'm jealous of my students in many ways, for the things they'll get to build and experience. But they should be jealous of me, too.

It was graduation week last week, which is always a great bash. The students enjoy it, obviously -- but it's mainly an event for the parents, many of whom are seeing their first child, or even the first child in their extended family, succeed at university. It's also the time of year when we conduct post mortem analyses of what and how we've taught throughout the year, and how we can change it for the better in the next session.

One of the modules I teach is for second-year undergraduates on data structures, algorithms, applied complexity and other really basic topics. It's a subject that's in serious danger of being as dry as ditchwater: it's also extremely important, not only because of it's applications across computer science but also because it's one of the first experiences the students have of the subject, so it's important that it conveys the opportunities and excitement of computer science so they don't accidentally nuke their lives by going off to do physics or maths instead.

One of the aspects of making a subject engaging is appealing to the students' backgrounds and future interests -- which of course are rather different to the ones I had when I was in their position 25 years ago. (I initially wrote "quarter of a century ago," which sounds way longer somehow.) So what are the experiences and aspirations of our current students?

Many get their first experience of programming computers with us, but they're all experienced computer users who've been exposed to computers and the internet their entire lives. They're the first generation for whom this is true, and I don't think we've really assimilated what it means. They're completely at home, for example, in looking up background material on Wikipedia, or surfing for alternative sources of lectures and tutoring on YouTube and other, more specialised sites. They can do this while simultaneously writing email, using Facebook and replying to instant messages in a way that most older people can't. They're used to sharing large parts of themselves with their friends and with the world, and it's a world in which popularity can sometimes equate with expertise in unexpected ways. It's hard to argue that this diversity of experience is a bad thing, and I completely disagree with those whom have done so: more information on more topics collected from more people can only be positive in terms of exposure to ideas. For an academic, though, this means that we have to change how and what we teach: the facts are readily available, but the interpretation and criticism of those facts, and the balancing of issues in complex systems, are something that still seem to benefit from a lecture or tutorial setting.

Many of the students have also built web sites, of course -- some very complex ones. Put another way, they've built distributed information systems by 17, and in doing so have unknowingly made use of techniques that were at cutting edge of research less than 17 years ago. They expect sites to be slick, to have decent graphics and navigation, to be linked into social media, and so forth. They've seen the speed at which new ideas can be assimilated and deployed, and the value that information gains when it's linked, tagged and commented upon by a crowd of people. Moreover they expect this to continue: none of them expects the web to fragment into isolated "gated communities" (which is a fear amongst some commentators), or to become anything other than more and more usable and connected with time.

I'm jealous of my students, for the web that they'll have and the web that many of them will help to build. But before I get too envious, it's as well to point out that they should be jealous of me too: of the experience my peers and I had of computers. It's not been without impact.

One of the things that surprises me among some students is that they find it hard to imagine ever building some of the underpinning software that they use. They can't really imagine building an operating system, for example, even though they know intellectually that Linux was built, and is maintained, by a web-based collaboration. They can't imagine building the compilers, web servers and other base technology -- even though they're happy to use and build upon them in ways that really surprise me.

I suspect the reasons for this are actually embedded into their experience. All the computers, web sites and other services they've used have always had a certain degree of completeness about them. That's not to say they were any good necessarily, but they were at least functional and usable to some degree, and targeted in general at a consumer population who expected these degrees of functionality and usability (and more). This is radically different to the experience we had of unpacking a ZX-80, Acorn Atom or some other 1980's vintage home computer, which didn't really do anything -- unless we made it do it ourselves. These machines were largely blank slates as far as their functions were concerned, and you had to become a programmer to make them worth buying. Current games consoles criminalise these same activities: you need permission to program them.

It's not just a commercial change. A modern system is immensely complex and involves a whole stack of software just to make it function. It's hard to imagine that you can actually take control all the way down. In fact it's worse than that: it's hard to see why you'd want to, given that you'd have to re-invent so much to get back to the level of functionality you expect to have in your devices. As with programming languages, the level of completeness in modern systems is a severe deterrent to envisioning them, and re-building them, in ways other than they are.

Innovation, for our students, is something that happens on top of a large stack of previous innovation that's just accepted and left untouched. And this barrier -- as much mental as technological -- is the key difference between their experience of computing and mine. I grew up with computers that could be -- and indeed had to be -- understood from the bare metal up. One could rebuild all the software in a way that'd be immensely more challenging now, given the level of function we've come to expect.

This is far more of a difference than simply an additional couple of decades of experience with technology and research: it sits at the heart of where the next generation will see the value of their efforts, and of where they can change the world: in services that sit at the top of the value chain, rather than in the plumbing down at its base. Once we understand that, it becomes clearer what and how we should teach the skills they'll need in order best to apply themselves to the challenges they'll select as worth their time. And I'll look forward to seeing what these result in.

Congratulations to the graduating class of 2011. Have great lives, and build great things.

Why is my wheelie bin measured in decibels?

An unexpected curiosity arising from a common household appliance.

I've just been clearing up the remains of a tree that came down in my garden after the recent storms. Morningside isn't generally noted for hurricane-force winds, but we got them a fortnight or so ago while we were away. In doing so I noticed that the inside of the wheelie bin that we're supplied with by Edinburgh council for garden waste has a peculiar marking:

Wheeliebin decibels

This seems to indicate that there's something about the bin measured in decibels: a logarithmic unit of relative power. There's also what looks like a speaker next to it.

Which rather begs the question: why is my wheelie bin measured in decibels?

My first thought was sound: wheelie bins are notoriously noisy when wheeled around, so perhaps this is the sound output one expects from this particular model. However, 99dB is approximately the sound level of a jackhammer at 1m distance, and considerably more than the sound level at which one is recommended to wear ear defenders.  Much as I dislike wheelie bins being wheeled around early in the morning, I doubt they're a threat to my hearing.

Perhaps this is a level of sound that the wheelie bin can muffle? Quality product though it is, it hardly seems likely that it would muffle the sound of a jackhammer being used inside it. Even if one wanted to. And such an application seems, shall we say, a little exotic for the manufacturer to have printed this specification on every wheelie bin they sell. If this is the case, it should perhaps be better advertised.

It could be the sound made by a wheelie bin full of bottles being tipped out, but that's surely a function of the bottles and the manner in which they're emptied rather than being a property that should be stamped on the bin.

Another use of decibels is in electronics, specifically to measure the power of an amplifier or radio antenna. It'd certainly be possible to use a wheelie bin as an amplifier for a speaker, although that too sounds somewhat esoteric. As a radio antenna it suffers the disadvantage of not being conductive and so not suitable as a waveguide. I suppose you could coat the inside with tinfoil, but we're again into not-exactly-mainstream-customer-uses territory.

So I'm somewhat at a loss: any suggestions, however inane, gratefully received.

Lectureship in Computer Science at St Andrews

The School has an opening for a lecturer (assistant professor) in any discipline that matches our research interests.

Lectureship at St Andrews

Lecturer in Computer Science – SK8461

School of Computer Science £36,862 – £45,336 per annum. Starting September 2011 or as soon as possible thereafter. Standard appointment.

The Scottish Informatics and Computer Science Alliance (SICSA) is creating a world-leading Computer Science research community across the universities in Scotland. As part of this initiative, we seek lectureship applications from researchers with expertise in the SICSA themes of Next Generation Internet; Complex Systems Engineering; Multi-modal Interaction and Modelling & Abstraction. We are particularly interested in candidates with expertise in sensor systems, multicore systems, embedded systems, data intensive systems or wireless communications and especially those who bridge two or more of these areas.

You should have a PhD and have an outstanding research record as demonstrated by publications and research funding. You must be willing to cooperate with other researchers across Scotland and contribute to the work of SICSA and to teach in any area of computer science.

A period of postdoctoral experience and publications that reflect your experience is essential. Teaching is important to us and we expect candidates to be committed teachers, with appropriate experience. We have a growing portfolio of Masters level courses and welcome those who have an interest in contributing to their development.

Candidates interested in this post are welcome to informally contact the Head of School (Alan Dearle) to discuss the opportunity.

Ref No: SK8461

Closing Date: 27 July 2011

Further Particulars: SK8461 FPs.doc

PhD position available in intrusion detection

A colleague of mine has a fully-funded PhD studentship available in web security and intrusion detection.

PhD Studentship in Immuno-inspired Web Intrusion Detection

We invite applicants for a 3 year funded PhD position at the Department of Computer and Information Sciences, University of Strathclyde.

The increase in the number of reported cyber-attacks and the sophistication of their techniques is driving developments in Intrusion Detection Systems (IDS) that are able to detect them in order to take action against them. Despite significant success in the detection of previously known attacks (misuse detection) our ability to reliably detect novel attacks remains limited.

In this context, it has long been observed that the role of IDS in computer systems is analogous to the role of the Human Immune System (HIS), and models of HIS operation have inspired IDS techniques. Taking inspiration from the HIS Danger Theory model (DT), we have proposed distress detection as a symptom-based approach to IDS. A key challenge for investigations in this area is the lack of appropriate experimental testbeds and datasets. Existing datasets do not include the symptoms of cyber-attacks, while testbeds are limited in their ability to capture those symptoms. In order to address this challenge the aim of this project is to develop an experimental testbed for the exploration of symptom-based Web IDS.

The studentship will start in October 2011 and covers University Home fees plus a student stipend for 3 years (c. £13,590 for 2011/12).

The work will be co-supervised by Dr Sotirios Terzis and Dr Marc Roper.

All applicants must possess or be about to obtain a 1st class or 2.1 Honours degree or equivalent in a relevant discipline.

The type of candidate sought will have strong technical computer science/technical skills, ideally some experience web system administration and low-level systems programming, an enthusiasm for research coupled with a dogged persistence, a broad outlook and the ability to explore new ideas in depth, an ability to rapidly absorb new technical innovations.

For further details see  here.

Call for papers: MidSens'11

Papers are invited on topics in middleware, tools and services for sensor and other embedded systems.

Sixth International Workshop on Middleware Tools, Services and Run-time Support for Networked Embedded Systems (MidSens’11)

Co-located with Middleware 2011 (December 12th - December 16th, 2011), Lisbon, Portugal

The aim of MidSens’11 is to stimulate research in the specific domain of middleware for networked embedded systems. This year’s focus is on sensor networks and robotics control – a broader focus than the previous editions – since we believe that the extended scope will result in complementary and synergetic submissions from researchers working in both niches. Along with the ‘core’ topic of middleware architectures, services and tool support, MidSens’11 will also seek quality papers describing novel programming languages, run-time support and relevant experience reports. As with previous editions of this workshop, MidSens’11 will investigate how middleware support can relieve developers from low-level, platform specific concerns, while enabling optimal exploitation of available resources. We hope that you will be able to join us in Lisbon on December 12th 2011.

Middleware for networked embedded systems such as sensor networks and robotics is a critical research domain which addresses key challenges that application developers are facing today. The five previous editions of this workshop (MidSens'06, MidSens'07, MidSens'08, MidSens'09 and MidSens'10) attracted researchers from Europe, Asia, and the United States. The MidSens workshop series has served to trigger and guide research efforts to create an integrated middleware vision, which is required to handle the challenges inherent in developing, deploying and managing complex networked embedded applications in an efficient way.

The workshop seeks papers in, but not limited to:

  • Middleware Tools and Architectures:
    • Architectures for networked embedded systems.
    • Novel programming abstractions.
    • Lightweight agent middleware for embedded systems.
    • Testing and simulation tools.
    • Fault identification, diagnosis and repair.
  • Middleware services:
    • Location tracking, localization, and synchronization.
    • Support for real-time and safety-critical systems.
    • Data management, aggregation and filtering.
    • Energy-aware middleware mechanisms.
    • Fault tolerance, reliability and quality of service.
    • Privacy and security services.
    • Virtualization, sharing and trading of resources.
  • Run-time Support:
    • Overlay and topology creation, maintenance and management.
    • Resource/Service discovery and management.
    • Support for reconfiguration and adaptation.
    • Effective naming and addressing schemes.
    • Support for modeling and enacting safe software reconfiguration.
  • Management and Experiences:
    • Managing heterogeneity and network dynamism.
    • Integration of embedded systems with web services.
    • Experience and evaluation of middleware platforms.
    • Support for the unification of various networked embedded platforms.
    • Shared infrastructure embedded systems.


Submitted papers must be original work in English without substantial overlap with papers that have been published or that are simultaneously submitted to a journal or conference with proceedings. Submissions must not exceed 6 pages, must strictly follow the ACM conference proceedings format, and must be submitted in PDF format. All workshop papers will be uploaded to the ACM Digital Library. Full instructions can be found here.

Important dates

  • Paper submission: 15 August 2011
  • Review notification: 29 September 2011
  • Camera-ready: 10 October 2011
  • Registration: 7 October 2011

Programme committee

  • Gordon Blair, Lancaster University, UK
  • Vinny Cahill, Trinity College, Ireland
  • Paolo Costa, Imperial College London, UK
  • Simon Dobson, University of St. Andrews, UK
  • Michael Fisher, University of Liverpool, UK
  • Wen Hu, CSIRO, Australia
  • Joerg Kaiser, University of Magdeburg, Germany
  • Torsten Kroeger, Stanford University, USA
  • Ajay Kshemkalyani, University of Illinois at Chicago
  • Kristof Van Laerhoven, Technical University of Darmstadt
  • Sam Michiels, K.U.Leuven, Belgium
  • Nader Mohamed, United Arab Emirates University, UAE
  • Luca Mottola, SICS, Sweden
  • Mirco Musolesi, University of Birmingham, UK
  • Dennis Pfisterer, University of Lübeck, Germany
  • Kay Römer, University of Lübeck, Germany
  • Coen De Roover, Vrije Universiteit Brussel, Belgium
  • Romain Rouvoy, INRIA Lille, France
  • Jo Ueyama, Universidade de Sao Paulo, Brazil

Call for papers: D3Science

Papers are solicited for a workshop on data-intensive, distributed and dynamic e-science problems.

Workshop on D3Science - Call for Papers

(To be held with IEEE e-Science 2011) Monday, 5 December 2011 Stockholm, Sweden

This workshop is interested in data-intensive, distributed, and dynamic (D3) science. It will also focus on innovative approaches for scalability in the end-to-end real-time processing of scientific data. We refer to D3 applications as those are data-intensive, are fundamentally, or need to be, distributed, and need to support and respond to data that may be non-persistent and is dynamically generated. We are also looking to bring researchers together to look at holistic, rather than piecewise, approaches to the end-to-end processing and managing of scientific data.
There has been a lot of effort in managing and distributing tasks where computation is dominant. Such applications have after all, been historically the drivers of "grid" computing. There has been, however, relatively less effort on tasks where the computationalload is matched by the data load, or even dominated by the data load. For such tasks to be able to operate at scale, there are conceptually simple run-time tradeoffs that need to be made, such as determining whether to move data to compute versus moving compute to data, or possibly regenerating data on-the-fly. Due to fluctuating resource availability and capabilities, as well as insufficient prior information about application requirements, such decisions must be made at run-time. Furthermore, resource, connectivity and/or storage constraints may require the data to be manipulated in-transit so that it is "made-right" for the consumer. Currently it is very difficult to implement these dynamic decisions or the underlying mechanisms in a general-purpose and scalable fashion. Although the increasing volumes and complexity of data will make many problems data-dominated, the computational requirements will still be high. In practice, data-intensive applications will encompass data-driven applications. For example, many data-driven applications will involve computational activities triggered as a consequence of independently created data; thus it is imperative for an application to be able to respond to unplanned changes in data load or content. Therefore, understanding how to support dynamic computations is a fundamental, but currently missing element in data-intensive computing.
The D3Science workshop builds upon a 3 year research theme on Distributed Programming Abstractions (DPA,, which has held a series of related workshops including but not limited to e-Science 2008, EuroPar 2008 and the CLADE series, and the ongoing 3DPAS ( research theme funded by the NSF and UK EPSRC, which is holding one workshop in June 2011: the 3DAPAS workshop ( The workshop is intended to lead to a funding proposal for transcontinental collaboration, with contributors as potential members of the collaboration, and as such, we are particularly interested is discussing both existing and future projects that are suitable for transcontinental collaboration.
Topics of interest include but are not limited to:
  • Case studies of development, deployment and execution of representative
  • D3 applications, particularly projects suitable for transcontinental collaboration
  • Programming systems, abstractions, and models for D3 applications
  • Discussion of the common, minimally complete, characteristics of D3 application
  • Major barriers to the development, deployment, and execution of D3 applications, and primary challenges of D3 applications at scale
  • Patterns that exist within D3 applications, and commonalities in the way such patterns are used
  • How programming models, abstraction and systems for data-intensive applications can be extended to support dynamic data applications
  • Tools, environments and programming support that exist to enable emerging distributed infrastructure to support the requirements of dynamic applications (including but not limited to streaming data and in-transit data analysis)
  • Data-intensive dynamic workflow and in-transit data manipulation
  • Adaptive/pervasive computing applications and systems
  • Abstractions and mechanisms for dynamic code deployment and "moving code to data"
  • Application drivers for end-to-end scientific data management
  • Runtime support for in-situ analysis
  • System support for high end workflows
  • Hybrid computing solutions for in-situ analysis
  • Technologies to enable multi-platform workflows

Submission instructions

Authors are invited to submit papers containing unpublished, original work (not under review elesewhere) of up to 8 pages of double column text using single spaced 10 point size on 8.5 x 11 inch pages, as per IEEE 8.5 x 11 manuscript guidelines. Templates are available: Authors should submit a PDF or PostScript (level 2) file that will print on a PostScript printer. Papers conforming to the above guidelines can be submitted through the workshop's paper submission system: It is a requirement that at least one author of each accepted paper register and attend the conference.

Important dates

  • 17 July 2011 - submission date
  • 23 August 2011 - decisions announced
  • 23 September 2011 - final versions of papers due to IEEE for proceedings


  • Daniel S. Katz, University of Chicago & Argonne National Laboratory, USA
  • Neil Chue Hong, University of Edinburgh, UK
  • Shantenu Jha, Rutgers University & Louisiana State University, USA
  • Omer Rana, Cardiff University, UK

PC members

  • Gagan Aggarwal, Ohio State University, USA
  • Deb Agarwal, Lawrence Berkeley National Lab, USA
  • Gabrielle Allen, Lousiana State University, USA
  • Malcolm Atkinson, University of Edinburgh, UK
  • Adam Barker, University of St Andrews, UK
  • Paolo Besana, University of Edinburgh, UK
  • Jon Blower, University of Reading, UK
  • Yun-He Chen-Burger, University of Edinburgh, UK
  • Simon Dobson, University of St Andrews, UK
  • Gilles Fedak, INRIA, France
  • Cécile Germain, University Paris Sud, France
  • Keith R. Jackson, Lawrence Berkeley National Lab, USA
  • Manish Parashar, Rutgers, USA
  • Abani Patra, University of Buffalo, USA
  • Yacine Rezgui, Cardiff University, UK
  • Yogesh Simmhan, University of Southern California, USA
  • Domenico Talia, University of Calabria, Italy
  • Paul Watson, Newcastle University, UK
  • Jon Weissman, University of Minnesota, USA

Pervasive healthcare

This week's piece of shameless self-promotion: a book chapter on how pervasive computing and social media can change long-term healthcare.

My colleague Aaron Quigley and I were asked to contribute a chapter to a book put together by Jeremy Pitt as part of PerAda, the EU network of excellence in pervasive systems. We were asked to think about how pervasive computing and social media could change healthcare. This is something quite close to both our hearts -- Aaron perhaps more so than me -- as it's one of the most dramatic examples of how pervasive computing can really make an impact on society.

There are plenty of examples of projects that attempt to provide high-tech solutions to the issues of independent living-- some of which we've been closely involved with. For this work, though, we suggest that one of the most cost-effective contributions that technology can make might actually be centred around social media. Isolation really is a killer, in a literal sense. A lot of research has indicated that social isolation is a massive risk factor in both physiological and psychological illnesses, and this is something that's likely to get worse as populations age.

Social media can help address this, especially in an age when older people have circles of older friends, and where these friends and family can be far more geographically dispersed than in former times. This isn't to suggest that Twitter and Facebook are the cures of any social ills, but rather that the services they might evolve into could be of enormous utility for older people. Not only do they provide traffic between people, they can be mined to determine whether users' activities are changing over time, identify situations that can be supported, and so provide unintrusive medical feedback -- as well as opening-up massive issues of privacy and surveillance. While today's older generation are perhaps not fully engaged with social media, future generations undoubtedly will be, and it's something to be encouraged.

Other authors -- some of them leading names in the various aspects of pervasive systems -- have contributed chapters about implicit interaction, privacy, trust, brain interfaces, power management, sustainability, and a range of other topics in accessible form.

The book has a web site (of course), and is available for pre-order on Amazon. Thanks to Jeremy for putting this together: it's been a great opportunity to think more broadly than we often get to do as research scientists, and see how our work might help make the world more liveable-in.

Mainstreaming Smalltalk

Smalltalk's influence has declined of late, at least in part because of the  "all or nothing" architecture of the most influential distribution. We've got to the stage that we could change that.

Some programming languages achieve epic influence without necessarily achieving universal adoption. Lisp was phenomenally influential, but has remained in an AI niche; APL gave rise to notions of bulk processing and hidden parallelism without having much uptake outside finance. Smalltalk's influence has been similar: it re-defined what it meant to be an interactive programming environment and laid the ground for many modern interface concepts, as well as giving a boost to object-oriented programming that prepared the way for C++ and Java.

So why does no-one use it any more? Partly it's because of its very interactivity. Smalltalk -- and more especially Squeak, its most prominent modern implementation -- are unusual in basing software around complete interactive images rather than collections of source files. This isn't inherently poor -- images start immediately and encourage direct-manipulation design and interfacing -- but it's radically unfamiliar to programmers used to source code and version control. (Squeak provides version-controlled change sets to collect code outside an image, but that's still an unfamiliar structure.)

But a more important issue is the all-or-nothing nature of Squeak in particular, and in fact of novel languages in general. If you want to use Squeak for a project, all the components really need to be in Squeak (although it can also use web services and other component techniques). If you want to integrate web pages, someone needs to write an HTML rendering component, HTTP back-end and the like; for the semantic web someone needs to write XML parsers, triple stores, reasoners and the like. You can't just re-use the ones that're already out there, at least not without violating the spirit of the system somewhat. That means it plays well at the periphery with other tools, but at its core need most of the services to be written again. This is a huge buy-in. Similarly Squeak provides a great user interface for direct manipulation -- but only within its own Squeak window, rendered differently and separately from other windows on the screen. These aren't issues inherent to Smalltalk -- it's perfectly possible to imagine a Smalltalk system that used "standard" rendering (and indeed they exist) -- but the "feel" of the language is rather more isolated than is common for modern systems used to integrating C libraries and the like. At bottom it's not necessarily all that different to Java's integration with the host operating system, but the style is very much more towards separation in pursuit of uniformity and expressive power. This is a shame, because Smalltalk is in many ways a perfect development environment for novice programmers (and especially for children, who find it captivating) who are a vast source of programming innovation for small, focused services such as we find on the web ad on smartphones.

So can we make Smalltalk more mainstream? Turn it into an attractive development platform for web services and mobile apps? Looking at some recent developments I think the answer is firmly yes -- and without giving up on the interactivity that gives it its attraction. The key change is (unsurprisingly) the web, or more precisely the current crop of browsers that support Javascript, style sheets, SVG, dynamic HTML and the like. The browser has now arrived at a point at which it can provide a complete GUI -- complete with windows, moving and animated elements and the like -- in a standard, platform-independent and (most importantly) cloud-friendly way.

What I have in mind is essentially implementing a VM for a graphical Smalltalk system, complete with interactive compiler and direct-manipulation editing, in Javascript within a browser. The "image" is then the underlying XML document and its style sheet, which can be downloaded, shared and manipulated directly. The primitives are written in Javascript, together with an engine to run Smalltalk code and objects. Objects are persisted by turning them into document elements and storing them in the DOM tree, which incidentally allows their look and feel to be customised quite simply. Crucially, it can also appeal to any current or emerging features that can appear in style sheets, the semantic web, Javascript or embedded components: it's mainstream with respect to the other technologies being developed.

Why use Smalltalk, and not Javascript directly? I simply think that the understanding we gained from Smalltalk's simplicity of programming model and embrace of direct manipulation is too valuable to lose. That's not to say that it doesn't need to be re-imagined for the web world, though. In fact, Smalltalk's simplicity and interactivity are ideally suited to the development of front-ends, components and mobile apps -- if they play well with the other technologies those front-ends and apps need to use, and with a suitably low barrier to entry. It's undoubtedly attractive to be able to combine local and remote components together as end-user programs, without the hassle of a traditional compile cycle, and then be able to share those mash-ups directly to the web to be shared (and, if desired, modified) by anyone.

One of the things that's always attracted me about Smalltalk (and Forth, and Scheme -- and Javascript to a lesser extent) is that the code lives completely within the dominant data structure: indeed, the code is just data in most cases, and can be manipulated using data-structure operations. This is very different from the separation you get between code and data in most other languages, and gives you a huge amount of expressive power. Conversely, one of the thing that always fails to attract me about these same languages is their lack of any static typing and consequent dependence on testing. Perhaps these two aspects necessarily go hand in hand, although I can't think of an obvious reason why that should be.

I know purists will scream at this idea, but to me it seems to go along with ideas that Smalltalk's co-inventor, Alan Kay, has expressed, especially with regard to the need to do away with closely-packaged applications and move towards a more fluid style of software:

The "no applications" idea first surfaced for me at PARC, when we realised that you really wanted to freely construct arbitrary combinations (and could do just that with (mostly media) objects). So, instead of going to a place that has specialised tools for doing  just a few things, the idea was to be in an "open studio" and pull the resources you wanted to combine to you. This doesn't mean to say  that e.g. a text object shouldn't be pretty capable -- so it's a mini app if you will -- but that it and all the other objects that intermingle with each other should have very similar UIs and have their graphics aspect be essentially totally similar as far as the graphics system is concerned -- and this goes also for user constructed objects. The media presentations I do in Squeak for my talks are good examples of the directions this should be taken for the future.

(Anyone who has seen one of Kay's talks -- as I did at the ECOOP conference in 2000 -- can attest to how stunningly engaging they are.) To which I would add that it's equally important today that their data work together seamlessly too, and with the other tools that we'll develop along the way.

The use of the browser as a desktop isn't new, of course: it's central to Google Chromium and to cloud-friendly Linux variants like Jolicloud. But it hasn't really been used so much as a development environment, or as the host for a language that lives inside the web's main document data structure. I'm not hung-up on it being Smalltalk -- a direct-manipulation front-end to jQuery UI might be even better -- but some form of highly interactive programming-in-the-web might be interesting to try.

Why we have code

Coding is an under-rated skill, even for non-programmers.

Computer science undergraduates spend a lot of time learning to program. While one can argue convincingly that computer science is about more than programming, it's without doubt a central pillar of the craft: one can't reasonably claim to be a computer scientist without demonstrating a strong ability to work with code. (What this says about the many senior computer science academics who can no longer program effectively is another story.) The reason is that it helps one to think about process, and some of the best illustrations of that comes from teaching.

Firstly, why is code important? One can argue that both programming languages and the discipline of code itself are two of the main contributions computer science has made to knowledge. (To this list I would add the fine structuring of big data and the improved understanding of human modes of interaction -- the former is about programming, the latter an area in which the programming structures are still very weak.) They're so important because they force an understanding of a process at its most basic level.

When you write computer software you're effectively explaining a process to a computer in perfect detail. You often get a choice about the level of abstraction you choose. You can exploit the low-level details of the machine using assembler or C, or use the power of the machine to handle the low-level details and write in Haskell, Perl, or some other high-level language. But this doesn't alter the need to express precisely all that the machine needs to know to complete the task at hand.

But that's not all. Most software is intended to be used by someone other than the programmer, and generally will be written or maintained in part by more than one person -- either directly as part of the programming team or indirectly through the use of third-party compilers and libraries. This implies that, as well as explaining a purpose to the computer, the code also has to explain a purpose to other programmers.

So code, and programming languages more generally, are about communication -- from humans to machines, and to other humans. More importantly, code is the communication of process reduced to its  purest form: there is no clearer way to describe the way a process works than to read well-written, properly-abstracted code. I sometimes think (rather tenuously, I admit) this is an unexpected consequence of the halting problem, which essentially says that the simplest (and generally only) way to decide what a program does is to run it. The simplest way to understand a process is to express it as close to executable form as possible.

You think you know when you learn, are more sure when you can write, even more when you can teach, but certain only when you can program. Alan Perlis

There are caveats here, of course, the most important of which is that the code be well-written and properly abstracted: it needs to separate-out the details so that there's a clear process description that calls into -- but is separate from -- the details of exactly what each stage of the process does. Code that doesn't do this, for whatever reason, obfuscates rather than explains. A good programming education will aim to impart this skill of separation of concerns, and moreover will do so in a way that's independent of the language being used.

Once you adopt this perspective, certain things that are otherwise slightly confusing become clear. Why do programmers always find documentation so awful? Because the code is a clearer explanation of what's going on, because it's a fundamentally better description of process than natural language.

This comes through clearly when marking student assessments and exams. When faced with a question of the form "explain this algorithm", some students try to explain it in words without reference to code, because they think explanation requires text. As indeed it does, but a better approach is to sketch the algorithm as code or pseudo-code and then explain with reference to that code -- because the code is the clearest description it's possible to have, and any explanation is just clearing up the details.

Some of the other consequences of the discipline of programming are slightly more surprising. Every few years some computer science academic will look at the messy, unstructured, ill-defined rules that govern the processes of a university -- especially those around module choices and student assessment -- and decide that they will be immensely improved by being written in Haskell/Prolog/Perl/whatever. Often they'll actually go to the trouble of writing some or all of the rules in their code of choice, discover inconsistencies and ambiguities, and proclaim that the rules need to be re-written. It never works out, not least because the typical university administrator has not the slightest clue what's being proposed or why, but also because the process always highlights grey areas and boundary cases that can't be codified. This could be seen as a failure, but can also be regarded as a success: coding successfully distinguishes between those parts of an organisation that are structured and those parts that require human judgement, and by doing so makes clear the limits of individual intervention and authority in the processes.

The important point is that, by thinking about a non-programming problem within a programming idiom, you clarify and simplify the problem and deepen your understanding of it.

So programming has an impact not only on computers, but on everything to which one can bring a description of process; or, put another way, once you can precisely describe processes easily and precisely you're free to spend more time on the motivations and cultural factors that surround those processes without them dominating your thinking. Programmers think differently to other people, and often in a good way that should be encouraged and explored.

Evolving programming languages

Most programming languages have fixed definitions and hard boundaries. In thinking about building software for domains we don't understand very well, a case can be made for a more relaxed, evolutionary approach to language design.

I've been thinking a lot about languages this week, for various reasons: mainly about the recurring theme of what are the right programming structures for systems driven by sensors, whether they're pervasive systems or sensor networks. In either case, the structures we've evolved for dealing with desktop and server systems don't feel like they're the right abstractions to effectively take things forward.

A favourite example is the if statement: first decide whether a condition is true or false, and execute one piece of code or another depending on which it is. In a sensor-driven system we often can't make this determination cleanly because of noise and uncertainty -- and if we can, it's often only probably true, and only for a particular period. So are if statements (and while loops and the like) actually appropriate constructs, when we can't make the decisions definitively?

Whatever you think of this example (and plenty of people hate it) there are certainly differences between what we want to do between traditional and highly sensorised systems, and consequently how we program them. The question is, how do we work out what the right structures are?

Actually, the question is broader than this. It should be: how do we improve our ability to develop languages that match the needs of particular computational and conceptual domains?

Domain-specific languages (DSLs) have a tangled history in computer science, pitched between those who like the idea and those who prefer their programming languages general-purpose and re-usable across a range of domains. There are strong arguments on both sides: general-purpose languages are more productive to learn and are often more mature, but can be less efficient and more cumbersome to apply; DSLs mean learning another language that may not last long and will probably have far weaker support, but can be enormously more productive and well-targeted in use.

In some ways, though, the similarities between traditional languages and DSLs are very strong. As a general rule both will have syntax and semantics defined up-front: they won't be experimental in the sense of allowing experimentation within the language itself. If we don't know what we're building, does it make sense to be this definite?

There are alternatives. One that I'm quite keen on is the idea of extensible virtual machines, where the primitives of a language are left "soft" to be extended as required. This style has several advantages. Firstly, it encourages experimentation by not forcing a strong division of concepts between the language we write (the target language) and the language this is implemented in (the host language): the two co-evolve. Secondly, it allows extensions to be as efficient as "base" elements, assuming we can reduce the cost of building new elements appropriately low. Thirdly, it allows multiple paradigms and approaches to co-exist within the same system, since they can share some elements while having other that differ.

Another related feature is the ability to modify the compiler: that is, don't fix the syntax or the way in which its handled. So as well as making the low level soft, we also make the high level soft. The advantage here is two-fold. Firstly, we can modify the forms of expression we allow to capture concepts precisely. A good example would be the ability to add concurrency control to a language: the low-level might use semaphores, but programing might demand monitors or some other construct. Modifying the high-level form of the language allows these constructs to be added if required -- and ignored if not.

This actually leads to the  second advantage, that we can avoid features we don't want to be available, for example not providing general recursion for languages that need to complete all operations in a finite time. This is something that's surprisingly uncommon in language design despite being common in teaching programming: leaving stuff out can have a major simplifying effect.

Some people argue that syntax modification is unnecessary in a language that's sufficiently expressive, for example Haskell. I don't agree. The counter-example is actually in Haskell itself, in the form of the do block syntactic sugar for simplifying monadic computations. This had to be in the language to make it in any way usable, which implied a change of definition, and the monad designers couldn't add it without the involvement of the language "owners", even though the construct is really just a re-formulation and generalisation of one common in other languages. There are certainly other areas in which such sugaring would be desirable to make the forms of expression simpler and more intuitive. The less well we understand a domain, the more likely this is to happen.

Perhaps surprisingly, there are a couple of existing examples of systems that do pretty much what I'm suggesting. Forth is a canonical example (which explains my current work on Attila); Smalltalk is another, where the parser an run-time are almost completely exposed, although abstracted behind several layers of higher-level structure. Both the languages are quite old, have devoted followings, and weakly and dynamically typed -- and may have a lot to teach us about how to develop languages for new domains. They share a design philosophy of allowing a language to evolve to meet new applications. In Forth, you don't so much write applications as extend the language to meet the problem; in Smalltalk you develop a model of co-operating objects that provide direct-manipulation interaction through the GUI.

In both cases the whole language, including the definition and control structures, is built in the language itself via bootstrapping and cross-compilation. Both languages are compiled, but in both cases the separation between run-time and compile-time are weak, in the sense that the compiler is by default available interactively. Interestingly this doesn't stop you building "normal" compiled applications: cross-compile a system without including the compiler itself, a process that can still take advantage of any extensions added into the compiler without cluttering-up the compiled code. You're unlikely to get strictly the best performance or memory footprint as you might with a mature C compiler, but you do get advantages in terms of expressiveness and experimentation which seem to outweigh these in a domain we don't understand well. In particular, it means you can evolve the language quickly, easily, and within itself, to explore the design space more effectively and find out whether your wackier ideas are actually worth pursuing further.