Call for panels in integrated network management

We're looking for expert panels to be run at the IFIP/IEEE International Symposium on Integrated Network Management in Dublin in May 2011.

IFIP/IEEE International Symposium on Integrated Network Management (IM 2011)

Dublin, Ireland, 23-27 May 2011 http://www.ieee-im.org/

Call for panels

Background

IM is one of the premier venues in network management, providing a forum for discussing a broad range of issues relating to network structure, services, optimisation, adaptation and management. This year's symposium has a special emphasis on effective and energy-efficient design, deployment and management of current and future networks and services. We are seeking proposals for expert discussion panels to stimulate and inform the audience as to the current "hot topics" within network management. An ideal panel will bring the audience into a discussion prompted by position statements from the panel members, taking account of both academic and industrial viewpoints. We are interested in panels on all aspects of network management, especially those related to the theme of energy awareness and those discussing the wider aspects of networks and network management. This would include the following topics, amongst others:

  • Multi-transport and multi-media networks
  • Network adaptation and optimisation
  • Co-managing performance and energy
  • The uses and abuses of sensor networks
  • The science of service design and implementation
  • Programming network management strategies
  • Tools and techniques for network management
  • Socio-technical integration of networks
  • Energy-efficiency vs equality of access
  • Network-aware cloud computing
  • The future of autonomic management
  • Coping with data obesity
  • Managing the next-generation internet

How to propose a panel

Please send a brief (1-page) proposal to the panel chairs, Simon Dobson and Gerard Parr. Your proposal should indicate the  relevance of the panel to the broad audience of IM, and include the names of proposed panel members.

Important dates

Submission of panel proposals: 20 October 2010 Notifications of acceptance: mid-November 2010 Conference dates: 23-27 May 2011

Funded PhD studentship in adaptive service ecosystems

We have an opening for someone interested in doing a PhD in adaptive services.

University of St Andrews School of Computer Science http://www.cs.st-andrews.ac.uk

Funded PhD studentship in adaptive service ecosystems

The opportunity

We have an opening for a talented and well-motivated individual to investigate mechanisms for the design and implementation of long-lived, responsive and adaptive ecosystems of distributed services. This work will take place in the context of the SAPERE project, funded by the EU Seventh Framework  Programme. SAPERE seeks to develop both the theory and practice of "self-aware" component technology, able to orchestrate and adapt to changing requirements, constraints, conditions and technologies. The University of St Andrews is leading the strand of research into describing and identifying opportunities for adaptation, building on past work in sensor fusion and situation identification.

The successful applicant will work closely with the other St Andrews staff (Prof Simon Dobson and Dr Juan Ye), as well as with the other SAPERE consortium partners. An interest in one or more of the fields of adaptive systems, pervasive computing, sensor-driven systems, uncertain reasoning and software engineering would be desirable.

The studentship is funded for three years, including a stipend and all tuition fees. Please note that the conditions of funding restrict this opportunity to applicants who are nationals of an EU/EEA country.

Background

The University of St Andrews is the oldest university in Scotland (founded around 1410) and the third-oldest in the English-speaking world. Situated in a small town on the east coast of Scotland, it has a student population of around 8000 and an active research programme in a wide range of disciplines across the sciences and humanities. It has been consistently ranked in the top-10 universities in the UK by a number of recent league table exercises, and has been described by the Sunday Times as "now firmly established as the leading multi-faculty alternative to Oxford and Cambridge."

The School of Computer Science had 60% of its staff rated at 4 ("world-leading in terms of originality, significance and rigour," the highest available) or 3 in the most recent UK Research Assessment Exercise. The School's academic staffing consists of 22 academics and over 25 post-doctoral fellows. There are also around 150 undergraduate full-time equivalent students and 30 PhD students. To stimulate a dynamic high-quality research environment, the School's research strategy concentrates on three research groups, each incorporating a number of key research topics pursued independently and in collaboration. The three research groups work in the areas of Artificial Intelligence and Symbolic Computation, Networks and Distributed Systems, and Systems  Engineering. Each group attracts a large amount of external funding from both the traditional research grant agencies (EPSRC, EU, etc.) and from  companies such as SUN Microsystems, ICL, Data Connection, Microsoft and BAe. The School is also a leading member of SICSA, a research alliance of leading computer science Schools across Scotland, with members of the School leading a number of core strategic activities.

Within the School, the Networks and Distributed Systems group is a world-class research centre, not only for software artefacts such as programming languages, object stores and operating systems but also in adaptive, pervasive and autonomic systems engineering and in the composition technologies used in such large-scale distributed systems such as peer-to-peer overlay mechanisms, distributed mediation, distributed termination and distributed garbage collection.

Application

To apply, please send a CV and statement of research interests by email to Prof Simon Dobson (sd@cs.st-andrews.ac.uk), to whom informal inquiries can also be directed. The closing date for applications is 11 October 2010.

Getting rid of the laptop

I've been playing with evaluating two new toys important new pieces of technology: an iPad and a Pulse SmartPen. The combination almost makes we ready to ditch my netbook -- or at least got me thinking carefully about why I still have one.

The iPad is well-known; the SmartPen perhaps not so. It's made by a company called LiveScribe, and works with special notebook paper. A camera in the nib watches what's been written and tracks the pen by looking at a background of dots arranged to provide location information about the pen on the page. The pen can also record what's being said, and cleverly links the two data streams together: later you can tap a word and hear what was being said at the time. I've been using one for a fortnight.

I used to keep written notebooks, but moved to taking notes purely on my netbook when I realised I was forgetting what I'd written where: a notebook is just a dead tree with dead information on it, and I've become used to everything being searchable. However, getting searchability meant converting my note-taking style to linear text rather than mindmaps or sketches, since that's what the tools typically support. (There are mindmap tools, of course, but they're completely separate from other note-taking tools and so get in the way somewhat.) There's also a barrier to note-taking in having to get the netbook out, rather than just picking up a (special) pen and (special) paper. The resulting data is searchable, since the desktop tool does fairly decent) handwriting recognition: I can "tag" pages in the margins, writing slightly more carefully than usual, and search for the tags even if full content searching is a bit aspirational.

For what I do this is a lot, but not quite enough, as I spend a lot of time reading, looking up information and writing emails, papers and the like. A Kindle or other e-reader would be great for the reading, but not for the net access. That's where the iPad comes in: can it replace the need for a more traditional web-access and writing device? It's certainly a lovely piece of kit, fast and stable, and allows easy browsing. The keyboard is pretty good for a "soft" device, and one could easily see writing email and editing small amounts of text using it. I can also see that it'd be an awesome platform for interactive content and mixed media books/films, assuming the editing tools are available elsewhere.

Of course neither netbooks nor iPads are really optimised for the creation of content: they're very much consumer devices intended for the consumption of content written on other, bigger machines. I don't think that's a criticism -- no-one does smartphone software development on a smartphone, after all -- but it does mean that neither is optimal as a device for someone who creates a lot. But the combination of a digitised paper notebook with an internet-access device is extremely attractive. Both devices are extremely portable and friendly, and link well to the larger "back office" machines I use for "serious" work.

I have two worries, one about both devices and one about the iPad alone. The first worry is the almost completely closed nature of the software. The Pulse loads its recorded sound and images into its own desktop tool, which are than only available through that tool despite (I imagine) using standard data formats internally with some clever hyperlinking. The tool does provide important value-add, of course, specifically the links from written text to recorded audio. But that should be separate from the actual content, and it isn't. One can "archive" a notebook, or turn individual pages into PDF, but not (as far as I can tell) get access to the content programmatically as it stands. That's simply obstructive on the part of Livescribe -- and also a little shortsighted, I think, since their linking technology could clearly be applied to any print-linked-to-sound data if their tool was open and able to access arbitrary content. I think this is a great example of where openness is both friendly to the community and potentially a commercial virtue. (Oh, and the Livescribe desktop only works on Intel Macs: who exactly writes non-universal binaries these days? and why?)

The iPad has a similar ecosystem, of course, which is "open" in the sense that anyone can write programs for it but "closed" in the sense that (firstly) access to the App Store is carefully constrained and (secondly) there are features of the platform and operating system that aren't freely available to everyone.

I can understand Apple's contention that -- for phones especially -- it's important to only download apps that won't brick the device. This doesn't of course imply that there should be a single gatekeeper as has happened with the App Store: one could provide a default store but allow external ones, as happens with Android Market. A single gatekeeper is basically just a way to extract rents from the software ecosystem. This can stifle both innovation and price, to Apple's advantage.

What worries me more, though, is the extra, non-commercial dimension in terms of content control, which I think is more broadly damaging than just software. I was looking at an app for cocktail recipes (Linda's a big fan). There are several available, of course, but all come with a rating of 17+ because of their mention of frequent drinking or drug use. There's a suggestion of the illicit becoming the illegal there. It's also well-known that Apple enforces a "no porn" rule on the App Store. Whatever one's attitude to pornography, much of it isn't illegal and it's not clear that a software company should restrict the uses of a device above and beyond the law.

The whole experience reminds me very strongly of Disneyland: safe, beautiful, welcoming, friendly -- and utterly fake, and utterly anodyne. One can choose not to go to Disneyland, of course -- and certainly not to live there -- but it's another thing to hand control of access to information and information technology off to a commercial third party. Anything can be disallowed on a whim, or for the greater commercial good -- and can of course be disallowed or edited retrospectively.

Whether we like it or not, human culture includes material, that is distasteful for many people. That's why we have critical faculties, and diversity, and laws on free speech. Commercial device providers and operators are not constrained by requirements to fairness in the way that newspapers and public broadcasters are, and could easily be persuaded to silence some forms of speech on the basis of commercial interest regardless of their wider legality.

For the present I'll be keeping the netbook for internet access, but using the SmartPen for note-taking, and thinking a bit more about a dedicated ebook reader. It's a compromise between openness and convenience that I'm conscious of making, and not without some hesitation. Time will tell how the choices play out and evolve, and maybe I'll buy an Android tablet when they mature a little :-)

Monads: a language designer's perspective

Monads are one of the hottest topics in functional programming, and arguably simplify the construction of a whole class of systems. Which makes it surprising that they're so opaque and hard to understand to people who's main experience is in imperative or object-oriented languages.

There are a lot of explanations of, and tutorials on, monads, but most of them seem to take one of two perspectives: either start with a concrete example, usually in I/O handling, and work back, or start from the abstract mathematical formulation and work forwards. This sounds reasonable, but apparently neither works well in practice -- at least, judging from the comments one receives from  intelligent and able programmers who happen not to have an extensive functional programming or abstract mathematical background. Such a core concept shouldn't be hard to explain, though, so I thought I'd try a different tack: monads from the perspective of language design.

In Pascal, C or Java, statements are separated (or terminated) by semicolons. This is usually regarded as a piece of syntax, but let's look at it slightly differently. Think of the semicolon as an operator that takes two program fragments and combines them together to form a bigger fragment. For example:

int x = 4; int y = x * 3; printf("%d", y);

We have three program fragments. The semicolon operator at the end of the first line takes the fragment on its left-hand side and combines it with the fragment on its right-hand side. Essentially the semicolon defines how the RHS is affected by the code on the LHS: in this case the RHS code is evaluated in an environment that includes a binding of variable x, effectively resolving what is otherwise a free variable. Similarly, the semicolon at the end of the second line causes the third line to be evaluated in an environment that include y. The meaning of the semicolon is hard-wired into the language (C, in this case) and defines how code fragments are sequenced and their effects propagated.

Now from this perspective, a monad is a programmable semicolon. A monad allows the application programmer, rather than the language designer, to determine how a sequence of code is put together, and how one fragment affects those that come later.

Let's revert to Haskell. In a slightly simplified form, a monad is a type class with the following signature:

class Monad m where return :: a -> m a (>>=) :: m a -> (a -> m b) -> m b

So a monad is a constructed type wrapping-up some underlying element type that defines two functions, return and (>>=). The first function injects an element of the element type into the monadic type. The second takes an element of the monadic type and a function that maps an element that monad's element type to some other monadic type, and returns an element of this second monadic type.

The simplest example of a monad is Haskell's Maybe type, which represents either a value of some underlying element type or the absence of a value:

data Maybe a = Just a | Nothing

Maybe is an instance of Monad, simply by virtue of defining the two functions that the type class needs:

instance Monad Maybe where return a = Just a Just a >>= f = f a Nothing >>= _ = Nothing

return injects an element of a into an element of Maybe a. (>>=) takes an element of Maybe a and a function from a to Maybe b. If the element of Maybe a it's passed is of the form Just a, it applies the function to the element value a. If, however, the element is Nothing, it returns Nothing without evaluating the function.

It's hard to see what this type has to do with sequencing, but bear with me. Haskell provides a do construction which gives rise to code like the following:

do v <- if b == 0 then Nothing else Just (a / b) return 26 / v

Intuitively this looks like a sequence of code fragments, so we might infer that the conditional executes first and binds a value to v, and then the next line computes with that value -- which is in fact what happens, but with a twist. The way in which the fragments relate is not pre-defined by Haskell. Instead, the relationship between the fragments is determined by the values of which monad the fragments manipulate (usually expressed as which monad the code executes in). The do block is just syntactic sugar for a stylised use of the two monad functions. The example above expands to:

(if b == 0 then Nothing else Just (a / b)) >>= (\v -> return (26 / v))

So the do block is syntax that expands into user-defined code, depending on the monad that the expressions within it use. In this case, we execute the first expression and then compose it with the function on the right-hand side of the (>>=) operator. The definition says that, if the left-hand side value is Just a, the result is that we call the RHS passing the element value; if the LHS is Nothing, we return Nothing immediately. The result is that, if any code fragment in the computation returns Nothing, then the entire computation returns Nothing, since all subsequent compositions will immediately short-circuit: the Maybe type acts like a simple exception that escapes from the computation immediately Nothing is encountered. So the monad structure introduces what's normally regarded as a control construct, entirely within the language. It's fairly easy to see that we could provide "real" exceptions by hanging an error code off the failure value. It's also fairly easy to see that the monad sequences the code fragments and aborts when one of the "fails". In C we can think of the same function being provided by the semicolon "operator", but with the crucial difference that it is the language, and not the programmer, that decides what happens, one and for all. Monads reify the control of sequencing into the language.

To see how this can be made more general, let's think about another monad: the list type constructor. Again, to make lists into monads we need to define return and (>>=) with appropriate types. The obvious injection is to turn a singleton into a list:

instance Monad [] where return a = [a]

The definition of (>>=) is slightly more interesting: which function of type [a] -> (a -> [b]) -> [b] is appropriate? One could choose to select an element of the [a] list at random and apply the function to it, giving a list [b] -- a sort of non-deterministic application of a function to a set of possible arguments. (Actually this might be interesting in the context of programming with uncertainty, but that's another topic.) Another definition -- and the one that Haskell actually chooses -- is to apply the function to all the elements of [a], taking each a to a list [b], and then concatenating the resulting lists together to form one big list:

l >>= f = concat (map f l)

What happens to the code fragments in a do block now? The monad threads them together using the two basic functions. So if we have code such as:

do x <- [1..10] y <- [20..30] return (x, y)

What happens? The first and second fragments clearly define lists, but what about the third, which seems to define a pair? To see what happens, we need to consider all the fragments together. Remember, each fragment is combined with the next by applying concat (map f l). If we expand this out, we get:

concat (map (\x -> concat (map (\y -> return (x, y)) [20..30])) [1..10])

So to summarise, Haskell provides a do block syntax that expands to a nested sequence of monadic function calls. The actual functions used depend on the monadic type in the do block, so the programmer can define how the code fragments relate to one another. Common monads include some simple types, but also I/O operations and state bindings, allowing Haskell to perform operations that are typically regarded as imperative without losing its laziness. The Haskell tutorial explains the I/O syntax.

What can we say about monads from the perspective of language design? Essentially they reify sequencing, in a functional style. They only work as seamlessly as they do because of Haskell's flexible type system (allowing the definition of new monads), and also because of the do syntax: without the syntactic sugar, most monadic code is incomprehensible. The real power is that they allow some very complex functionality to be abstracted into functions and re-used. Consider the Maybe code we used earlier: without the "escape" provided by the Maybe monad, we'd have to guard each statement with a conditional to make sure there wasn't a Nothing returned at any point. This quickly gets tiresome and error-prone: the monad encapsulates and enforces the desired behaviour. When you realise that one can also compose monads using monad transformers, layering monadic behaviours on top of each other (albeit with some contortions to keep the type system happy), it becomes clear that this is a very powerful capability.

I think one can also easily identify a few drawbacks, though. One that immediately springs to mind is that monads reify one construction, of the many that one might choose. A more general meta-language, like the use of meta-objects protocols or aspects, or structured language and compiler extensions, would allow even more flexibility. A second -- perhaps with wider impact -- is that one has to be intimately familiar with the monad being used before one has the slightest idea what a piece of code will do. The list example above is not obviously a list comprehension, until one recognises the "idiom" of the list monad. Thirdly, the choice of monadic function definitions isn't often canonical, so there can be a certain arbitrariness to the behaviour. It'd be interesting to consider generalisations of monads and language constructs to address these issues, but for the meantime one can use them to abstract a whole range of functionality in an interesting way. Good luck!

Contextual processes

Context-aware systems are intended to follow and augment user-led, real-world processes. These differ somewhat from traditional workflow processes, but share some characteristics. Might the techniques used to implement business processes via web service orchestration fit into the context-aware landscape too?

These ideas arose as a result of discussions at the PMMPS workshop at PERVASIVE 2010 in Helsinki. In particular, I was thinking about comments Aaron Quigley made in his keynote about the need to build platforms and development environments if we're to avoid forever building just demos. The separation of function from process seems to be working in the web world: might it work in the pervasive world too?

In building a pervasive system we need to integrate several components:

  1. A collection of sensors that allow us to observe users in the real world
  2. A user or situation model describing what the users are supposed to be doing, in terms of the possible observations we might make and inferences we might draw
  3. A context model that brings together sensor observations and situations, allowing us to infer the latter from a sequence of the former
  4. Some behaviour that's triggered depending on the situation we believe we're observing
Most current pervasive systems have quite simple versions of all these components. The number of sensors is often small -- sometimes only one or two, observing one user. The situation model is more properly an activity model in that it classifies a user's immediate current activity, independently of any other activity at another time. The context model encapsulates a mapping from sensors to activities, which then manifest themselves in a activating or deactivating a single behaviour. Despite their simplicity, such systems can perform a lot of useful tasks.

However, pervasive activities clearly live on a timeline: you leave home and then walk to work and then enter your office and then check your email, and so forth. One can treat these activities as independent, but that might lose continuity of behaviour, when what you want to do depends on the route by which you got to a particular situation. Alternatively we could treat the timeline as a process, and track the user's progress along it, in the manner of an office workflow.

Of course the problem is that users don't actually follow workflows like this -- or, rather, they tend to interleave actions, perform them in odd orders, leave bits out, drop one process and do another before picking-up the first (or not), and so on. So pervasive workflows aren't at all like "standard" office processes. They aren't discriminated from other workflows (and non-workflow activities) happening simultaneously in the same space, with the same people and resources involved. In some simple systems the workflow actually is "closed", for example computer theatre (Pinhanez, Mase and Bobick. Interval scripts: a design paradigm for story-based interactive systems., Proceedings of CHI'97. 1997.) -- but in most cases its "open". So the question becomes, how do we describe "loose" workflows in which there is a sequence of activities, each one of which reinforces our confidence in later ones, but which contain noise and extraneous activities that interfere with the inferencing?

There are several formalisms for describing sequences of activities. The one that underlies Pinhanez' work mentioned above is Allen algebra (Allen and Ferguson. Actions and events in interval temporal logic. Journal of Logic and Computation 4(5), pp.531--579. 1994.) which provides a notation for specifying how intervals of time relate: an interval a occurs strictly before another b, for example, which in turn contains wholly within it another interval c. It's easy to see how such a process provides a model for how events from the world should be observed: if we see that b has ended, we can infer that c has ended also because we know that c is contained within b, and so forth. We can do this if we don't -- or can't -- directly observe the end of c. However, this implies that we can specify the relationships between intervals precisely. If we have multiple possible relationships the inferencing power degrades rapidly.

Another way to look at things is to consider what "noise" means. In terms of the components we set out earlier, noise is the observation of events that don't relate to the process we're trying to observe. Suppose I'm trying to support a "going to work" process. If I'm walking to work and stop at a shop, for example, this doesn't interfere with my going to work -- it's "noise" in the sense of "something that happened that's non-contradictory of what we expected to see". On the other hand if, after leaving the shop, I go home again, that might be considered as "not noise", in the sense of "something that happened that contradicts the model we have of the process". As well as events that support a process, we also have events that contradict it, and events that provide no information.

Human-centred processes are therefore stochastic, and we need a stochastic process formalism. I'm not aware of any that really fit the bill: process algebras seem too rigid. Markov processes are probably the closest, but they're really designed to capture frequencies with which paths are taken rather than detours and the like. Moreover we need to enrich the event space so that observations support or refute hypotheses as to which process is being followed and where we are in it. This is rather richer than is normal, where events are purely confirmatory. In essence what we have is process as hypothesis in which we try to confirm that this process is indeed happening, and where we are in it, using the event stream.

It's notable that we can describe a process separately from the probabilities that constrain how it's likely to evolve, though. That suggests to me that we might need an approach like BPEL, where we separate the description of the process from the actions we take as a result, and also form the ways in which we move the process forward. In other words, we have a description of what it means to go to work, expressed separately from how we confirm that this is what's being observed in terms of sensors and events, and separated further from what happens as a result of this process being followed. That sounds  a lot easier than it is, because some events are confirmatory and some aren't. Furthermore we may have several processes that can be supported  by observations up to a point and then diverge: going to work and going shopping are pretty similar until I go into a shop, and/or until I leave the shop and don't go to work. How do we handle this? We could enforce common-prefix behaviour, but that breaks the separation between process and action. We could insist on "undo" actions for "failed", no-longer-supported-by-the-observations processes, which severely complicates programming and might lead to interactions between different failed processes. Clearly there's something missing from our understanding of how to structure more complex, temporally elongated behaviours that'll need significant work to get right.

The power of comics

I've been fortunate enough to spend some of the past couple of days with a comic-writer who studies the academic experience, and who might well have a greater aggregate impact on science than almost anyone else I've ever met.

This week has been the SICSA graduate student conference, giving SICSA's  PhD students to share their ideas in front of a friendly audience. As well as the science, one of the goals of the event was to improve the student experience in social ways, letting them find new collaborators and share their experiences and worries. And what better way facilitate this than by inviting the writer of PhD Comics, one off the most popular internet comic strips, to come and talk?

The man behind PhD Comics is Jorge Cham, whom I have to say is one of the nicest guys you could want to hang out with.

Jorge has a PhD himself, of course. His research topic was robotics, specifically small, fast robots mimicking cockroach locomotion to move over uneven surfaces. These sorts of systems have huge potential applications, from space missions and environmental rovers to accident-victim location and disaster recovery. However, his main passion even during his PhD was cartooning, reflecting on and responding to the graduate student experience. It started as a print comic in a Stanford newspaper and predictably did well in a place where the student density is so high.

But it was only when he put it onto the internet that it really took off. Like many things on the internet, there's a law-of-large-numbers effect that can come unexpectedly into play. The number of graduate students in any particular place is usually small, but integrated over the world you have a respectable audience -- and PhD Comics now sustains around half a million hits per day.

The goal of PhD Comics is to act as an encouragement to graduate students. For anyone who's been through it  -- as I have -- it's overall an extremely rewarding, liberating intellectual, social and life experience; it's also a lonely, frustrating, depressing, isolating and self-critical one. It takes an effort of will to believe that you're making a contribution, making discoveries that others will find interesting and worthwhile. Even those with unbounded self-confidence -- which most certainly does not include me, not now and certainly not then -- will find themselves questioning their motivations and capabilities over the course of their PhD.

Often the most sobering part of the whole experience is the realisation of how smart other people are. Most graduate students come from being top or near-top of their undergraduaate class. They then land in an environment where everyone was top of their class: the average suddenly lurches upwards, which can be disorienting. Not only that, but graduate students generally mix, on fairly equal terms, with postdocs and staff who have enormously more experience and who may in some cases be quite famous within the limited bounds of their fields, putting further strain on self-confidence.

I have a quite visceral memory of going to my first graduate-level presentation on a topic (type theory) that I thought I understood well -- as indeed I did, at an undergraduate level. Three slides into the talk, I realised that I knew absolutely nothing about type theory as it actually is, its important concepts, challenges and uses. It was quite wrenching to realise the extent of my own ignorance. Conversely, though, when I now talk about my own work I'm conscious of the gap that exists between people with a reasonable (in every sense) knowledge of a field and those with an expert knowledge, and try to pitch the material accordingly.

Which brings us back to PHD Comics. Every individual graduate student will feel overwhelmed at some point, and may not be able to reach out locally to find support. But they can reach out to the shared experience that is the comic and its archive and see how others feel about the same situations and challenges that they face -- and do so in a way that's far more entertaining than talking to a counsellor. I suspect this is an incredibly valuable service, and one that I'd've welcomed when doing my own PhD.

Does this help with the process of doing science? The completion rates for PhDs is high --over 99% for Scottish computer science, for example -- but the time taken, and stress endured, in that process varies widely. Anything that helps mitigate the strain, helping students cement their self-confidence and deal with the challenges, is very much to be welcomed.

This got me thinking. Robotics is an important field, and it's impossible to say what we lost in terms of research and innovation when Jorge followed his passion. But it's almost certain that he's influenced more scientific activity, more widely, as a cartoonist than he ever would have done as a researcher or an academic. Not everyone can be a researcher, but even fewer can provide insight and entertainment as cartoonists, and even fewer can spot and take the opportunity to become the voice of graduate students worldwide.

Following this logic a little further, I suspect that bringing Jorge over to SICSA may have been the single most effective "soft spend" in the whole programme to date. We don't have a problem with completion, but we do, like all universities, have issues with confidence and motivation, and anything we can do to improve those is money well spent. I wish I could think of a way to confirm that value empirically, but I can't: but that's not going to stop me recommending Jorge as a speaker to anyone wanting to improve their research environment.

Made smarter by the internet?

Over the weekend there was a fascinating exchange of viewpoints in the Wall Street Journal taking opposing sides of the argument as to what effect the internet is having on us: is it making us smarter and better-informed, or more shallow and un-disciplined compared to our book-reading days? Perhaps more importantly. is there anything we can do to leverage the internet to promote smartness more effectively?

Clay Shirky's article takes the positivist approach, arguing that amid the spam and the videos of people tripping over chairs are the beginnings of a new media culture that's attuned to writing for a non-linear, multimedia experience. Opposed to this is Nicholas Carr's view that the internet is opposed to the "stillness" that books encourage, and that the mental discipline that attends reading is of value in itself.

This is a critical (indeed, perhaps the critical) cultural argument over the "content revolution" that we're currently in the middle of. As a computer scientist with a deep interest in writing and literature, I find it fascinating that computers are at the forefront of a societal change, just as they're at the forefront of scientific change. I think we can both critique the points made by both authors, and also use them to move on from what risks being a sterile discussion to consider what computers have to offer in terms of literature and writing.

It's perhaps unsurprising that overall I take Shirky's position: the internet is mind-expanding, and the mind-blowing volume of mediocrity created in the process doesn't alter that. It's undoubtedly true that there is an amazing amount of trivial and/or self-serving junk on the web (probably including most of what I write), as well as material that's offensive and/or dangerous. The same is true for print.

Carr's argument seems to me to centre around a critique of hyperlinking rather than of the web per se, and it's hard to argue with the suggestion that rapidly clicking from page to page isn't conducive to deep study or critical understanding. It's also clearly true that this sort of frenetic behaviour is at least facilitated, if not encouraged, by pages with lots of links such as those found on Wikipedia and many news sites. There's a tendency to encounter something one doesn't know and immediately look it up -- because doing so is so easy -- only to do the same thing with this second page, and so on. On the other hand, few things are  less rewarding than keeping reading material for which one doesn't have the background, whose arguments will never make sense and whose content will never cohere as a result. Hyperlinking makes such context readily available, alongside a potentially destabilising of loss of focus.

It's important to realise that such distraction isn't inevitable, though. When reading Carr's article I was reminded of a comment by Esther Dyson (in another context) to the effect that the internet is simply an amplifier that accentuates what one would do anyway. Deep thinkers can use hyperlinking to find additional information, simplify and their learning and generally enrich their thinking; conversely, shallow thinkers can skim more material with less depth. I think there's an unmistakable whiff of cultural elitism in the notion that book-reading is self-evidently profound and web-page-reading necessarily superficial.

It's tempting to suggest that books better reflect and support a shared cultural experience, a value system that's broadly shared and considered, while the internet fosters fragmentation, ill-considered and narrowly-shared sub-cultures. I suspect this suggestion of broadly true, but not in a naive cause-and-effect way: books cost money to print and distribute, which tends to throttle the diversity of expression they represent. In other words, there's a shared cultural space because that's all people were offered. Both the British government and the Catholic church maintained a list of censored and banned books that effectively limited the space of public discourse through books. Both systems survived until recently: the Index Liborum Prohibitorum was only abolished in 1966 (and hung around for longer than that in Ireland), and the British government domestically banned Spycatcher in the 1980s.

What may be more significant than hyperlinking, though, is closed hyperlinking and closed platforms in general. This is a danger that several writers have alluded to in analysing the iPad. The notion of curated computing -- where users live in a closed garden whose contents are entirely pre-approved (and sometimes post-retracted, too) -- seems to me to be more conducive to shallow thinking. Whatever else the open internet provides, it provides an informational and discursive space that's largely unconstrained, at least in the democratic world. One can only read deeply when there is deep material to read, and when one can find the background, context and dissenting material against which to check one's reading. To use Dyson's analogy again, it'd be easy to amplify the tendency of people to look for material that agrees with their pre-existing opinions (confirmational bias) and so shape the public discussion. There might be broad cultural agreement that Mein Kampf and its recent derivatives should be excluded in the interests of public safety, but that's a powerful decision to give to someone -- especially when digital technology gives them the power to enforce it, both into the future and retroactively.

(As an historical aside, in the early days of the web a technology called content selection was developed, intended to label pages with a machine-readable tag of their content to enable parental control amongst other things. There was even a standard developed, PICS, to attach labels to pages. The question then arose as to who should issue the labels. If memory serves me correctly, a consortium of southern-US Christian churches lobbied W3C to be nominated as the sole label-provider. It's fair to say this would have changed the internet forever....)

But much of this discussion focuses on the relationship between the current internet and books. I suspect it's much more interesting to consider what post-book media will look like, and then to ask what might make such media more conducive to "smart study". There are shallow and simple changes one might make. Allowing hyperlinks that bring up definitions of terms in-line or in pop-ups (as allowed by HyTime, incidentally, a far older hypertext model than the web), would reduce de-contextualisation and attention fragmentation. I find tools like Read It Later to be invaluable, allowing me quickly to mark pages for later reading rather than having to rely on memory and the inevitable cognitive load, especially on mobile devices. Annotating pages client-side would be another addition, on the page rather than at a separate site. More broadly, multimedia and linking invite a whole new style of book. The iPad has seen several "concept" projects for radically hyperlinked multimedia works, and projects like Sophie are also looking at the readability of hypermedia. Unsurprisingly a lot of the best work is going on within the Squeak community, which has been looking at these issues for years: it has a rich history in computer science, albeit somewhat outwith the mainstream.

I doubt the internet can ever make someone smarter, any more than it can make someone careful. What it can do is facilitate new ways of thinking about how to collect, present, organise and interact with information in a dynamic and semantically directed fashion. This is definitely an agenda worth following, and its great to see discussions on new media taking place in the general wide-circulation press and newspapers

Languages for extensible virtual machines

Many languages have an underlying virtual machine (VM) to provide a more portable and convenient substrate for compilation or interpretation. For language research it's useful to be able to generate custom VMs and other language tools for different languages. Which raises the question: what's the appropriate language for writing experimental languages?

What I have in mind is slightly more than just VMs, and more a platform for experimenting with language design for novel environments such as sensor-driven systems. As well as a runtime, this requires the ability to parse, to represent and evaluate type and semantic rules, and to provide a general framework for computation that can then be exposed into a target language as constructs, types and so forth. What's the right language in which to do all this?

This isn't a simple question. It's well-accepted that the correct choice of language is vital to the success of a coding project.  One could work purely at the language level, exploring constructs and type systems without any real constraint of the real world (such as being runnable on a sensor mote). This has to some extent been traditional in programming language research, justified by the Moore's law increases in performance of the target machines. It isn't justifiable for sensor networks, though, where we won't see the same phenomenon. If we want to prototype realistic language tools in the same framework, we need at least a run-time VM that was appropriate for these target devices; alternatively we could ignore this, focus on the language, and prototype only when we're happy with the structures, using a different framework. My gut ffeeling is that the former is preferable, if it's possible, for reasons of conceptual clarity, impact and simplicity. But even without making this decision we can consider the features of different candidate language-writing languages:

C

The most obvious approach is to use C, which is run-time-efficient and runs on any potential platform. For advanced language research, though, it's less attractive because of its poor symbolic data handling. That makes it harder to write type-checking sub-systems and the like, which are essentially symbolic mathematics.

Forth

I've wondered about Forth before. At one level it combines the same drawbacks as C -- poor symbolic and dynamic data handling -- with the additional drawback of being unfamiliar to almost everyone.

Forth does have some redeeming features, though. Firstly, threaded interpretation means that additional layers of abstraction are largely cost-free: they run at the same speed as the language itself. Moreover there's a sense in which threaded interpretation blurs the distinction between host language and meta-language: you don't write Forth applications, you extend it towards the problem, so the meta-language becomes the VM and language tool. This is something that needs some further exploration.

Scheme

Scheme's advantages are its simplicity, regularity, and pretty much unrivalled flexibility in handling symbolic data. There's a long  tradition of Scheme-based language tooling, and so a lot of experience and libraries to make use of. It's also easy to write purely functional code, which can aid re-use.

Scheme is dynamically typed, which can be great when exploring approaches like partial evaluation (specialising an interpreter against a particular piece of code to get a compiled program, for example).

Haskell

In some ways, Haskell is the obvious language for a new language project. The strong typing, type classing and modules mean one can generate a typed meta-language. There are lots of libraries and plenty of activity in the research community. Moreover Haskell is in many ways the "mathematician's choice" of language, since one can often map mathematical concepts almost directly into code. Given thaat typing and semantics are just mathematical operations over symbols, this is a significant attraction.

Where Haskell falls over, of course, is its runtime overheads -- mostly these days in terms of memory rather than performance. It essentially mandates a choice of target platform to be fairly meaty, which closes-off some opportunities. There are some "staged" Haskell strategies that might work around this, and one could potentially stage the code to another runtime virtual machine. Or play games like implement a Forth VM inside Haskell for experimentation, and then emit code for a different Forth implementation for runtime.

Java

Java remains the language du jour for most new projects. It has decent dynamic data handling, poor symbolic data handling, fairly large run-time overheads and a well-stocked library for re-use. (Actually I used Java for Vanilla, an earlier project in a similar area.) Despite the attractions, Java feels wrong. It doesn't provide a good solution to any of the constraints, and would be awkward as a platform for manipulating rules-based descriptions.

Smalltalk

Smalltalk -- and especially Squeak -- isn't a popular choice within language research, but does have a portable virtual machine, VM generation, and other nice features and libraries. The structure is also attractive, being modern and object-oriented. It's also a good platform for building interactive systems, so one could do simulation, visual programming and the like within the same framework -- something that'd be much harder with other choices. There are also some obvious connectionns between Smalltalk and pervasive systems, where one is talking about the interactions of objects in the real world.

Where does that leave us? Nowhere, in a sense, other than with a list of characteristics of different candidate languages for language research. It's unfortunate there isn't a clear winner; alternatively, it's positive that there's a choice depending on the final direction. The worry has to be that a project like this is a moving target that moves away from the areas of strength for any choice made.

Impressions of Pervasive 2010

I've spent this week at the Pervasive 2010 conference on pervasive computing, along with the Programming Methods for Mobile and Pervasive Systems workshop I co-arranged with Dominic Duggan. Both events have been fascinating.

The PMMPS workshop is something we've wanted to run for a while, bringing together the programming language and pervasive/mobile communities to see where languages  ought to go. We received a diverse set of submissions: keynotes from Roy Campbell and Aaron Quigley, talks covering topics including debugging, software processes, temporal aspects (me), context collectionvisual programming ang a lot more. Some threads emerge quite strongly, but I think they'll have to wait for a later post after I've collected my thoughts a bit more.

The main conference included many  papers so good that it seems a shame to single any out. The following are simply those that spoke most strongly to me:

Panorama and Cascadia. The University of Washington presented work on a "complex" events system, combining lower-level raw events. Simple sensor events are noisy and often limited in their coverage. Cascadiais an event service that allows complex events to be defined over the raw event stream, using Bayesian particle filters to interpolate missing events or those from uncovered areas: so it's possible in principle to inferentially  "sense" someone's location even in places without explicit sensor coverage, using a model of the space being observed. This is something that could be generalised to other model-based sensor streams. The Panorama tool allows end-users to specify complex events by manipulating thresholds, which seems little unsatisfactory: there's no principled way to determine the thresholds, and it still begs the question of how to program with the uncertain event stream. Still, I have to say this is the first complex event system I've seen that I actually believe could work.

Eyecatcher. How do you stop people hiding from a camera, or playing-up to it? Work from Ochanomizu University in Japan places a small display on top of the camera, which can be used to present images to catch the subject's attention and to suggest poses or actions. (Another version barks like a dog, to attract your pet's attention.)I have to say this research is very Japanese, a very unusual but perceptive view of the world and the problems appropriate for research.

Emotion modeling. Jennifer Healey from Intel described how to monitor and infer emotions from physiological data. The main problem is that there is no common language for describing emotions -- "anxious" is good for some and bad for others -- so getting ground truth is hard even given extensive logging.

Indoor location tracking for non-experts. More University of Washington work, this time looking at an indoor location system simple enough to be used by non-experts such as rehabilitation therapists. They used powerline positioning, injecting different frequencies into a home's power network and detecting the radiated signal using what are essentially AM radios. Interestingly one of the most important factors was the aesthetics of the sensors: people don't want ugly boxes in their home.

Transfer learning. Tim van Kasteren of the University of Amsterdam has generated one of the most useful smart-home data sets, used across the community (including by several of my students). He reported experiences with transfering machine-learned classifiers from one sensor network to another, by mapping the data into a new, synthetic feature space. He also used the known distribution of features from the first network to condition the learning algorithhm in the second, to improve convergence.

Common Sense. Work from UC Berkeley on a platform for participative sensing: CommonSense. The idea is to place environmental sensors onto commodity mobile devices, and give them to street cleaners and others "out and about" in a community. The great thing about this is that is gives information on pollution and the like to the communities themselves, directly, rather than mediated through a (possibly indifferent or otherwise) State agency.

Energy-aware data traffic management. I should add the disclaimer that is work by my colleague, Mirco Musolesi of the University of  St Andrews. Sensor nodes need to be careful about the energy they use to transmit data back to their base station. This work compares a range of strategies that trade-off the accuracy of returned data with the amount of traffic exchanged and so the impact on the nodoe's battery. This is //really// important for environmental sensing, and makes me think about further modifying the models to account for what's being sensed to trade-off information content as well.

Tutorials AJ Brush did a wonderful tutorial on how to do user surveys. This is something we've done ourselves, and it was great to see the issues nailed-down -- along with war stories of how to plan and conduct a survey for greatest validity and impact. Equally, John Krumm did a fantastic overview of signal processing, particle filters, hidden Markov models and the like that make the maths far more accessible than it normally is. Adrian Friday heroically took the graveyard slot with experiences and ideas about system support for pervasive systems.

This is the first large conference I've attended for a while, for various reasons, and it's been a great week both scientifically and socially. The organisers at the University of Helsinki deserve an enormous vote of thanks for their efforts. Pervasive next year will be in San Francisco, an I'll definitely be there -- hopefully with a paper to present :-)

The only computer science book worth reading twice?

I was talking to one of my students earlier, and lent him a book to read over summer. It was only after he'd left that I realised  that -- for me at any rate -- the book I'd given him is probably the most seminal work in the whole of computer science, and certainly the book that's most influenced my career and research interests.

So what's the book? Structure and interpretation of computer programs by Hal Abelson and Jerry Sussman (MIT Press. 1984. ISBN 0-262-01077-1), also known as SICP. The book's still in print, but -- even better -- is available online in its entirety.

OK, everyone has their favourite book: why's this one so special to me? The first reason is the time I first encountered it: in Newcastle upon Tyne in the second year of my first degree. I was still finding my way in computer science, and this book was a recommended text after you'd finished the first programming courses. It's the book that introduced me to programming as it could be (rather than programming as it was, in Pascal at the time). What I mean by that is that SICP starts out by introducing the elements of programming -- values, names, binding, control and so on -- and then runs with them to explore a quite dazzling breadth of issues including:

  • lambda-abstraction and higher-order computation
  • complex data structures, including structures with embedded computational content
  • modularity and mutability
  • streams
  • lazy evaluation
  • interpreter and compiler construction
  • storage management, garbage collection and virtual memory
  • machine code
  • domain-specific languages
...and so forth. The list of concepts is bewildering, and only stays coherent because the authors are skilled writers devoted to their craft. But it's also a remarkable achievement to handle all these concepts within a single language framework -- Scheme -- in such a way that each builds on what's gone before.

The second reason is the way in which Hal and Jerry view everything as an exercise in language design:

We have also obtained a glimpse of another crucial idea about languages and program design. This is the approach of stratified design, the notion that a complex system should be structured as a sequence of levels that are described using a sequence of languages. Each level is constructed by combining parts that are regarded as primitive at that level, and the parts constructed at each level are used as primitives at the next level. The language used at each level of a stratified design has primitives, means of combination, and means of abstraction appropriate to that level of detail.

Layered abstraction of course is second nature to all computer scientists. What's novel in this view is that each level should be programmable: that the layers are all about computation and transformation, and not simply about hiding information. We don't see that in the mainstream of programming languages, because layering doesn't extend the language at all: Java is Java from top to bottom, with class and libraries but no new control structures. If a particular domain has concepts that would benefit from dedicated language constructs, that's just tough. Conversely (and this is something that very much interests me) if there are constructs it'd be desirable not to have in some domain, they can't be removed. (Within the language, anyway: Java-ME dumps some capabilities in the interests of running on small devices, but that's not something you can do without re-writing the compiler.)

The third influential feature is the clear-sighted view of what computer science is actually about:

The computer revolution is a revolution in the way we think and in the way we express what we think. The essence of this change is the emergence of what might best be called procedural epistemology -- the study of the structure of knowledge from an imperative point of view, as opposed to the more declarative point of view taken by classical mathematical subjects. Mathematics provides a framework for dealing precisely with notions of "what is." Computation provides a framework for dealing precisely with notions of "how to."

I've taken a view before about computers being the new microscopes, opening-up new science on their own as well as facilitating existing approaches. The "how to" aspect of computer science re-appears everywhere in this: in describing the behaviours of sensor networks that can adapt while continuing the reflect the phenomena they've been deployed to sense; in the interpretation of large-scale data mined and mashed-up across the web; in capturing scientific methods and processes for automation; and so forth. The richness of these domains mitigates against packaged software and encourages integration through programming languages like R, so that the interfaces and structures remain "soft" and open to experimentation.

When I looked at my copy, the date I'd written on the inside was September 1988. So a book I bought nearly 22 years ago is still relevant. In fact, I'd go further and say that it's the only computer science book of that age that I'd happily and usefully read again without it being just for historical interest: the content has barely aged at all. That's not all that unusual for mathematics books, but it's almost unheard of in computer science, where the ideas move so quickly and where much of what's written about is ephemeral rather than foundational. It goes to show how well SICP nailed the core concepts. In this sense, it's certainly one of the very few  books on computer science that it's worth reading twice (or more). SICP is to computer science what Feynman's Lectures on Physics are to physics: an accessible distillation of the essence of the subject that's stood the test of time.