Skip to main content

Posts about Blog (old posts, page 20)

Ambient backscatter

An interesting article on how to power sensors and other "Internet of Things" devices.

A group at the University of Washington has developed a way of making use of "stray" radiation to power simple radio transmitters and receivers. Rather than use a dedicated power source, whether on-board like a battery or transmitted as in near-field communications, this technique makes use of the ambient radiation of cellphone signals, wifi networks and the like to provide enough power to energise a simple radio link.

Recycled Energy: Ambient Backscatter Allows Wireless Communications with no Batteries

If it works reliably, this'll be a huge contribution to low-power environmental sensing as well as to the applications the authors are targeting.

How to write an abstract

An abstract is an advertisement, not an introduction.

I've spent much of this week working with MSc students writing their dissertations, and this has inevitably led to the part of a dissertation that often causes the most pain to write (and read, for that matter): the abstract.

What is the abstract of a report or paper? What it isn't is an introduction or guide to the rest of the document: that, unsurprisingly, is what the introduction is for. The goal of an abstract is much simpler: it's intended to persuade the reader to read some or all of the rest of the document. It may be surprising that this is an issue, but in a world in the grip of an information explosion it's clear that readers' attention is a limiting quantity in information processing. As a reader, should you bother to read a paper? Or not? And how do you make this decision?

Many new researchers, when confronted with a paper, start at the front and work forward. More experienced researchers know this is a mistake that leads to reading whole tracts of irrelevance or nonsense. (The former is worse: nonsense at least has occasional entertainment value.) This leads many people (myself included) to adopt a non-linear reading style:

  1. Read the abstract
  2. If still interested, read the conclusion
  3. If still interested, read the introduction
  4. If it's really interesting, read the rest of the paper
(There are several variants, a couple of which are described here and here.) The point here is that we can filter out papers of lesser interest to reserve time and head space for the most interesting. Only if the results grab your attention do you need to spend time discovering the detail of how they were obtained -- rather than doing that work only discover that you don't care about the results being reported.

The abstract is the key: without a decent description of what the paper is about, the discerning reader will not proceed even to the introduction, and therefore any work contained in the paper will remain unread -- and therefore be largely worthless.

So how does one write a decent abstract? Most experienced scientists have their own technique, and the approach does need to vary from field to field and for different paper styles: a review paper is different to a piece of primary research. The approach I've come to rely on for research in computer science and mathematics can be described succinctly as the five-sentence abstract. I've found five sentences seems to be about optimal, structured as follows:

  • The area of the paper (1 sentence). The problem area to which this paper makes a contribution.
  • The issue the paper addresses (1 sentence). Presumably the area is not yet fully explored, and you've found a problem that needs tackling -- otherwise what's in your paper?
  • What you've done, the results you've obtained (2 sentences). The key contribution of the paper, what you've added to practice and/or knowledge
  • What this means (1 sentence). Why should anyone care?

Let's do an example, from a paper I've always wanted to write about finding unicorns:

There are lots of interesting animals out there, many of which have horns. No-one has yet reported observing any one-horned horses, however. We describe our research survey of the horses of the West of Ireland. While we found many horses, and many other horned animals, we failed to locate any horned horses. We conclude that further research is required to find unicorns, preferably in an equally pleasant holiday destination.

OK, perhaps not a great example. Let's try another, from a real paper:

In the domain of ubiquitous computing, the ability to identify the occurrence of situations is a core function of being context-aware. Given the uncertain nature of sensor information and inference rules, reasoning techniques that cater for uncertainty hold promise for enhancing the reasoning process. In our work, we apply the Dempster Shafer theory of evidence to infer situation occurrence with the minimal use of training data. We describe a set of evidential operations for sensor mass functions using context quality and evidence accumulation for continuous situation detection. We demonstrate how our approach enables situation inference with uncertain information using a case study based on a published smart home data set.

(Taken from McKeever, Ye, Coyle and Dobson. Using Dempster-Shafer theory of evidence for situation inference. In Proceedings of the 4th European Conference on Smart Sensing and Context (EuroSSC). Volume 5741 of LNCS. Springer-Verlag. Guildford, UK. 2009.) The abstract goes from domain to challenge to approach to significance: having read it, the reader hopefully has a fairly good idea of what the paper contributes to which domain, and why this contribution is significant (in the authors' minds, at least).

Shorter is often better, of course:

Wireless sensor networks are attracting increasing interest but suffer from severe challenges such as power constraints and low data reliability. Sensors are often energy-hungry and cannot operate over the long term, and the data they gather are frequently erroneous in complex ways. The two problems are linked, but existing work typically treats them independently: in this paper we consider both side-by-side, and propose a self-organising solution for model-based data collection that reduces errors and communications in a unified fashion.

(From Fang and Dobson. Unifying sensor fault detection with energy conservation. In Proceedings of the 7th International Workshop on Self-Organising Systems (IWSOS'13). Palma de Mallorca, ES. May 2013.) In this case my student Lei messed with the sentence structure because we wanted to get across the idea that the main problem was the totality of the existing approach to the problem, which we wanted to address as a whole. The abstract still follows the basic structure, and I think is stronger for being shorter.

No rule of writing is hard-and-fast, of course, and so you'll often find great abstracts that adopt a completely different approach. I don't think this matters so much as ensuring that the abstract is fit for purpose: an enticement to read further.

Big, or just rich?

The current focus on "big data" may be obscuring something more interesting: it's often not the pure size of a dataset that's important.

The idea of extracting insight from large bodies of data promises significant advances in science and commerce. Given a large dataset, "big data" techniques cover a number of possible approaches:

  • Look through the data for recurring patterns (data mining)
  • Present a summary of the data to highlight features (analytics)
  • (Less commonly) Identify automatically from the dataset what's happening in the real world (situation recognition)
There's a wealth of UK government data available, for example. Making it machine-readable means it can be presented in different ways, for example geographically. The real opportunities seem to come from cross-overs between datasets, though, where they can be mined and manipulated to find relationships that might otherwise remain hidden, for example the effects of crime on house prices.

Although the size and availability of datasets clearly makes a difference here -- big open data -- we might be confusing two issues. In some circumstances we might be better looking for smaller but richer datasets, and for richer connections between them.

Big data is a strange name to start with: when is data "big"? The only meaningful definition I can think of is "a dataset that's large relative to the current computing and storage capacity being deployed against it" -- which of course means that big data has always been with us, and indeed always will be. It also suggests that data might become less "big" if we become sufficiently interested in it to deploy more computing power to processing it. The alternative term popular in some places, data science, is equally tautologous, as I can't readily name a science that isn't based on data. (This isn't just academic pedantry, by the way: terms matter, if only to distinguish what topics are, and aren't, covered by big data/data science research.)

It's worth reviewing what big data lets us do. Having more data is useful when looking for patterns, since it makes the pattern stand out from the background noise. Those patterns in turn can reveal important processes at work in the world underlying the data, processes whose reach, significance, or even existence may be unsuspected. There may be patterns in the patterns, suggesting correlation or causality in the underling processes, and these can then be used for prediction: if pattern A almost always precedes pattern B in the dataset, then when I see a pattern A in the future I may infer that there's an instance of B coming. The statistical machine learning techniques that let one do this kind of analysis are powerful, but dumb: it still requires human identification and interpretation of the underlying processes to to conclude that A causes B, as opposed to A and B simply occurring together through some acausal correlation, or being related by some third, undetected process. A data-driven analysis won't reliably help you to distinguish between these options without further, non-data-driven insight.

Are there are cases in which less data is better? Our experience with situation recognition certainly suggests that this is the case. When you're trying to relate data to the the real world, it's essential to have ground truth, a record of what actually happened. You can then make a prediction about what the data indicates about the real world, and verify that this prediction is true or not against known circumstances. Doing this well over a dataset provides some confidence that the technique will work well against other data, where your prediction is all you have. In this case, what matters is not simply the size of the dataset, but its relationship to another dataset recording the actual state of the world: it's the richness that matters, not strictly the size (although having more data to train against is always welcome).

Moreover, rich connections may help with the more problematic part of data science, the identification of the processes underlying the dataset. While there may be no way to distinguish causality from correlation within a single dataset -- because they look indistinguishably alike -- the patterns of data points in the one dataset may often be related to patterns and data points in another dataset in which they don't look alike. So the richness provides a translation from one system to another, where the second provides discrimination not available in the first.

I've been struggling to think of an example of this idea, and this is the best I've come up with (and it's not all that good). Suppose we have tracking data for people around an area, and we see that person A repeatedly seems to follow person B around. Is A following B? Stalking them? Or do they live together, or work together (or even just close together)? We can distinguish between these alternatives by having a link from people to their jobs, homes, relationships and the like.

There's a converse concern, which is that poor discrimination can lead to the wrong conclusions being drawn: classifying person B as a potential stalker when he's actually an innocent who happens to follow a similar schedule. An automated analysis of a single dataset risks finding spurious connections, and it's increasingly the case that these false-positives (or -negatives, for that matter) could have real-world consequences.

Focusing on connections between data has its own dangers, of course, since we already know that we can make very precise classifications of people's actions from relatively small, but richly connected, datasets. Maybe the point here is that focusing exclusively on the size of a dataset masks both the advantages to be had from richer connections with other datasets, and the benefits and risks associated with smaller but better-connected datasets. Looking deeply can be as effective (or more so) as looking broadly.

The edge of computer science

Where does mathematics end and computer science begin?

I don't seem to have posted to this blog recently, so let's re-start with a big question: where is the edge of computer science? That is to say, what separates it from maths? How do mathematicians and computer scientists see problems differently, and indeed select differently what constitutes an interesting problem?

This has been on my mind recently because of some work I've been doing with my student Saray on adaptive complex networks. A complex network is one that has statistical regularity in the distribution of the wires, or links, between its nodes. The internet is a complex network where the links obey a power-law distribution: a small number of sites (Yahoo, Google, IBM, ...) have a huge number of links to them, while the majority (this site) have almost none. Complex networks are created naturally by lots of processes, and are useful for describing a whole range of phenomena. (An accessible introduction is Albert-László Barabási. Linked: how everything is connected to everything else, and what it means for business and daily life. 2003.) An adaptive complex network is one where the way the network is wired changes with time. A good example is a meeting-with-friends network where there is a link between you and those people you meet in a particular timeframe. You might change the people you meet if you discover that one of them is ill, for example, so the the friend-meeting network would be re-wired to remove links to your sick friends. If we were to model the spread of an illness through this network, there would be two processes at work: a spreading process that made people who met sick people ill themselves (with some probability); and a re-wiring process that tried to remove links to those friends known to be sick (again perhaps with some probability). Our paper (Saray Shai and Simon Dobson. Complex adaptive coupled networks. Physical Review E 87(4). 2013.) shows how there are unsuspected subtleties in the way spreading processes work on such networks, where common simplifications can actually hide crucial features of the network's behaviour.

The literature on network science is full of papers analysing such processes. Typically the analysis is both analytic and numerical. That is to say, a mathematical model is developed that describes the state of the network after lots of time has passed (its equilibrium behaviour); and numerical simulation is then performed by creating a large number of networks, running the spreading processes on them, and seeing whether the results obtained match the analytical model. (It was an unexpected mis-match between analytical and numerical results that led us to the main result reported in our paper.) Typically the community finds analytical results more interesting than numerical results, and with good reason: an analytic result provides both a final, closed-form solution that can be used to describe any network with particular statistical properties, without simulation; and it often also provides insight into why a given equilibrium behaviour occurs. These are the sorts of general statements that can lead to profound understanding of wide ranges of phenomena.

There's a sting in the tail of analysis, however, which is this. In order to be able to form an analytic model, the process being run over the network has to be simple enough that the mathematics converges properly. A typical analysis might use a probabilistic re-wiring function, for example, where nodes are re-wired with a fixed probability, or one that varies only slowly. Anything more complex than this defeats analysis, and as a result one never encounters anything other than simple spreading processes in the literature.

As a computer scientist rather than a mathematician I find that unsatisfying, and I think my dissatisfaction may actually define the boundary between computing and mathematics. The boundary is the halting problem -- or, more precisely, sustaining your interest in a problem once you've hit it.

The halting problem is one of the definitive results in computer science, and essentially says that there are some problems for which it's impossible to predict ahead of time whether they'll complete with a solution or simply go on forever. Put another way, there are some problems where the only way to determine what the solution is is to run a piece of code that computes it, and that may or may not deliver a solution. Put yet another way, there are problems for which the code that computes the solution is the most concise description available.

What this has to do with complex systems is the following. When a computer scientist sees a problem, they typically try to abstract it as far as possible. So on encountering a complex network, our first instinct is to build the network and then build the processes running on it as separate descriptions that can be defined independently. That is, we don't limit what kind of functions can hang off each node to define the spreading process: we just allow any function -- any piece of code -- and then run the dynamics of the network with that code defining what happens at each node at each timestep. The immediate consequence of this approach is that we can't say anything a priori about the macroscopic properties of the spreading process, because to do so would run into the fact that there isn't anything general one can say about an arbitrary piece of code. The generality we typically seek precludes our making global statements about behaviour.

Mathematicians don't see networks this way, because they want to make precisely the global statements that the general approach precludes -- and so don't allow arbitrary functions into the mix. Instead they use functions that aggregate cleanly, like fixed-probability infection rates, about which one can make global statements. One way to look at this is that well-behaved functions allow one to make global statements about their aggregate behaviour without having to perform any computation per se: they remain within an envelope whose global properties are known. A mathematician who used an ill-behaved function would be unable to perform analysis, and that's precisely what they're interested in doing, even though by doing so they exclude a range of possible network behaviours.In fact, it's worse than that: the range of behaviours excluded is infinite, and contains a lot of networks that seem potentially very interesting, for example those whose behaviours depend on some transmitted value, or one computed from values spread by the process.

So a mathematician's (at least as represented in most of the complex systems literature) interest in a problem is lost at precisely the point that a computer scientist's interest kicks in: where the question is about behaviour of arbitrary computations. The question this leads to is, what model do real-world networks follow more closely? Are they composed of simple, well-behaved spreading processes? Or do they more resemble arbitrary functions hanging off a network of relationships, whose properties can only be discovered numerically? And what effect does the avoidance of arbitrary computation have on the choice of problems to which scientists apply themselves? Perhaps the way forward here is to try to find the boundary of the complexity of functions that remain analytic when used as part of a network dynamics, to get the best of both worlds: global statements about large categories of networks, without needing numerical simulation of individual cases.

Such a classification would have useful consequences for general computer science as well. A lot of the problems in systems design come from the arbitrariness of code and its interactions, and from the fact that we're uncomfortable restricting that generality a priori because we don't know what the consequences will be for the re-usability and extensibility of the systems being designed. A more nuanced understanding of behavioural envelopes might help.

Studentship available in Newcastle upon Tyne

My colleagues in Newcastle (where I did my first degree) have an MRes/PhD position available in computational behaviour analysis.

Together with colleagues from Agriculture, the School of Computing Science in Newcastle have recently been active in the field of computational behaviour analysis in livestock animals. They have a small grant on which we are currently doing some  pilot work related to food safety and animal wellbeing, which has now given rise to funding for an integrated MRes / PhD studentship to work in this field.

The group wants to work on a system for automatic tracking of farm pigs (indoor) and on modelling their behaviour (as individuals and in the group) with the goal of early detection of changes therein that might be linked to  certain diseases. The project involves some very strong partners from and connections to agriculture and access to farms and industry.  The topic is extremely relevant for the UK industry and has wider societal impact (food safety, animal welfare). They are looking for strong CS candidates with a background in machine learning, signal processing, or computer vision, and an interest in working in an multi-disciplinary team on a challenging and very interesting topic with substantial societal and economic relevance.

The funding covers four years of fees and maintenance. UK home  students (and EU citizens who have been residents for at least three years) are eligible for the full funding, EU students essentially for 50%. Throughout the first year the student will be enrolled into an MRes program with courses and time to get familiar with the topic, which is a fantastic opportunity for students to obtain an interdisciplinary skill-set.

More details can be found here. Application deadline is the 5th April 2013.