Laureation for Professor Dana Scott

I had the honour (and the great personal pleasure) of inviting the Vice-Chancellor to bestow an honorary degree upon Dana Scott, the inventor of some of the most influential ideas in computer science.

Vice-Chancellor, I have the privilege to present Professor Dana Scott for the degree of Doctor of Science, honoris causa.

Vice-Chancellor, colleagues, friends, ladies and gentlemen:

For millennia, people have performed calculations, sometimes changing the way we live or understand the world. Many of these calculations have involved long, complicated sequences of actions — what we now refer to as algorithms. But it was only in the 1930s that researchers such as Alonzo Church, John von Neumann, Alan Turing, and others formally studied how we perform calculations, which rapidly opened-up the mechanisation of such operations and led to what we now know as computer science.

What does it mean to describe a calculation? For Turing, it meant designing an ideal machine whose small set of simple operations could perform computation — an operational view of computing that allows machines to perform tasks previously thought to require humans. But we can also think of computation independent of mechanisation, where mathematics can be applied to studying computation, and a theory of computation becomes available for the study of mathematics, physics, and other disciplines. And when we take this view, we are making use of ideas that owe their modern existence to the work of Dana Scott.

Scott was a PhD student of the logician Alonzo Church, whom I mentioned earlier. Working with the late Christopher Strachey at Oxford, Scott developed a theory of computation that allows calculations to be analysed, studied, and compared. Scott’s insight was to view computation as a steady increase in information. His development of the mathematical structures now known as Scott domains provided a way of precisely describing this progression. They in turn led directly to an approach for formally describing programs and programming languages — the Scott-Strachey approach to denotational semantics — and indirectly both to approaches to proving programs correct, and to the development of lazy functional programming languages that today form a major strand of computer science research: one to which St Andrews is proud to be making an on-going contribution.

If asked, most computer scientists would agree that denotational semantics forms Scott’s most lasting contribution; they might marvel that, later this year, at the age of 81, he will be delivering a keynote lecture in Vienna at the main international conference on computational logic; and they would probably be able to tell you that he is a recipient of the Turing Award, often referred to as the “Nobel Prize for Computer Science”. However, Scott in fact won the Turing Award, jointly with Michael Rabin, for work on automata theory that predates his work on semantics. In other words, he won the highest accolade his discipline has to offer for work not generally considered to be his most significant. As you might imagine, this is a rather unusual occurrence: in fact, the only other example I can find in the entire history of science is the award of the Nobel Prize to Albert Einstein for work other than his theory of relativity. That’s not bad company to be keeping.

When we think of computers, we often think of their visible manifestations: the internet, mobile phones, aircraft flight control systems, Angry Birds. But no matter how impressive, and how much they continue to change our lives for the better, these systems are possible only because of the foundational intellectual developments that let us reason about proofs, calculations, and computations, as well as simply carrying them out. Vice Chancellor, the work of Dana Scott grounded the discipline of computer science, not only in a specific piece of theory, but also in an approach and a mindset that changed how we think about computing and, through this, has had a profound influence across the whole of human endeavour. It is in recognition of these seminal contributions to science that I invite you to confer upon Professor Dana Scott the degree of Doctor of Science, honoris causa.

Photo here.

(Thanks to Al Dearle, Steve Linton, Lisa Dow, and Muffy Calder for comments that made this better than the first draft I did.)

Research fellowships available in Dublin

Two post-doctoral positions in smart cities now available at Trinity College Dublin.

Research Fellowships in Autonomic Service-Oriented Computing for Smart Cities

Applications are invited for two Postdoctoral Research Fellowships at Trinity College Dublin’s Distributed Systems Group to investigate the provision of a new service-oriented computing infrastructure that provides demand-based composition of software services interacting with a city-wide, dynamic network infrastructure. The project will investigate autonomic adaptation of services and infrastructure, ensuring resilient service provision within an integrated, city-wide system.

Applicants should have a Ph.D. in Computer Science, Computer Engineering or a closely-related discipline and strong C++/C#/Java development skills. Experience with autonomic computing, service-oriented middleware, and/or smart city technologies is desirable as are strong mathematical skills.

The project is supported by Science Foundation Ireland under the Principal Investigator programme between 2014-2018 and will be conducted in collaboration with Cork Institute of Technology, NUI Maynooth, IBM Smarter Cities Research Centre, Intel Intelligent Cities Lab, EMC2 Research Europe, and Arup.  The position is tenable from September 2014.

Please apply by email to Siobhan.Clarke@scss.tcd.ie quoting “Smart Cities Fellowship” in the subject line. Applications should include a curriculum vitae, in PDF format, giving full details of qualifications and experience, together with the names of two referees. The closing date for applications is the 20th June, 2014.

Trinity College is an equal opportunities employer.

Call for papers: new journal on self-adaptive systems

Papers are welcome for the EAI Endorsed Transactions on Self-Adaptive Systems.

EAI Transactions on Self-Adaptive Systems

http://eai.eu/transaction/self-adaptive-systems

Editor-in-Chief: Dr. Emil Vassev, Lero, the Irish Software Engineering Centre, University of Limerick, Ireland

SCOPE

This journal seeks contributions from leading experts from research and practice of self-adaptive systems that will provide the connection between theory and practice with the ultimate goal to bring both the science and industry closer to the so-called "autonomic culture" and successful realization of self-adaptive systems. Both theoretical and applied contributions related to the relevance and potential of engineering methods, approaches and tools for self-adaptive systems are particularly welcome. This applies to application areas and technologies such as:

  • adaptable user interfaces
  • adaptable security and privacy
  • autonomic computing
  • dependable computing
  • embedded systems
  • genetic algorithms
  • knowledge representation and reasoning
  • machine learning
  • mobile ad hoc networks
  • mobile and autonomous robots
  • multi-agent systems
  • peer-to-peer applications
  • sensor networks
  • service-oriented architectures
  • ubiquitous computing
It also hold for many research fields, which have already investigated some aspects of self-adaptation from their own perspective, such as fault-tolerant computing, distributed systems, biologically inspired computing, distributed artificial intelligence, integrated management, robotics, knowledge-based systems, machine learning, control theory, etc.

MANUSCRIPT SUBMISSION

Manuscripts should present original work in the scope of the journal and must be exclusively submitted to this journal, must not have been published before, and must not be under consideration for publication elsewhere. Significantly extended and expanded versions of papers published in conference proceedings can be submitted, providing also a detailed description of the additions. Regular papers are limited to a maximum of 20 pages. Prepare and submit your manuscript by following the instructions provided here.

OPEN ACCESS

Authors are not charged with any publication fees and their papers will be published online with Open Access. Open Access is a publishing model where the electronic copy of the article is made freely available with permission for sharing and redistribution. Currently, all articles published in all journals in the EAI Endorsed Transactions series are Open Access under the terms of the Creative Commons with Attribution license and published in the European Union Digital Library.

EDITORIAL BOARD

  • Christopher Rouff , Johns Hopkins Applied Physics Laboratory, USA
  • Danny Weyns , Linnaeus University, Sweden
  • Franco Zambonelli , UNIMORE, Italy
  • Genaina Rodrigues , University of Brasilia, Brazil
  • Giacomo Cabri , UNIMORE, Italy
  • Imrich Chlamtac , CREATE-NET Research Consortium, University of Trento, Italy
  • James Windsor , ESTEC, European Space Agency, Netherlands
  • Michael O'Neill , UCD, Ireland
  • Mike Hinchey , Lero, the Irish Software Engineering Research Centre, University of Limerick, Ireland
  • Richard Antony , University of Greenwich, UK
  • Simon Dobson , Uni­ver­sity of St Andrews, UK
 

Let's teach everyone about big data

Demolishing the straw men of big data.

This post comes about from reading Tim Harford's opinion piece in the Financial Times in which he offers a critique of "big data", the idea that we can perform all the science we want to simply by collecting large datasets and then letting machine learning and other algorithms loose on it. Harford deploys a whole range of criticisms against this claim, all of which are perfectly valid: sampling bias will render a lot of datasets worthless; correlations will appear without causation; the search goes on without hypotheses to guide it, and so isn't well-founded in falsifiable predictions; and an investigator without a solid background in the science underlying the data is going to have no way to correct these errors.

The critique is, in other words, damning. The only problem is, that's not what most scientists with an interest in data-intensive research are claiming to do.

Let's consider the biggest data-driven project to date, the Large Hadron Collider's search for the Higgs boson. This project involved building a huge experiment that then generated huge data volumes that were trawled for the signature of Higgs interactions. The challenge was so great that the consortium had to develop new computer architectures, data storage, and triage techniques just to keep up with the avalanche of data.

None of this was, however, an "hypothesis-free" search through the data for correlation. On the contrary, the theory underlying the search for the Higgs made quite definite predictions as to what its signature should look like. Nonetheless, there would have been no way of confirming or refuting the correctness of those predictions without collecting the data volumes necessary to make the signal stand out from the noise.

That's data-intensive research: using new data-driven techniques to confirm or refute hypotheses about the world. It gives us another suite of techniques to deploy, changing both the way we do science and the science that we do. It doesn't replace the other ways of doing science, any more than the introduction of any other technology necessarily invalidates hat came before. Microscopes did not remove the need for, or value of, searching for or classifying new species: they just provided a new, complementary approach to both.

That's not to say that all the big data propositions are equally appropriate, and I'm certainly with Harford in the view that approaches like Google Flu are deeply and fundamentally flawed, over-hyped attempts to grab the limelight. Where he and I diverge is that Harford is worried that all data-driven research falls into this category, and that's clearly not true. He may be right that a lot of big data research is a corporate plot to re-direct science, but he's wrong to worry that all projects working with big data are similarly driven.

I've argued before that "data scientist" is a nonsense term, and I still think so. Data-driven research is just research, and needs the same skills of understanding and critical thinking. The fact that some companies and others with agendas are hijacking the term worries me a little, but in reality is no more significant than the New Age movement's hijacking of terms like "energy" and "quantum" -- and one doesn't stop doing physics because of that.

In fact, I think Harford's critique is a valuable and significant contribution to the debate precisely because it highlights the need for understanding beyond the data: it's essentially a call for scientists to only use data-driven techniques in the service of science, not as a replacement for it. An argument, in other words, for a broadly-based education in data-driven techniques for all scientists, and indeed all researchers, since the techniques are equally (if not more) applicable to social sciences and humanities. The new techniques open-up new areas, and we have to understand their strengths and limitations, and use them to bring our subjects forwards -- not simply step away because we're afraid of their potential for misuse.

UPDATE 7Apr2014: An opinion piece in the New York Times agrees: "big data can work well as an adjunct to scientific inquiry but rarely succeeds as a wholesale replacement." The number of statistical land mines is enormous, but the right approach is to be aware of them and make the general research community aware too, so we can use the data properly and to best effect.

It's the tooling, not the terabytes

With all the hype about big data it's sometimes hard to realise that it's about more than just data. In fact, it's real interest doesn't come from big-ness at all.

The term big data is deceptively hard to parse. It's a relative term, for a start: when you get down to it, all it really means is data that's large relative to the available storage and processing capacity. From that perspective, big data has always been with us -- and always will be. It's also curiously technology-specific for something that's garnering such broad interest. A volume of data may be "big" for one platform (a laptop, for example) and not for others (a computing cluster with a large network-attached store).

Getting away from the data, what researchers typically mean by big data is a set of computational techniques that can be applied broadly to data somewhat independent of its origins and subject. I'm not a fan of "data science", but the term does at least focus on techniques and not on data size alone. This is much more interesting, and poses a set of challenges to disciplines -- and to computer science -- to identify and expand the set of techniques and tools researchers can make use of.

What often frightens people about big data (or makes it an attractive career niche, depending on your point of view) is that there is a step-change in how you interact with data beyond a certain size -- and it's this change in tooling requirements that I think is a more significant indicator of a big-data problem than simply the data size.

Suppose you collect a dataset consisting of a few hundred samples, each with maybe ten variables -- something that could come from a small survey, for example. This size of data can easily be turned into a spreadsheet model, and a large class of researchers have become completely comfortable with building such models. (This didn't used to be the case: I had a summer job in the late 1980's building spreadsheet models for a group that didn't have the necessary experience. Research skills evolve.) Even a survey with thousands of rows would be possible.

But what if the survey has several million rows? -- for example because it came from an online survey, or because it was sensed automatically in some way. No spreadsheet program can ingest that much data.

A few million rows does not constitute what many people would regard as "big data": it'll be at most several megabytes. But that's not the point: rather, the point is that, in order to deal with it, the researchers have to change tools -- and change them quite radically. Rather than using a spreadsheet, they have to become programmers; not just that, they have to become programmers who are familiar with languages like Python (to get the libraries), and Hadoop and cloud computing (to be able to scale-out the solutions). They could employ programmers, of course, but that removes them from the immediate and intimate contact with their data that many researchers find extremely valuable, and that personal computers have given us. To many people, hiring a programmer and running calculations in the cloud is suspiciously like a return to the mainframe computing we thought was rendered obsolete decades ago.

It's not the terabytes, it's the tooling. Big data starts when you cross the threshold from your familiar world of interactive tools and into a more specialised, programmers world.

This transition is becoming common in humanities and social sciences, as well as in science and medicine, and to my mind it's the major challenge of big data. The basic problem is simple: changing tools takes time, expertise, and mental effort, that is taken away from what's a researcher's actual research interest. A further disincentive is that the effort may not be rewarded: after all, if this really is research, one is often not sure whether there actually is anything valuable on the other side of a data analysis. In fields where competition is really competitive, like medicine, this feels like a lot of risk for an uncertain reward. There's evidence to suggest -- and I can't prove this contention -- that people are steering clear of doing the experiments they know they should do because they know they'll generate data volumes they're uncomfortable with.

This, then, is where the big data challenge actually is: minimising the cost of transition from tools that are familiar to tools that are effective on the larger data volumes that are now becoming commonplace. This is a programming and user interface challenge, to make the complex infrastructure appear easy and straightforward to people who want to accomplish tasks on it. A large challenge, but not an unprecedented one: I'm writing this just after the 25th birthday of the World Wide Web that took a complex infrastructure (the internet) and made it usable by everyone. Let's just not get too hung-up on the idea that data needs to be really big to be interesting in this new data-driven world.

No data scientists

"Data science" and "data scientist" are not good terms -- for anything.

The recent revolution in our ability to monitor more and more human (and indeed non-human) activity has spawned a whole new field of study. Known variously as "big data", "data science", and "digital humanities", the idea is that -- by studying the data collected -- we can perform feats of prediction and customisation that defeat other approaches.

This is not all hype. There's no doubt that deriving algorithms from data -- known as "data-driven design" -- can often work better than a priori design. The best-known example of this is Google Translate, whose translation approach is driven almost entirely by applying statistical learning to a large corpus of documents with for which exact translations exist across languages, and using the correlations found as the basis for translation. (The documents in question were actually the core agreements governing the operation of the EU, the acquis communautaire, which explains why Google Translate works best on bureaucratese and worst on poetry.) It turns out that data-driven learning works better in most cases than grammar-directed parsing.

The data-driven approach rests on several pillars. Chief among them is applied machine learning as mentioned above, allowing algorithms to look through bodies of data and learn the correlations that exist between events. (We use similar techniques to analyse sensor data and perform situation recognition. See Ye, Dobson and McKeever. Situation identification techniques in pervasive computing: a review. Pervasive and Mobile Computing 8(1), pp. 36--66. 2012.) Various other statistical techniques can also be applied, ranging up from simple mean and variance calculations. One can usually augment the basic techniques using more structural methods: if you know the structure of the links in a web site, for example, you can learn more about how people navigate because you know that some routes are more probable (by clicking hyperlinks) than others. The results of these analyses then need to be presented as analytics for consumption by managers and decision-makers, and distilling large volumes of information into visually compelling and comprehensible form is a skill in itself.

Going a stage further, one may conduct experiments directly. If you see people searching for the same term, present the results to different people in different ways and see how that influences the links they click. Google again are at the forefront.

The excitement of these techniques has spilled over into the wider science and humanities landscapes. Just within St Andrews this week I've talked about projects for analysing DNA data using machine learning, improving clinical trials, building a concept map for two centuries-worth of literature pertaining to Edinburgh, detecting intruders in computer systems, and mapping the births, deaths, and marriages of Scotland from parish records -- and it's only Wednesday. All these activities are inherently data-driven: they only become possible because we can ingest and analyse large data volumes at high-enough speeds.

However, none of this implies that there is a subject called "data science", and I'm starting to worry that a false premise is being established.

The term "data science" is a tricky one to parse. How many sciences do you know that aren't based in data? "Data-driven" is the same. That's not to say that they're meaningless, exactly: they identify a sub-set of techniques that are qualitatively different to the more theory-driven approaches, and even differentiate themselves from experimentally-driven approaches by their attempt to correlate across datasets rather than being based on a single homogeneous sampling (however large). But that's a nuanced reading, and a general reader would be forgiven for believing that "data science" was somehow a separate discipline from "ordinary" science, instead of it denoting a set of computational techniques with general applicability across the range of (non-)traditional sciences.

But it's "data scientist" that really troubles me. This seems to suggest that one can find scientific meaning in the data and the data alone. It seems to suggest that there's a short-cut to scientific insight through the data (and by implication, computer science) rather than through traditional scientific training. And this simply isn't true.

The problem is the age-old different between correlation and causality. Correlation is when things happen together: you leave a cup of a tea and a biscuit on a table for long enough, the tea goes cold and the biscuit gets stale. Causality is when one thing happening triggers another thing happening: the tea got cold, and that made the biscuit become stale. Mistaking one for the other leads to all sorts of interesting possibilities: if we put the tea in an insulated cup, the biscuit will stay fresh longer.

Now, the final tea-and-biscuit example is clearly meaningless, but ask yourself this: how did you recognise it as meaningless? Because you understand that the two effects are happening independently because of the passage of time, not because of each other: they are correlated but not causative. You understand this because you have insight into the processes of the world.

And it's here that the problems start for data scientists. In order to interpret the machine learning, statistics, analytics, or other results, you have to have an understanding of the underlying processes. It doesn't happen in the data at all. That's fine for tea and biscuits, and is also probably fairly fine for sales of consumer goods on mass-market web sites, where we have a good intuitive understanding of the processes involved, but will drop-off rapidly as we approach areas that are more complex, more noisy, less intuitive, and less well-understood. How can you differentiate correlation from causality if you don't understand what's possible in the underlying domain? How, in fact, can you determine anything from the data in and of itself?

This suggests to me that data scientist isn't a job description, or even an aspiration: it's a misnomer that should really be read as "a trained scientist working with lots of rich empirical data". Not as sexy, but probably more useful and less likely to disappoint everyone involved.

No-cost journals

What have scientific journals ever done for us? And can we get the benefits without the access issues?

"Open access" is a big thing in scientific publishing these days. The UK research councils, who fund a large fraction of the UK's academic research, have decided that papers arising from their research have to made available to any interested reader at no charge. The argument is that publicly-funded research results are a public good, and that other researchers should not be impeded in building on results. Since science progresses by researchers building on each others' work, there is plenty of justification for this view.

You would think that open access wouldn't be a problem in these days of personal web pages and Google. However, when publishing a paper in a major journal, the authors typically sign away their copyright to the journal publisher, meaning that they can't legally make the paper freely available. The publishers in turn lock the papers away, either in dead-tree form (which they then sell to university libraries at exorbitant cost) or behind paywalls requiring individual or institutional subscription. The journals who do this are often the most important and prestigious venues, places where you want your work to appear, and scientists aren't going to stop publishing in these places any time soon.

To address open access, some journals have started charging open access fees, whereby an author can pay to have their article made open access (i.e., to appear outside the paywall). Of course, anyone funded by a UK research council basically has to pay these fees to be compliant with their funding. Effectively, though, it means that institutions typically pay twice for publication: they pay the open access fee for individual articles, but still need to subscribe to the paid-for journal to get access to other papers. There are also open access journals that charge a publication fee for each accepted paper, but these are still quite new and with some exceptions (most notably PLoS) still fairly low-grade.

These issues got me thinking: what do the journals actually give us? And could we get the benefit using internet technology without the costs?

Historically journals served as the primary means of academic communication, but clearly that time has passed. Nowadays journals give us two things:

  1. An editor and editorial board acting as guardians of the quality of the papers accepted. As a general rule, you never publish in a journal where you don't recognise any of the names on the editorial board: you want a journal managed by people known in your field.
  2. A brand that gives readers confidence that this journal will contain significant work that justifies the time spent reading it.
  3. Persistent storage of articles to give confidence that they can be found, referenced, and accessed in the future.
Clearly (2) is a function of (1), in the sense that the brand is built by the demonstrated consistently by the editorial board. It typically takes time to develop for a new journal. As to (3), persistent storage isn't much of a problem these days, but finding a copy of a paper could well be.

In building our now-cost journal, we therefore need to replicate (1) and (3) in order to build (2).

Here's a possible workflow. We establish the journal's web site: the St Andrews Journal of Interesting Things, perhaps. Like most journals, this allows prospective authors to submit their manuscripts, which are passed to the editorial board for review.

Academics typically serve on editorial boards for free. They are self-organising, in the sense that the editor-in-chief (EIC) appoints a set of trusted lieutenants consisting of his friends, colleagues, and people well-known in the field of the journal. Doing this well is a major skill -- but an individual one, dependent on the selection of a good EIC. The editorial board are assigned incoming papers, which they ask their friends and colleagues to review and provide comments one. Again, the selection of reviewers is critical to the quality of papers, as the reviewers are expected to assess the work presented and to suggest changes (or reject the paper completely). Academics typically review for free, too, so you'll notice that, for a typical journal, the total cost so far has been running the web server that manages the editorial process.

Papers typically go around one or two rounds of revision before being accepted and published. The problem here is that we need to show readers that the paper has passed through the quality assurance of review. Anyone can put an article on the web, but journals guarantee that the work has been looked at and approved by the authors' scientific peers. (This doesn't guarantee that journal-published work is correct, merely that it's sufficiently convincing to a suitably qualified set of reviewers. There is always a steady stream of corrections, retractions, and withdrawals as flaws are found in work post-publication.) In a paid-for journal, the guarantee comes from printing on paper: you can check whether a paper purporting to appear in a journal actually does so by checking the appropriate volume. This of course implies printing and distribution, which publishers claim is source of their need for fees. For the St Andrews Journal of Interesting Things we want to avoid this cost.

Actually, this is technically straightforward in a number of ways. The complicated way is to create a machine-readable metadata file containing the paper's title, authors, abstract, journal reference, associate editor in charge, and maybe some other details. We then bind this file cryptographically to the final ("camera-ready") manuscript. The cryptography guarantees two things: that the binding was done by the journal editor, and that neither file has been changed since being bound together. Anyone downloading the file bundle can then check that the metadata and manuscript match, and therefore knows that the paper is the one "published".

(The simple way is to add a header to the manuscript text and then cryptographically sign the resulting file. This is trivially accomplished using a tool like Adobe Acrobat Pro, but is less attractive than the metadata approach because the header isn't machine-readable, making it harder to index the paper.)

There is no cost associated with either of these approaches. We can then give the signed file back to the authors and tell them to place it on any web server that Google will index. This will let anyone searching for the file to find it: that's what search engines do. If we want to be really thorough, we would keep track of where the files are stored, and/or perform regular searches to locate them (easy enough given machine-readable metadata), and maintain a journal web page listing the published papers and linking to them. (Total cost: one web page.) If we want to be really thorough we can mint Digital Object Identifiers that resolve through our web server to the paper locations. (Total cost: a small database and a single CGI script on our web server.)

We've now recreated the publication side of the journal industry, essentially for free. This leaves the branding issue. There are two sides to establishing a brand: quality and visibility. The quality of the product, as mentioned above, relies on the selection of editorial and review teams and their willingness to serve at no cost, as is normal in academic publishing. The visibility issue is harder to crack, but could be addressed using the web, by viral marketing and appearances at conferences that editorial board members were attending, by word of mouth through the research community -- and even by advertising in the paid-for journals themselves, possibly. One great thing about the web and social media is that word will get out: after that, it's a matter of the quality of papers accepted and the willingness of authors to contribute.

I'm not actually planning on setting up a new journal. The point is that 21st century research doesn't need the friction and costs imposed by journals whose main editorial services are provided free by their consumers anyway. We should be able to do away with these costs without sacrificing the quality of material that we read or the reliance we place upon it.

First SleepySketch release

Happy 2014! We're particularly happy to be making the first release of the SleepySketch library for writing low-power Arduino sketches.

SleepySketch changes the way you write Arduino sketches by letting the library, rather than the main body of the sketch, decide when to run code. The sketch stays asleep as much as possible, with the Arduino placed into a low-power state to preserve battery.

This is a first release of SleepySketch, for comments. It provides a sketch framework, a basic sleep manager, and an example "blinkenlights" demonstration to show how the system fits together. Future releases will provide more flexible sleep management and support for component-level power management for common components like Xbee radios.

You can download SleepySketch v. 0.1 from here.

Funded PhD positions available

The School of Computer Science at the University of St Andrews has around eight  fully-funded PhD positions available. I'd welcome applicants interested in sensor networks, complex systems, and data science.

We welcome students from a wide range of countries, our only major requirements being that you're excited by the idea of research and are able to conduct a complex programme within a small, friendly, and supportive environment.

In my case, I'm interested in hearing from potential students with interests in the following areas:

  • Sensor networks, especially deploying sensors into the environment;
  • Complex system modelling, trying to model phenomena that operate on a range of scales; and
  • Data science, particularly for how we collect, categorise, and work with large scientific datasets.
An early conversation by skype or email could be followed by a formal application, the details of which are available here.