The Psychopath Test: A Journey Through the Madness Industry
3/5. Finished 02 April 2014.
(Originally published on Goodreads.)
I Spend Therefore I Am: How Economics Has Changed the Way We Think and Feel
A fantastic exploration of the purpose and dangers of economics and its effects on society.
(Full disclosure: like Philip Roscoe, I work at the University of St Andrews, and we know each other slightly.)
Roscoe's main observation is of a fallacy, that economics is a descriptive endeavour. Instead, he argues that it becomes normative: it causes people to enact the behaviour that it thinks it simply describes. Introducing an economic model into a situation causes those concerned to act in accordance with the model, rather than (as we often think) the model simply exposing behaviours and motivations that were there already. Roascoe supports this hypothesis with a fabulous range of examples, from Norwegian fishermen to prostitution and internet dating.
In many ways this argument is reminiscent of that made by Adam Curtis in The Trap. Mathematical and scientific models are necessarily abstractions of the agents and phenomena they describe, but when these models are used to design systems, we can find that – rather than the systems failing because the models are incomplete – the people involved respond to the systems' incentives by becoming more like the simplified abstractions. What game theory is for Curtis, economics is for Roscoe: simplifying assumptions lead to a reduction in human richness and behaviour through the power of incentives.
One can read this book as a call to arms to reclaim economics (and society) by embracing these characteristics about systems and people, and to design systems with this power in mind: systems that enrich and encourage people rather than simply following the dictates of money and its tendency to further enrich the already wealthy at the expense of the poor. That would be a noble outcome for what is a fascinating piece of work
5/5. Finished 29 March 2014.
(Originally published on Goodreads.)
Writing on the Wall: Social Media - The First 2,000 Years
A tour through the history of media, beginning with Cicero's speeches and letters and continuing through to the familiar Facebook and Twitter.
The author asks a simple question: are social media, and the questions they raise about people's interaction styles, really new phenomena? He asked similar questions about the internet as a whole in The Victorian Internet, his (excellent) history of the telegraph system, and the answer he arrive at here is similar. Unlikely though it may seem at first glance, the majority of the history of media has actually been of social media, in the sense of information flowing informally along the relationships between individuals. The recent prominence of newspapers, radio, and television have blinded us to the fact that these broadcast media are actually historical anomalies, expressing a centralising and one-way tendency that is singularly unusual in the annals of human communication.
For me, the most interesting observation came in the epilogue. Through fretting about the impact of social media, families sometimes adopt strategies like "Unplugged Sunday" where internet-connected devices are banned in favour of more communal pursuits – and these pursuits may involve watching television together, making use of a technology that was previously targeted as the destroyer of family time. (The same might be said of reading novels, targeted in their turn in an earlier age.) Yesterday's radical, disruptive technology becomes today's comfort blanket sanctified by time and familiarity with startling regularity.
4/5. Finished 22 March 2014.
(Originally published on Goodreads.)
What Has Nature Ever Done for Us?: How Money Really Does Grow on Trees
An excellent discussion of ecology and ecological services. Essentially the author is putting the case for measuring the value associated with various elements of nature and their interactions, as a way to make more compelling arguments for conservation and environmental protection. This is an approach I agree with very strongly: while the emotional arguments are of course very strong, backing them up with numbers and prices can only be helpful.
The author doesn't make the mistake of so many books of this kind, of painting a picture of ecological damage that's irredeemable in any realistic sense. One can argue whether or not this is actually the case, but it's certainly true that encouraging a state of learned helplessness amongst the citizens of the developed world isn't going to be helpful.
The book covers a huge range, and everyone will find a new perspective on some familiar aspect of nature, from the evolution of pollinators to the demise of the oyster beds off the New England coast and their possible effects on the behaviours of hurricanes.
In terms of economics, the author makes a couple of points, one familiar and one less so. The point of including the "externalities" of natural services into prices and company accounts is still strong despite its familiarity: the problem remains coming up with good pricing structures. The less familiar point, though, is that companies treat the services they receive from nature as dividends that are renewed rather than as capital being spent, although many ecosystems have now passed the point of un-managed self-recovery, and so their degradation should be costed in. Having long-term investors think like this would, it is argued, have a significant effect on company behaviour in encouraging them to behave more sustainably. While a lot of the degradation is coming from the other end of the economic spectrum -- subsistence farmers trying to make a living in competition with more efficient large-scale actors -- there's still a lot to be said for this approach.
4/5. Finished 17 March 2014.
(Originally published on Goodreads.)
With all the hype about big data it's sometimes hard to realise that it's about more than just data. In fact, it's real interest doesn't come from big-ness at all.
The term big data is deceptively hard to parse. It's a relative term, for a start: when you get down to it, all it really means is data that's large relative to the available storage and processing capacity. From that perspective, big data has always been with us -- and always will be. It's also curiously technology-specific for something that's garnering such broad interest. A volume of data may be "big" for one platform (a laptop, for example) and not for others (a computing cluster with a large network-attached store).
Getting away from the data, what researchers typically mean by big data is a set of computational techniques that can be applied broadly to data somewhat independent of its origins and subject. I'm not a fan of "data science", but the term does at least focus on techniques and not on data size alone. This is much more interesting, and poses a set of challenges to disciplines -- and to computer science -- to identify and expand the set of techniques and tools researchers can make use of.
What often frightens people about big data (or makes it an attractive career niche, depending on your point of view) is that there is a step-change in how you interact with data beyond a certain size -- and it's this change in tooling requirements that I think is a more significant indicator of a big-data problem than simply the data size.
Suppose you collect a dataset consisting of a few hundred samples, each with maybe ten variables -- something that could come from a small survey, for example. This size of data can easily be turned into a spreadsheet model, and a large class of researchers have become completely comfortable with building such models. (This didn't used to be the case: I had a summer job in the late 1980's building spreadsheet models for a group that didn't have the necessary experience. Research skills evolve.) Even a survey with thousands of rows would be possible.
But what if the survey has several million rows? -- for example because it came from an online survey, or because it was sensed automatically in some way. No spreadsheet program can ingest that much data.
A few million rows does not constitute what many people would regard as "big data": it'll be at most several megabytes. But that's not the point: rather, the point is that, in order to deal with it, the researchers have to change tools -- and change them quite radically. Rather than using a spreadsheet, they have to become programmers; not just that, they have to become programmers who are familiar with languages like Python (to get the libraries), and Hadoop and cloud computing (to be able to scale-out the solutions). They could employ programmers, of course, but that removes them from the immediate and intimate contact with their data that many researchers find extremely valuable, and that personal computers have given us. To many people, hiring a programmer and running calculations in the cloud is suspiciously like a return to the mainframe computing we thought was rendered obsolete decades ago.
It's not the terabytes, it's the tooling. Big data starts when you cross the threshold from your familiar world of interactive tools and into a more specialised, programmers world.
This transition is becoming common in humanities and social sciences, as well as in science and medicine, and to my mind it's the major challenge of big data. The basic problem is simple: changing tools takes time, expertise, and mental effort, that is taken away from what's a researcher's actual research interest. A further disincentive is that the effort may not be rewarded: after all, if this really is research, one is often not sure whether there actually is anything valuable on the other side of a data analysis. In fields where competition is really competitive, like medicine, this feels like a lot of risk for an uncertain reward. There's evidence to suggest -- and I can't prove this contention -- that people are steering clear of doing the experiments they know they should do because they know they'll generate data volumes they're uncomfortable with.
This, then, is where the big data challenge actually is: minimising the cost of transition from tools that are familiar to tools that are effective on the larger data volumes that are now becoming commonplace. This is a programming and user interface challenge, to make the complex infrastructure appear easy and straightforward to people who want to accomplish tasks on it. A large challenge, but not an unprecedented one: I'm writing this just after the 25th birthday of the World Wide Web that took a complex infrastructure (the internet) and made it usable by everyone. Let's just not get too hung-up on the idea that data needs to be really big to be interesting in this new data-driven world.