What have scientific journals ever done for us? And can we get the benefits without the access issues?
“Open access” is a big thing in scientific publishing these days. The UK research councils, who fund a large fraction of the UK’s academic research, have decided that papers arising from their research have to made available to any interested reader at no charge. The argument is that publicly-funded research results are a public good, and that other researchers should not be impeded in building on results. Since science progresses by researchers building on each others’ work, there is plenty of justification for this view.
You would think that open access wouldn’t be a problem in these days of personal web pages and Google. However, when publishing a paper in a major journal, the authors typically sign away their copyright to the journal publisher, meaning that they can’t legally make the paper freely available. The publishers in turn lock the papers away, either in dead-tree form (which they then sell to university libraries at exorbitant cost) or behind paywalls requiring individual or institutional subscription. The journals who do this are often the most important and prestigious venues, places where you want your work to appear, and scientists aren’t going to stop publishing in these places any time soon.
To address open access, some journals have started charging open access fees, whereby an author can pay to have their article made open access (i.e., to appear outside the paywall). Of course, anyone funded by a UK research council basically has to pay these fees to be compliant with their funding. Effectively, though, it means that institutions typically pay twice for publication: they pay the open access fee for individual articles, but still need to subscribe to the paid-for journal to get access to other papers. There are also open access journals that charge a publication fee for each accepted paper, but these are still quite new and with some exceptions (most notably PLoS) still fairly low-grade.
These issues got me thinking: what do the journals actually give us? And could we get the benefit using internet technology without the costs?
Historically journals served as the primary means of academic communication, but clearly that time has passed. Nowadays journals give us two things:
- An editor and editorial board acting as guardians of the quality of the papers accepted. As a general rule, you never publish in a journal where you don’t recognise any of the names on the editorial board: you want a journal managed by people known in your field.
- A brand that gives readers confidence that this journal will contain significant work that justifies the time spent reading it.
- Persistent storage of articles to give confidence that they can be found, referenced, and accessed in the future.
Clearly (2) is a function of (1), in the sense that the brand is built by the demonstrated consistently by the editorial board. It typically takes time to develop for a new journal. As to (3), persistent storage isn’t much of a problem these days, but finding a copy of a paper could well be.
In building our now-cost journal, we therefore need to replicate (1) and (3) in order to build (2).
Here’s a possible workflow. We establish the journal’s web site: the St Andrews Journal of Interesting Things, perhaps. Like most journals, this allows prospective authors to submit their manuscripts, which are passed to the editorial board for review.
Academics typically serve on editorial boards for free. They are self-organising, in the sense that the editor-in-chief (EIC) appoints a set of trusted lieutenants consisting of his friends, colleagues, and people well-known in the field of the journal. Doing this well is a major skill — but an individual one, dependent on the selection of a good EIC. The editorial board are assigned incoming papers, which they ask their friends and colleagues to review and provide comments one. Again, the selection of reviewers is critical to the quality of papers, as the reviewers are expected to assess the work presented and to suggest changes (or reject the paper completely). Academics typically review for free, too, so you’ll notice that, for a typical journal, the total cost so far has been running the web server that manages the editorial process.
Papers typically go around one or two rounds of revision before being accepted and published. The problem here is that we need to show readers that the paper has passed through the quality assurance of review. Anyone can put an article on the web, but journals guarantee that the work has been looked at and approved by the authors’ scientific peers. (This doesn’t guarantee that journal-published work is correct, merely that it’s sufficiently convincing to a suitably qualified set of reviewers. There is always a steady stream of corrections, retractions, and withdrawals as flaws are found in work post-publication.) In a paid-for journal, the guarantee comes from printing on paper: you can check whether a paper purporting to appear in a journal actually does so by checking the appropriate volume. This of course implies printing and distribution, which publishers claim is source of their need for fees. For the St Andrews Journal of Interesting Things we want to avoid this cost.
Actually, this is technically straightforward in a number of ways. The complicated way is to create a machine-readable metadata file containing the paper’s title, authors, abstract, journal reference, associate editor in charge, and maybe some other details. We then bind this file cryptographically to the final (“camera-ready”) manuscript. The cryptography guarantees two things: that the binding was done by the journal editor, and that neither file has been changed since being bound together. Anyone downloading the file bundle can then check that the metadata and manuscript match, and therefore knows that the paper is the one “published”.
(The simple way is to add a header to the manuscript text and then cryptographically sign the resulting file. This is trivially accomplished using a tool like Adobe Acrobat Pro, but is less attractive than the metadata approach because the header isn’t machine-readable, making it harder to index the paper.)
There is no cost associated with either of these approaches. We can then give the signed file back to the authors and tell them to place it on any web server that Google will index. This will let anyone searching for the file to find it: that’s what search engines do. If we want to be really thorough, we would keep track of where the files are stored, and/or perform regular searches to locate them (easy enough given machine-readable metadata), and maintain a journal web page listing the published papers and linking to them. (Total cost: one web page.) If we want to be really thorough we can mint Digital Object Identifiers that resolve through our web server to the paper locations. (Total cost: a small database and a single CGI script on our web server.)
We’ve now recreated the publication side of the journal industry, essentially for free. This leaves the branding issue. There are two sides to establishing a brand: quality and visibility. The quality of the product, as mentioned above, relies on the selection of editorial and review teams and their willingness to serve at no cost, as is normal in academic publishing. The visibility issue is harder to crack, but could be addressed using the web, by viral marketing and appearances at conferences that editorial board members were attending, by word of mouth through the research community — and even by advertising in the paid-for journals themselves, possibly. One great thing about the web and social media is that word will get out: after that, it’s a matter of the quality of papers accepted and the willingness of authors to contribute.
I’m not actually planning on setting up a new journal. The point is that 21st century research doesn’t need the friction and costs imposed by journals whose main editorial services are provided free by their consumers anyway. We should be able to do away with these costs without sacrificing the quality of material that we read or the reliance we place upon it.