It is now old news that contributions in the form of new articles and edits have slowed dramatically since the end of 2006. The change was so dramatic that it could (should) not be interpreted as a ‘saturation of knowledge’ but is much more likely to come from the more mundane – that contributors had moved on to other forms of social interaction that can consume as many hours as one is willing to give (yes, I’m probably talking about Twitter, even though the massive growth in Twitter started two years later).
So here is what the Wikipedia slow-down looks like, in the number of characters per month (the reason for characters instead of articles/words will become apparent). I have also included the distribution of sizes for Wikipedia articles content, by character.
Further evidence that the slow-down is related to the amount of time humans are willing to spend on the endeavour comes from the parallel (but not mutually exclusive) slow-down in the number of edits. What this means for Wikipedia is relatively simple:
It is unlikely that Wikipedia will be able to keep up with the vast human knowledge enterprise if it becomes stable or shrinks further.
So the question is: Should we be concerned about the slow down in a repository as important as Wikipedia? And can we learn from it when forecasting about Twitter (and other crowd-sourced streams of knowledge), or perhaps more importantly, the evidence base for biomedical literature as a whole?
So let’s now look at another very large repository of well-linked knowledge, called PubMed. PubMed is a reasonable reflection of modern biomedical knowledge as a whole but only around 10% of that knowledge is actually fully accessible for free (via PubMed Central, for example). I used an even smaller subset (the Open Access subset of PubMed Central) to estimate the median size of PubMed articles and use that highest peak you can see (because the other peaks are for articles where only the pdf is available and are therefore not a true reflection of the length).
So which of Twitter (a stream of mostly redundant information, with super-linear growth) or Wikipedia (a curated repository with very little redundant information, and slowing growth) is more likely to resemble a fast-forwarded version of biomedical literature? Is biomedical literature a stream or a repository? Does it contain a lot of redundant information or a little? Is it likely to stabilise in the future once we deal with the issue of information overload? Or perhaps, does the production of biomedical literature operate under completely separate set of principles to other modern streams of knowledge by virtue of it’s place in history and perceived importance?
These are not rhetorical questions. If you have an idea about this, tweet me at @adamgdunn because I’d love to hear what you think.