trial2rev: seeing the forest for the trees in the systematic review ecosystem

tl;dr: we have a new web-based system called trial2rev that links published systematic reviews to included and relevant trials based on their registrations in Our report on this system has been published today in JAMIA Open. The first aim is to make it easy for systematic reviewers to monitor the status of ongoing and completed trials that are likely to be included in the updates of systematic reviews, but we hope the system will be able to do much more than just that.

Skip to the subsection trial2rev below for the details or continue reading here for the background.

Systematic reviews are facing a weird kind of crisis at the moment. For years the problem was that systematic reviews were time consuming and resource intensive, and clinical trials and studies were being published at such a rate that it was impossible to do enough systematic reviewing to keep up. Back in 2010, Hilda Bastian and colleagues wrote “Seventy-Five Trials and Eleven Systematic Reviews a Day: How Will We Ever Keep Up?“, and in the same year Fraser and Dunstan wrote “On the impossibility of being expert“. Seeing it as an automation problem, a range of different folk—including smart computer scientists and information retrieval experts—decided to try and speed up the individual processes that make up systematic reviews.

What is the problem?

Some of the most recent methods for screening articles for inclusion in a systematic review are clever and will likely be implemented to make a difference in how individual systematic reviews are performed. But we struggle to take it beyond a support role (say, as a second independent screener) because we don’t have enough training data. Most novel methods have used data from 24 or fewer systematic reviews. In a 2017 example, Shekelle and others proposed a neat new method for screening articles that might be included in a systematic review. The training and testing data that underpinned that method? It was included and excluded studies from just 3 systematic reviews.

Back in 2012, I was writing in Science Translational Medicine about what the future of evidence synthesis might look like—doing away with cumbersome systematic reviews and creating a utopia of shared access to patient-level clinical trial data with an ecosystem of open source software tools to support the rapid incorporation of new trial data into systematic reviews at the press of a button. That obviously hasn’t happened yet. What we didn’t really think about at the time was the quality and integrity of systematic reviews. That there would be so many bad systematic reviews being done that it might drown out the good ones while still leaving other clinical questions un-synthesised. Fast-forward to 2016 and we have at least one person suggesting that the vast majority of systematic reviews being published are unnecessary, redundant, or misleading.

So there are two major problems here. First, the more resources we throw at systematic reviews and the better we get at automating the individual processes that go into them, the more likely it is that systematic reviewers will continue to flood the space with reviews that are unnecessary, redundant, or misleading. Second, with so few training and testing examples being used and no one openly sharing data linking systematic reviews to included/excluded trial articles en masse, I don’t think we will ever get to the place we need to be to truly transform the way we do systematic reviews and stem the tide of garbage being published.

Can we fix it?

There have been some efforts aimed at helping systematic reviewers decide if and when they should actually do a systematic review. There was even a useful tool developed in 2013 that uses empirical data from lots of old systematic reviews and their updates to try and guess whether a systematic review is likely to need updating.

Back in 2005, Ida Sim absolutely had the right idea when she proposed a global trial bank for reporting structured results, and even did some work on automatically extracting key information from trial reports.

Structured and machine-readable representations of trials and tools for knowing when to update an individual systematic review are a good start, but here’s what else I think needs to be done:

  • The design and development of new and more general empirical methods for predicting which systematic reviews actually need to be updated based on likelihood of conclusion change.
  • New journal policies that integrate the use and reporting of these tools, which will hopefully reduce the number of bad systematic reviews being published, or at least be used to assess and downgrade journals that publish too many unnecessary and redundant systematic reviews.
  • Broad release of clinical trial results data where the outcomes are properly mapped to standardised sets of clinical outcomes, and eventually to include patient-level data made accessible in privacy-protecting ways.
  • Then finally and most importantly, full structured representations not just of trials and their results, but of systematic reviews that include detailed and standardised representations of inclusion and exclusion criteria, and lists of links to the included and excluded studies.

We should end up with completely transparent and machine-readable data covering everything from the details of the systematic reviews linked to the trials they included (or excluded) down to the individual participants used to synthesise the results. With this, we can skip over the band-aid solution of automating individual systematic review processes and move to a new phase in evidence synthesis where trial results are immediately incorporated into a smaller number of trustworthy, complete, living, and useful systematic reviews.


So to give this a bit of a nudge in the right direction, we created a shared space for humans and AI to work together to fill in some of the gaps and take advantage of a set of simple tools that emerge from these data … and then we give it all away for free—even including the code for the platform itself.

We designed the trial2rev system to do two things. First, we wanted it to be simple and fast enough so that systematic reviewers can use it to track the status of ongoing and completed trials that might be relevant to the published systematic reviews they want to track; without wasting their time or requiring special expertise.

Second, we designed it to provide access to lots of machine-readable information about trials included in systematic reviews for use in training and testing of new methods for making systematic reviews better. And although the information is currently imperfect and noisy for most systematic reviews, we wanted to make it easy to quickly access a decent-sized sample of verified examples that are known to be complete and correct.


Figure. The user interface for engaging with a systematic review. The trial2rev system currently has partial information for more than 12,000 systematic reviews.

We also wanted the system to be efficient and avoid the duplication of effort that currently happens as systematic reviewers go about their business without properly sharing the data they produce. So in our system, when registered users go in to fix up and improve information about their own systematic reviews, everyone else immediately gets to access that information. Not only that, but the more that people use the system, the smarter the machine learning methods get at identifying relevant trials and the less work humans will need to do.

The trial2rev system also helps track similar systematic reviews by looking at the overlap in included trials. We know that for a variety of reasons systematic reviews that include the same trials can end up producing different conclusions. We also know that sometimes systematic reviews are being repeated for the same topic with the same sets of studies simply for the sake of publishing a systematic review rather than to fill a necessary gap in the literature. Our hope is that we can do a better job of monitoring how this has happened in different areas, as well as potentially develop the system as a tool for journal editors to assess the novelty of the systematic reviews manuscript submissions they receive.

The system is a prototype. It is still missing a number of things that would make it more useful, like the ability to add published reports for unregistered trials that were included in systematic reviews; the ability to add systematic review protocols and get back a list of trial registrations that are likely to be included; and bots that use active learning approaches to reduce workload in screening of trial registrations for inclusion in these new systematic reviews.

Some of these will definitely be added in the very near future. Others we have decided not to add specifically because our aim was to reduce our collective reliance on published trial reports and focus more on the use of structured trial results data.

In the meantime, you can check out the article:

P Newman, D Surian, R Bashir, FT Bourgeois, AG Dunn (2018) Trial2rev: Combining machine learning and crowd-sourcing to create a shared space for updating systematic reviewsJAMIA Open, ooy062. [doi:10.1093/jamiaopen/ooy062]

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s