How to predict the conclusion of a review without even reading it…

Short version: We published a new article in the Journal of Clinical Epidemiology all about selective citation in reviews of neuraminidase inhbitors – like Tamiflu and Relenza.

Lots of reviews get written about drugs (especially the ones that get prescribed often), and the drugs used to treat and prevent influenza are no exception. There are more reviews written than there are randomised controlled trials, and I think it is hard to justify why doctors and their patients would need so many different interpretations of the same evidence. When too many reviews are written in this way, I like to call it “flooding”.

The reason for why there are so many reviews written probably has something to do with a problem that has been written about many times over by people much more eloquent than I am: when marketing is disguised as clinical evidence.

We recently undertook some research to try and understand how authors of reviews (narrative and systematic) manage to come up with conclusions that appear to be diametrically opposed. For the neuraminidase inhibitors (e.g. Tamiflu, Relenza), conclusions in reviews varied from recommending the early use in anyone who looks unwell, or massive global stockpiling for preventative use, to others who question the very use of the drug in clinical practice and raise safety concerns. We hypothesised that one of the ways these differences could manifest in reviews was through something called selective citation bias.

Selective citation bias happens when authors of reviews are allowed to pick and choose what they cite in order to present the evidence in ways that fit their predetermined views. And of course, we often associate this problem with conflicts of interest. This has in the past led to drugs being presented as safe and effective (repeatedly) when they simply aren’t.

By the way, here’s a picture of approximately where I am right now while I’m writing this quick update. I’m on a train between Boston and NYC in the United States, passing through a place called New Haven.

train

To test our hypothesis about selective citation bias, we did something quite new and unusual using the citation patterns among the reviews of neuraminidase inhibitors. We looked at 152 different reviews published since 2005, as well as the 10,086 citations in the reference lists pointing at 4,574 unique articles. Two members of the team (Diana and Joel) graded the reviews as favourable or otherwise, and when they both agreed that the review presented the evidence favourably, we put that in the favourable pile. The majority of reviews (61%) ended up in this group.

We then did two things: we undertook a statistical analysis to see if we could find articles that were by themselves much more likely to be cited by favourable reviews. And we constructed a bunch of classifiers using supervised machine learning algorithms to see how well we could predict which reviews were favourable by looking only at the reference lists of the articles.

What we found was relatively surprising – we could predict a favourable conclusion with an accuracy of 96.5% (in sample) only using the reference lists, and without actually looking at the text of the review at all.

A further examination of the articles that were most useful (in combination) for predicting the conclusions of the reviews suggested that the not-favourable pile tended to cite studies about viral resistance much more often than their favourable counterparts.

What we expected to find, but didn’t, was that industry-funded studies would be over-represented in favourable reviews. To me, the lack of a finding here means that the method we devised was probably better at finding what was “missing” from the reference lists of the majority rather than what is over-represented in the majority. The maths on this makes sense too.

So we think that applying machine learning to the metadata from published reviews could be useful for editors trying to review new narrative reviews. More importantly, when faced with multiple reviews that clearly disagree with each other, these methods could be used to help identify what’s missing from reviews in order to restore some balance in the representation of primary clinical evidence in things like reviews and guidelines.

Should we ignore industry-funded research in clinical medicine?

A quick update to explain our most recent editorial [pdf] on evidence-based medicine published in the Journal of Comparative Effectiveness Research. It’s free for anyone to access.

What do we know?

Industry funded research does not always lead to biases that are detrimental to the quality of clinical evidence, and biased research may come about for many reasons other than financial conflicts of interest. But there are clear and strong systemic differences in the research produced by people who have strong financial ties to pharmaceutical companies and other groups. These differences have in the past been connected to problems like massive opportunity costs (ineffective drugs) and widespread harm (unsafe drugs).

[Spoiler alert]

What do we think?

Our simple answer is no, we don’t think that industry-funded research should be excluded from comparative effectiveness research. To put it very simply, around half of the completed trials undertaken each year are funded by the industry, and despite the overwhelming number of published trials we see, we still don’t have anywhere near enough of them to properly answer all the questions that doctors and patients have when trying to make decisions together.

Instead, we think improvements in transparency and access to patient-level data, the surveillance of risks of bias, and new methods for combining evidence and data from all available sources at once are much better alternatives. You can read more about all of these in the editorial.

Bonus:

Also, check out the new article from our group on automated citation snowballing published in the Journal of Medical Internet Research. It forms the basis of a recursive search and retrieval method that finds peer-reviewed articles online, downloads them, extracts the reference lists, and follows those links to find and retrieve articles recursively. It is particularly interesting because it can automatically construct citation networks back from a single paper.

Neuropsych trials involving kids are designed differently when funded by the companies that make the drugs

Over the short break that divided 2013 and 2014, we had a new study published looking at the designs of neuropsychiatric clinical trials that involve children. Because we study trial registrations and not publications, many of the trials that are included in the study are yet to be published, and it is likely that quite a few will never be published.

Neuropsychiatric conditions are a big deal for children and make up a substantial proportion of the burden of disease. In the last decade or so, more and more drugs are being prescribed to children to treat ADHD, depression, autism spectrum disorders, seizure disorders, and a few others. The major problem we face in this area right now is the lack of evidence to help guide the decisions that doctors make with their patients and their patients’ families. Should kids be taking Drug A? Why not Drug B? Maybe a behavioural intervention? A combination of these?

I have already published a few things about how industry and non-industry funded clinical trials are different. To look at how clinical trials differ based on who funds them, we often use the clinicaltrials.gov registry, which currently provides information for about 158K registered trials and is made up of about half US trials, and half trials that are conducted entirely outside the US.

Some differences are generally expected (by cynical people like me) because of the different reasons why industry and non-industry groups decide to do a trial in the first place. We expect that industry trials are more likely to look at their own drugs, the trials are likely to be shorter, more focused on the direct outcomes related to what the drug claims to do (e.g. lower cholesterol rather than reduce cardiovascular risk), and of course they are likely to be designed to nearly always produce a favourable result for the drug in question.

For non-industry groups, there is a kind of hope that clinical trials funded by the public will be for the public good – to fill in the gaps by doing comparative effectiveness studies (where drugs are tested against each other, rather than against a placebo or in a single group) whenever they are appropriate, to focus on the real health outcomes of the populations, and to be capable of identifying risk-to-benefit ratios for drugs that have had questions raised about safety.

The effects of industry sponsorship on clinical trial designs for neuropsychiatric drugs in children

So those differences you might expect to see between industry and non-industry are not quite what we found in our study. For clinical trials that involve children and test drugs used for neuropsychiatric conditions, there really isn’t that much difference between what the industry choose to study and what everyone else does. So even though we did find that industry is less likely to undertake comparative effectiveness trials for these conditions, and the different groups tend to study completely different drugs, the striking result is just how little comparative effectiveness research is being done by both groups.

journal.pone.0084951.g003 (1)

A network view of the drug trials undertaken for ADHD by industry (black) and non-industry (blue) groups – each drug is a node in the network; lines between them are the direct comparisons from trials with active comparators.

To make a long story short, it doesn’t look like either side are doing a very good job of systematically addressing the questions that doctors and their patients really need answered in this area.

Some of the reasons for this probably include the way research is funded (small trials might be easier to fund and undertake), the difficulties associated with acquiring ethics and recruiting children to be involved in clinical trials, and the complexities of testing behavioural therapies and other non-drug interventions against and with drugs.

Of course, there are other good reasons for undertaking trials that involve a single group or only test against a placebo (including safety and ethical reasons)… but for conditions like seizure disorders, where there are already approved standard therapies that are known to be safe, it is quite a shock to see that nearly all of the clinical trials undertaken for seizure disorders in children are placebo-controlled or are tested only in a single group.

What should be done?

To really improve the way we produce and then synthesise evidence for children, we really need to consider much more cooperation and smarter trial designs that will actually fill the gaps in knowledge and help doctors make good decisions. It’s true that it is very hard to fund and successfully undertake a big coordinated trial even when it doesn’t involve children, but the mess of clinical trials that are being undertaken today often seem to be for other purposes – to get a drug approved, to expand a market, to fill a clinician-scientist’s CV – or are constrained to the point where the design is too limited to be really useful. And these problems flow directly into synthesis (systematic reviews and guidelines) because you simply can’t review evidence that doesn’t exist.

I expect that long-term clinical trials that take advantage of electronic medical records, retrospective trials, and observational studies involving heterogeneous sets of case studies will come back to prominence for as long as the evidence produced by current clinical trials is hampered by compromised design, resource constraints, and a lack of coordinated cooperation. We really do need better ways to know which questions need to be answered first, and to find better ways to coordinate research among researchers (and patient registries). Wouldn’t it be nice if we knew exactly which clinical trials are most needed right now, and we could organise ourselves into large-enough groups to avoid redundant and useless trials that will never be able to improve clinical decision-making?

How about a systematic review that writes itself?

Guy Tsafnat, me, Paul Glasziou and Enrico Coiera have written an editorial for the BMJ on the automation of systematic reviews. I helped a bit, but the clever analogy with the ticking machines from Player Piano fell out of Guy’s brain.

In the editorial, we covered the state-of-the-art in automating specific tasks in the process of synthesising clinical evidence. The basic problem with systematic reviews is that we waste a lot of time and effort in trying to re-do systematic reviews when new evidence becomes available – and in a lot of cases, systematic reviews are out-of-date nearly as soon as they are published.

The solution – using an analogy from Kurt Vonnegut’s Player Piano, which is a dystopian science fiction novel in which ticking automata are able to replicate the actions of a human after observing them – is to replace the standalone systematic reviews with dynamically and automatically updated reviews that change when new evidence is available.

At the press of a button.

The proposal is that after developing the rigorous protocol for a systematic review (something that is already done), we should have enough tech so that clinicians can simply find the review they want, press a button, and have the most recent evidence synthesised in silico. The existing protocols determine which studies are included and how they are analysed. The aim is to dramatically improve the efficiency of systematic reviews and improve their clinical utility by providing the best evidence to clinicians whenever they need it.

G Tsafnat, AG Dunn, P Glasziou, E Coiera (2013) The Automation of Systematic Reviews, BMJ 346:f139

A trade to fall back on

There’s a scene from “The Vartabedian Conundrum” in Big Bang Theory where Sheldon says that his Aunt Marion gave him a stethoscope as a child because “he should have a trade to fall back on” if the theoretical physics thing didn’t work out.

Trust me, that’s relevant. But let’s go back a step first.

This year, I was lucky enough to be awarded a grant from the National Health and Medical Research Council (NHMRC). A modest project, but an exciting one. It was the first time I had applied for funding from the NHMRC. Also during this year, I proposed something interesting and unusual to the Australian Research Council (ARC) in the form of a Discovery Early Career Research Award (DECRA). It was knocked back. It was the second time I had applied to the ARC for funding and my affiliation is the Centre for Health Informatics in the faculty of Medicine at UNSW.

Amongst research grant candidates from medical faculties across Australia, I wasn’t alone. Candidates and projects from medicine are only very rarely funded by the ARC. This is after their projects have been deemed “non-medical” and have been reviewed at much cost to the ARC. I suspect it may be a much more efficient use of taxpayers money if anyone with “health” or “medicine” in their affiliation is simply told not to apply to the ARC, even if their project is entirely non-medical.

Which is quite sad. Let me explain why.

It’s not the money. The money we can get from the NHMRC. And it’s not about being rejected. That’s to be expected when the likelihood is well under 20%. It’s the cross-fertilisation of disciplines that will be eroded by stopping brilliant researchers who currently work in medical domains (and there are plenty) from attempting projects that cross the disciplinary divide.

And I happen to know a bit about multi-disciplinary cross-fertilisation.

I’ve made (the start of) a career out of translating methods from physics into ecology, physics into medicine, theoretical computer science into ecology, sociology into ecology, ecology into medicine, theoretical computer science into sociology, and theoretical computer science into medicine. This sort of thing works so well precisely because fresh perspectives bring with them new solutions to old problems.

So why don’t we translate anything from medicine into other disciplines? Is it because, as Sheldon suggests, medicine is purely a trade to be applied? To suck up all the good research from elsewhere into an “applied” sinkhole?

I don’t think so.

There are plenty of examples of theory-driven research in medicine that could be translated into other disciplines. As an example, here’s the title of the unsuccessful grant I submitted:

“Polarisation and consensus: A computational investigation in networks of social influence and opinion dynamics”

The grant was supposed to be all about understanding how harmful opinions (think climate change denial, anti-vaccination, violent extremism) can propagate through networks with particular patterns (think of the introspective and disconnected networks that separate political ideals, for example). We already know a lot about how clinical evidence flows into decision-making for better or worse, and how distortions can create unusual consensus, so there’s a lot that we can do in the area.

Well, could have.

Instead I’ll be working on this project:

“Using collaboration networks to measure bias and inefficiency in the production and translation of evidence about cardiovascular risk”

Of course, I’m not unhappy. This is an interesting and exciting project, and many worthy projects go unfunded across the Category 1 spectrum. I’m just disappointed that the structure of the ARC and NHMRC continue to drive larger and larger wedges between medicine and other disciplines.

How long does it take for new prescription drugs to become mainstream?

You probably don’t want to hear your doctor proclaiming “I’m so indie, I was prescribing that *way* before it was cool.” Or maybe you do?

If you’re a hipster and you really need to know when a particular band stops being underground and teeters on the edge of being mainstream so you can only like it in the cool ironic way, then you would want to know how quickly it passes from the early adopters into the hands of the mainstream.

It’s the same for prescribed drugs in primary care – we want to know how long it takes for new prescription drugs to become part of mainstream practice.

tl;dr – we had a go at working out how long it takes for prescription drugs to be fully adopted in Australia, and published it here.

We already know quite a lot about how an individual prescriber makes a decision to change his or her behaviour for a new drug on the Pharmaceutical Benefits Scheme (PBS). Sometimes it has something to do with the evidence being produced in clinical trials and aggregated in systematic reviews. But often it is all about what the prescriber’s colleagues are saying and the opinions of influential people and companies.

It’s evidence of social contagion. And it’s been shown to be important for innovations in healthcare.

What we haven’t seen are good models for describing (or even better, predicting) the rate of adoption within a large population the size of a country. So in a new paper in BMC Health Services Research I wrote about a well-studied model and its application to prescription volumes in Australian general practice. Together with some of my more senior colleagues, I applied a simple model to over a hundred drugs that were introduced to Australia since ‘96.

It turns out that, in Australia, your average sort of drug takes over 8 years to reach a steady level of prescriptions.

The model is arguably too simple. It assumes an initial external ‘push’, which falls away as social contagion grows. The problem is that these external pushes don’t all happen at once when a new drug is released onto the PBS but more likely exist as a series of perturbations which correspond to new evidence, new marketing efforts, other drugs, and changes and restrictions. So while the model produces some very accurate curves that correspond to the adoption we have seen historically, it wouldn’t be particularly good at predicting adoption based on some early information.

For that, I think we need to create a strong link between the decision-making of individuals and the structure of the network through which diffusion of information and social contagion flows. I’ve started something like this already. I think we still have quite a way to go before we can work out why some drugs take over a decade and some are adopted within a couple of years.