The gold standard for determining whether a medical treatment works is the double-blind, placebo-controlled study . With a few exceptions, new drugs must pass a number of double-blind studies to be approved by the US Food and Drug Administration (FDA); conversely, when a drug already in use for a given purpose repeatedly fails to prove effective for that purpose in double-blind studies, it is eventually discredited.
Herbs and supplements do not need FDA approval, but they too are subjected to double-blind studies, and the results of those studies, whether positive or negative, can have a major impact on the public. Therefore, when an article published in the International Journal of Epidemiology in April 2007 exposed a serious deficiency in the entire body of double-blind study literature, it struck at the very heart of evidence-based medicine. Based on the findings of this article, the results of virtually all double-blind trials must now be viewed with skepticism.
Background on Double-blind, Placebo-controlled Studies
In a double-blind, placebo-controlled study, some of the participants are given the real treatment while others are given a fake treatment designed to appear as much as possible like the real treatment (the placebo control). The assignment to real or fake treatment groups is accomplished by flipping a coin or by a computer random number generator. Both participants and researchers are kept in the dark (blind) regarding who is receiving real treatment and who is receiving fake treatment. The full technical name for these studies, “randomized double-blind, placebo-controlled trial,” incorporates all of the elements mentioned above. This expression is often abbreviated as “RCT,” or randomized controlled trial. However, this abbreviation leaves out both placebo and blinding, and therefore we shall not use it here.
The double-blind, placebo-controlled study was first conceived by German researchers in the 1950s as a way to minimize the power of suggestion and other confounding factors. By the 1960s, the medical scientific community had come to recognize that double-blind trials are the essential means of establishing treatment efficacy. However, it wasn’t until the 1970s that pharmaceuticals were routinely required to pass meaningful double-blind studies to obtain FDA approval. (Many drugs approved prior to this were “grandfathered” in, and may not work!)
At about the same time, double-blind studies of herbs and supplements began to appear sporadically in the literature, but such studies remained relatively uncommon until the late 1980s. From then on, the rate of publication of double-blind studies of natural products has grown at an astonishing pace. In the early 1990s, months could go by before a new double-blind study of a natural product was published; today, 15-20 such studies are published every week. Furthermore, while the typical study published in the early days of natural product testing involved ten or twenty participants, current studies commonly enroll more than a hundred participants. Some are much larger than that: studies of vitamin E , beta-carotene , and other antioxidants enrolled tens of thousands of participants.
Unfortunately, as natural treatments have begun to undergo systematic testing, many have failed to prove effective. For example, in the giant studies just mentioned, vitamin E and beta-carotene proved ineffective for preventing heart disease or cancer. Other stand-bys have also suffered falls from grace: garlic for high cholesterol, glucosamine for arthritis, and calcium supplements for osteoporosis, to name a few.
These negative trials have been discouraging for supporters of natural medicine. However, the article alluded to above casts doubt on these negative results—though it casts doubt on all positive results as well.
In April 2007, Danish researcher Asbjorn Hrobjartsson and his colleagues published a landmark article in the International Journal of Epidemiology . Essentially, they conducted a study of studies. Through extensive analysis of published studies augmented by interviews with some of the researchers who published those studies, Hrobjartsson documented that the vast majority of researchers who perform double-blind, placebo-controlled studies fail to carry out a central, essential task: that of determining whether the blind held firm. In another words, they did not check whether participants and observers remained unable to distinguish the real treatment from the fake treatment.
It would not have been difficult to answer this question; one can simply poll participants and observers and ask them to guess. If the guesses come out correct no more than about half the time, then it would be fair to conclude that the blind remained intact. However, researchers generally do not conduct such a poll; and, therefore, generally they do not know whether the blind remained intact.
But it is essential to know this. If blinding does not hold, if, for example, most people involved in the study figure out who’s taking placebo and who’s taking the real treatment because the real treatment is smelly, then the validity of the study is drastically and even fatally compromised.
Why Is Blinding Important?
This is a very complicated subject, discussed in detail in the article, "Why Does This Database Depend on Double-Blind Studies?" Here we will mention only one reason: the issue of observer bias.
Suppose the researchers conducting a study want to prove that an herb or supplement doesn’t work. Such bias doesn’t matter much if the study they conduct is truly double-blind; since they cannot tell who is getting the real treatment and who is getting the placebo, the outcome of the study is insulated from their preferences. But once the blind is broken, this protection disappears. For example, researchers disinclined to believe that glucosamine can help arthritis may unconsciously underestimate or underreport benefits in patients that they know are taking glucosamine. This in turn could lead to a false outcome in the study, an apparent failure of a treatment that actually works.
Of course, this problem cuts the other way as well. If researchers are biased in favor of the natural product (or drug) being tested, their predilections will likely cause them to see benefits not actually present; this is only human nature. However, when researchers do not know who is getting the treatment and who is getting the placebo, their bias is frustrated. (Of course, bias can still get its hand in through manipulation of statistics or outright dishonesty, but that is a different subject.)
The problem of observer bias is just one of the confounding factors that double-blinding forestalls. And yet, in their article, Hrobjartsson and colleagues found that most researchers do not bother to poll their study participants to see if the blind held. To make matters even worse, among those who did conduct such a poll, less then half actually found that the blind had been maintained! This means that over half of all published double-blind studies are quite possibly invalid.
In one sense this is cheery news for supporters of natural medicine: all the recent negative studies regarding natural products may have been flawed by a broken blind. On the other hand, the very same research deficiency identified by Hrobjartsson means that positive studies involving natural products may be invalid as well.
In either case, this shows that the gold standard for medical research has become somewhat tarnished. It’s time to polish it up.