Tuesday, May 28, 2013

Risk, Odds, Hazard...More on The Language

For every 100 g of processed meat people eat, they are 16% more likely to develop colorectal cancer during their lives. For asthma sufferers, the odds of suffering an attack for those who took dupilumab in a recent trial were reduced 87% over a placebo. What does all this mean, and how do we contextualize it? What is risk, and how does it differ from hazard? Truthfully, there's several ways to compare the effects of an exposure to some drug or substance, and the only one that's entirely intuitive is the one that you're least likely to encounter unless you read the results section of a study.

When you see statistics like those above, and pretty much every story revealing results of a study in public health will have them, each way of comparing risk elicits a different kind of reaction in a reader. I'll go back to the prospective cohort study suggesting that vegetarians are 1/3rd less likely to suffer from ischemic heart disease (IHD) than those who eat meat, because I think it's such a great example of how widely the interpretations can seem based upon which metric you use. According to this study, IHD was a pretty rare event; only 2.7% out of over 44,500 individuals developed it at all. For the meat-eaters, 1.6% developed IHD vs. 1.1% in vegetarians. If you simply subtract 1.6% and 1.1%, you might intuitively sense that eating meat didn't really add that much risk. Another way of putting it is out of every 1,000 people, 16 people who eat meat will develop IHD vs. 11 for vegetarians. This could be meaningful if you were able to extrapolate these results to an entire population of say 300 million people, where 1.5 million less incidences of IHD would develop, but I think most epidemiologists would be very cautious in zooming out that far based upon one estimate from a single cohort study. Yet another way of looking at the effect is the "number needed to treat" (NNT), which refers to how many people would need to be vegetarian for one person to benefit. In this case, the answer is 20 200 (oops!). That means 199 people who decide to change their diet to cut out meat entirely wouldn't even benefit in terms of developing IHD during their lifetime.

When subtracting the two totals, the 0.5% difference is termed the "absolute risk reduction" of being a vegetarian based upon this study. However easy it is to comprehend, different measures of comparing risk are generally opted for, either relative risk or the odds ratio depending on the type of study design used.

RR1 = A/(A+B); RR2 = C/(C+D); The "RR" reported in the media =  RR1/RR2. (Good Source for background on the maths!)

Relative risk (RR) is still pretty intuitive, and is a valuable comparison in prospective cohort studies when you have follow-up data on the vast majority of your enrollees. If you use our study on vegetarians, the RR of being a vegetarian is 0.69. It's a bit awkward to interpret this as meaning that vegetarians have 69% of the risk of developing IHD as meat eaters do, so the RR is subtracted from 1 to reflect the difference in probability between the two groups. This changes the results to say that vegetarians are 31% less likely to develop IHD. This is what almost every headline ran with, but compare this value to the NNT. Which one do you think helps you visualize the expected impact more? It's two ways of interpreting the same exact data, but if you were running a drug company, and a drug you spent $300M developing just went through a clinical trial with the exact same results as the vegetarian study, which one would you use in marketing your drug? I don't think you'd want to highlight that 200 people could start using this treatment before any one of them would show the intended benefit. You get a pretty different response by saying their risk is cut by 1/3rd.

One other way to compare the effect of a treatment or exposure is the odds ratio (OR), which does a good job of approximating the RR when the outcome you're measuring is rare. The RR is a probability, meaning that it's a ratio that falls somewhere between 0 and 1. The OR is also a ratio, but is not, strictly speaking, a probability. Odds for any event can go from 0 to , and this difference becomes pronounced when the outcome is common. Case-control studies use the OR because you don't have that follow-up on specific individuals. For our study on vegetarians, using a case-control design would mean I started with data on actual cases of IHD, and went back to look at previous survey records to compare which of those people were vegetarian vs. those who ate meat. Because IHD is relatively rare, the OR found would likely be relatively similar to the RR found in the prospective cohort design. If I ran this case-control study and happened to find a difference in OR of 0.31, I'm not reducing risk by 31% by being a vegetarian like I found in the cohort study, I'm saying your odds of having IHD are 31% less if you are a vegetarian. For every 1 vegetarian that developed IHD, there are 3 meat eaters that did. It seems like it should be pretty close, but evidence pretty strongly suggests that as OR increases, it diverges more and more from the RR, and is difficult to interpret. You even see it confused for RR not just in the media, but even in academic journals

The higher the OR, the less easily you can compare it to RR (BMJ)
I don't really expect everyone to understand all this perfectly. The main takeaway is that there are multiple ways to compare people in medical studies, and they even when they report more or less the same thing, each can have a very different impact on how you react to what you're reading. There's one other key point I want to make, which is that risk and odds are dependent on exposure, which is quite different than hazard, which does not. And let's face it, we do a terrible job of teaching people how to assess risk. Generally speaking, if I were to talk to a random person on the street and ask them what risk means, I'm certain the vast majority would give me the definition of hazard.

To illustrate the difference, let's switch gears from our vegetarian study to my most recent post about the EWG's Dirty Dozen list. This is the perfect case of conflating risk and hazard. Pesticides are hazardous to people's health. Everyone acknowledges that. That is to say, if I worked on a farm and mixed the solutions of pesticides I spray on crops, I am doing something that is potentially dangerous to my health. How dangerous, though? Again, there are hazards to every pesticide we use, but the probability that glyphosate would negatively affect my health is much, much less than the probability that dieldrin would affect me. Likewise, if I were to eat an apple with a trace amount of imidacloprid, the probability of the substance's hazard affecting me is close to 0. That is risk, and it is absolutely essential to understanding this topic. Technically speaking, any time I leave my house, there is a hazard of me getting mugged (I do live in Chicago, after all). However, my risk changes depending on the time of day, the neighborhood, and whether I'm by myself or not. It's exactly the same when looking at dietary or environmental factors in your diet or your surroundings.

If you can remember this not so subtle difference, and remember that relative risk without other ways of looking at the same data might mislead, I think you have all the tools you need to better navigate the confusing landscape of healthcare related media. It's pretty impossible to take an analysis seriously without this background, and it really did change the way I look at things for the better when I learned it. I hope it does the same for you.


  1. I thought Number Needed to Treat = 1/(Absolute Risk Reduction)? In this case 1/ARR = 1/(0.5%) = 1/(0.005) = 200 (rather than 20). Looks like a error resulting from mixing decimals of percents, unless I'm incorrect in understanding NNT.

    I like your blog, and appreciate your summaries of clinical statistics.

    1. Nope, you got it. That's exactly what I did wrong. I guess if I'm going to be off by an order of magnitude because of a careless mistake, I'd rather understate my case than overstate it? Thanks for the comment!