Researchers call for tougher standards for studies on obesity policies

When a new park is built, a tax is instituted on fast food, or a ban put in place against soft drinks in a school, public health researchers must often rely on "after the fact" observational studies to evaluate the impact of such efforts on rates of obesity in a particular population and try to clearly identify and measure the factors that worked or didn't.

Such “natural experiments” are distinct from randomized controlled trials, the gold standard experiment in which groups are randomly selected to be part of an intervention or not.

In a summary of findings published yesterday in the Annals of Internal Medicine, Johns Hopkins researchers report that a review of nearly 300 studies purporting to evaluate the impact of antiobesity programs, policies, and environmental changes suggests that while natural experiments have a lot to offer, the field needs to agree on some stricter standards for assessing the effects of broad public health policies.

"Our findings about the strengths and weaknesses of natural experiments are in part a call to action for better standards," says Wendy Bennett, MD, MPH, associate professor of medicine at the Johns Hopkins University School of Medicine. "We conclude that researchers need to use stronger study designs and more standardized reporting methods so that we can better identify policies and programs that work in our schools and communities.”

Bennett says an estimated 1.9 billion people worldwide are overweight or obese, a condition contributing to epidemics of heart disease, diabetes, high blood pressure and cancer. The financial costs of obesity are high, but so also are public health interventions, she notes.

In an effort to get a better view of what works and doesn't work, Bennett and her colleagues identified 294 studies, including 156 (53 percent) natural experiments, reporting the effects of anti-obesity programs and policies. Of the 294 studies, 118 (40 percent) were classified as standard experimental studies, and 20 (7 percent) had unclear study designs.

For each study, two reviewers independently extracted data on the population characteristics, data sources, measures, analytical methods and risks of bias in study design and outcomes.

"It's important that we know what researchers are doing, what types of data they're using and what they're measuring when they're asking these questions about obesity interventions," says Bennett. "Then we can ask how we can improve on that so we have valid and trustworthy results."

Most studies, Bennett's team found, used large data sources such as national or statewide surveys, including the California Healthy Kids Survey and the National Health and Nutrition Examination Survey in adults. However, even when multiple databases were used, they were rarely similar or linked together.

Outcomes studied in the natural experiments included population-wide changes in body weight, body mass index (BMI), physical activity, caloric intake, fruit and vegetable intake, sugar-sweetened beverage intake, and fast-food frequency. In natural experiments, the most common study design, used in more than a third of the studies, was to compare a group exposed to an intervention, such as a ban on sweetened school drinks, with a group not exposed to it. Another 31 percent of natural experiments compared the same population before and after a policy, program, or environment change, most commonly only at two time points.

Of the classical experimental studies, 55 percent were rated as having a strong study design, and 61 percent adequately addressed confounding variables, the idea that some characteristics about a population or policy were not considered in the design of the study and could have a hidden effect on the results. Overall, Bennett's team found, natural experiments often (64 percent) fell short in these areas.

"We often didn't see comparisons between exposed and unexposed groups to see if they were really appropriate to compare," says Bennett. "If one community or school had a policy change and the other didn't, were those communities or schools really comparable to begin with? We can't always tell from the study report."

Only 28 percent of natural experiments, for instance, were rated "strong" when it came to participant selection bias, and only 44 percent were considered "strong" when it came to addressing confounding variables. Overall, nearly 80 percent of natural experiments were rated as having a high risk of bias due to one or more study design factors.

"When researchers publish results, they need to do so in a way that makes clear what they measured and what the risk of bias is," says Bennett. "We need to establish better standards around reporting for natural experiment studies."

The study was based on research conducted under contract to the Agency for Healthcare Research and Quality (AHRQ). Funding was provided by the AHRQ and National Institutes of Health.

Bennett adds that public health researchers working in this area also need to improve data collection methods—self-reporting phone or paper surveys are notoriously weak, for example—and enhance the links between existing data sources; train future researchers in more depth when it comes to methods for natural experiments; and design a central clearinghouse for studies.

"Educators have the What Works Clearinghouse where you can go in and see what teaching methods and curricula are most effective," says Bennett. "For obesity control and prevention, we need something similar to catalog and critique the studies that are being done. We want to know which are the best studies showing effective policies and programs to reduce and prevent obesity in our communities."