5.2. Study limitations, risk of bias
5.2.1. Risk of bias assessment of individual studies should be done using a validated assessment tool
For the quality assessment of individual studies, specific tools were selected by the KCE.
There is no agreed KCE tool for observational studies. At this moment, we limit ourselves to a number of elements that need to be verified when looking at observational studies. There are a large number of assessment tools, but in the scientific community there is considerable disagreement on what items really matter. Moreover, observational studies are way more diverse then RCTs.
Study limitations in observational studies as identified by GRADE are:
- Failure to develop and apply appropriate eligibility criteria (inclusion of control population);
- Under- or overmatching in case-control studies;
- Selection of exposed and unexposed in cohort studies from different populations;
- Flawed measurement of both exposure and outcome;
- Differences in measurement of exposure (e.g., recall bias in case-control studies);
- Differential surveillance for outcome in exposed and unexposed in cohort studies;
- Failure to adequately control confounding;
- Failure of accurate measurement of all known prognostic factors;
- Failure to match for prognostic factors and/or lack of adjustment in statistical analysis;
- Incomplete follow-up.
5.2.2. Moving from individual risk of bias to a judgment about rating down for risk of bias across a body of evidence
Moving from risk of bias criteria for each individual study to a judgment about rating down for risk of bias across a group of studies addressing a particular outcome presents challenges. GRADE suggests the following principles:
- First, in deciding on the overall quality of evidence, one does not average across studies (for instance if some studies have no serious limitations, some serious limitations, and some very serious limitations, one does not automatically rate quality down by one level because of an average rating of serious limitations). Rather, judicious consideration of the contribution of each study, with a general guide to focus on the high-quality studies, is warranted.
- Second, this judicious consideration requires evaluating the extent to which each trial contributes toward the estimate of magnitude of effect. This contribution will usually reflect study sample size and number of outcome events: larger trials with many events will contribute more, much larger trials with many more events will contribute much more.
- Third, one should be conservative in the judgment of rating down. That is, one should be confident that there is substantial risk of bias across most of the body of available evidence before one rates down for risk of bias.
- Fourth, the risk of bias should be considered in the context of other limitations. If, for instance, reviewers find themselves in a close-call situation with respect to two quality issues (risk of bias and, e.g. precision), we suggest rating down for at least one of the two.
- Fifth, notwithstanding the first four principles, reviewers will face close-call situations. They should both acknowledge that they are in such a situation, make it explicit why they think this is the case, and make the reasons for their ultimate judgment apparent.
This approach is summarized in the table below.