Maureen Hannley, PhD, is currently Chief of the Research Division of the Department of Otolaryngology and Communication Sciences at Medical College of Wisconsin and Research Consultant for the Triological Society. She formerly served as the Chief Research Officer of the American Academy of Otolaryngology-Head and Neck Surgery Foundation and has held positions at the National Institutes of Health, Stanford University Medical School, and Arizona State University.
Explore This Issue
September 2008Do you cringe when another journal comes in, only to join the growing stack of still-unread back issues? Every clinician is familiar with the losing battle faced with the information overload regularly imposed by the monthly arrival of the latest issues of journals: Laryngoscope; Otolaryngology-Head and Neck Surgery; Otology and Neurotology; International Journal of Pediatric Otorhinolaryngology; American Journal of Rhinology; Archives of Otolaryngology; American Journal of Otolaryngology-Head and Neck Surgery…the list goes on and on, depending on practice specialty and/or membership in senior societies. This problem is not limited to otolaryngology: Some 20,000 biomedical periodicals publish around six million articles annually; this is supplemented by 17,000 biomedical books published each year-many of which are out of date before they even make it into print.1
It is not difficult to understand why most practitioners of medicine or science find that keeping current falls into the someday category-and stays there. A quick search of PubMed recovered 47,000 references to head and neck cancer and more than 90,000 articles related to ear, nose, or throat surgery. Most clinicians will scan through a journal’s table of contents searching for papers relevant to their interests or practice focus.
Selecting carefully, then keeping a few key principles and features in mind as you read, will help you become a more critical reader, decide which papers have information reliable and valid enough to apply to your practice, and make better use of your valuable time.
Types of Studies
Clinical research papers can first be viewed as falling into a hierarchy of value, anchored on the low side by expert opinion and retrospective chart reviews or case series, and on the high side by randomized clinical trials or meta-analyses/systematic reviews of clinical trials. In between those two extremes, ranging from the highest value to the lowest, are: (1) cohort studies; (2) case-control studies; (3) cross-sectional studies; (4) practice patterns and demographic surveys; and (5) retrospective database analyses. Whereas randomized controlled clinical trials are an experimental study design, the cohort, case-control, and cross-sectional studies are examples of observational or analytical designs. Demographic surveys and database analyses would be categorized as descriptive studies.
The strategies that most people use to decide which paper to read are usually based on the title of the article; the author(s) and their institutions; or a quick review of the abstract. Unfortunately, none of these strategies is completely dependable in guiding the reader to a high-quality study. Titles may be uninformative or misleading (e.g., One gene: four syndromes, Nature 1994;367:319; Zinc: the neglected nutrient, Am J Otol 1989;10:56-60). Selecting by authors usually entails calling on the reputation of the authors and their institutions or placing greater emphasis on multi-institutional studies, neither of which guarantees a well-designed, well-conducted study. Although the structured abstract is now in common use, it does not allow for a critical examination of the analysis and is insufficient for drawing clinical conclusions apart from those offered by the author(s). It does, however, contain more information than one can derive from selecting on the basis of title or author alone.
Assessing the Major Parts of a Paper
Papers appearing in most peer-reviewed journals will have four major sections: the background or introduction; the methods; the results; and the discussion or conclusions. These sections should work together to allow you, in the end, to assess the three major attributes of a scholarly paper: its validity, its importance, and its reliability. A schematic depicting the sections of a paper, with their respective attributes, is presented in Figure 1.
Introduction and Background
First, are the clinical question of interest and the gap in knowledge the study is intended to address clearly identified in the beginning of the paper? Is the objective of the study or the hypothesis to be tested clearly stated? The hypothesis can be structured as a null hypothesis (which is the formal basis for testing statistical significance) or as an alternative hypothesis. Is the relevant literature fairly reviewed, up-to-date, with opposing viewpoints identified in an objective manner? If there is a prevailing conflict about the subject in the field, is there a statement about how this study may help to resolve it? This section’s purpose is to establish the relevance of the study for the reader and to review the past and current beliefs about the subject.
Methods and Procedure
Keeping in mind the study objectives and clinical question, the first question to ask is whether the study design is appropriate to achieve those objectives. A summary of the main categories of studies and the most appropriate study designs appears in Table 1. Next, you would want to know the sample size and how it was calculated. An adequate sample size, based on a power analysis, will be capable of detecting a true effect if one exists. Related to this issue is the sampling method: Was it a random sample, consecutive patients, patients with particular characteristics (in which case inclusion and exclusion criteria should be specified), and so on? A good paper will always indicate not only the sample size, but also the length of the study (including recruitment period, study period, and follow-up period), and completion rate-that is, the percentage of patients in the original group that completed the follow-up. A completion rate of 80% or more indicates a study with reasonable validity.2 The variables of interest should be specified, as well as the tools or instruments used to study them.
The procedure section should contain enough detail so that another person could replicate the study in its entirety. This doesn’t mean that this section should constitute a procedure manual; rather, details that could influence the outcome should be given, including type/approach of surgery, stimulation and recording parameters of electrophysiological tests, title and validation status of outcome or quality-of-life instruments, key laboratory tests or procedures, and any nonstandard diagnostic or intervention procedures. In this section, the astute reader will also be alert to potential sources of bias in the design and execution of the study, such as selection bias, subject recall bias, measurement bias, investigator bias, and the like. The purpose of the methods and procedure section is to allow the reader to assess the quality and validity of the study. If there are major flaws in the procedure, the results and conclusions are inevitably called into question.
Results
Arguably the most important question that will guide your assessment of the results section is whether the results reported are valid. In order to reach this conclusion, look for several features that will allow assessment of the power of the study and the precision with which it was conducted. The results section should begin with a description of the demographic features of the subjects: their numbers, age, gender, and ethnic distribution. Accompanying this information, measures of central tendency (mean or median), the variance (standard deviation), and the range of any baseline numerical data (e.g., tumor size, pure tone average, apnea/hypopnea score, etc.) should appear. If a disease entity is being studied, look for an operational definition of that disease, such as For purposes of this study, chronic rhinosinusitis was defined as…. The same is true if levels of severity are an operational factor. If there is a comparison group-normal, different condition, different demographic, or different intervention-there should be some statement of how closely they were matched and whether their prognosis was the same at the beginning of the study. Pre- and post-intervention results are reported, along with complication rates and adverse events.
The key results should be presented in either tabular or graphic form, highlighting important points in the text. Once again, measures of central tendency, variance, and range should be provided and the type of statistical test (t-test, chi square, ANOVA, Mann-Whitney U test, etc.) used to determine statistical significance specified. Usually 0.05 is stipulated as the de facto level of significance; bear in mind, however, that statistical significance may not necessarily denote clinical significance. To paraphrase Gertrude Stein, A difference, to be a difference, must make a difference. Significance values can be influenced by sample size (the larger the sample size, the greater the likelihood of finding a significant effect), the variance (the larger the variance, the smaller the likelihood of finding a significant effect), and the method of collecting the data.
Discussion and Conclusions
Having examined the results as described and presented graphically, you will have formed your own opinion as to what conclusions can be drawn. Does the author reach the same conclusions-or are they overstated, or even erroneous? Insufficient follow-up time is an important source of error in assessing treatment effects. Have alternative explanations for the results been considered? Has the central hypothesis been accepted or rejected by the results of the study? What is the clinical significance of this finding? Where should research go from this point?
On a more personal basis, the reader will want to ask whether the reported results can be applied to the patients in his or her own practice. Were the study patients similar to the practice patients in demographics and case mix? Were all clinically important outcomes, or those that would be important in the practice considered? Were the treatment benefits worth the likely costs and risks? The issues of validity and relevance are key in determining a paper’s clinical value to you and to the field.
The Bottom Line
Five simple questions will help you make a quick assessment of the quality of a paper:3
- Does the study have internal validity-was it designed so you can trust the findings?
- Does the study have external validity-was it designed so you can generalize the findings to your patients?
- Is the study important-what was the magnitude of the effect?
- Was the study reliable-if the study was repeated, is it likely that the same or similar results would be obtained?
- Was systematic bias avoided or minimized?
The time spent identifying high-quality papers, whether for teaching purposes, journal clubs, use as background for other papers or grant applications, or for application to clinical practice, will be rewarded by more efficient critical reading, more dependably relevant to your clinical interests-and perhaps will improve your own publications!
References
- Guyatt G, Rennie D (eds). Users’ Guides to the Medical Literature. Chicago: AMA Press, 2002.z
[Context Link] - Straus SE, Richardson WS, Glasziou P, Haynes RB. Evidence-based Medicine. How to Practice and Teach EBM (3rd ed). Edinburgh: Elsevier Churchill Livingstone, 2005.
[Context Link] - Hurley Research Center. Critical appraisal of the medical literature. www.hurleymc.com/upload/docs/Medical%20Literature.ppt#256,1 .
[Context Link]
©2008 The Triological Society