Affiliations
Department of Radiology, University of Pennsylvania School of Medicine, Philadelphia, Pennsylvania
Given name(s)
Stephanie
Family name
Diperna
Degrees
MD

Chest Radiograph Interpretation

Article Type
Changed
Sun, 05/21/2017 - 18:10
Display Headline
Impact of clinical history on chest radiograph interpretation

The inclusion of clinical information in diagnostic testing may influence the interpretation of the clinical findings. Historical and clinical findings may focus the reader's attention to the relevant details, thereby improving the accuracy of the interpretation. However, such information may cause the reader to have preconceived notions about the results, biasing the overall interpretation.

The impact of clinical information on the interpretation of radiographic studies remains an issue of debate. Previous studies have found that clinical information improves the accuracy of radiographic interpretation for a broad range of diagnoses,[1, 2, 3, 4] whereas others do not show improvement.[5, 6, 7] Additionally, clinical information may serve as a distraction that leads to more false‐positive interpretations.[8] For this reason, many radiologists prefer to review radiographs without knowledge of the clinical scenario prompting the study to avoid focusing on the expected findings and potentially missing other important abnormalities.[9]

The chest radiograph (CXR) is the most commonly used diagnostic imaging modality. Nevertheless, poor agreement exists among radiologists in the interpretation of chest radiographs for the diagnosis of pneumonia in both adults and children.[10, 11, 12, 13, 14, 15] Recent studies have found a high degree of agreement among pediatric radiologists with implementation of the World Health Organization (WHO) criteria for standardized CXR interpretation for diagnosis of bacterial pneumonia in children.[16, 17, 18] In these studies, participants were blinded to the clinical presentation. Data investigating the impact of clinical history on CXR interpretation in the pediatric population are limited.[19]

We conducted this prospective case‐based study to evaluate the impact of clinical information on the reliability of radiographic diagnosis of pneumonia among children presenting to a pediatric emergency department (ED) with clinical suspicion of pneumonia.

METHODS

Study Subjects

Six board‐certified radiologists at 2 academic children's hospitals (Children's Hospital of Philadelphia [n = 3] and Boston Children's Hospital [n = 3]) interpreted the same 110 chest radiographs (100 original and 10 duplicates) on 2 separate occasions. Clinical information was withheld during the first interpretation. The inter‐ and inter‐rater reliability for the interpretation of these 110 radiographs without clinical information have been previously reported.[18] After a period of 6 months, the radiologists reviewed the radiographs with access to clinical information provided by the physician ordering the CXR. This clinical information included age, sex, clinical indication for obtaining the radiograph, relevant history, and physical examination findings. The radiologists did not have access to the patients' medical records. The radiologists varied with respect to the number of years practicing pediatric radiology (median, 8 years; range, 336 years).

Radiographs were selected from children who presented to the ED at Boston Children's Hospital with concern of pneumonia. We selected radiographs with a spectrum of respiratory disease processes encountered in a pediatric population. The final radiographs included 50 radiographs with a final reading in the medical record without suspicion for pneumonia and 50 radiographs with suspicion of pneumonia. In the latter group, 25 radiographs had a final reading suggestive of an alveolar infiltrate, and 25 radiographs had a final reading suggestive of an interstitial infiltrate. Ten duplicate radiographs were included.

Radiograph Interpretation

The radiologists interpreted both anterior‐posterior and lateral views for each subject. Digital Imaging and Communications in Medicine images were downloaded from a registry at Boston Children's Hospital, and were copied to DVDs that were provided to each radiologist. Standardized radiographic imaging software (eFilm Lite; Merge Healthcare, Chicago, Illinois) was used by each radiologist.

Each radiologist completed a study questionnaire for each radiograph (see Supporting Information, Appendix 1, in the online version of this article). The questionnaire utilized radiographic descriptors of primary endpoint pneumonia described by the WHO to standardize the radiographic diagnosis of pneumonia.[20, 21] No additional training was provided to the radiologists. The main outcome of interest was the presence or absence of an infiltrate. Among radiographs in which an infiltrate was identified, radiologists selected whether there was an alveolar infiltrate, interstitial infiltrate, or both. Alveolar infiltrate and interstitial infiltrate are defined on the study questionnaire (Appendix 1). A radiograph classified as having either an alveolar infiltrate or interstitial infiltrate (not atelectasis) was considered to have any infiltrate. Additional findings including air bronchograms, hilar adenopathy, pleural effusion, and location of abnormalities were also recorded.

Statistical Analysis

Inter‐rater reliability was assessed using the kappa statistic to determine the overall agreement among the 6 radiologists for each outcome (eg, presence or absence of alveolar infiltrate). The kappa statistic for more than 2 raters utilizes an analysis of variance approach.[22] To calculate 95% confidence intervals (CI) for kappa statistics with more than 2 raters, we employed a bootstrapping method with 1000 replications of samples equal in size to the study sample. Intra‐rater reliability was evaluated by examining the agreement within each radiologist upon review of 10 duplicate radiographs. We used the following benchmarks to classify the strength of agreement: poor (<0.0), slight (00.20), fair (0.210.40), moderate (0.410.60), substantial (0.610.80), almost perfect (0.811.0).[23] Negative kappa values represent agreement less than would be predicted by chance alone.[24, 25] To calculate the kappa, a value must be recorded in 3 of 4 of the following categories: negative to positive, positive to negative, concordant negative, and concordant positive reporting of pneumonia. If raters did not fulfill 3 categories, the kappa could not be calculated.

The inter‐rater concordance for identification of an alveolar infiltrate was calculated for each radiologist by comparing their reporting of alveolar infiltrate with and without clinical history for each of the 100 radiographs. Radiographs that were identified by an individual rater as no alveolar infiltrate when read without clinical history, but those subsequently identified as alveolar infiltrate with clinical history were categorized as negative to positive reporting of pneumonia with clinical history. Those that were identified as alveolar infiltrate but subsequently identified as no alveolar infiltrate were categorized as positive to negative reporting of pneumonia with clinical history. Those radiographs in which there was no change in identification of alveolar infiltrate with clinical information were categorized as concordant reporting of pneumonia.

The study was approved by the institutional review boards at both children's hospitals.

RESULTS

Patient Sample

The radiographs were from patients ranging in age from 1 week to 19 years (median, 3.5 years; interquartile range, 1.66.0 years). Fifty (50%) patients were male.

Inter‐rater Reliability

The kappa coefficients of inter‐rater reliability between the radiologists across the 6 clinical measures of interest with and without access to clinical history are plotted in Figure 1. Reliability improved from fair (k = 0.32, 95% CI: 0.24 to 0.42) to moderate (k = 0.53, 95% CI: 0.43 to 0.64) for identification of air bronchograms with the addition of clinical history. Although there was an increase in kappa values for identification of any infiltrate, alveolar infiltrate, interstitial infiltrate, and pleural effusion, and a decrease in the kappa value for identification of hilar adenopathy with the addition of clinical information, there was substantial overlap of the 95% CIs, suggesting that inclusion of clinical history did not result in a statistically significant change in the reliability of these findings.

Figure 1
Inter‐rater reliability of radiologists (n = 6) evaluating chest radiographs with and without access to clinical history data in children presenting to the emergency department with suspected pneumonia (n = 100).

Intra‐rater Reliability

The estimates of inter‐rater reliability for the interpretation of the 10 duplicate images with and without clinical history are shown in Table 1. The inter‐rater reliability in the identification of alveolar infiltrate remained substantial to almost perfect for each rater with and without access to clinical history. Rater 1 had a decrease in inter‐rater reliability from almost perfect (k = 1.0, 95% CI: 1.0 to 1.0) to fair (k = 0.21, 95% CI: 0.43 to 0.85) in the identification of interstitial infiltrate with the addition of clinical history. This rater also had a decrease in agreement from almost perfect (k = 1.0, 95% CI: 1.0 to 1.0) to fair (k = 0.4, 95% CI: 0.16 to 0.96) in the identification of any infiltrate.

Intra‐rater Reliability of Radiologists With and Without Access to Clinical History While Evaluating Chest Radiographs (n = 10) for Pneumonia in Children
 Phase 1No Clinical HistoryPhase 2Access to Clinical History
Kappa95% Confidence IntervalKappa95% Confidence Interval
  • NOTE: Abbreviations: N/A, not applicable.
  • Too few categories of agreement to calculate kappa. Both responses are negative for all 10 paired radiographs; kappa cannot be calculated.
Any infiltrate    
Rater 11.001.00 to 1.000.400.16 to 0.96
Rater 20.600.10 to 1.000.580.07 to 1.00
Rater 30.800.44 to 1.000.800.44 to 1.00
Rater 41.001.00 to 1.000.780.39 to 1.00
Rater 5N/Aa 0.110.36 to 0.14
Rater 61.001.00 to 1.001.001.00 to 1.00
Alveolar infiltrate    
Rater 11.001.00 to 1.001.001.00 to 1.00
Rater 21.001.00 to 1.001.001.00 to 1.00
Rater 31.001.00 to 1.001.001.00 to 1.00
Rater 41.001.00 to 1.000.780.39 to 1.00
Rater 50.780.39 to 1.001.001.00 to 1.00
Rater 60.740.27 to 1.000.780.39 to 1.00
Interstitial infiltrate    
Rater 11.001.00 to 1.000.210.43 to 0.85
Rater 20.210.43 to 0.850.110.36 to 0.14
Rater 30.740.27 to 1.000.780.39 to 1.00
Rater 4N/A N/A 
Rater 50.580.07 to 1.000.520.05 to 1.00
Rater 60.620.5 to 1.00N/Aa 

Intra‐rater Concordance

The inter‐rater concordance of the radiologists for the identification of alveolar infiltrate during the interpretation of the 100 chest radiographs with and without access to clinical history is shown in Figure 2. The availability of clinical information impacted physicians differently in the evaluation of alveolar infiltrates. Raters 1, 4, and 6 appeared more likely to identify an alveolar infiltrate with access to the clinical information, whereas raters 3 and 5 appeared less likely to identify an alveolar infiltrate. Of the 100 films that were interpreted with and without clinical information, the mean number of discordant interpretations per rater was 10, with values ranging from 6 to 19 for the individual raters. Radiographs in which more than 3 raters changed their interpretation regarding the presence of an alveolar infiltrate are shown in Figure 3. For Figure 3D, 4 radiologists changed their interpretation from no alveolar infiltrate to alveolar infiltrate, and 1 radiologist changed from alveolar infiltrate to no alveolar infiltrate with the addition of clinical history.

Figure 2
Intra‐rater concordance of radiologists before and after access to clinical history while evaluating chest radiographs (n = 100) for alveolar infiltrate in children.
Figure 3
Chest radiographs of children in which 3 or more radiologists changed their interpretation in regard to the presence or absence of an alveolar infiltrate with the addition of clinical information. (A, B, and C) Three of 6 radiologists changed their interpretation. (D) Five of 6 radiologists changed their interpretation. (A) Female, 2 years old. (B) Male, 9 months old. (C) Male, 3 years old. (D) Male, 3 years old. The clinical history provided for (D) read as follows: “3‐year‐old male with cough and difficulty breathing. Rales at left base.”

Comment

We investigated the impact of the availability of clinical information on the reliability of chest radiographic interpretation in the diagnosis of pneumonia. There was improved inter‐rater reliability in the identification of air bronchograms with the addition of clinical information; however, clinical history did not have a substantial impact on the inter‐rater reliability of other findings. The addition of clinical information did not alter the inter‐rater reliability in the identification of alveolar infiltrate. Clinical history affected individual raters differently in their interpretation of alveolar infiltrate, with 3 raters more likely to identify an alveolar infiltrate and 2 raters less likely to identify an alveolar infiltrate.

Most studies addressing the impact of clinical history on radiographic interpretation evaluated accuracy. In many of these studies, accuracy was defined as the raters' agreement with the final interpretation of each film as documented in the medical record or their agreement with the interpretation of the radiologists selecting the cases.[1, 2, 3, 5, 6, 7] Given the known inter‐rater variability in radiographic interpretation,[10, 11, 12, 13, 14, 15] accuracy of a radiologist's interpretation cannot be appropriately assessed through agreement with their peers. Because a true measure of accuracy in the radiographic diagnosis of pneumonia can only be determined through invasive testing, such as lung biopsy, reliability serves as a more appropriate measure of performance. Inclusion of clinical information in chest radiograph interpretation has been shown to improve reliability in the radiographic diagnosis of a broad range of conditions.[15]

The primary outcome in this study was the identification of an infiltrate. Previous studies have noted consistent identification of the radiographic features that are most suggestive of bacterial pneumonia, such as alveolar infiltrate, and less consistent identification of other radiographic findings, including interstitial infiltrate.[18, 26, 27] Among the radiologists in this study, the addition of clinical information did not have a meaningful impact on the reliability of either of these findings, as there was substantial inter‐rater agreement for the identification of alveolar infiltrate and only slight agreement for the identification of interstitial infiltrate, both with and without clinical history. Additionally, inter‐rater reliability for the identification of alveolar infiltrate remained substantial to almost perfect for all 6 raters with the addition of clinical information.

Clinical information impacted the raters differently in their pattern of alveolar infiltrate identification, suggesting that radiologists may differ in their approach to incorporating clinical history in the interpretation of chest radiographs. The inclusion of clinical information may impact a radiologist's perception, leading to improved identification of abnormalities; however, it may also guide their decision making about the relevance of previously identified abnormalities.[28] Some radiologists may use clinical information to support or suggest possible radiographic findings, whereas others may use the information to challenge potential findings. This study did not address the manner in which the individual raters utilized the clinical history. There were also several radiographs in which the clinical information resulted in a change in the identification of an alveolar infiltrate by 3 or more raters, with as many as 5 of 6 raters changing their interpretation for 1 particular radiograph. These changes in identification of an infiltrate suggest that unidentified aspects of a history may be likely to influence a rater's interpretation of a radiograph. Nevertheless, these changes did not result in improved reliability and it is not possible to determine if these changes resulted in improved accuracy in interpretation.

This study had several limitations. First, radiographs were purposefully selected to encompass a broad spectrum of radiographic findings. Thus, the prevalence of pneumonia and other abnormal findings was artificially higher than typically observed among a cohort of children for whom pneumonia is considered. Second, the radiologists recruited for this study all practice in an academic children's hospital setting. These factors may limit the generalizability of our findings. However, we would expect these results to be generalizable to pediatric radiologists from other academic institutions. Third, this study does not meet the criteria of a balanced study design as defined by Loy and Irwig.[19] A study was characterized as balanced if half of the radiographs were read with and half without clinical information in each of the 2 reading sessions. The proposed benefit of such a design is to control for possible changes in ability or reporting practices of the raters that may have occurred between study periods. The use of a standardized reporting tool likely minimized changes in reporting practices. Also, it is unlikely that the ability or reporting practices of an experienced radiologist would change over the study period. Fourth, the radiologists interpreted the films outside of their standard workflow and utilized a standardized reporting tool that focused on the presence or absence of pneumonia indicators. These factors may have increased the radiologists' suspicion for pneumonia even in the absence of clinical information. This may have biased the results toward finding no difference in the identification of pneumonia with the addition of detailed clinical history. Thus, the inclusion of clinical information in radiograph interpretation in clinical practice may have greater impact on the identification of these pneumonia indicators than was found in this study.[29] Finally, reliability does not imply accuracy, and it is unknown if changes in the identification of pneumonia indicators led to more accurate interpretation with respect to the clinical or pathologic diagnosis of pneumonia.

In conclusion, we observed high intra‐ and inter‐rater reliability among radiologists in the identification of an alveolar infiltrate, the radiographic finding most suggestive of bacterial pneumonia.[16, 17, 18, 30] The addition of clinical information did not have a substantial impact on the reliability of its identification.

Files
References
  1. Berbaum KS, Franken EA, Dorfman DD, et al. Tentative diagnoses facilitate the detection of diverse lesions in chest radiographs. Invest Radiol. 1986;21(7):532539.
  2. Berbaum KS, Franken EA, Dorfman DD, Barloon TJ. Influence of clinical history upon detection of nodules and other lesions. Invest Radiol. 1988;23(1):4855.
  3. Berbaum KS, Franken EA, Dorfman DD, Lueben KR. Influence of clinical history on perception of abnormalities in pediatric radiographs. Acad Radiol. 1994;1(3):217223.
  4. Song KS, Song HH, Park SH, et al. Impact of clinical history on film interpretation. Yonsei Med J. 1992;33(2):168172.
  5. Cooperstein LA, Good BC, Eelkema EA, et al. The effect of clinical history on chest radiograph interpretations in a PACS environment. Invest Radiol. 1990;25(6):670674.
  6. Good BC, Cooperstein LA, DeMarino GB, et al. Does knowledge of the clinical history affect the accuracy of chest radiograph interpretation?AJR Am J Roentgenol. 1990;154(4):709712.
  7. Quekel LG, Goei R, Kessels AG, Engelshoven JM. Detection of lung cancer on the chest radiograph: impact of previous films, clinical information, double reading, and dual reading. J Clin Epidemiol. 2001;54(11):11461150.
  8. Eldevik OP, Dugstad G, Orrison WW, Haughton VM. The effect of clinical bias on the interpretation of myelography and spinal computed tomography. Radiology. 1982;145(1):8589.
  9. Griscom NT. A suggestion: look at the images first, before you read the history. Radiology. 2002;223(1):910.
  10. Albaum MN, Hill LC, Murphy M, et al. Interobserver reliability of the chest radiograph in community‐acquired pneumonia. PORT Investigators. Chest. 1996;110(2):343350.
  11. Bloomfield FH, Teele RL, Voss M, Knight DB, Harding JE. Inter‐ and intra‐observer variability in the assessment of atelectasis and consolidation in neonatal chest radiographs. Pediatr Radiol. 1999;29(6):459462.
  12. Gatt ME, Spectre G, Paltiel O, Hiller N, Stalnikowicz R. Chest radiographs in the emergency department: is the radiologist really necessary?Postgrad Med J. 2003;79(930):214217.
  13. Hopstaken RM, Witbraad T, Engelshoven JM, Dinant GJ. Inter‐observer variation in the interpretation of chest radiographs for pneumonia in community‐acquired lower respiratory tract infections. Clin Radiol. 2004;59(8):743752.
  14. Novack V, Avnon LS, Smolyakov A, Barnea R, Jotkowitz A, Schlaeffer F. Disagreement in the interpretation of chest radiographs among specialists and clinical outcomes of patients hospitalized with suspected pneumonia. Eur J Intern Med. 2006;17(1):4347.
  15. Tudor GR, Finlay D, Taub N. An assessment of inter‐observer agreement and accuracy when reporting plain radiographs. Clin Radiol. 1997;52(3):235238.
  16. Shimol BS, Dagan R, Givon‐Lavi N, et al. Evaluation of the World Health Organization criteria for chest radiographs for pneumonia diagnosis in children. Eur J Pediatr. 2011;171(2):369374.
  17. Cherian T, Mulholland EK, Carlin JB, et al. Standardized interpretation of paediatric chest radiographs for the diagnosis of pneumonia in epidemiological studies. Bull World Health Organ. 2005;83(5):353359.
  18. Neuman MI, Lee EY, Bixby S, et al. Variability in the interpretation of chest radiographs for the diagnosis of pneumonia in children. J Hosp Med. 2012;7(4):294298.
  19. Loy CT, Irwig L. Accuracy of diagnostic tests read with and without clinical information: a systematic review. JAMA. 2004;292(13):16021609.
  20. Standardization of interpretation of chest radiographs for the diagnosis of pneumonia in children. In:World Health Organization: Pneumonia Vaccine Trial Investigators' Group.Geneva: Department of Vaccine and Biologics;2001.
  21. Hansen J, Black S, Shinefield H, et al. Effectiveness of heptavalent pneumococcal conjugate vaccine in children younger than 5 years of age for prevention of pneumonia: updated analysis using World Health Organization standardized interpretation of chest radiographs. Pediatr Infect Dis J. 2006;25(9):779781.
  22. Landis JR, Koch GG. A one‐way components of variance model for categorical data. Biometrics. 1977;33:671679.
  23. Landis JR, Koch GG. The measurement of observer agreement for categorical data. Biometrics. 1977;33(1):159174.
  24. Juurlink DN, Detsky AS. Kappa statistic. CMAJ. 2005;173(1):16.
  25. Kramer MS, Feinstein AR. Clinical biostatistics. LIV. The biostatistics of concordance. Clin Pharmacol Ther. 1981;29(1):111123.
  26. Bartlett JG, Dowell SF, Mandell LA, File TM, Musher DM, Fine MJ. Practice guidelines for the management of community‐acquired pneumonia in adults. Infectious Diseases Society of America. Clin Infect Dis. 2000;31(2):347382.
  27. Niederman MS, Mandell LA, Anzueto A, et al. Guidelines for the management of adults with community‐acquired pneumonia. Diagnosis, assessment of severity, antimicrobial therapy, and prevention. Am J Respir Crit Care Med. 2001;163(7):17301754.
  28. Berbaum KS, Franken EA. Commentary does clinical history affect perception?Acad Radiol. 2006;13(3):402403.
  29. Berbaum KS, Franken EA. The effect of clinical history on chest radiograph interpretations in a PACS environment. Invest Radiol. 1991;26(5):512514.
  30. Korppi M, Kiekara O, Heiskanen‐Kosma T, Soimakallio S. Comparison of radiological findings and microbial aetiology of childhood pneumonia. Acta Paediatr. 1993;82(4):360363.
Article PDF
Issue
Journal of Hospital Medicine - 8(7)
Publications
Page Number
359-364
Sections
Files
Files
Article PDF
Article PDF

The inclusion of clinical information in diagnostic testing may influence the interpretation of the clinical findings. Historical and clinical findings may focus the reader's attention to the relevant details, thereby improving the accuracy of the interpretation. However, such information may cause the reader to have preconceived notions about the results, biasing the overall interpretation.

The impact of clinical information on the interpretation of radiographic studies remains an issue of debate. Previous studies have found that clinical information improves the accuracy of radiographic interpretation for a broad range of diagnoses,[1, 2, 3, 4] whereas others do not show improvement.[5, 6, 7] Additionally, clinical information may serve as a distraction that leads to more false‐positive interpretations.[8] For this reason, many radiologists prefer to review radiographs without knowledge of the clinical scenario prompting the study to avoid focusing on the expected findings and potentially missing other important abnormalities.[9]

The chest radiograph (CXR) is the most commonly used diagnostic imaging modality. Nevertheless, poor agreement exists among radiologists in the interpretation of chest radiographs for the diagnosis of pneumonia in both adults and children.[10, 11, 12, 13, 14, 15] Recent studies have found a high degree of agreement among pediatric radiologists with implementation of the World Health Organization (WHO) criteria for standardized CXR interpretation for diagnosis of bacterial pneumonia in children.[16, 17, 18] In these studies, participants were blinded to the clinical presentation. Data investigating the impact of clinical history on CXR interpretation in the pediatric population are limited.[19]

We conducted this prospective case‐based study to evaluate the impact of clinical information on the reliability of radiographic diagnosis of pneumonia among children presenting to a pediatric emergency department (ED) with clinical suspicion of pneumonia.

METHODS

Study Subjects

Six board‐certified radiologists at 2 academic children's hospitals (Children's Hospital of Philadelphia [n = 3] and Boston Children's Hospital [n = 3]) interpreted the same 110 chest radiographs (100 original and 10 duplicates) on 2 separate occasions. Clinical information was withheld during the first interpretation. The inter‐ and inter‐rater reliability for the interpretation of these 110 radiographs without clinical information have been previously reported.[18] After a period of 6 months, the radiologists reviewed the radiographs with access to clinical information provided by the physician ordering the CXR. This clinical information included age, sex, clinical indication for obtaining the radiograph, relevant history, and physical examination findings. The radiologists did not have access to the patients' medical records. The radiologists varied with respect to the number of years practicing pediatric radiology (median, 8 years; range, 336 years).

Radiographs were selected from children who presented to the ED at Boston Children's Hospital with concern of pneumonia. We selected radiographs with a spectrum of respiratory disease processes encountered in a pediatric population. The final radiographs included 50 radiographs with a final reading in the medical record without suspicion for pneumonia and 50 radiographs with suspicion of pneumonia. In the latter group, 25 radiographs had a final reading suggestive of an alveolar infiltrate, and 25 radiographs had a final reading suggestive of an interstitial infiltrate. Ten duplicate radiographs were included.

Radiograph Interpretation

The radiologists interpreted both anterior‐posterior and lateral views for each subject. Digital Imaging and Communications in Medicine images were downloaded from a registry at Boston Children's Hospital, and were copied to DVDs that were provided to each radiologist. Standardized radiographic imaging software (eFilm Lite; Merge Healthcare, Chicago, Illinois) was used by each radiologist.

Each radiologist completed a study questionnaire for each radiograph (see Supporting Information, Appendix 1, in the online version of this article). The questionnaire utilized radiographic descriptors of primary endpoint pneumonia described by the WHO to standardize the radiographic diagnosis of pneumonia.[20, 21] No additional training was provided to the radiologists. The main outcome of interest was the presence or absence of an infiltrate. Among radiographs in which an infiltrate was identified, radiologists selected whether there was an alveolar infiltrate, interstitial infiltrate, or both. Alveolar infiltrate and interstitial infiltrate are defined on the study questionnaire (Appendix 1). A radiograph classified as having either an alveolar infiltrate or interstitial infiltrate (not atelectasis) was considered to have any infiltrate. Additional findings including air bronchograms, hilar adenopathy, pleural effusion, and location of abnormalities were also recorded.

Statistical Analysis

Inter‐rater reliability was assessed using the kappa statistic to determine the overall agreement among the 6 radiologists for each outcome (eg, presence or absence of alveolar infiltrate). The kappa statistic for more than 2 raters utilizes an analysis of variance approach.[22] To calculate 95% confidence intervals (CI) for kappa statistics with more than 2 raters, we employed a bootstrapping method with 1000 replications of samples equal in size to the study sample. Intra‐rater reliability was evaluated by examining the agreement within each radiologist upon review of 10 duplicate radiographs. We used the following benchmarks to classify the strength of agreement: poor (<0.0), slight (00.20), fair (0.210.40), moderate (0.410.60), substantial (0.610.80), almost perfect (0.811.0).[23] Negative kappa values represent agreement less than would be predicted by chance alone.[24, 25] To calculate the kappa, a value must be recorded in 3 of 4 of the following categories: negative to positive, positive to negative, concordant negative, and concordant positive reporting of pneumonia. If raters did not fulfill 3 categories, the kappa could not be calculated.

The inter‐rater concordance for identification of an alveolar infiltrate was calculated for each radiologist by comparing their reporting of alveolar infiltrate with and without clinical history for each of the 100 radiographs. Radiographs that were identified by an individual rater as no alveolar infiltrate when read without clinical history, but those subsequently identified as alveolar infiltrate with clinical history were categorized as negative to positive reporting of pneumonia with clinical history. Those that were identified as alveolar infiltrate but subsequently identified as no alveolar infiltrate were categorized as positive to negative reporting of pneumonia with clinical history. Those radiographs in which there was no change in identification of alveolar infiltrate with clinical information were categorized as concordant reporting of pneumonia.

The study was approved by the institutional review boards at both children's hospitals.

RESULTS

Patient Sample

The radiographs were from patients ranging in age from 1 week to 19 years (median, 3.5 years; interquartile range, 1.66.0 years). Fifty (50%) patients were male.

Inter‐rater Reliability

The kappa coefficients of inter‐rater reliability between the radiologists across the 6 clinical measures of interest with and without access to clinical history are plotted in Figure 1. Reliability improved from fair (k = 0.32, 95% CI: 0.24 to 0.42) to moderate (k = 0.53, 95% CI: 0.43 to 0.64) for identification of air bronchograms with the addition of clinical history. Although there was an increase in kappa values for identification of any infiltrate, alveolar infiltrate, interstitial infiltrate, and pleural effusion, and a decrease in the kappa value for identification of hilar adenopathy with the addition of clinical information, there was substantial overlap of the 95% CIs, suggesting that inclusion of clinical history did not result in a statistically significant change in the reliability of these findings.

Figure 1
Inter‐rater reliability of radiologists (n = 6) evaluating chest radiographs with and without access to clinical history data in children presenting to the emergency department with suspected pneumonia (n = 100).

Intra‐rater Reliability

The estimates of inter‐rater reliability for the interpretation of the 10 duplicate images with and without clinical history are shown in Table 1. The inter‐rater reliability in the identification of alveolar infiltrate remained substantial to almost perfect for each rater with and without access to clinical history. Rater 1 had a decrease in inter‐rater reliability from almost perfect (k = 1.0, 95% CI: 1.0 to 1.0) to fair (k = 0.21, 95% CI: 0.43 to 0.85) in the identification of interstitial infiltrate with the addition of clinical history. This rater also had a decrease in agreement from almost perfect (k = 1.0, 95% CI: 1.0 to 1.0) to fair (k = 0.4, 95% CI: 0.16 to 0.96) in the identification of any infiltrate.

Intra‐rater Reliability of Radiologists With and Without Access to Clinical History While Evaluating Chest Radiographs (n = 10) for Pneumonia in Children
 Phase 1No Clinical HistoryPhase 2Access to Clinical History
Kappa95% Confidence IntervalKappa95% Confidence Interval
  • NOTE: Abbreviations: N/A, not applicable.
  • Too few categories of agreement to calculate kappa. Both responses are negative for all 10 paired radiographs; kappa cannot be calculated.
Any infiltrate    
Rater 11.001.00 to 1.000.400.16 to 0.96
Rater 20.600.10 to 1.000.580.07 to 1.00
Rater 30.800.44 to 1.000.800.44 to 1.00
Rater 41.001.00 to 1.000.780.39 to 1.00
Rater 5N/Aa 0.110.36 to 0.14
Rater 61.001.00 to 1.001.001.00 to 1.00
Alveolar infiltrate    
Rater 11.001.00 to 1.001.001.00 to 1.00
Rater 21.001.00 to 1.001.001.00 to 1.00
Rater 31.001.00 to 1.001.001.00 to 1.00
Rater 41.001.00 to 1.000.780.39 to 1.00
Rater 50.780.39 to 1.001.001.00 to 1.00
Rater 60.740.27 to 1.000.780.39 to 1.00
Interstitial infiltrate    
Rater 11.001.00 to 1.000.210.43 to 0.85
Rater 20.210.43 to 0.850.110.36 to 0.14
Rater 30.740.27 to 1.000.780.39 to 1.00
Rater 4N/A N/A 
Rater 50.580.07 to 1.000.520.05 to 1.00
Rater 60.620.5 to 1.00N/Aa 

Intra‐rater Concordance

The inter‐rater concordance of the radiologists for the identification of alveolar infiltrate during the interpretation of the 100 chest radiographs with and without access to clinical history is shown in Figure 2. The availability of clinical information impacted physicians differently in the evaluation of alveolar infiltrates. Raters 1, 4, and 6 appeared more likely to identify an alveolar infiltrate with access to the clinical information, whereas raters 3 and 5 appeared less likely to identify an alveolar infiltrate. Of the 100 films that were interpreted with and without clinical information, the mean number of discordant interpretations per rater was 10, with values ranging from 6 to 19 for the individual raters. Radiographs in which more than 3 raters changed their interpretation regarding the presence of an alveolar infiltrate are shown in Figure 3. For Figure 3D, 4 radiologists changed their interpretation from no alveolar infiltrate to alveolar infiltrate, and 1 radiologist changed from alveolar infiltrate to no alveolar infiltrate with the addition of clinical history.

Figure 2
Intra‐rater concordance of radiologists before and after access to clinical history while evaluating chest radiographs (n = 100) for alveolar infiltrate in children.
Figure 3
Chest radiographs of children in which 3 or more radiologists changed their interpretation in regard to the presence or absence of an alveolar infiltrate with the addition of clinical information. (A, B, and C) Three of 6 radiologists changed their interpretation. (D) Five of 6 radiologists changed their interpretation. (A) Female, 2 years old. (B) Male, 9 months old. (C) Male, 3 years old. (D) Male, 3 years old. The clinical history provided for (D) read as follows: “3‐year‐old male with cough and difficulty breathing. Rales at left base.”

Comment

We investigated the impact of the availability of clinical information on the reliability of chest radiographic interpretation in the diagnosis of pneumonia. There was improved inter‐rater reliability in the identification of air bronchograms with the addition of clinical information; however, clinical history did not have a substantial impact on the inter‐rater reliability of other findings. The addition of clinical information did not alter the inter‐rater reliability in the identification of alveolar infiltrate. Clinical history affected individual raters differently in their interpretation of alveolar infiltrate, with 3 raters more likely to identify an alveolar infiltrate and 2 raters less likely to identify an alveolar infiltrate.

Most studies addressing the impact of clinical history on radiographic interpretation evaluated accuracy. In many of these studies, accuracy was defined as the raters' agreement with the final interpretation of each film as documented in the medical record or their agreement with the interpretation of the radiologists selecting the cases.[1, 2, 3, 5, 6, 7] Given the known inter‐rater variability in radiographic interpretation,[10, 11, 12, 13, 14, 15] accuracy of a radiologist's interpretation cannot be appropriately assessed through agreement with their peers. Because a true measure of accuracy in the radiographic diagnosis of pneumonia can only be determined through invasive testing, such as lung biopsy, reliability serves as a more appropriate measure of performance. Inclusion of clinical information in chest radiograph interpretation has been shown to improve reliability in the radiographic diagnosis of a broad range of conditions.[15]

The primary outcome in this study was the identification of an infiltrate. Previous studies have noted consistent identification of the radiographic features that are most suggestive of bacterial pneumonia, such as alveolar infiltrate, and less consistent identification of other radiographic findings, including interstitial infiltrate.[18, 26, 27] Among the radiologists in this study, the addition of clinical information did not have a meaningful impact on the reliability of either of these findings, as there was substantial inter‐rater agreement for the identification of alveolar infiltrate and only slight agreement for the identification of interstitial infiltrate, both with and without clinical history. Additionally, inter‐rater reliability for the identification of alveolar infiltrate remained substantial to almost perfect for all 6 raters with the addition of clinical information.

Clinical information impacted the raters differently in their pattern of alveolar infiltrate identification, suggesting that radiologists may differ in their approach to incorporating clinical history in the interpretation of chest radiographs. The inclusion of clinical information may impact a radiologist's perception, leading to improved identification of abnormalities; however, it may also guide their decision making about the relevance of previously identified abnormalities.[28] Some radiologists may use clinical information to support or suggest possible radiographic findings, whereas others may use the information to challenge potential findings. This study did not address the manner in which the individual raters utilized the clinical history. There were also several radiographs in which the clinical information resulted in a change in the identification of an alveolar infiltrate by 3 or more raters, with as many as 5 of 6 raters changing their interpretation for 1 particular radiograph. These changes in identification of an infiltrate suggest that unidentified aspects of a history may be likely to influence a rater's interpretation of a radiograph. Nevertheless, these changes did not result in improved reliability and it is not possible to determine if these changes resulted in improved accuracy in interpretation.

This study had several limitations. First, radiographs were purposefully selected to encompass a broad spectrum of radiographic findings. Thus, the prevalence of pneumonia and other abnormal findings was artificially higher than typically observed among a cohort of children for whom pneumonia is considered. Second, the radiologists recruited for this study all practice in an academic children's hospital setting. These factors may limit the generalizability of our findings. However, we would expect these results to be generalizable to pediatric radiologists from other academic institutions. Third, this study does not meet the criteria of a balanced study design as defined by Loy and Irwig.[19] A study was characterized as balanced if half of the radiographs were read with and half without clinical information in each of the 2 reading sessions. The proposed benefit of such a design is to control for possible changes in ability or reporting practices of the raters that may have occurred between study periods. The use of a standardized reporting tool likely minimized changes in reporting practices. Also, it is unlikely that the ability or reporting practices of an experienced radiologist would change over the study period. Fourth, the radiologists interpreted the films outside of their standard workflow and utilized a standardized reporting tool that focused on the presence or absence of pneumonia indicators. These factors may have increased the radiologists' suspicion for pneumonia even in the absence of clinical information. This may have biased the results toward finding no difference in the identification of pneumonia with the addition of detailed clinical history. Thus, the inclusion of clinical information in radiograph interpretation in clinical practice may have greater impact on the identification of these pneumonia indicators than was found in this study.[29] Finally, reliability does not imply accuracy, and it is unknown if changes in the identification of pneumonia indicators led to more accurate interpretation with respect to the clinical or pathologic diagnosis of pneumonia.

In conclusion, we observed high intra‐ and inter‐rater reliability among radiologists in the identification of an alveolar infiltrate, the radiographic finding most suggestive of bacterial pneumonia.[16, 17, 18, 30] The addition of clinical information did not have a substantial impact on the reliability of its identification.

The inclusion of clinical information in diagnostic testing may influence the interpretation of the clinical findings. Historical and clinical findings may focus the reader's attention to the relevant details, thereby improving the accuracy of the interpretation. However, such information may cause the reader to have preconceived notions about the results, biasing the overall interpretation.

The impact of clinical information on the interpretation of radiographic studies remains an issue of debate. Previous studies have found that clinical information improves the accuracy of radiographic interpretation for a broad range of diagnoses,[1, 2, 3, 4] whereas others do not show improvement.[5, 6, 7] Additionally, clinical information may serve as a distraction that leads to more false‐positive interpretations.[8] For this reason, many radiologists prefer to review radiographs without knowledge of the clinical scenario prompting the study to avoid focusing on the expected findings and potentially missing other important abnormalities.[9]

The chest radiograph (CXR) is the most commonly used diagnostic imaging modality. Nevertheless, poor agreement exists among radiologists in the interpretation of chest radiographs for the diagnosis of pneumonia in both adults and children.[10, 11, 12, 13, 14, 15] Recent studies have found a high degree of agreement among pediatric radiologists with implementation of the World Health Organization (WHO) criteria for standardized CXR interpretation for diagnosis of bacterial pneumonia in children.[16, 17, 18] In these studies, participants were blinded to the clinical presentation. Data investigating the impact of clinical history on CXR interpretation in the pediatric population are limited.[19]

We conducted this prospective case‐based study to evaluate the impact of clinical information on the reliability of radiographic diagnosis of pneumonia among children presenting to a pediatric emergency department (ED) with clinical suspicion of pneumonia.

METHODS

Study Subjects

Six board‐certified radiologists at 2 academic children's hospitals (Children's Hospital of Philadelphia [n = 3] and Boston Children's Hospital [n = 3]) interpreted the same 110 chest radiographs (100 original and 10 duplicates) on 2 separate occasions. Clinical information was withheld during the first interpretation. The inter‐ and inter‐rater reliability for the interpretation of these 110 radiographs without clinical information have been previously reported.[18] After a period of 6 months, the radiologists reviewed the radiographs with access to clinical information provided by the physician ordering the CXR. This clinical information included age, sex, clinical indication for obtaining the radiograph, relevant history, and physical examination findings. The radiologists did not have access to the patients' medical records. The radiologists varied with respect to the number of years practicing pediatric radiology (median, 8 years; range, 336 years).

Radiographs were selected from children who presented to the ED at Boston Children's Hospital with concern of pneumonia. We selected radiographs with a spectrum of respiratory disease processes encountered in a pediatric population. The final radiographs included 50 radiographs with a final reading in the medical record without suspicion for pneumonia and 50 radiographs with suspicion of pneumonia. In the latter group, 25 radiographs had a final reading suggestive of an alveolar infiltrate, and 25 radiographs had a final reading suggestive of an interstitial infiltrate. Ten duplicate radiographs were included.

Radiograph Interpretation

The radiologists interpreted both anterior‐posterior and lateral views for each subject. Digital Imaging and Communications in Medicine images were downloaded from a registry at Boston Children's Hospital, and were copied to DVDs that were provided to each radiologist. Standardized radiographic imaging software (eFilm Lite; Merge Healthcare, Chicago, Illinois) was used by each radiologist.

Each radiologist completed a study questionnaire for each radiograph (see Supporting Information, Appendix 1, in the online version of this article). The questionnaire utilized radiographic descriptors of primary endpoint pneumonia described by the WHO to standardize the radiographic diagnosis of pneumonia.[20, 21] No additional training was provided to the radiologists. The main outcome of interest was the presence or absence of an infiltrate. Among radiographs in which an infiltrate was identified, radiologists selected whether there was an alveolar infiltrate, interstitial infiltrate, or both. Alveolar infiltrate and interstitial infiltrate are defined on the study questionnaire (Appendix 1). A radiograph classified as having either an alveolar infiltrate or interstitial infiltrate (not atelectasis) was considered to have any infiltrate. Additional findings including air bronchograms, hilar adenopathy, pleural effusion, and location of abnormalities were also recorded.

Statistical Analysis

Inter‐rater reliability was assessed using the kappa statistic to determine the overall agreement among the 6 radiologists for each outcome (eg, presence or absence of alveolar infiltrate). The kappa statistic for more than 2 raters utilizes an analysis of variance approach.[22] To calculate 95% confidence intervals (CI) for kappa statistics with more than 2 raters, we employed a bootstrapping method with 1000 replications of samples equal in size to the study sample. Intra‐rater reliability was evaluated by examining the agreement within each radiologist upon review of 10 duplicate radiographs. We used the following benchmarks to classify the strength of agreement: poor (<0.0), slight (00.20), fair (0.210.40), moderate (0.410.60), substantial (0.610.80), almost perfect (0.811.0).[23] Negative kappa values represent agreement less than would be predicted by chance alone.[24, 25] To calculate the kappa, a value must be recorded in 3 of 4 of the following categories: negative to positive, positive to negative, concordant negative, and concordant positive reporting of pneumonia. If raters did not fulfill 3 categories, the kappa could not be calculated.

The inter‐rater concordance for identification of an alveolar infiltrate was calculated for each radiologist by comparing their reporting of alveolar infiltrate with and without clinical history for each of the 100 radiographs. Radiographs that were identified by an individual rater as no alveolar infiltrate when read without clinical history, but those subsequently identified as alveolar infiltrate with clinical history were categorized as negative to positive reporting of pneumonia with clinical history. Those that were identified as alveolar infiltrate but subsequently identified as no alveolar infiltrate were categorized as positive to negative reporting of pneumonia with clinical history. Those radiographs in which there was no change in identification of alveolar infiltrate with clinical information were categorized as concordant reporting of pneumonia.

The study was approved by the institutional review boards at both children's hospitals.

RESULTS

Patient Sample

The radiographs were from patients ranging in age from 1 week to 19 years (median, 3.5 years; interquartile range, 1.66.0 years). Fifty (50%) patients were male.

Inter‐rater Reliability

The kappa coefficients of inter‐rater reliability between the radiologists across the 6 clinical measures of interest with and without access to clinical history are plotted in Figure 1. Reliability improved from fair (k = 0.32, 95% CI: 0.24 to 0.42) to moderate (k = 0.53, 95% CI: 0.43 to 0.64) for identification of air bronchograms with the addition of clinical history. Although there was an increase in kappa values for identification of any infiltrate, alveolar infiltrate, interstitial infiltrate, and pleural effusion, and a decrease in the kappa value for identification of hilar adenopathy with the addition of clinical information, there was substantial overlap of the 95% CIs, suggesting that inclusion of clinical history did not result in a statistically significant change in the reliability of these findings.

Figure 1
Inter‐rater reliability of radiologists (n = 6) evaluating chest radiographs with and without access to clinical history data in children presenting to the emergency department with suspected pneumonia (n = 100).

Intra‐rater Reliability

The estimates of inter‐rater reliability for the interpretation of the 10 duplicate images with and without clinical history are shown in Table 1. The inter‐rater reliability in the identification of alveolar infiltrate remained substantial to almost perfect for each rater with and without access to clinical history. Rater 1 had a decrease in inter‐rater reliability from almost perfect (k = 1.0, 95% CI: 1.0 to 1.0) to fair (k = 0.21, 95% CI: 0.43 to 0.85) in the identification of interstitial infiltrate with the addition of clinical history. This rater also had a decrease in agreement from almost perfect (k = 1.0, 95% CI: 1.0 to 1.0) to fair (k = 0.4, 95% CI: 0.16 to 0.96) in the identification of any infiltrate.

Intra‐rater Reliability of Radiologists With and Without Access to Clinical History While Evaluating Chest Radiographs (n = 10) for Pneumonia in Children
 Phase 1No Clinical HistoryPhase 2Access to Clinical History
Kappa95% Confidence IntervalKappa95% Confidence Interval
  • NOTE: Abbreviations: N/A, not applicable.
  • Too few categories of agreement to calculate kappa. Both responses are negative for all 10 paired radiographs; kappa cannot be calculated.
Any infiltrate    
Rater 11.001.00 to 1.000.400.16 to 0.96
Rater 20.600.10 to 1.000.580.07 to 1.00
Rater 30.800.44 to 1.000.800.44 to 1.00
Rater 41.001.00 to 1.000.780.39 to 1.00
Rater 5N/Aa 0.110.36 to 0.14
Rater 61.001.00 to 1.001.001.00 to 1.00
Alveolar infiltrate    
Rater 11.001.00 to 1.001.001.00 to 1.00
Rater 21.001.00 to 1.001.001.00 to 1.00
Rater 31.001.00 to 1.001.001.00 to 1.00
Rater 41.001.00 to 1.000.780.39 to 1.00
Rater 50.780.39 to 1.001.001.00 to 1.00
Rater 60.740.27 to 1.000.780.39 to 1.00
Interstitial infiltrate    
Rater 11.001.00 to 1.000.210.43 to 0.85
Rater 20.210.43 to 0.850.110.36 to 0.14
Rater 30.740.27 to 1.000.780.39 to 1.00
Rater 4N/A N/A 
Rater 50.580.07 to 1.000.520.05 to 1.00
Rater 60.620.5 to 1.00N/Aa 

Intra‐rater Concordance

The inter‐rater concordance of the radiologists for the identification of alveolar infiltrate during the interpretation of the 100 chest radiographs with and without access to clinical history is shown in Figure 2. The availability of clinical information impacted physicians differently in the evaluation of alveolar infiltrates. Raters 1, 4, and 6 appeared more likely to identify an alveolar infiltrate with access to the clinical information, whereas raters 3 and 5 appeared less likely to identify an alveolar infiltrate. Of the 100 films that were interpreted with and without clinical information, the mean number of discordant interpretations per rater was 10, with values ranging from 6 to 19 for the individual raters. Radiographs in which more than 3 raters changed their interpretation regarding the presence of an alveolar infiltrate are shown in Figure 3. For Figure 3D, 4 radiologists changed their interpretation from no alveolar infiltrate to alveolar infiltrate, and 1 radiologist changed from alveolar infiltrate to no alveolar infiltrate with the addition of clinical history.

Figure 2
Intra‐rater concordance of radiologists before and after access to clinical history while evaluating chest radiographs (n = 100) for alveolar infiltrate in children.
Figure 3
Chest radiographs of children in which 3 or more radiologists changed their interpretation in regard to the presence or absence of an alveolar infiltrate with the addition of clinical information. (A, B, and C) Three of 6 radiologists changed their interpretation. (D) Five of 6 radiologists changed their interpretation. (A) Female, 2 years old. (B) Male, 9 months old. (C) Male, 3 years old. (D) Male, 3 years old. The clinical history provided for (D) read as follows: “3‐year‐old male with cough and difficulty breathing. Rales at left base.”

Comment

We investigated the impact of the availability of clinical information on the reliability of chest radiographic interpretation in the diagnosis of pneumonia. There was improved inter‐rater reliability in the identification of air bronchograms with the addition of clinical information; however, clinical history did not have a substantial impact on the inter‐rater reliability of other findings. The addition of clinical information did not alter the inter‐rater reliability in the identification of alveolar infiltrate. Clinical history affected individual raters differently in their interpretation of alveolar infiltrate, with 3 raters more likely to identify an alveolar infiltrate and 2 raters less likely to identify an alveolar infiltrate.

Most studies addressing the impact of clinical history on radiographic interpretation evaluated accuracy. In many of these studies, accuracy was defined as the raters' agreement with the final interpretation of each film as documented in the medical record or their agreement with the interpretation of the radiologists selecting the cases.[1, 2, 3, 5, 6, 7] Given the known inter‐rater variability in radiographic interpretation,[10, 11, 12, 13, 14, 15] accuracy of a radiologist's interpretation cannot be appropriately assessed through agreement with their peers. Because a true measure of accuracy in the radiographic diagnosis of pneumonia can only be determined through invasive testing, such as lung biopsy, reliability serves as a more appropriate measure of performance. Inclusion of clinical information in chest radiograph interpretation has been shown to improve reliability in the radiographic diagnosis of a broad range of conditions.[15]

The primary outcome in this study was the identification of an infiltrate. Previous studies have noted consistent identification of the radiographic features that are most suggestive of bacterial pneumonia, such as alveolar infiltrate, and less consistent identification of other radiographic findings, including interstitial infiltrate.[18, 26, 27] Among the radiologists in this study, the addition of clinical information did not have a meaningful impact on the reliability of either of these findings, as there was substantial inter‐rater agreement for the identification of alveolar infiltrate and only slight agreement for the identification of interstitial infiltrate, both with and without clinical history. Additionally, inter‐rater reliability for the identification of alveolar infiltrate remained substantial to almost perfect for all 6 raters with the addition of clinical information.

Clinical information impacted the raters differently in their pattern of alveolar infiltrate identification, suggesting that radiologists may differ in their approach to incorporating clinical history in the interpretation of chest radiographs. The inclusion of clinical information may impact a radiologist's perception, leading to improved identification of abnormalities; however, it may also guide their decision making about the relevance of previously identified abnormalities.[28] Some radiologists may use clinical information to support or suggest possible radiographic findings, whereas others may use the information to challenge potential findings. This study did not address the manner in which the individual raters utilized the clinical history. There were also several radiographs in which the clinical information resulted in a change in the identification of an alveolar infiltrate by 3 or more raters, with as many as 5 of 6 raters changing their interpretation for 1 particular radiograph. These changes in identification of an infiltrate suggest that unidentified aspects of a history may be likely to influence a rater's interpretation of a radiograph. Nevertheless, these changes did not result in improved reliability and it is not possible to determine if these changes resulted in improved accuracy in interpretation.

This study had several limitations. First, radiographs were purposefully selected to encompass a broad spectrum of radiographic findings. Thus, the prevalence of pneumonia and other abnormal findings was artificially higher than typically observed among a cohort of children for whom pneumonia is considered. Second, the radiologists recruited for this study all practice in an academic children's hospital setting. These factors may limit the generalizability of our findings. However, we would expect these results to be generalizable to pediatric radiologists from other academic institutions. Third, this study does not meet the criteria of a balanced study design as defined by Loy and Irwig.[19] A study was characterized as balanced if half of the radiographs were read with and half without clinical information in each of the 2 reading sessions. The proposed benefit of such a design is to control for possible changes in ability or reporting practices of the raters that may have occurred between study periods. The use of a standardized reporting tool likely minimized changes in reporting practices. Also, it is unlikely that the ability or reporting practices of an experienced radiologist would change over the study period. Fourth, the radiologists interpreted the films outside of their standard workflow and utilized a standardized reporting tool that focused on the presence or absence of pneumonia indicators. These factors may have increased the radiologists' suspicion for pneumonia even in the absence of clinical information. This may have biased the results toward finding no difference in the identification of pneumonia with the addition of detailed clinical history. Thus, the inclusion of clinical information in radiograph interpretation in clinical practice may have greater impact on the identification of these pneumonia indicators than was found in this study.[29] Finally, reliability does not imply accuracy, and it is unknown if changes in the identification of pneumonia indicators led to more accurate interpretation with respect to the clinical or pathologic diagnosis of pneumonia.

In conclusion, we observed high intra‐ and inter‐rater reliability among radiologists in the identification of an alveolar infiltrate, the radiographic finding most suggestive of bacterial pneumonia.[16, 17, 18, 30] The addition of clinical information did not have a substantial impact on the reliability of its identification.

References
  1. Berbaum KS, Franken EA, Dorfman DD, et al. Tentative diagnoses facilitate the detection of diverse lesions in chest radiographs. Invest Radiol. 1986;21(7):532539.
  2. Berbaum KS, Franken EA, Dorfman DD, Barloon TJ. Influence of clinical history upon detection of nodules and other lesions. Invest Radiol. 1988;23(1):4855.
  3. Berbaum KS, Franken EA, Dorfman DD, Lueben KR. Influence of clinical history on perception of abnormalities in pediatric radiographs. Acad Radiol. 1994;1(3):217223.
  4. Song KS, Song HH, Park SH, et al. Impact of clinical history on film interpretation. Yonsei Med J. 1992;33(2):168172.
  5. Cooperstein LA, Good BC, Eelkema EA, et al. The effect of clinical history on chest radiograph interpretations in a PACS environment. Invest Radiol. 1990;25(6):670674.
  6. Good BC, Cooperstein LA, DeMarino GB, et al. Does knowledge of the clinical history affect the accuracy of chest radiograph interpretation?AJR Am J Roentgenol. 1990;154(4):709712.
  7. Quekel LG, Goei R, Kessels AG, Engelshoven JM. Detection of lung cancer on the chest radiograph: impact of previous films, clinical information, double reading, and dual reading. J Clin Epidemiol. 2001;54(11):11461150.
  8. Eldevik OP, Dugstad G, Orrison WW, Haughton VM. The effect of clinical bias on the interpretation of myelography and spinal computed tomography. Radiology. 1982;145(1):8589.
  9. Griscom NT. A suggestion: look at the images first, before you read the history. Radiology. 2002;223(1):910.
  10. Albaum MN, Hill LC, Murphy M, et al. Interobserver reliability of the chest radiograph in community‐acquired pneumonia. PORT Investigators. Chest. 1996;110(2):343350.
  11. Bloomfield FH, Teele RL, Voss M, Knight DB, Harding JE. Inter‐ and intra‐observer variability in the assessment of atelectasis and consolidation in neonatal chest radiographs. Pediatr Radiol. 1999;29(6):459462.
  12. Gatt ME, Spectre G, Paltiel O, Hiller N, Stalnikowicz R. Chest radiographs in the emergency department: is the radiologist really necessary?Postgrad Med J. 2003;79(930):214217.
  13. Hopstaken RM, Witbraad T, Engelshoven JM, Dinant GJ. Inter‐observer variation in the interpretation of chest radiographs for pneumonia in community‐acquired lower respiratory tract infections. Clin Radiol. 2004;59(8):743752.
  14. Novack V, Avnon LS, Smolyakov A, Barnea R, Jotkowitz A, Schlaeffer F. Disagreement in the interpretation of chest radiographs among specialists and clinical outcomes of patients hospitalized with suspected pneumonia. Eur J Intern Med. 2006;17(1):4347.
  15. Tudor GR, Finlay D, Taub N. An assessment of inter‐observer agreement and accuracy when reporting plain radiographs. Clin Radiol. 1997;52(3):235238.
  16. Shimol BS, Dagan R, Givon‐Lavi N, et al. Evaluation of the World Health Organization criteria for chest radiographs for pneumonia diagnosis in children. Eur J Pediatr. 2011;171(2):369374.
  17. Cherian T, Mulholland EK, Carlin JB, et al. Standardized interpretation of paediatric chest radiographs for the diagnosis of pneumonia in epidemiological studies. Bull World Health Organ. 2005;83(5):353359.
  18. Neuman MI, Lee EY, Bixby S, et al. Variability in the interpretation of chest radiographs for the diagnosis of pneumonia in children. J Hosp Med. 2012;7(4):294298.
  19. Loy CT, Irwig L. Accuracy of diagnostic tests read with and without clinical information: a systematic review. JAMA. 2004;292(13):16021609.
  20. Standardization of interpretation of chest radiographs for the diagnosis of pneumonia in children. In:World Health Organization: Pneumonia Vaccine Trial Investigators' Group.Geneva: Department of Vaccine and Biologics;2001.
  21. Hansen J, Black S, Shinefield H, et al. Effectiveness of heptavalent pneumococcal conjugate vaccine in children younger than 5 years of age for prevention of pneumonia: updated analysis using World Health Organization standardized interpretation of chest radiographs. Pediatr Infect Dis J. 2006;25(9):779781.
  22. Landis JR, Koch GG. A one‐way components of variance model for categorical data. Biometrics. 1977;33:671679.
  23. Landis JR, Koch GG. The measurement of observer agreement for categorical data. Biometrics. 1977;33(1):159174.
  24. Juurlink DN, Detsky AS. Kappa statistic. CMAJ. 2005;173(1):16.
  25. Kramer MS, Feinstein AR. Clinical biostatistics. LIV. The biostatistics of concordance. Clin Pharmacol Ther. 1981;29(1):111123.
  26. Bartlett JG, Dowell SF, Mandell LA, File TM, Musher DM, Fine MJ. Practice guidelines for the management of community‐acquired pneumonia in adults. Infectious Diseases Society of America. Clin Infect Dis. 2000;31(2):347382.
  27. Niederman MS, Mandell LA, Anzueto A, et al. Guidelines for the management of adults with community‐acquired pneumonia. Diagnosis, assessment of severity, antimicrobial therapy, and prevention. Am J Respir Crit Care Med. 2001;163(7):17301754.
  28. Berbaum KS, Franken EA. Commentary does clinical history affect perception?Acad Radiol. 2006;13(3):402403.
  29. Berbaum KS, Franken EA. The effect of clinical history on chest radiograph interpretations in a PACS environment. Invest Radiol. 1991;26(5):512514.
  30. Korppi M, Kiekara O, Heiskanen‐Kosma T, Soimakallio S. Comparison of radiological findings and microbial aetiology of childhood pneumonia. Acta Paediatr. 1993;82(4):360363.
References
  1. Berbaum KS, Franken EA, Dorfman DD, et al. Tentative diagnoses facilitate the detection of diverse lesions in chest radiographs. Invest Radiol. 1986;21(7):532539.
  2. Berbaum KS, Franken EA, Dorfman DD, Barloon TJ. Influence of clinical history upon detection of nodules and other lesions. Invest Radiol. 1988;23(1):4855.
  3. Berbaum KS, Franken EA, Dorfman DD, Lueben KR. Influence of clinical history on perception of abnormalities in pediatric radiographs. Acad Radiol. 1994;1(3):217223.
  4. Song KS, Song HH, Park SH, et al. Impact of clinical history on film interpretation. Yonsei Med J. 1992;33(2):168172.
  5. Cooperstein LA, Good BC, Eelkema EA, et al. The effect of clinical history on chest radiograph interpretations in a PACS environment. Invest Radiol. 1990;25(6):670674.
  6. Good BC, Cooperstein LA, DeMarino GB, et al. Does knowledge of the clinical history affect the accuracy of chest radiograph interpretation?AJR Am J Roentgenol. 1990;154(4):709712.
  7. Quekel LG, Goei R, Kessels AG, Engelshoven JM. Detection of lung cancer on the chest radiograph: impact of previous films, clinical information, double reading, and dual reading. J Clin Epidemiol. 2001;54(11):11461150.
  8. Eldevik OP, Dugstad G, Orrison WW, Haughton VM. The effect of clinical bias on the interpretation of myelography and spinal computed tomography. Radiology. 1982;145(1):8589.
  9. Griscom NT. A suggestion: look at the images first, before you read the history. Radiology. 2002;223(1):910.
  10. Albaum MN, Hill LC, Murphy M, et al. Interobserver reliability of the chest radiograph in community‐acquired pneumonia. PORT Investigators. Chest. 1996;110(2):343350.
  11. Bloomfield FH, Teele RL, Voss M, Knight DB, Harding JE. Inter‐ and intra‐observer variability in the assessment of atelectasis and consolidation in neonatal chest radiographs. Pediatr Radiol. 1999;29(6):459462.
  12. Gatt ME, Spectre G, Paltiel O, Hiller N, Stalnikowicz R. Chest radiographs in the emergency department: is the radiologist really necessary?Postgrad Med J. 2003;79(930):214217.
  13. Hopstaken RM, Witbraad T, Engelshoven JM, Dinant GJ. Inter‐observer variation in the interpretation of chest radiographs for pneumonia in community‐acquired lower respiratory tract infections. Clin Radiol. 2004;59(8):743752.
  14. Novack V, Avnon LS, Smolyakov A, Barnea R, Jotkowitz A, Schlaeffer F. Disagreement in the interpretation of chest radiographs among specialists and clinical outcomes of patients hospitalized with suspected pneumonia. Eur J Intern Med. 2006;17(1):4347.
  15. Tudor GR, Finlay D, Taub N. An assessment of inter‐observer agreement and accuracy when reporting plain radiographs. Clin Radiol. 1997;52(3):235238.
  16. Shimol BS, Dagan R, Givon‐Lavi N, et al. Evaluation of the World Health Organization criteria for chest radiographs for pneumonia diagnosis in children. Eur J Pediatr. 2011;171(2):369374.
  17. Cherian T, Mulholland EK, Carlin JB, et al. Standardized interpretation of paediatric chest radiographs for the diagnosis of pneumonia in epidemiological studies. Bull World Health Organ. 2005;83(5):353359.
  18. Neuman MI, Lee EY, Bixby S, et al. Variability in the interpretation of chest radiographs for the diagnosis of pneumonia in children. J Hosp Med. 2012;7(4):294298.
  19. Loy CT, Irwig L. Accuracy of diagnostic tests read with and without clinical information: a systematic review. JAMA. 2004;292(13):16021609.
  20. Standardization of interpretation of chest radiographs for the diagnosis of pneumonia in children. In:World Health Organization: Pneumonia Vaccine Trial Investigators' Group.Geneva: Department of Vaccine and Biologics;2001.
  21. Hansen J, Black S, Shinefield H, et al. Effectiveness of heptavalent pneumococcal conjugate vaccine in children younger than 5 years of age for prevention of pneumonia: updated analysis using World Health Organization standardized interpretation of chest radiographs. Pediatr Infect Dis J. 2006;25(9):779781.
  22. Landis JR, Koch GG. A one‐way components of variance model for categorical data. Biometrics. 1977;33:671679.
  23. Landis JR, Koch GG. The measurement of observer agreement for categorical data. Biometrics. 1977;33(1):159174.
  24. Juurlink DN, Detsky AS. Kappa statistic. CMAJ. 2005;173(1):16.
  25. Kramer MS, Feinstein AR. Clinical biostatistics. LIV. The biostatistics of concordance. Clin Pharmacol Ther. 1981;29(1):111123.
  26. Bartlett JG, Dowell SF, Mandell LA, File TM, Musher DM, Fine MJ. Practice guidelines for the management of community‐acquired pneumonia in adults. Infectious Diseases Society of America. Clin Infect Dis. 2000;31(2):347382.
  27. Niederman MS, Mandell LA, Anzueto A, et al. Guidelines for the management of adults with community‐acquired pneumonia. Diagnosis, assessment of severity, antimicrobial therapy, and prevention. Am J Respir Crit Care Med. 2001;163(7):17301754.
  28. Berbaum KS, Franken EA. Commentary does clinical history affect perception?Acad Radiol. 2006;13(3):402403.
  29. Berbaum KS, Franken EA. The effect of clinical history on chest radiograph interpretations in a PACS environment. Invest Radiol. 1991;26(5):512514.
  30. Korppi M, Kiekara O, Heiskanen‐Kosma T, Soimakallio S. Comparison of radiological findings and microbial aetiology of childhood pneumonia. Acta Paediatr. 1993;82(4):360363.
Issue
Journal of Hospital Medicine - 8(7)
Issue
Journal of Hospital Medicine - 8(7)
Page Number
359-364
Page Number
359-364
Publications
Publications
Article Type
Display Headline
Impact of clinical history on chest radiograph interpretation
Display Headline
Impact of clinical history on chest radiograph interpretation
Sections
Article Source

© 2012 Society of Hospital Medicine

Disallow All Ads
Correspondence Location
Address for correspondence and reprint requests: Samir S. Shah, MD, 3333 Burnet Avenue, ML 9016, Cincinnati Children's Hospital Medical Center, Cincinnati, OH 45229. E-mail: [email protected]
Content Gating
No Gating (article Unlocked/Free)
Alternative CME
Article PDF Media
Media Files

Reliability of CXR for Pneumonia

Article Type
Changed
Mon, 05/22/2017 - 18:55
Display Headline
Variability in the interpretation of chest radiographs for the diagnosis of pneumonia in children

The chest radiograph (CXR) is the most commonly used diagnostic imaging modality in children, and is considered to be the gold standard for the diagnosis of pneumonia. As such, physicians in developed countries rely on chest radiography to establish the diagnosis of pneumonia.13 However, there are limited data investigating the reliability of this test for the diagnosis of pneumonia in children.2, 46

Prior investigations have noted poor overall agreement by emergency medicine, infectious diseases, and pulmonary medicine physicians, and even radiologists, in their interpretation of chest radiographs for the diagnosis of pneumonia.2, 5, 710 The World Health Organization (WHO) developed criteria to standardize CXR interpretation for the diagnosis of pneumonia in children for use in epidemiologic studies.11 These standardized definitions of pneumonia have been formally evaluated by the WHO6 and utilized in epidemiologic studies of vaccine efficacy,12 but the overall reliability of these radiographic criteria have not been studied outside of these forums.

We conducted this prospective case‐based study to evaluate the reliability of the radiographic diagnosis of pneumonia among children presenting to a pediatric emergency department with clinical suspicion of pneumonia. We were primarily interested in assessing the overall reliability in CXR interpretation for the diagnosis of pneumonia, and identifying which radiographic features of pneumonia were consistently identified by radiologists.

MATERIALS AND METHODS

Study Subjects

We evaluated the reliability of CXR interpretation with respect to the diagnosis of pneumonia among radiologists. Six board‐certified radiologists at 2 academic children's hospitals (Children's Hospital of Philadelphia, Philadelphia, PA [n = 3] and Children's Hospital, Boston, Boston, MA [n = 3]) interpreted the same 110 chest radiographs in a blinded fashion. The radiologists varied with respect to the number of years practicing pediatric radiology (median 8 years, range 3‐36 years). Clinical information such as age, gender, clinical indication for obtaining the radiograph, history, and physical examination findings were not provided. Aside from the study form which stated the WHO classification scheme for radiographic pneumonia, no other information or training was provided to the radiologists as part of this study.

Radiographs were selected among a population of children presenting to the emergency department at Children's Hospital, Boston, who had a radiograph obtained for concern of pneumonia. From this cohort, we selected children who had radiographs which encompassed the spectrum of respiratory disease processes encountered in a pediatric population. The radiographs selected for review included 50 radiographs with a final reading in the medical record without suspicion for pneumonia, and 50 radiographs in which the diagnosis of pneumonia could not be excluded. In the latter group, 25 radiographs had a final reading suggestive of an alveolar infiltrate, and 25 radiographs had a final reading suggestive of an interstitial infiltrate. Ten duplicate radiographs were included to permit assessment of intra‐rater reliability.

Radiograph Interpretation

Radiologists at both sites interpreted the identical 110 radiographs (both anteroposterior [AP] and lateral views for each subject). Digital Imaging and Communications in Medicine (DICOM) images were downloaded from a registry at Children's Hospital, Boston, and were copied to DVDs which were provided to each radiologist. Standardized radiographic imaging software (eFilm Lite [Mississauga, Canada]) was used by each radiologist to view and interpret the radiographs.

Each radiologist completed a study questionnaire for each radiograph interpreted (see Supporting Appendix A in the online version of this article). The questionnaire utilized radiographic descriptors of primary end‐point pneumonia described by the WHO which were procured to standardize the radiographic diagnosis of pneumonia.11, 12 The main outcome of interest was the presence or absence of an infiltrate. Among radiographs in which an infiltrate was identified, radiologists selected whether there was an alveolar infiltrate, interstitial infiltrate, or both. An alveolar infiltrate was defined as a dense or fluffy opacity that occupies a portion or whole of a lobe, or of the entire lung, that may or may not contain air bronchograms.11, 12 An interstitial infiltrate was defined by a lacy pattern involving both lungs, featuring peribronchial thickening and multiple areas of atelectasis.11, 12 It also included minor patchy infiltrates that were not of sufficient magnitude to constitute consolidation, and small areas of atelectasis that in children may be difficult to distinguish from consolidation. Among interstitial infiltrates, radiologists were asked to distinguish infiltrate from atelectasis. A radiograph classified as having either an alveolar infiltrate or interstitial infiltrate (not atelectasis) was considered to have any infiltrate. Additional findings including air bronchograms, hilar adenopathy, pleural effusion, and location of abnormalities were also recorded.

Statistical Analysis

Inter‐rater reliability was assessed using the kappa statistic to determine the overall agreement between the 6 radiologists for each binary outcome (ie, presence or absence of alveolar infiltrate). To calculate 95% confidence intervals (CI) for kappa statistics with more than 2 raters, we employed a bootstrapping method with 1000 replications of samples equal in size to the study sample, using the kapci program as implemented by STATA software (version 10.1, STATA Corp, College Station, TX). Also, intra‐rater reliability was evaluated by examining the agreement within each radiologist upon review of 10 duplicate radiographs that had been randomly inserted into the case‐mix. We used the benchmarks proposed by Landis and Koch to classify the strength of agreement measured by the kappa statistic, as follows: poor (<0.0); slight (0‐0.20); fair (0.21‐0.40); moderate (0.41‐0.60); substantial (0.61‐0.80); almost perfect (0.81‐1.0).13

The study was approved by the institutional review boards at Children's Hospital, Boston and Children's Hospital of Philadelphia.

RESULTS

Patient Sample

The sample of 110 radiographs was obtained from 100 children presenting to the emergency department at Children's Hospital, Boston, with concern of pneumonia. These patients ranged in age from 1 week to 19 years (median, 3.5 years; interquartile range [IQR], 1.6‐6.0 years). Fifty (50%) of these patients were male. As stated above, the sample comprised 50 radiographs with a final reading in the medical record without suspicion for pneumonia, and 50 radiographs in which the diagnosis of pneumonia could not be excluded. The 10 duplicate radiographs encompassed a similar spectrum of findings.

Inter‐Rater Reliability

The kappa coefficients of inter‐rater reliability between the radiologists across the 6 clinical measures of interest are displayed in Table 1. As shown, the most reliable measure was that of alveolar infiltrate (Figure 1), which attained a substantial degree of agreement between the radiologists. Two other measures, any infiltrate and pleural effusion, attained moderate reliability, while bronchograms and hilar adenopathy were each classified as having fair reliability. However, interstitial infiltrate (Figure 2) was found to have the lowest kappa estimate, with a slight degree of reliability. When examining inter‐rater reliability among the radiologists separately from each institution, the pattern of results was similar.

Inter‐Rater Reliability of Radiologists (n = 6) Evaluating Chest Radiographs in Children Presenting to the ED With Suspected Pneumonia (n = 100)
All Radiologists (n = 6)Kappa95% Confidence Interval
  • Abbreviation: ED, emergency department.

Any infiltrate0.470.39, 0.56
Alveolar infiltrate0.690.60, 0.78
Interstitial infiltrate0.140.05, 0.23
Air bronchograms0.320.24, 0.42
Hilar adenopathy0.210.08, 0.39
Pleural effusion0.450.29, 0.61
Figure 1
Chest radiograph (anteroposterior [AP] view) of a child with an opacity in the right middle lobe. For this image, all 6 radiologists classified the patient as having an alveolar infiltrate.
Figure 2
Chest radiograph (anteroposterior [AP] view) of a child demonstrating increased interstitial markings which are most prominent in the right middle and left upper lobes. For this image, 4 radiologists classified this radiograph as having an interstitial infiltrate, whereas 2 radiologists classified the patient as not having an interstitial infiltrate.

At least 4 of the 6 radiologists agreed on the presence or absence of an alveolar infiltrate for 95 of the 100 unique CXRs; all 6 radiologists agreed regarding the presence or absence of an alveolar infiltrate in 72 of the 100 unique CXRs. At least 4 of the 6 radiologists agreed on the presence or absence of any infiltrate and interstitial infiltrate 96% and 90% of the time, respectively. All 6 of the radiologists agreed on the presence or absence of any infiltrate and interstitial infiltrate 35% and 27% of the time, respectively.

Intra‐Rater Reliability

Estimates of intra‐rater reliability on the primary clinical outcomes (alveolar infiltrate, interstitial infiltrate, and any infiltrate) are found in Table 2. Across the 6 raters, the kappa estimates for alveolar infiltrate were all classified as substantial or almost perfect. The kappa estimates for interstitial infiltrate varied widely, ranging from fair to almost perfect, while for any infiltrate, reliability ranged from moderate to almost perfect.

Intra‐Rater Reliability of Radiologists Evaluating Chest Radiographs (n = 10) for Pneumonia in Children
 Kappa95% Confidence Interval
  • Too few response categories were represented to facilitate the calculation of the kappa statistic.

  • Both responses are negative for all 10 paired radiographs; kappa cannot be calculated.

Any infiltrate  
Rater 11.001.00, 1.00
Rater 20.600.10, 1.00
Rater 30.800.44, 1.00
Rater 41.001.00, 1.00
Rater 5n/a* 
Rater 61.001.00, 1.00
Alveolar infiltrate  
Rater 11.001.00, 1.00
Rater 21.001.00, 1.00
Rater 31.001.00, 1.00
Rater 41.001.00, 1.00
Rater 50.780.39, 1.00
Rater 60.740.27, 1.00
Interstitial infiltrate  
Rater 11.001.00, 1.00
Rater 20.210.43, 0.85
Rater 30.740.27, 1.00
Rater 4n/a 
Rater 50.580.07, 1.00
Rater 60.620.5, 1.00

DISCUSSION

The chest radiograph serves as an integral component of the reference standard for the diagnosis of childhood pneumonia. Few prior studies have assessed the reliability of chest radiograph findings in children.3, 5, 12, 14, 15 We found a high degree of agreement among radiologists for radiologic findings consistent with bacterial pneumonia when standardized interpretation criteria were applied. In this study, we identified radiographic features of pneumonia, such as alveolar infiltrate and pleural effusion, that were consistently identified by different radiologists reviewing the same radiograph and by the same radiologist reviewing the same radiograph. These data support the notion that radiographic features most suggestive of bacterial pneumonia are consistently identified by radiologists.16, 17 There was less consistency in the identification of other radiographic findings, such as interstitial infiltrates, air bronchograms, and hilar lymphadenopathy.

Prior studies have found high levels of disagreement among radiologists in the interpretation of chest radiographs.2, 3, 15, 18 Many of these prior studies emphasized variation in detection of radiographic findings that would not typically alter clinical management. We observed high intra‐rater, and inter‐rater reliability among radiologists for the findings of alveolar infiltrate and pleural effusion. These are the radiographic findings most consistent with a bacterial etiologic agent for pneumonia.19 Other studies have also found that the presence of an alveolar infiltrate is a reliable radiographic finding in children18 and adults.7, 9, 10 These findings support the use of the WHO definition of primary endpoint pneumonia for use in epidemiologic studies.4, 6, 11

This study also confirms a previous report by Cherian et al. that findings of many children with asthma, reactive airways disease, bronchiolitis, and viral infections interstitial infiltrates are less reliable.6 This is not surprising considering the fact that these patients often have radiographic findings due to small airway disease and atelectasis.19, 20 The differentiation between atelectasis and interstitial infiltrate is difficult, particularly in young children. A prior study conducted among neonates observed wide variability in the interpretation of chest radiographs, and that the differentiation of pneumonia from atelectasis was difficult for this patient population.5 The decisions around antimicrobial treatment of children with radiographic findings of interstitial infiltrates should be made in the context of the clinical history and physical examination findings, and clinicians should realize that these radiographic features demonstrate poor reliability for the diagnosis of pneumonia.

Overall reliability for the presence of any infiltrate, and its converse, no infiltrate was considered moderate. This is driven by the low reliability and variability around the radiographic diagnosis of interstitial infiltrates. Our findings are similar to those observed in adults with lower respiratory tract infections.9 The low reliability in identification of interstitial infiltrates may explain why prior studies have demonstrated that the CXR results rarely change management in children who have radiographs performed for suspicion of pneumonia.1, 21 Our study highlights the importance of quantifying CXR findings to include specific comments regarding the presence or absence of alveolar infiltrates, rather than the presence or absence of any infiltrate.

The WHO has procured definitions the radiographic diagnosis of pneumonia, and this definition has been utilized to help standardize the interpretation of chest radiographs for the conduct of epidemiological studies.6, 11 Specifically, the definitions utilized not only define the presence or absence of pneumonia, but also attempt to differentiate a primarily bacterial infection (consolidation or pleural effusion), from a viral or atypical presentation (interstitial pattern). Even under the best of circumstances, the differentiation of viral versus bacterial pneumonia is not always possible, and again, is often made by the treating physician by incorporating the clinical setting within which the radiograph was obtained.

This study had several limitations. Firstly, the included radiographs did not reflect the frequency with which certain radiographic findings would be identified in children evaluated for pneumonia in a pediatric emergency department setting. Radiographs were purposefully selected to encompass a broad spectrum of radiologic findings, including less common findings such as hilar lymphadenopathy and pleural effusions. Thus, the prevalence of pneumonia and other abnormal findings in this study was artificially higher than typically observed among a cohort of children for whom pneumonia is considered, a factor that may limit the generalizability of our results. Secondly, the clinical history was not provided to the radiologists to avoid bias by indication. For this study, we notified the radiologists that all radiographs were performed for clinical suspicion of pneumonia without providing details about the subjects' signs and symptoms. The absence of clinical history, however, does not mirror the real world scenario in which the interpretation of the chest radiograph is frequently made in the context of the clinical history. The relevance of this latter issue is unclear, as Tudor et al. found a nonstatistically significant improvement in the overall accuracy in chest radiograph interpretation when radiologists were provided clinical details.10 The radiologists recruited for this study all practice in an academic children's hospital setting, and thus, the generalizability of our findings may be limited to this type of practice setting. Finally, reproducibility does not imply accuracy, and reliability in identifying specific findings does not necessarily lead to improved or different management. Thus, while the reliability of radiographic findings of alveolar infiltrate and pleural effusion is reassuringly high, the validity of these radiographic features for bacterial pneumonia is not known. Ascertainment of validity can only be assessed through the use of invasive testing such as lung biopsy, as the yield from bacterial testing such as blood cultures is low, and the results of other studies such as viral testing of nasopharyngeal washings do not prove an etiologic cause of pneumonia.

CONCLUSIONS

Radiographic findings of alveolar infiltrates and pleural effusions are highly reliable among radiologists. Radiographic interpretation of interstitial infiltrates appears to be less reliable.

Files
References
  1. Alario AJ,McCarthy PL,Markowitz R, et al.Usefulness of chest radiographs in children with acute lower respiratory tract disease.J Pediatr.1987;111:187193.
  2. Novack V,Avnon LS,Smolyakov A, et al.Disagreement in the interpretation of chest radiographs among specialists and clinical outcomes of patients hospitalized with suspected pneumonia.Eur J Intern Med.2006;17:4347.
  3. Stickler GB,Hoffman AD,Taylor WF.Problems in the clinical and roentgenographic diagnosis of pneumonia in young children.Clin Pediatr (Phila).1984;23:398399.
  4. WHO guidelines on detecting pneumonia in children.Lancet.1991;338:14531454.
  5. Bloomfield FH,Teele RL,Voss M, et al.Inter‐ and intra‐observer variability in the assessment of atelectasis and consolidation in neonatal chest radiographs.Pediatr Radiol.1999;29:459462.
  6. Cherian T,Mulholland EK,Carlin JB, et al.Standardized interpretation of paediatric chest radiographs for the diagnosis of pneumonia in epidemiological studies.Bull World Health Organ.2005;83:353359.
  7. Albaum MN,Hill LC,Murphy M, et al.Interobserver reliability of the chest radiograph in community‐acquired pneumonia. PORT Investigators.Chest.1996;110:343350.
  8. Gatt ME,Spectre G,Paltiel O, et al.Chest radiographs in the emergency department: is the radiologist really necessary?Postgrad Med J.2003;79:214217.
  9. Hopstaken RM,Witbraad T,van Engelshoven JM, et al.Inter‐observer variation in the interpretation of chest radiographs for pneumonia in community‐acquired lower respiratory tract infections.Clin Radiol.2004;59:743752.
  10. Tudor GR,Finlay D,Taub N.An assessment of inter‐observer agreement and accuracy when reporting plain radiographs.Clin Radiol.1997;52:235238.
  11. Standardization of interpretation of chest radiographs for the diagnosis of pneumonia in children. In:World Health Organization: Pneumonia Vaccine Trial Investigators' Group.Geneva:Department of Vaccine and Biologics;2001.
  12. Hansen J,Black S,Shinefield H, et al.Effectiveness of heptavalent pneumococcal conjugate vaccine in children younger than 5 years of age for prevention of pneumonia: updated analysis using World Health Organization standardized interpretation of chest radiographs.Pediatr Infect Dis J.2006;25:779781.
  13. Landis JR,Koch GG.The measurement of observer agreement for categorical data.Biometrics.1977;33:159174.
  14. Grossman LK,Caplan SE.Clinical, laboratory, and radiological information in the diagnosis of pneumonia in children.Ann Emerg Med.1988;17:4346.
  15. Johnson J,Kline JA.Intraobserver and interobserver agreement of the interpretation of pediatric chest radiographs.Emerg Radiol.17:285290.
  16. Bartlett JG,Dowell SF,Mandell LA, et al.Practice guidelines for the management of community‐acquired pneumonia in adults. Infectious Diseases Society of America.Clin Infect Dis.2000;31:347382.
  17. Niederman MS,Mandell LA,Anzueto A, et al.Guidelines for the management of adults with community‐acquired pneumonia. Diagnosis, assessment of severity, antimicrobial therapy, and prevention.Am J Respir Crit Care Med.2001;163:17301754.
  18. Korppi M,Kiekara O,Heiskanen‐Kosma T, et al.Comparison of radiological findings and microbial aetiology of childhood pneumonia.Acta Paediatr.1993;82:360363.
  19. Kuhn JP, Slovis TL, Haller JO, eds.Caffey's Pediatric Diagnostic Imaging.10th ed.Philadelphia, PA:Mosby;2004.
  20. Mathews B,Shah S,Cleveland RH, et al.Clinical predictors of pneumonia among children with wheezing.Pediatrics.2009;124:e29e36.
  21. Spottswood SE,Liaw K,Hernanz‐Schulman M, et al.The clinical impact of the radiology report in wheezing and nonwheezing febrile children: a survey of clinicians.Pediatr Radiol.2009;39:348353.
Article PDF
Issue
Journal of Hospital Medicine - 7(4)
Publications
Page Number
294-298
Sections
Files
Files
Article PDF
Article PDF

The chest radiograph (CXR) is the most commonly used diagnostic imaging modality in children, and is considered to be the gold standard for the diagnosis of pneumonia. As such, physicians in developed countries rely on chest radiography to establish the diagnosis of pneumonia.13 However, there are limited data investigating the reliability of this test for the diagnosis of pneumonia in children.2, 46

Prior investigations have noted poor overall agreement by emergency medicine, infectious diseases, and pulmonary medicine physicians, and even radiologists, in their interpretation of chest radiographs for the diagnosis of pneumonia.2, 5, 710 The World Health Organization (WHO) developed criteria to standardize CXR interpretation for the diagnosis of pneumonia in children for use in epidemiologic studies.11 These standardized definitions of pneumonia have been formally evaluated by the WHO6 and utilized in epidemiologic studies of vaccine efficacy,12 but the overall reliability of these radiographic criteria have not been studied outside of these forums.

We conducted this prospective case‐based study to evaluate the reliability of the radiographic diagnosis of pneumonia among children presenting to a pediatric emergency department with clinical suspicion of pneumonia. We were primarily interested in assessing the overall reliability in CXR interpretation for the diagnosis of pneumonia, and identifying which radiographic features of pneumonia were consistently identified by radiologists.

MATERIALS AND METHODS

Study Subjects

We evaluated the reliability of CXR interpretation with respect to the diagnosis of pneumonia among radiologists. Six board‐certified radiologists at 2 academic children's hospitals (Children's Hospital of Philadelphia, Philadelphia, PA [n = 3] and Children's Hospital, Boston, Boston, MA [n = 3]) interpreted the same 110 chest radiographs in a blinded fashion. The radiologists varied with respect to the number of years practicing pediatric radiology (median 8 years, range 3‐36 years). Clinical information such as age, gender, clinical indication for obtaining the radiograph, history, and physical examination findings were not provided. Aside from the study form which stated the WHO classification scheme for radiographic pneumonia, no other information or training was provided to the radiologists as part of this study.

Radiographs were selected among a population of children presenting to the emergency department at Children's Hospital, Boston, who had a radiograph obtained for concern of pneumonia. From this cohort, we selected children who had radiographs which encompassed the spectrum of respiratory disease processes encountered in a pediatric population. The radiographs selected for review included 50 radiographs with a final reading in the medical record without suspicion for pneumonia, and 50 radiographs in which the diagnosis of pneumonia could not be excluded. In the latter group, 25 radiographs had a final reading suggestive of an alveolar infiltrate, and 25 radiographs had a final reading suggestive of an interstitial infiltrate. Ten duplicate radiographs were included to permit assessment of intra‐rater reliability.

Radiograph Interpretation

Radiologists at both sites interpreted the identical 110 radiographs (both anteroposterior [AP] and lateral views for each subject). Digital Imaging and Communications in Medicine (DICOM) images were downloaded from a registry at Children's Hospital, Boston, and were copied to DVDs which were provided to each radiologist. Standardized radiographic imaging software (eFilm Lite [Mississauga, Canada]) was used by each radiologist to view and interpret the radiographs.

Each radiologist completed a study questionnaire for each radiograph interpreted (see Supporting Appendix A in the online version of this article). The questionnaire utilized radiographic descriptors of primary end‐point pneumonia described by the WHO which were procured to standardize the radiographic diagnosis of pneumonia.11, 12 The main outcome of interest was the presence or absence of an infiltrate. Among radiographs in which an infiltrate was identified, radiologists selected whether there was an alveolar infiltrate, interstitial infiltrate, or both. An alveolar infiltrate was defined as a dense or fluffy opacity that occupies a portion or whole of a lobe, or of the entire lung, that may or may not contain air bronchograms.11, 12 An interstitial infiltrate was defined by a lacy pattern involving both lungs, featuring peribronchial thickening and multiple areas of atelectasis.11, 12 It also included minor patchy infiltrates that were not of sufficient magnitude to constitute consolidation, and small areas of atelectasis that in children may be difficult to distinguish from consolidation. Among interstitial infiltrates, radiologists were asked to distinguish infiltrate from atelectasis. A radiograph classified as having either an alveolar infiltrate or interstitial infiltrate (not atelectasis) was considered to have any infiltrate. Additional findings including air bronchograms, hilar adenopathy, pleural effusion, and location of abnormalities were also recorded.

Statistical Analysis

Inter‐rater reliability was assessed using the kappa statistic to determine the overall agreement between the 6 radiologists for each binary outcome (ie, presence or absence of alveolar infiltrate). To calculate 95% confidence intervals (CI) for kappa statistics with more than 2 raters, we employed a bootstrapping method with 1000 replications of samples equal in size to the study sample, using the kapci program as implemented by STATA software (version 10.1, STATA Corp, College Station, TX). Also, intra‐rater reliability was evaluated by examining the agreement within each radiologist upon review of 10 duplicate radiographs that had been randomly inserted into the case‐mix. We used the benchmarks proposed by Landis and Koch to classify the strength of agreement measured by the kappa statistic, as follows: poor (<0.0); slight (0‐0.20); fair (0.21‐0.40); moderate (0.41‐0.60); substantial (0.61‐0.80); almost perfect (0.81‐1.0).13

The study was approved by the institutional review boards at Children's Hospital, Boston and Children's Hospital of Philadelphia.

RESULTS

Patient Sample

The sample of 110 radiographs was obtained from 100 children presenting to the emergency department at Children's Hospital, Boston, with concern of pneumonia. These patients ranged in age from 1 week to 19 years (median, 3.5 years; interquartile range [IQR], 1.6‐6.0 years). Fifty (50%) of these patients were male. As stated above, the sample comprised 50 radiographs with a final reading in the medical record without suspicion for pneumonia, and 50 radiographs in which the diagnosis of pneumonia could not be excluded. The 10 duplicate radiographs encompassed a similar spectrum of findings.

Inter‐Rater Reliability

The kappa coefficients of inter‐rater reliability between the radiologists across the 6 clinical measures of interest are displayed in Table 1. As shown, the most reliable measure was that of alveolar infiltrate (Figure 1), which attained a substantial degree of agreement between the radiologists. Two other measures, any infiltrate and pleural effusion, attained moderate reliability, while bronchograms and hilar adenopathy were each classified as having fair reliability. However, interstitial infiltrate (Figure 2) was found to have the lowest kappa estimate, with a slight degree of reliability. When examining inter‐rater reliability among the radiologists separately from each institution, the pattern of results was similar.

Inter‐Rater Reliability of Radiologists (n = 6) Evaluating Chest Radiographs in Children Presenting to the ED With Suspected Pneumonia (n = 100)
All Radiologists (n = 6)Kappa95% Confidence Interval
  • Abbreviation: ED, emergency department.

Any infiltrate0.470.39, 0.56
Alveolar infiltrate0.690.60, 0.78
Interstitial infiltrate0.140.05, 0.23
Air bronchograms0.320.24, 0.42
Hilar adenopathy0.210.08, 0.39
Pleural effusion0.450.29, 0.61
Figure 1
Chest radiograph (anteroposterior [AP] view) of a child with an opacity in the right middle lobe. For this image, all 6 radiologists classified the patient as having an alveolar infiltrate.
Figure 2
Chest radiograph (anteroposterior [AP] view) of a child demonstrating increased interstitial markings which are most prominent in the right middle and left upper lobes. For this image, 4 radiologists classified this radiograph as having an interstitial infiltrate, whereas 2 radiologists classified the patient as not having an interstitial infiltrate.

At least 4 of the 6 radiologists agreed on the presence or absence of an alveolar infiltrate for 95 of the 100 unique CXRs; all 6 radiologists agreed regarding the presence or absence of an alveolar infiltrate in 72 of the 100 unique CXRs. At least 4 of the 6 radiologists agreed on the presence or absence of any infiltrate and interstitial infiltrate 96% and 90% of the time, respectively. All 6 of the radiologists agreed on the presence or absence of any infiltrate and interstitial infiltrate 35% and 27% of the time, respectively.

Intra‐Rater Reliability

Estimates of intra‐rater reliability on the primary clinical outcomes (alveolar infiltrate, interstitial infiltrate, and any infiltrate) are found in Table 2. Across the 6 raters, the kappa estimates for alveolar infiltrate were all classified as substantial or almost perfect. The kappa estimates for interstitial infiltrate varied widely, ranging from fair to almost perfect, while for any infiltrate, reliability ranged from moderate to almost perfect.

Intra‐Rater Reliability of Radiologists Evaluating Chest Radiographs (n = 10) for Pneumonia in Children
 Kappa95% Confidence Interval
  • Too few response categories were represented to facilitate the calculation of the kappa statistic.

  • Both responses are negative for all 10 paired radiographs; kappa cannot be calculated.

Any infiltrate  
Rater 11.001.00, 1.00
Rater 20.600.10, 1.00
Rater 30.800.44, 1.00
Rater 41.001.00, 1.00
Rater 5n/a* 
Rater 61.001.00, 1.00
Alveolar infiltrate  
Rater 11.001.00, 1.00
Rater 21.001.00, 1.00
Rater 31.001.00, 1.00
Rater 41.001.00, 1.00
Rater 50.780.39, 1.00
Rater 60.740.27, 1.00
Interstitial infiltrate  
Rater 11.001.00, 1.00
Rater 20.210.43, 0.85
Rater 30.740.27, 1.00
Rater 4n/a 
Rater 50.580.07, 1.00
Rater 60.620.5, 1.00

DISCUSSION

The chest radiograph serves as an integral component of the reference standard for the diagnosis of childhood pneumonia. Few prior studies have assessed the reliability of chest radiograph findings in children.3, 5, 12, 14, 15 We found a high degree of agreement among radiologists for radiologic findings consistent with bacterial pneumonia when standardized interpretation criteria were applied. In this study, we identified radiographic features of pneumonia, such as alveolar infiltrate and pleural effusion, that were consistently identified by different radiologists reviewing the same radiograph and by the same radiologist reviewing the same radiograph. These data support the notion that radiographic features most suggestive of bacterial pneumonia are consistently identified by radiologists.16, 17 There was less consistency in the identification of other radiographic findings, such as interstitial infiltrates, air bronchograms, and hilar lymphadenopathy.

Prior studies have found high levels of disagreement among radiologists in the interpretation of chest radiographs.2, 3, 15, 18 Many of these prior studies emphasized variation in detection of radiographic findings that would not typically alter clinical management. We observed high intra‐rater, and inter‐rater reliability among radiologists for the findings of alveolar infiltrate and pleural effusion. These are the radiographic findings most consistent with a bacterial etiologic agent for pneumonia.19 Other studies have also found that the presence of an alveolar infiltrate is a reliable radiographic finding in children18 and adults.7, 9, 10 These findings support the use of the WHO definition of primary endpoint pneumonia for use in epidemiologic studies.4, 6, 11

This study also confirms a previous report by Cherian et al. that findings of many children with asthma, reactive airways disease, bronchiolitis, and viral infections interstitial infiltrates are less reliable.6 This is not surprising considering the fact that these patients often have radiographic findings due to small airway disease and atelectasis.19, 20 The differentiation between atelectasis and interstitial infiltrate is difficult, particularly in young children. A prior study conducted among neonates observed wide variability in the interpretation of chest radiographs, and that the differentiation of pneumonia from atelectasis was difficult for this patient population.5 The decisions around antimicrobial treatment of children with radiographic findings of interstitial infiltrates should be made in the context of the clinical history and physical examination findings, and clinicians should realize that these radiographic features demonstrate poor reliability for the diagnosis of pneumonia.

Overall reliability for the presence of any infiltrate, and its converse, no infiltrate was considered moderate. This is driven by the low reliability and variability around the radiographic diagnosis of interstitial infiltrates. Our findings are similar to those observed in adults with lower respiratory tract infections.9 The low reliability in identification of interstitial infiltrates may explain why prior studies have demonstrated that the CXR results rarely change management in children who have radiographs performed for suspicion of pneumonia.1, 21 Our study highlights the importance of quantifying CXR findings to include specific comments regarding the presence or absence of alveolar infiltrates, rather than the presence or absence of any infiltrate.

The WHO has procured definitions the radiographic diagnosis of pneumonia, and this definition has been utilized to help standardize the interpretation of chest radiographs for the conduct of epidemiological studies.6, 11 Specifically, the definitions utilized not only define the presence or absence of pneumonia, but also attempt to differentiate a primarily bacterial infection (consolidation or pleural effusion), from a viral or atypical presentation (interstitial pattern). Even under the best of circumstances, the differentiation of viral versus bacterial pneumonia is not always possible, and again, is often made by the treating physician by incorporating the clinical setting within which the radiograph was obtained.

This study had several limitations. Firstly, the included radiographs did not reflect the frequency with which certain radiographic findings would be identified in children evaluated for pneumonia in a pediatric emergency department setting. Radiographs were purposefully selected to encompass a broad spectrum of radiologic findings, including less common findings such as hilar lymphadenopathy and pleural effusions. Thus, the prevalence of pneumonia and other abnormal findings in this study was artificially higher than typically observed among a cohort of children for whom pneumonia is considered, a factor that may limit the generalizability of our results. Secondly, the clinical history was not provided to the radiologists to avoid bias by indication. For this study, we notified the radiologists that all radiographs were performed for clinical suspicion of pneumonia without providing details about the subjects' signs and symptoms. The absence of clinical history, however, does not mirror the real world scenario in which the interpretation of the chest radiograph is frequently made in the context of the clinical history. The relevance of this latter issue is unclear, as Tudor et al. found a nonstatistically significant improvement in the overall accuracy in chest radiograph interpretation when radiologists were provided clinical details.10 The radiologists recruited for this study all practice in an academic children's hospital setting, and thus, the generalizability of our findings may be limited to this type of practice setting. Finally, reproducibility does not imply accuracy, and reliability in identifying specific findings does not necessarily lead to improved or different management. Thus, while the reliability of radiographic findings of alveolar infiltrate and pleural effusion is reassuringly high, the validity of these radiographic features for bacterial pneumonia is not known. Ascertainment of validity can only be assessed through the use of invasive testing such as lung biopsy, as the yield from bacterial testing such as blood cultures is low, and the results of other studies such as viral testing of nasopharyngeal washings do not prove an etiologic cause of pneumonia.

CONCLUSIONS

Radiographic findings of alveolar infiltrates and pleural effusions are highly reliable among radiologists. Radiographic interpretation of interstitial infiltrates appears to be less reliable.

The chest radiograph (CXR) is the most commonly used diagnostic imaging modality in children, and is considered to be the gold standard for the diagnosis of pneumonia. As such, physicians in developed countries rely on chest radiography to establish the diagnosis of pneumonia.13 However, there are limited data investigating the reliability of this test for the diagnosis of pneumonia in children.2, 46

Prior investigations have noted poor overall agreement by emergency medicine, infectious diseases, and pulmonary medicine physicians, and even radiologists, in their interpretation of chest radiographs for the diagnosis of pneumonia.2, 5, 710 The World Health Organization (WHO) developed criteria to standardize CXR interpretation for the diagnosis of pneumonia in children for use in epidemiologic studies.11 These standardized definitions of pneumonia have been formally evaluated by the WHO6 and utilized in epidemiologic studies of vaccine efficacy,12 but the overall reliability of these radiographic criteria have not been studied outside of these forums.

We conducted this prospective case‐based study to evaluate the reliability of the radiographic diagnosis of pneumonia among children presenting to a pediatric emergency department with clinical suspicion of pneumonia. We were primarily interested in assessing the overall reliability in CXR interpretation for the diagnosis of pneumonia, and identifying which radiographic features of pneumonia were consistently identified by radiologists.

MATERIALS AND METHODS

Study Subjects

We evaluated the reliability of CXR interpretation with respect to the diagnosis of pneumonia among radiologists. Six board‐certified radiologists at 2 academic children's hospitals (Children's Hospital of Philadelphia, Philadelphia, PA [n = 3] and Children's Hospital, Boston, Boston, MA [n = 3]) interpreted the same 110 chest radiographs in a blinded fashion. The radiologists varied with respect to the number of years practicing pediatric radiology (median 8 years, range 3‐36 years). Clinical information such as age, gender, clinical indication for obtaining the radiograph, history, and physical examination findings were not provided. Aside from the study form which stated the WHO classification scheme for radiographic pneumonia, no other information or training was provided to the radiologists as part of this study.

Radiographs were selected among a population of children presenting to the emergency department at Children's Hospital, Boston, who had a radiograph obtained for concern of pneumonia. From this cohort, we selected children who had radiographs which encompassed the spectrum of respiratory disease processes encountered in a pediatric population. The radiographs selected for review included 50 radiographs with a final reading in the medical record without suspicion for pneumonia, and 50 radiographs in which the diagnosis of pneumonia could not be excluded. In the latter group, 25 radiographs had a final reading suggestive of an alveolar infiltrate, and 25 radiographs had a final reading suggestive of an interstitial infiltrate. Ten duplicate radiographs were included to permit assessment of intra‐rater reliability.

Radiograph Interpretation

Radiologists at both sites interpreted the identical 110 radiographs (both anteroposterior [AP] and lateral views for each subject). Digital Imaging and Communications in Medicine (DICOM) images were downloaded from a registry at Children's Hospital, Boston, and were copied to DVDs which were provided to each radiologist. Standardized radiographic imaging software (eFilm Lite [Mississauga, Canada]) was used by each radiologist to view and interpret the radiographs.

Each radiologist completed a study questionnaire for each radiograph interpreted (see Supporting Appendix A in the online version of this article). The questionnaire utilized radiographic descriptors of primary end‐point pneumonia described by the WHO which were procured to standardize the radiographic diagnosis of pneumonia.11, 12 The main outcome of interest was the presence or absence of an infiltrate. Among radiographs in which an infiltrate was identified, radiologists selected whether there was an alveolar infiltrate, interstitial infiltrate, or both. An alveolar infiltrate was defined as a dense or fluffy opacity that occupies a portion or whole of a lobe, or of the entire lung, that may or may not contain air bronchograms.11, 12 An interstitial infiltrate was defined by a lacy pattern involving both lungs, featuring peribronchial thickening and multiple areas of atelectasis.11, 12 It also included minor patchy infiltrates that were not of sufficient magnitude to constitute consolidation, and small areas of atelectasis that in children may be difficult to distinguish from consolidation. Among interstitial infiltrates, radiologists were asked to distinguish infiltrate from atelectasis. A radiograph classified as having either an alveolar infiltrate or interstitial infiltrate (not atelectasis) was considered to have any infiltrate. Additional findings including air bronchograms, hilar adenopathy, pleural effusion, and location of abnormalities were also recorded.

Statistical Analysis

Inter‐rater reliability was assessed using the kappa statistic to determine the overall agreement between the 6 radiologists for each binary outcome (ie, presence or absence of alveolar infiltrate). To calculate 95% confidence intervals (CI) for kappa statistics with more than 2 raters, we employed a bootstrapping method with 1000 replications of samples equal in size to the study sample, using the kapci program as implemented by STATA software (version 10.1, STATA Corp, College Station, TX). Also, intra‐rater reliability was evaluated by examining the agreement within each radiologist upon review of 10 duplicate radiographs that had been randomly inserted into the case‐mix. We used the benchmarks proposed by Landis and Koch to classify the strength of agreement measured by the kappa statistic, as follows: poor (<0.0); slight (0‐0.20); fair (0.21‐0.40); moderate (0.41‐0.60); substantial (0.61‐0.80); almost perfect (0.81‐1.0).13

The study was approved by the institutional review boards at Children's Hospital, Boston and Children's Hospital of Philadelphia.

RESULTS

Patient Sample

The sample of 110 radiographs was obtained from 100 children presenting to the emergency department at Children's Hospital, Boston, with concern of pneumonia. These patients ranged in age from 1 week to 19 years (median, 3.5 years; interquartile range [IQR], 1.6‐6.0 years). Fifty (50%) of these patients were male. As stated above, the sample comprised 50 radiographs with a final reading in the medical record without suspicion for pneumonia, and 50 radiographs in which the diagnosis of pneumonia could not be excluded. The 10 duplicate radiographs encompassed a similar spectrum of findings.

Inter‐Rater Reliability

The kappa coefficients of inter‐rater reliability between the radiologists across the 6 clinical measures of interest are displayed in Table 1. As shown, the most reliable measure was that of alveolar infiltrate (Figure 1), which attained a substantial degree of agreement between the radiologists. Two other measures, any infiltrate and pleural effusion, attained moderate reliability, while bronchograms and hilar adenopathy were each classified as having fair reliability. However, interstitial infiltrate (Figure 2) was found to have the lowest kappa estimate, with a slight degree of reliability. When examining inter‐rater reliability among the radiologists separately from each institution, the pattern of results was similar.

Inter‐Rater Reliability of Radiologists (n = 6) Evaluating Chest Radiographs in Children Presenting to the ED With Suspected Pneumonia (n = 100)
All Radiologists (n = 6)Kappa95% Confidence Interval
  • Abbreviation: ED, emergency department.

Any infiltrate0.470.39, 0.56
Alveolar infiltrate0.690.60, 0.78
Interstitial infiltrate0.140.05, 0.23
Air bronchograms0.320.24, 0.42
Hilar adenopathy0.210.08, 0.39
Pleural effusion0.450.29, 0.61
Figure 1
Chest radiograph (anteroposterior [AP] view) of a child with an opacity in the right middle lobe. For this image, all 6 radiologists classified the patient as having an alveolar infiltrate.
Figure 2
Chest radiograph (anteroposterior [AP] view) of a child demonstrating increased interstitial markings which are most prominent in the right middle and left upper lobes. For this image, 4 radiologists classified this radiograph as having an interstitial infiltrate, whereas 2 radiologists classified the patient as not having an interstitial infiltrate.

At least 4 of the 6 radiologists agreed on the presence or absence of an alveolar infiltrate for 95 of the 100 unique CXRs; all 6 radiologists agreed regarding the presence or absence of an alveolar infiltrate in 72 of the 100 unique CXRs. At least 4 of the 6 radiologists agreed on the presence or absence of any infiltrate and interstitial infiltrate 96% and 90% of the time, respectively. All 6 of the radiologists agreed on the presence or absence of any infiltrate and interstitial infiltrate 35% and 27% of the time, respectively.

Intra‐Rater Reliability

Estimates of intra‐rater reliability on the primary clinical outcomes (alveolar infiltrate, interstitial infiltrate, and any infiltrate) are found in Table 2. Across the 6 raters, the kappa estimates for alveolar infiltrate were all classified as substantial or almost perfect. The kappa estimates for interstitial infiltrate varied widely, ranging from fair to almost perfect, while for any infiltrate, reliability ranged from moderate to almost perfect.

Intra‐Rater Reliability of Radiologists Evaluating Chest Radiographs (n = 10) for Pneumonia in Children
 Kappa95% Confidence Interval
  • Too few response categories were represented to facilitate the calculation of the kappa statistic.

  • Both responses are negative for all 10 paired radiographs; kappa cannot be calculated.

Any infiltrate  
Rater 11.001.00, 1.00
Rater 20.600.10, 1.00
Rater 30.800.44, 1.00
Rater 41.001.00, 1.00
Rater 5n/a* 
Rater 61.001.00, 1.00
Alveolar infiltrate  
Rater 11.001.00, 1.00
Rater 21.001.00, 1.00
Rater 31.001.00, 1.00
Rater 41.001.00, 1.00
Rater 50.780.39, 1.00
Rater 60.740.27, 1.00
Interstitial infiltrate  
Rater 11.001.00, 1.00
Rater 20.210.43, 0.85
Rater 30.740.27, 1.00
Rater 4n/a 
Rater 50.580.07, 1.00
Rater 60.620.5, 1.00

DISCUSSION

The chest radiograph serves as an integral component of the reference standard for the diagnosis of childhood pneumonia. Few prior studies have assessed the reliability of chest radiograph findings in children.3, 5, 12, 14, 15 We found a high degree of agreement among radiologists for radiologic findings consistent with bacterial pneumonia when standardized interpretation criteria were applied. In this study, we identified radiographic features of pneumonia, such as alveolar infiltrate and pleural effusion, that were consistently identified by different radiologists reviewing the same radiograph and by the same radiologist reviewing the same radiograph. These data support the notion that radiographic features most suggestive of bacterial pneumonia are consistently identified by radiologists.16, 17 There was less consistency in the identification of other radiographic findings, such as interstitial infiltrates, air bronchograms, and hilar lymphadenopathy.

Prior studies have found high levels of disagreement among radiologists in the interpretation of chest radiographs.2, 3, 15, 18 Many of these prior studies emphasized variation in detection of radiographic findings that would not typically alter clinical management. We observed high intra‐rater, and inter‐rater reliability among radiologists for the findings of alveolar infiltrate and pleural effusion. These are the radiographic findings most consistent with a bacterial etiologic agent for pneumonia.19 Other studies have also found that the presence of an alveolar infiltrate is a reliable radiographic finding in children18 and adults.7, 9, 10 These findings support the use of the WHO definition of primary endpoint pneumonia for use in epidemiologic studies.4, 6, 11

This study also confirms a previous report by Cherian et al. that findings of many children with asthma, reactive airways disease, bronchiolitis, and viral infections interstitial infiltrates are less reliable.6 This is not surprising considering the fact that these patients often have radiographic findings due to small airway disease and atelectasis.19, 20 The differentiation between atelectasis and interstitial infiltrate is difficult, particularly in young children. A prior study conducted among neonates observed wide variability in the interpretation of chest radiographs, and that the differentiation of pneumonia from atelectasis was difficult for this patient population.5 The decisions around antimicrobial treatment of children with radiographic findings of interstitial infiltrates should be made in the context of the clinical history and physical examination findings, and clinicians should realize that these radiographic features demonstrate poor reliability for the diagnosis of pneumonia.

Overall reliability for the presence of any infiltrate, and its converse, no infiltrate was considered moderate. This is driven by the low reliability and variability around the radiographic diagnosis of interstitial infiltrates. Our findings are similar to those observed in adults with lower respiratory tract infections.9 The low reliability in identification of interstitial infiltrates may explain why prior studies have demonstrated that the CXR results rarely change management in children who have radiographs performed for suspicion of pneumonia.1, 21 Our study highlights the importance of quantifying CXR findings to include specific comments regarding the presence or absence of alveolar infiltrates, rather than the presence or absence of any infiltrate.

The WHO has procured definitions the radiographic diagnosis of pneumonia, and this definition has been utilized to help standardize the interpretation of chest radiographs for the conduct of epidemiological studies.6, 11 Specifically, the definitions utilized not only define the presence or absence of pneumonia, but also attempt to differentiate a primarily bacterial infection (consolidation or pleural effusion), from a viral or atypical presentation (interstitial pattern). Even under the best of circumstances, the differentiation of viral versus bacterial pneumonia is not always possible, and again, is often made by the treating physician by incorporating the clinical setting within which the radiograph was obtained.

This study had several limitations. Firstly, the included radiographs did not reflect the frequency with which certain radiographic findings would be identified in children evaluated for pneumonia in a pediatric emergency department setting. Radiographs were purposefully selected to encompass a broad spectrum of radiologic findings, including less common findings such as hilar lymphadenopathy and pleural effusions. Thus, the prevalence of pneumonia and other abnormal findings in this study was artificially higher than typically observed among a cohort of children for whom pneumonia is considered, a factor that may limit the generalizability of our results. Secondly, the clinical history was not provided to the radiologists to avoid bias by indication. For this study, we notified the radiologists that all radiographs were performed for clinical suspicion of pneumonia without providing details about the subjects' signs and symptoms. The absence of clinical history, however, does not mirror the real world scenario in which the interpretation of the chest radiograph is frequently made in the context of the clinical history. The relevance of this latter issue is unclear, as Tudor et al. found a nonstatistically significant improvement in the overall accuracy in chest radiograph interpretation when radiologists were provided clinical details.10 The radiologists recruited for this study all practice in an academic children's hospital setting, and thus, the generalizability of our findings may be limited to this type of practice setting. Finally, reproducibility does not imply accuracy, and reliability in identifying specific findings does not necessarily lead to improved or different management. Thus, while the reliability of radiographic findings of alveolar infiltrate and pleural effusion is reassuringly high, the validity of these radiographic features for bacterial pneumonia is not known. Ascertainment of validity can only be assessed through the use of invasive testing such as lung biopsy, as the yield from bacterial testing such as blood cultures is low, and the results of other studies such as viral testing of nasopharyngeal washings do not prove an etiologic cause of pneumonia.

CONCLUSIONS

Radiographic findings of alveolar infiltrates and pleural effusions are highly reliable among radiologists. Radiographic interpretation of interstitial infiltrates appears to be less reliable.

References
  1. Alario AJ,McCarthy PL,Markowitz R, et al.Usefulness of chest radiographs in children with acute lower respiratory tract disease.J Pediatr.1987;111:187193.
  2. Novack V,Avnon LS,Smolyakov A, et al.Disagreement in the interpretation of chest radiographs among specialists and clinical outcomes of patients hospitalized with suspected pneumonia.Eur J Intern Med.2006;17:4347.
  3. Stickler GB,Hoffman AD,Taylor WF.Problems in the clinical and roentgenographic diagnosis of pneumonia in young children.Clin Pediatr (Phila).1984;23:398399.
  4. WHO guidelines on detecting pneumonia in children.Lancet.1991;338:14531454.
  5. Bloomfield FH,Teele RL,Voss M, et al.Inter‐ and intra‐observer variability in the assessment of atelectasis and consolidation in neonatal chest radiographs.Pediatr Radiol.1999;29:459462.
  6. Cherian T,Mulholland EK,Carlin JB, et al.Standardized interpretation of paediatric chest radiographs for the diagnosis of pneumonia in epidemiological studies.Bull World Health Organ.2005;83:353359.
  7. Albaum MN,Hill LC,Murphy M, et al.Interobserver reliability of the chest radiograph in community‐acquired pneumonia. PORT Investigators.Chest.1996;110:343350.
  8. Gatt ME,Spectre G,Paltiel O, et al.Chest radiographs in the emergency department: is the radiologist really necessary?Postgrad Med J.2003;79:214217.
  9. Hopstaken RM,Witbraad T,van Engelshoven JM, et al.Inter‐observer variation in the interpretation of chest radiographs for pneumonia in community‐acquired lower respiratory tract infections.Clin Radiol.2004;59:743752.
  10. Tudor GR,Finlay D,Taub N.An assessment of inter‐observer agreement and accuracy when reporting plain radiographs.Clin Radiol.1997;52:235238.
  11. Standardization of interpretation of chest radiographs for the diagnosis of pneumonia in children. In:World Health Organization: Pneumonia Vaccine Trial Investigators' Group.Geneva:Department of Vaccine and Biologics;2001.
  12. Hansen J,Black S,Shinefield H, et al.Effectiveness of heptavalent pneumococcal conjugate vaccine in children younger than 5 years of age for prevention of pneumonia: updated analysis using World Health Organization standardized interpretation of chest radiographs.Pediatr Infect Dis J.2006;25:779781.
  13. Landis JR,Koch GG.The measurement of observer agreement for categorical data.Biometrics.1977;33:159174.
  14. Grossman LK,Caplan SE.Clinical, laboratory, and radiological information in the diagnosis of pneumonia in children.Ann Emerg Med.1988;17:4346.
  15. Johnson J,Kline JA.Intraobserver and interobserver agreement of the interpretation of pediatric chest radiographs.Emerg Radiol.17:285290.
  16. Bartlett JG,Dowell SF,Mandell LA, et al.Practice guidelines for the management of community‐acquired pneumonia in adults. Infectious Diseases Society of America.Clin Infect Dis.2000;31:347382.
  17. Niederman MS,Mandell LA,Anzueto A, et al.Guidelines for the management of adults with community‐acquired pneumonia. Diagnosis, assessment of severity, antimicrobial therapy, and prevention.Am J Respir Crit Care Med.2001;163:17301754.
  18. Korppi M,Kiekara O,Heiskanen‐Kosma T, et al.Comparison of radiological findings and microbial aetiology of childhood pneumonia.Acta Paediatr.1993;82:360363.
  19. Kuhn JP, Slovis TL, Haller JO, eds.Caffey's Pediatric Diagnostic Imaging.10th ed.Philadelphia, PA:Mosby;2004.
  20. Mathews B,Shah S,Cleveland RH, et al.Clinical predictors of pneumonia among children with wheezing.Pediatrics.2009;124:e29e36.
  21. Spottswood SE,Liaw K,Hernanz‐Schulman M, et al.The clinical impact of the radiology report in wheezing and nonwheezing febrile children: a survey of clinicians.Pediatr Radiol.2009;39:348353.
References
  1. Alario AJ,McCarthy PL,Markowitz R, et al.Usefulness of chest radiographs in children with acute lower respiratory tract disease.J Pediatr.1987;111:187193.
  2. Novack V,Avnon LS,Smolyakov A, et al.Disagreement in the interpretation of chest radiographs among specialists and clinical outcomes of patients hospitalized with suspected pneumonia.Eur J Intern Med.2006;17:4347.
  3. Stickler GB,Hoffman AD,Taylor WF.Problems in the clinical and roentgenographic diagnosis of pneumonia in young children.Clin Pediatr (Phila).1984;23:398399.
  4. WHO guidelines on detecting pneumonia in children.Lancet.1991;338:14531454.
  5. Bloomfield FH,Teele RL,Voss M, et al.Inter‐ and intra‐observer variability in the assessment of atelectasis and consolidation in neonatal chest radiographs.Pediatr Radiol.1999;29:459462.
  6. Cherian T,Mulholland EK,Carlin JB, et al.Standardized interpretation of paediatric chest radiographs for the diagnosis of pneumonia in epidemiological studies.Bull World Health Organ.2005;83:353359.
  7. Albaum MN,Hill LC,Murphy M, et al.Interobserver reliability of the chest radiograph in community‐acquired pneumonia. PORT Investigators.Chest.1996;110:343350.
  8. Gatt ME,Spectre G,Paltiel O, et al.Chest radiographs in the emergency department: is the radiologist really necessary?Postgrad Med J.2003;79:214217.
  9. Hopstaken RM,Witbraad T,van Engelshoven JM, et al.Inter‐observer variation in the interpretation of chest radiographs for pneumonia in community‐acquired lower respiratory tract infections.Clin Radiol.2004;59:743752.
  10. Tudor GR,Finlay D,Taub N.An assessment of inter‐observer agreement and accuracy when reporting plain radiographs.Clin Radiol.1997;52:235238.
  11. Standardization of interpretation of chest radiographs for the diagnosis of pneumonia in children. In:World Health Organization: Pneumonia Vaccine Trial Investigators' Group.Geneva:Department of Vaccine and Biologics;2001.
  12. Hansen J,Black S,Shinefield H, et al.Effectiveness of heptavalent pneumococcal conjugate vaccine in children younger than 5 years of age for prevention of pneumonia: updated analysis using World Health Organization standardized interpretation of chest radiographs.Pediatr Infect Dis J.2006;25:779781.
  13. Landis JR,Koch GG.The measurement of observer agreement for categorical data.Biometrics.1977;33:159174.
  14. Grossman LK,Caplan SE.Clinical, laboratory, and radiological information in the diagnosis of pneumonia in children.Ann Emerg Med.1988;17:4346.
  15. Johnson J,Kline JA.Intraobserver and interobserver agreement of the interpretation of pediatric chest radiographs.Emerg Radiol.17:285290.
  16. Bartlett JG,Dowell SF,Mandell LA, et al.Practice guidelines for the management of community‐acquired pneumonia in adults. Infectious Diseases Society of America.Clin Infect Dis.2000;31:347382.
  17. Niederman MS,Mandell LA,Anzueto A, et al.Guidelines for the management of adults with community‐acquired pneumonia. Diagnosis, assessment of severity, antimicrobial therapy, and prevention.Am J Respir Crit Care Med.2001;163:17301754.
  18. Korppi M,Kiekara O,Heiskanen‐Kosma T, et al.Comparison of radiological findings and microbial aetiology of childhood pneumonia.Acta Paediatr.1993;82:360363.
  19. Kuhn JP, Slovis TL, Haller JO, eds.Caffey's Pediatric Diagnostic Imaging.10th ed.Philadelphia, PA:Mosby;2004.
  20. Mathews B,Shah S,Cleveland RH, et al.Clinical predictors of pneumonia among children with wheezing.Pediatrics.2009;124:e29e36.
  21. Spottswood SE,Liaw K,Hernanz‐Schulman M, et al.The clinical impact of the radiology report in wheezing and nonwheezing febrile children: a survey of clinicians.Pediatr Radiol.2009;39:348353.
Issue
Journal of Hospital Medicine - 7(4)
Issue
Journal of Hospital Medicine - 7(4)
Page Number
294-298
Page Number
294-298
Publications
Publications
Article Type
Display Headline
Variability in the interpretation of chest radiographs for the diagnosis of pneumonia in children
Display Headline
Variability in the interpretation of chest radiographs for the diagnosis of pneumonia in children
Sections
Article Source

Copyright © 2011 Society of Hospital Medicine

Disallow All Ads
Correspondence Location
Division of Emergency Medicine, Children's Hospital, Boston, 300 Longwood Ave, Boston, MA 02115
Content Gating
No Gating (article Unlocked/Free)
Alternative CME
Article PDF Media
Media Files