User login
Is an elevated serum transferrin saturation associated with the development of diabetes?
- Diabetes is a common comorbid condition of hemochromatosis and is suggested to be a complication of untreated hemochromatosis.
- Diabetes does not seem to be a complication of hemochromatosis.
- Screening for and treatment of hemochromatosis are justified for several complications but not indicated as a way to prevent development of diabetes.
Hemochromatosis is an autosomal recessive abnormality of iron regulation that results in excessive intestinal absorption and cellular deposition of iron.1 Although hemochromatosis was once thought to be rare, many screening studies have established that it is among the most common inherited metabolic abnormalities.2-6 The College of American Pathologists has recommended population-based screening for hemochromatosis with the use of the serum transferrin saturation level.7 Although prevalent in the population, hemochromatosis is rarely diagnosed.8 The pathologic iron accumulation resulting in hemochromatosis affects many organs including the liver, pancreas, and heart.9-12 Because primary hemochromatosis is a common comorbid condition with diabetes,13-15 most work on the relation between hemochromatosis and diabetes has focused on screening patients with diabetes for hemochromatosis.16,17
Clinical reviews have stated that, because diabetes is a serious complication of hemochromatosis, screening patients without diabetes for hemochromatosis might be a useful strategy to decrease the likelihood that they will develop diabetes.15,18-20 However, there are few primary data to support this contention. There is some evidence to indicate that hemochromatosis has the pathogenic features of impaired insulin secretion and insulin resistance due to iron accumulation in the liver.21 One study indicated that, in individuals with hemochromatosis but neither cirrhosis nor diabetes (n = 7), phlebotomy treatment normalizes serum ferritin levels, acute insulin response to glucose, and glucose tolerance.22 In patients with hemochromatosis and newly diagnosed diabetes, phlebotomy did not affect glucose tolerance or insulin resistance. In a nationally representative cohort, we examined the likelihood that patients with an elevated serum transferrin saturation rate but no current diagnosis of diabetes would develop diabetes during 20 years of follow-up.
Methods
This retrospective cohort study followed individuals without a diagnosis of diabetes, aged 25 to 74 years at the time of the index interview. We used the National Health and Nutrition Examination Survey I (1971–1974; NHANES I) merged with the NHANES I Epidemiologic Followup Study (1992; NHEFS).
The NHANES I was conducted between 1971 and 1975 and allowed for representative estimates of the non-institutionalized civilian US population. The NHEFS is a national longitudinal study of individuals assessed at the NHANES I baseline. The NHEFS initial population included the 14,407 participants who were 25 to 74 years of age when first examined in NHANES I. More than 98% of the individuals in the initial NHANES I cohort were traced and supplied data for the NHEFS.
The follow-up information was gathered in 3 ways. Surviving subjects were interviewed. If the subject was deceased or alive but incapacitated, a slightly modified version of the subject questionnaire was administered to a proxy respondent. For individuals who had died in the period between the NHANES I index interview and the follow-up interview, information from death certificates was recorded. A total of 1,681 proxy respondents was interviewed in the NHEFS.
Serum transferrin saturation was measured in the original NHANES I. We defined elevated serum transferrin saturation as greater than 45%, greater than 50%, greater than 55%, greater than 60%, or greater than 62%. All of these cutoff values had previously been proposed or used in population-based studies of elevated serum transferrin saturation.4,5,23
Diabetes was operationalized as a positive response to the question, Has a doctor ever told you that you have diabetes? This question was asked in the original NHANES I and in each wave of the follow-up survey (1982–1984, 1986, 1987, and 1992). For individuals who could not participate, proxy respondents were queried. In terms of individuals who died before the follow-up survey, we operationalized the development of diabetes as an ICD-9 diagnosis of 250.XX for underlying cause of death or any of the 20 other diagnoses listed on the death certificate.
We also assessed risk factors for diabetes available in the NHANES I, including obesity represented by a body mass index greater than 27 kg/m 2 , race, sex, age, physician diagnosis of hypertension, and total serum cholesterol above 240 mg/dL as a way to increase our understanding of diabetes mellitus as a consequence of hemochromatosis.
Our index sample was limited to men and women 25 to 74 years of age in the NHANES I, who had a serum transferrin saturation rate recorded in the NHANES I, did not have diabetes at the initial index interview, and had information on the development of diabetes (n = 9724).
Data analysis
We used sampling weights to calculate prevalence estimates for the civilian noninstitutionalized US population. Because of the complex sampling design of the survey, we performed all analyses with SUDAAN.24
We initially computed unadjusted estimates of the likelihood of development of diabetes for different levels of elevated serum transferrin saturation between 1971 and 1974. We attempted to compute analyses for serum transferrin saturation levels of 60% and 62%, but the number of people who developed diabetes with those levels was so small (n
Because we could not determine whether individuals with elevated serum transferrin saturation received treatment for hemochromatosis during the time of the study, we computed a series of analyses assuming that different proportions of individuals received treatment during the time frame. Some national evidence has suggested that few individuals are diagnosed with hemochromatosis. In fact, in the 1996, 1997, and 1998 National Ambulatory Medical Care Surveys, there were 7 visits for hemochromatosis of 64,001 total evaluated visits (0.01% of visits).8 We recomputed the adjusted odds ratios for the samples of people with transferrin saturation rates greater than 45% and 50% after randomly selecting 10% of each group and treating those individuals as though they were undergoing therapeutic phlebotomies.
Results
Table 1 presents the characteristics of the population with baseline characteristics and characteristics measured in the follow-up data collection. A substantial proportion of the adult population had elevated serum transferrin saturation greater than 45%, 50%, and 55%. The incidence of diagnosed diabetes in the cohort was 10.2%.
Among individuals with serum transferrin saturation levels greater than 45% at the NHANES I baseline, 8.9% developed diagnosed diabetes compared with 10.3% who did not have elevated serum transferrin saturation (P = .44). Similarly, among individuals with serum transferrin saturation levels greater than 50% at the NHANES I baseline, 8.1% developed diagnosed diabetes compared with 10.3% who did not have elevated serum transferrin saturation (P = .34); of individuals with transferrin saturation levels greater than 55%, 7.5% developed diabetes compared with 10.2% of those without elevated serum transferrin saturation (P = .38). Table 2 indicates that individuals with elevated transferrin saturation levels are not significantly more likely to develop diagnosed diabetes than individuals without elevated serum transferrin saturation. The lack of a significant relation is present in unadjusted and adjusted analyses.
When we reestimated the models assuming that 10% of the population with elevated serum transferrin saturation (> 45%) were successfully treated, individuals with elevated serum transferrin saturation were not significantly more likely to develop diagnosed diabetes than individuals without elevated serum transferrin saturation. This relation held in unadjusted and adjusted analyses. When we assumed that 10% of the population with elevated serum transferrin saturation at greater than 50% and greater than 55% were successfully treated, individuals with elevated serum transferrin saturation were not significantly more likely to develop diagnosed diabetes than individuals without elevated serum transferrin saturation. These relations remained consistent in unadjusted and adjusted analyses.
TABLE 1
Baseline characteristics of the population collected in National Health and Nutrition Examination Survey I
Characteristic | Value |
---|---|
Male sex | 47.7% |
Race | |
European American | 90.7% |
African American | 8.7% |
Other | 0.6% |
Age, mean ± standard error (y) | 47.1 ± 0.2 |
Obesity (body mass index ≥ 27) | 32.3% |
Transferrin saturation (cumulative) | |
> 45% | 8.0% |
> 50% | 4.6% |
> 55% | 2.7% |
> 60% | 1.7% |
> 62% | 1.4% |
Total serum cholesterol (>240 mg/dL) | 32.0% |
Hypertension (informed by physician) | 13.6% |
Collected in NHFES follow-up | |
Developed diabetes | 10.2% |
NHFES, National Health and Nutrition Examination Survey I (NHANES I) merged with the NHANES I Epidemiologic Followup Study. |
TABLE 2
Unadjusted and adjusted odds of developing diabetes with an elevated serum transferrin saturation level
Transferrin saturation | Unadjusted OR (CI) | Adjusted* OR (CI) |
---|---|---|
Original data | ||
> 45% | 1.17 (0.78–1.75) | 0.89 (0.59–1.34) |
> 50% | 1.29 (0.73–2.29) | 0.95 (0.53–1.70) |
> 55% | 1.40 (0.60–3.27) | 1.03 (0.44–2.43) |
Assuming 10% treatment | ||
> 45% | 1.23 (0.8–1.9) | 0.94 (0.61–1.47) |
> 50% | 1.29 (0.7–2.39) | 0.96 (0.52–1.79) |
> 55% | 1.41 (0.56–3.58) | 1.05 (0.41–2.67) |
*Controlling for age, sex, race, hypercholesterolemia, obesity, and hypertension. | ||
CI, 95% confidence interval; OR, odds ratio. |
Discussion
The findings of this study call into question the commonly held assumption that there is a causative relation between the presence of hemochromatosis and the subsequent development of diabetes mellitus. Although diabetes is a common comorbid condition with hemochromatosis,13,14 this may be due to the fact that both conditions are relatively common, not that one disease leads to the development of the other. In this longitudinal analysis, even when examining the likelihood of developing diabetes at different levels of transferrin saturation, the findings suggested that hemochromatosis does not lead to diabetes.
Could the findings of the current study be explained by the fact that people were treated for hemochromatosis, thus reducing the subsequent development of complications such as diabetes? This seems unlikely because few people with hemochromatosis are routinely identified, and even fewer are treated on a chronic basis. On the contrary, the phenomenon that few people with hemochromatosis are diagnosed and treated is the rationale for recent recommendations for screening asymptomatic persons. Further, in unadjusted and adjusted analyses of the current study, people with elevated transferrin saturation were no more likely to develop diabetes than people without elevated transferrin saturation, even after assuming that 10% of the population with elevated transferrin saturation (> 45%, > 50%, and > 55%) were successfully treated.
The findings of this study have implications for whether screening for hemochromatosis is worthwhile, assuming that prevention of diabetes is a goal. Hemochromatosis has many characteristics that make it attractive for screening: the disorder is common, it posseses a long asymptomatic phase, a simple screening test is available, and treatment is effective. To be a reasonable candidate for screening, the condition also needs to cause substantial morbidity or mortality, and treatment in the asymptomatic phase should be more effective than treatment initiated after the onset of symptoms.25 The findings of this study suggested that screening for and treatment of hemochromatosis are not worthwhile as a way to prevent diabetes. However, the relation between hemochromatosis, treatment, and the development of cirrhosis or hepatocellular carcinoma may warrant screening for hemochromatosis. There is preliminary evidence from an observational study that diagnosing patients with hemochromatosis in the precirrhotic stage and treating them with phlebotomy results in a normal life expectancy, whereas those diagnosed with hemochromatosis and cirrhosis have a shortened life expectancy and a high risk of liver cancer, even when iron depletion has been achieved.10
Our study had several limitations. First, the estimate from the NHANES I was based on an elevated serum transferrin saturation level. This is an appropriate first step in a diagnosis of hemochromatosis. Some investigators have recommended that elevated serum transferrin levels should be confirmed with a second fasting level or an elevated ferritin level.26,27 Further, we did not have access to liver biopsy data, which is considered the gold standard for diagnosing hemochromatosis. Thus, a single elevated transferrin saturation level may have resulted in overestimates of the prevalence of hemochromatosis in the study population. Second, the estimate also might have been affected by the use of the lower levels of serum transferrin saturation (> 45%, > 50%, or > 55%). Although using a more stringent level, eg, greater than 60%, might have strengthened the conclusions, so few people with this level developed diabetes that we could not accurately make a population estimate.
In summary, diabetes does not seem to be a likely complication of hemochromatosis as indicated by the presence of an elevated serum transferrin saturation. Consequently, cost-effectiveness models of screening for hereditary hemochromatosis may need to be reevaluated.
1. Feder JN, Penny DM, Irrinki A, et al. The hemochromatosis gene product complexes with the transferrin receptor and lowers its affinity for ligand binding. Proc Natl Acad Sci 1998;95:1472-7.
2. Looker AC, Johnson CL. Prevalence of elevated serum transferrin saturation in adults in the United States. Ann Intern Med 1998;129:940-5.
3. Baer DM, Simons JL, Staples RL, Rumore GJ, Morton CJ. Hemochromatosis screening in asymptomatic ambulatory men 30 years of age and older. Am J Med 1995;98:464-8.
4. McDonnell SM, Hover A, Gloe D, Ou C, Cogswell ME, Grummer-Strawn L. Population-based screening for hemochromatosis using phenotypic and DNA testing among employees of health maintenance organizations in Springfield, Missouri. Am J Med 1999;107:30-7.
5. Edwards CQ, Griffen LM, Goldgar D, Drummond C, Skolnick MH, Kushner JP. Prevalence of hemochromatosis among 11,065 presumably healthy blood donors. N Engl J Med 1988;318:1355-62.
6. Leggett BA, Halliday JW, Brown NN, Bryant S, Powell LW. Prevalence of haemochromatosis amongst asymptomatic Australians. Br J Haematol 1990;74:525-30.
7. Witte DL, Crosby WH, Edwards CQ, Fairbanks VF, Mitros FA. Practice guideline development task force of the College of American Pathologists. Hereditary hemochromatosis. Clin Chim Acta 1996;245:139-200.
8. Mainous AG III, Gill JM, Pearson WS. Should we screen for hemochromatosis? An examination of downstream effects on morbidity and mortality. Arch Intern Med 2002;162:1769-1774.
9. Cogswell ME, McDonnell SM, Khoury MJ, Franks AL, Burke W, Brittenham G. Iron overload, public health, and genetics: evaluating the evidence for hemochromatosis screening. Ann Intern Med 1998;129:971-9.
10. Niederau C, Fischer R, Sonnenburg A, Stremmel W, Trampisch HJ, Strohmeyer G. Survival and causes of death in cirrhotic and noncirrhotic patients with primary hemochromatosis. N Engl J Med 1985;313:1256-62.
11. Adams PC, Speechley M, Kertesz AE. Long term survival analysis in hereditary hemochromatosis. Gastroenterology 1991;101:368-72.
12. Flynn D, Fairney A, Jackson D, Clayton B. Hormonal changes in thalassemia major. Arch Dis Child 1976;51:828-36.
13. Yang Q, McDonnell SM, Khoury MJ, Cono J, Parrish RG. Hemochromatosis associated mortality in the United States from 1979 to 1992: an analysis of multiple-cause mortality data. Ann Intern Med 1998;129:946-53.
14. Buysschaert M, Paris I, Selvais P, Hermans MP. Clinical aspects of diabetes secondary to idiopathic haemochromatosis in French speaking Belgium. Diabetes Metab 1997;23:308-13.
15. Powell LW, Jazwinska E, Halliday JW. Primary iron overload. In: Brock JH, Halliday JW, Pippard MJ, et al, eds. Iron Metabolism in Health and Disease. London: Saunders; 1994;228-70.
16. George DK, Evans RM, Crofton RW, Gunn IR. Testing for haemochromatosis in the diabetic clinic. Ann Clin Biochem 1995;32:521-6.
17. O’Brien T, Barrett B, Murray DM, Dinneen S, O’Sullivan DJ. Usefulness of biochemical screening of diabetic patients for hemochromatosis. Diabetes Care 1990;13:532-4.
18. Yaouanq JM. Diabetes and haemochromatosis: current concepts, management and prevention. Diabetes Metab 1995;21:319-29.
19. Bothwell TH, MacPhail AP. Hereditary hemochromatosis: etiologic, pathologic, and clinical aspects. Semin Hematol 1998;35:55-71.
20. Ober KP. Polyendocrine syndromes. In: Leahy JL, Clark NG, Cefalu WT, eds. Medical Management of Diabetes Mellitus. New York: Marcel Dekker; 2000;699-717.
21. Stremmel W, Niederau C, Berger M, Kley HK, Kruskemper HL, Strohmeyer G. Abnormalities in estrogen, androgen, and insulin metabolism in idiopathic hemochromatosis. Ann N Y Acad Sci 1988;526:209-23.
22. Hramiak IM, Finegood DT, Adams PC. Factors affecting glucose tolerance in hereditary hemochromatosis. Clin Invest Med 1997;20:110-8.
23. Edwards CQ, Kushner JP. Screening for hemochromatosis. N Engl J Med 1993;328:1616-20.
24. Shah, BV, Barnwell BG, Hunt PN, LaVange LM. SUDAAN User’s Manual. Release 5.50. Research Triangle Park, NC: Research Triangle Institute; 1991.
25. McDonnell SM, Phatak PD, Felitti V, Hover A, McLaren GD. Screening for hemochromatosis in primary care settings. Ann Intern Med 1998;129:962-70.
26. Balan V, Baldus W, Fairbanks V, Michels V, Burritt M, Klee G. Screening for hemochromatosis: a cost effectiveness study based on 12,258 patients. Gastroenterology 1994;107:453-9.
27. Karlsson M, Ikkala E, Reunanen A, Takkunen H, Vuori E, Makinen J. Prevalence of hemochromatosis in Finland. Acta Med Scand 1988;224:385-90.
- Diabetes is a common comorbid condition of hemochromatosis and is suggested to be a complication of untreated hemochromatosis.
- Diabetes does not seem to be a complication of hemochromatosis.
- Screening for and treatment of hemochromatosis are justified for several complications but not indicated as a way to prevent development of diabetes.
Hemochromatosis is an autosomal recessive abnormality of iron regulation that results in excessive intestinal absorption and cellular deposition of iron.1 Although hemochromatosis was once thought to be rare, many screening studies have established that it is among the most common inherited metabolic abnormalities.2-6 The College of American Pathologists has recommended population-based screening for hemochromatosis with the use of the serum transferrin saturation level.7 Although prevalent in the population, hemochromatosis is rarely diagnosed.8 The pathologic iron accumulation resulting in hemochromatosis affects many organs including the liver, pancreas, and heart.9-12 Because primary hemochromatosis is a common comorbid condition with diabetes,13-15 most work on the relation between hemochromatosis and diabetes has focused on screening patients with diabetes for hemochromatosis.16,17
Clinical reviews have stated that, because diabetes is a serious complication of hemochromatosis, screening patients without diabetes for hemochromatosis might be a useful strategy to decrease the likelihood that they will develop diabetes.15,18-20 However, there are few primary data to support this contention. There is some evidence to indicate that hemochromatosis has the pathogenic features of impaired insulin secretion and insulin resistance due to iron accumulation in the liver.21 One study indicated that, in individuals with hemochromatosis but neither cirrhosis nor diabetes (n = 7), phlebotomy treatment normalizes serum ferritin levels, acute insulin response to glucose, and glucose tolerance.22 In patients with hemochromatosis and newly diagnosed diabetes, phlebotomy did not affect glucose tolerance or insulin resistance. In a nationally representative cohort, we examined the likelihood that patients with an elevated serum transferrin saturation rate but no current diagnosis of diabetes would develop diabetes during 20 years of follow-up.
Methods
This retrospective cohort study followed individuals without a diagnosis of diabetes, aged 25 to 74 years at the time of the index interview. We used the National Health and Nutrition Examination Survey I (1971–1974; NHANES I) merged with the NHANES I Epidemiologic Followup Study (1992; NHEFS).
The NHANES I was conducted between 1971 and 1975 and allowed for representative estimates of the non-institutionalized civilian US population. The NHEFS is a national longitudinal study of individuals assessed at the NHANES I baseline. The NHEFS initial population included the 14,407 participants who were 25 to 74 years of age when first examined in NHANES I. More than 98% of the individuals in the initial NHANES I cohort were traced and supplied data for the NHEFS.
The follow-up information was gathered in 3 ways. Surviving subjects were interviewed. If the subject was deceased or alive but incapacitated, a slightly modified version of the subject questionnaire was administered to a proxy respondent. For individuals who had died in the period between the NHANES I index interview and the follow-up interview, information from death certificates was recorded. A total of 1,681 proxy respondents was interviewed in the NHEFS.
Serum transferrin saturation was measured in the original NHANES I. We defined elevated serum transferrin saturation as greater than 45%, greater than 50%, greater than 55%, greater than 60%, or greater than 62%. All of these cutoff values had previously been proposed or used in population-based studies of elevated serum transferrin saturation.4,5,23
Diabetes was operationalized as a positive response to the question, Has a doctor ever told you that you have diabetes? This question was asked in the original NHANES I and in each wave of the follow-up survey (1982–1984, 1986, 1987, and 1992). For individuals who could not participate, proxy respondents were queried. In terms of individuals who died before the follow-up survey, we operationalized the development of diabetes as an ICD-9 diagnosis of 250.XX for underlying cause of death or any of the 20 other diagnoses listed on the death certificate.
We also assessed risk factors for diabetes available in the NHANES I, including obesity represented by a body mass index greater than 27 kg/m 2 , race, sex, age, physician diagnosis of hypertension, and total serum cholesterol above 240 mg/dL as a way to increase our understanding of diabetes mellitus as a consequence of hemochromatosis.
Our index sample was limited to men and women 25 to 74 years of age in the NHANES I, who had a serum transferrin saturation rate recorded in the NHANES I, did not have diabetes at the initial index interview, and had information on the development of diabetes (n = 9724).
Data analysis
We used sampling weights to calculate prevalence estimates for the civilian noninstitutionalized US population. Because of the complex sampling design of the survey, we performed all analyses with SUDAAN.24
We initially computed unadjusted estimates of the likelihood of development of diabetes for different levels of elevated serum transferrin saturation between 1971 and 1974. We attempted to compute analyses for serum transferrin saturation levels of 60% and 62%, but the number of people who developed diabetes with those levels was so small (n
Because we could not determine whether individuals with elevated serum transferrin saturation received treatment for hemochromatosis during the time of the study, we computed a series of analyses assuming that different proportions of individuals received treatment during the time frame. Some national evidence has suggested that few individuals are diagnosed with hemochromatosis. In fact, in the 1996, 1997, and 1998 National Ambulatory Medical Care Surveys, there were 7 visits for hemochromatosis of 64,001 total evaluated visits (0.01% of visits).8 We recomputed the adjusted odds ratios for the samples of people with transferrin saturation rates greater than 45% and 50% after randomly selecting 10% of each group and treating those individuals as though they were undergoing therapeutic phlebotomies.
Results
Table 1 presents the characteristics of the population with baseline characteristics and characteristics measured in the follow-up data collection. A substantial proportion of the adult population had elevated serum transferrin saturation greater than 45%, 50%, and 55%. The incidence of diagnosed diabetes in the cohort was 10.2%.
Among individuals with serum transferrin saturation levels greater than 45% at the NHANES I baseline, 8.9% developed diagnosed diabetes compared with 10.3% who did not have elevated serum transferrin saturation (P = .44). Similarly, among individuals with serum transferrin saturation levels greater than 50% at the NHANES I baseline, 8.1% developed diagnosed diabetes compared with 10.3% who did not have elevated serum transferrin saturation (P = .34); of individuals with transferrin saturation levels greater than 55%, 7.5% developed diabetes compared with 10.2% of those without elevated serum transferrin saturation (P = .38). Table 2 indicates that individuals with elevated transferrin saturation levels are not significantly more likely to develop diagnosed diabetes than individuals without elevated serum transferrin saturation. The lack of a significant relation is present in unadjusted and adjusted analyses.
When we reestimated the models assuming that 10% of the population with elevated serum transferrin saturation (> 45%) were successfully treated, individuals with elevated serum transferrin saturation were not significantly more likely to develop diagnosed diabetes than individuals without elevated serum transferrin saturation. This relation held in unadjusted and adjusted analyses. When we assumed that 10% of the population with elevated serum transferrin saturation at greater than 50% and greater than 55% were successfully treated, individuals with elevated serum transferrin saturation were not significantly more likely to develop diagnosed diabetes than individuals without elevated serum transferrin saturation. These relations remained consistent in unadjusted and adjusted analyses.
TABLE 1
Baseline characteristics of the population collected in National Health and Nutrition Examination Survey I
Characteristic | Value |
---|---|
Male sex | 47.7% |
Race | |
European American | 90.7% |
African American | 8.7% |
Other | 0.6% |
Age, mean ± standard error (y) | 47.1 ± 0.2 |
Obesity (body mass index ≥ 27) | 32.3% |
Transferrin saturation (cumulative) | |
> 45% | 8.0% |
> 50% | 4.6% |
> 55% | 2.7% |
> 60% | 1.7% |
> 62% | 1.4% |
Total serum cholesterol (>240 mg/dL) | 32.0% |
Hypertension (informed by physician) | 13.6% |
Collected in NHFES follow-up | |
Developed diabetes | 10.2% |
NHFES, National Health and Nutrition Examination Survey I (NHANES I) merged with the NHANES I Epidemiologic Followup Study. |
TABLE 2
Unadjusted and adjusted odds of developing diabetes with an elevated serum transferrin saturation level
Transferrin saturation | Unadjusted OR (CI) | Adjusted* OR (CI) |
---|---|---|
Original data | ||
> 45% | 1.17 (0.78–1.75) | 0.89 (0.59–1.34) |
> 50% | 1.29 (0.73–2.29) | 0.95 (0.53–1.70) |
> 55% | 1.40 (0.60–3.27) | 1.03 (0.44–2.43) |
Assuming 10% treatment | ||
> 45% | 1.23 (0.8–1.9) | 0.94 (0.61–1.47) |
> 50% | 1.29 (0.7–2.39) | 0.96 (0.52–1.79) |
> 55% | 1.41 (0.56–3.58) | 1.05 (0.41–2.67) |
*Controlling for age, sex, race, hypercholesterolemia, obesity, and hypertension. | ||
CI, 95% confidence interval; OR, odds ratio. |
Discussion
The findings of this study call into question the commonly held assumption that there is a causative relation between the presence of hemochromatosis and the subsequent development of diabetes mellitus. Although diabetes is a common comorbid condition with hemochromatosis,13,14 this may be due to the fact that both conditions are relatively common, not that one disease leads to the development of the other. In this longitudinal analysis, even when examining the likelihood of developing diabetes at different levels of transferrin saturation, the findings suggested that hemochromatosis does not lead to diabetes.
Could the findings of the current study be explained by the fact that people were treated for hemochromatosis, thus reducing the subsequent development of complications such as diabetes? This seems unlikely because few people with hemochromatosis are routinely identified, and even fewer are treated on a chronic basis. On the contrary, the phenomenon that few people with hemochromatosis are diagnosed and treated is the rationale for recent recommendations for screening asymptomatic persons. Further, in unadjusted and adjusted analyses of the current study, people with elevated transferrin saturation were no more likely to develop diabetes than people without elevated transferrin saturation, even after assuming that 10% of the population with elevated transferrin saturation (> 45%, > 50%, and > 55%) were successfully treated.
The findings of this study have implications for whether screening for hemochromatosis is worthwhile, assuming that prevention of diabetes is a goal. Hemochromatosis has many characteristics that make it attractive for screening: the disorder is common, it posseses a long asymptomatic phase, a simple screening test is available, and treatment is effective. To be a reasonable candidate for screening, the condition also needs to cause substantial morbidity or mortality, and treatment in the asymptomatic phase should be more effective than treatment initiated after the onset of symptoms.25 The findings of this study suggested that screening for and treatment of hemochromatosis are not worthwhile as a way to prevent diabetes. However, the relation between hemochromatosis, treatment, and the development of cirrhosis or hepatocellular carcinoma may warrant screening for hemochromatosis. There is preliminary evidence from an observational study that diagnosing patients with hemochromatosis in the precirrhotic stage and treating them with phlebotomy results in a normal life expectancy, whereas those diagnosed with hemochromatosis and cirrhosis have a shortened life expectancy and a high risk of liver cancer, even when iron depletion has been achieved.10
Our study had several limitations. First, the estimate from the NHANES I was based on an elevated serum transferrin saturation level. This is an appropriate first step in a diagnosis of hemochromatosis. Some investigators have recommended that elevated serum transferrin levels should be confirmed with a second fasting level or an elevated ferritin level.26,27 Further, we did not have access to liver biopsy data, which is considered the gold standard for diagnosing hemochromatosis. Thus, a single elevated transferrin saturation level may have resulted in overestimates of the prevalence of hemochromatosis in the study population. Second, the estimate also might have been affected by the use of the lower levels of serum transferrin saturation (> 45%, > 50%, or > 55%). Although using a more stringent level, eg, greater than 60%, might have strengthened the conclusions, so few people with this level developed diabetes that we could not accurately make a population estimate.
In summary, diabetes does not seem to be a likely complication of hemochromatosis as indicated by the presence of an elevated serum transferrin saturation. Consequently, cost-effectiveness models of screening for hereditary hemochromatosis may need to be reevaluated.
- Diabetes is a common comorbid condition of hemochromatosis and is suggested to be a complication of untreated hemochromatosis.
- Diabetes does not seem to be a complication of hemochromatosis.
- Screening for and treatment of hemochromatosis are justified for several complications but not indicated as a way to prevent development of diabetes.
Hemochromatosis is an autosomal recessive abnormality of iron regulation that results in excessive intestinal absorption and cellular deposition of iron.1 Although hemochromatosis was once thought to be rare, many screening studies have established that it is among the most common inherited metabolic abnormalities.2-6 The College of American Pathologists has recommended population-based screening for hemochromatosis with the use of the serum transferrin saturation level.7 Although prevalent in the population, hemochromatosis is rarely diagnosed.8 The pathologic iron accumulation resulting in hemochromatosis affects many organs including the liver, pancreas, and heart.9-12 Because primary hemochromatosis is a common comorbid condition with diabetes,13-15 most work on the relation between hemochromatosis and diabetes has focused on screening patients with diabetes for hemochromatosis.16,17
Clinical reviews have stated that, because diabetes is a serious complication of hemochromatosis, screening patients without diabetes for hemochromatosis might be a useful strategy to decrease the likelihood that they will develop diabetes.15,18-20 However, there are few primary data to support this contention. There is some evidence to indicate that hemochromatosis has the pathogenic features of impaired insulin secretion and insulin resistance due to iron accumulation in the liver.21 One study indicated that, in individuals with hemochromatosis but neither cirrhosis nor diabetes (n = 7), phlebotomy treatment normalizes serum ferritin levels, acute insulin response to glucose, and glucose tolerance.22 In patients with hemochromatosis and newly diagnosed diabetes, phlebotomy did not affect glucose tolerance or insulin resistance. In a nationally representative cohort, we examined the likelihood that patients with an elevated serum transferrin saturation rate but no current diagnosis of diabetes would develop diabetes during 20 years of follow-up.
Methods
This retrospective cohort study followed individuals without a diagnosis of diabetes, aged 25 to 74 years at the time of the index interview. We used the National Health and Nutrition Examination Survey I (1971–1974; NHANES I) merged with the NHANES I Epidemiologic Followup Study (1992; NHEFS).
The NHANES I was conducted between 1971 and 1975 and allowed for representative estimates of the non-institutionalized civilian US population. The NHEFS is a national longitudinal study of individuals assessed at the NHANES I baseline. The NHEFS initial population included the 14,407 participants who were 25 to 74 years of age when first examined in NHANES I. More than 98% of the individuals in the initial NHANES I cohort were traced and supplied data for the NHEFS.
The follow-up information was gathered in 3 ways. Surviving subjects were interviewed. If the subject was deceased or alive but incapacitated, a slightly modified version of the subject questionnaire was administered to a proxy respondent. For individuals who had died in the period between the NHANES I index interview and the follow-up interview, information from death certificates was recorded. A total of 1,681 proxy respondents was interviewed in the NHEFS.
Serum transferrin saturation was measured in the original NHANES I. We defined elevated serum transferrin saturation as greater than 45%, greater than 50%, greater than 55%, greater than 60%, or greater than 62%. All of these cutoff values had previously been proposed or used in population-based studies of elevated serum transferrin saturation.4,5,23
Diabetes was operationalized as a positive response to the question, Has a doctor ever told you that you have diabetes? This question was asked in the original NHANES I and in each wave of the follow-up survey (1982–1984, 1986, 1987, and 1992). For individuals who could not participate, proxy respondents were queried. In terms of individuals who died before the follow-up survey, we operationalized the development of diabetes as an ICD-9 diagnosis of 250.XX for underlying cause of death or any of the 20 other diagnoses listed on the death certificate.
We also assessed risk factors for diabetes available in the NHANES I, including obesity represented by a body mass index greater than 27 kg/m 2 , race, sex, age, physician diagnosis of hypertension, and total serum cholesterol above 240 mg/dL as a way to increase our understanding of diabetes mellitus as a consequence of hemochromatosis.
Our index sample was limited to men and women 25 to 74 years of age in the NHANES I, who had a serum transferrin saturation rate recorded in the NHANES I, did not have diabetes at the initial index interview, and had information on the development of diabetes (n = 9724).
Data analysis
We used sampling weights to calculate prevalence estimates for the civilian noninstitutionalized US population. Because of the complex sampling design of the survey, we performed all analyses with SUDAAN.24
We initially computed unadjusted estimates of the likelihood of development of diabetes for different levels of elevated serum transferrin saturation between 1971 and 1974. We attempted to compute analyses for serum transferrin saturation levels of 60% and 62%, but the number of people who developed diabetes with those levels was so small (n
Because we could not determine whether individuals with elevated serum transferrin saturation received treatment for hemochromatosis during the time of the study, we computed a series of analyses assuming that different proportions of individuals received treatment during the time frame. Some national evidence has suggested that few individuals are diagnosed with hemochromatosis. In fact, in the 1996, 1997, and 1998 National Ambulatory Medical Care Surveys, there were 7 visits for hemochromatosis of 64,001 total evaluated visits (0.01% of visits).8 We recomputed the adjusted odds ratios for the samples of people with transferrin saturation rates greater than 45% and 50% after randomly selecting 10% of each group and treating those individuals as though they were undergoing therapeutic phlebotomies.
Results
Table 1 presents the characteristics of the population with baseline characteristics and characteristics measured in the follow-up data collection. A substantial proportion of the adult population had elevated serum transferrin saturation greater than 45%, 50%, and 55%. The incidence of diagnosed diabetes in the cohort was 10.2%.
Among individuals with serum transferrin saturation levels greater than 45% at the NHANES I baseline, 8.9% developed diagnosed diabetes compared with 10.3% who did not have elevated serum transferrin saturation (P = .44). Similarly, among individuals with serum transferrin saturation levels greater than 50% at the NHANES I baseline, 8.1% developed diagnosed diabetes compared with 10.3% who did not have elevated serum transferrin saturation (P = .34); of individuals with transferrin saturation levels greater than 55%, 7.5% developed diabetes compared with 10.2% of those without elevated serum transferrin saturation (P = .38). Table 2 indicates that individuals with elevated transferrin saturation levels are not significantly more likely to develop diagnosed diabetes than individuals without elevated serum transferrin saturation. The lack of a significant relation is present in unadjusted and adjusted analyses.
When we reestimated the models assuming that 10% of the population with elevated serum transferrin saturation (> 45%) were successfully treated, individuals with elevated serum transferrin saturation were not significantly more likely to develop diagnosed diabetes than individuals without elevated serum transferrin saturation. This relation held in unadjusted and adjusted analyses. When we assumed that 10% of the population with elevated serum transferrin saturation at greater than 50% and greater than 55% were successfully treated, individuals with elevated serum transferrin saturation were not significantly more likely to develop diagnosed diabetes than individuals without elevated serum transferrin saturation. These relations remained consistent in unadjusted and adjusted analyses.
TABLE 1
Baseline characteristics of the population collected in National Health and Nutrition Examination Survey I
Characteristic | Value |
---|---|
Male sex | 47.7% |
Race | |
European American | 90.7% |
African American | 8.7% |
Other | 0.6% |
Age, mean ± standard error (y) | 47.1 ± 0.2 |
Obesity (body mass index ≥ 27) | 32.3% |
Transferrin saturation (cumulative) | |
> 45% | 8.0% |
> 50% | 4.6% |
> 55% | 2.7% |
> 60% | 1.7% |
> 62% | 1.4% |
Total serum cholesterol (>240 mg/dL) | 32.0% |
Hypertension (informed by physician) | 13.6% |
Collected in NHFES follow-up | |
Developed diabetes | 10.2% |
NHFES, National Health and Nutrition Examination Survey I (NHANES I) merged with the NHANES I Epidemiologic Followup Study. |
TABLE 2
Unadjusted and adjusted odds of developing diabetes with an elevated serum transferrin saturation level
Transferrin saturation | Unadjusted OR (CI) | Adjusted* OR (CI) |
---|---|---|
Original data | ||
> 45% | 1.17 (0.78–1.75) | 0.89 (0.59–1.34) |
> 50% | 1.29 (0.73–2.29) | 0.95 (0.53–1.70) |
> 55% | 1.40 (0.60–3.27) | 1.03 (0.44–2.43) |
Assuming 10% treatment | ||
> 45% | 1.23 (0.8–1.9) | 0.94 (0.61–1.47) |
> 50% | 1.29 (0.7–2.39) | 0.96 (0.52–1.79) |
> 55% | 1.41 (0.56–3.58) | 1.05 (0.41–2.67) |
*Controlling for age, sex, race, hypercholesterolemia, obesity, and hypertension. | ||
CI, 95% confidence interval; OR, odds ratio. |
Discussion
The findings of this study call into question the commonly held assumption that there is a causative relation between the presence of hemochromatosis and the subsequent development of diabetes mellitus. Although diabetes is a common comorbid condition with hemochromatosis,13,14 this may be due to the fact that both conditions are relatively common, not that one disease leads to the development of the other. In this longitudinal analysis, even when examining the likelihood of developing diabetes at different levels of transferrin saturation, the findings suggested that hemochromatosis does not lead to diabetes.
Could the findings of the current study be explained by the fact that people were treated for hemochromatosis, thus reducing the subsequent development of complications such as diabetes? This seems unlikely because few people with hemochromatosis are routinely identified, and even fewer are treated on a chronic basis. On the contrary, the phenomenon that few people with hemochromatosis are diagnosed and treated is the rationale for recent recommendations for screening asymptomatic persons. Further, in unadjusted and adjusted analyses of the current study, people with elevated transferrin saturation were no more likely to develop diabetes than people without elevated transferrin saturation, even after assuming that 10% of the population with elevated transferrin saturation (> 45%, > 50%, and > 55%) were successfully treated.
The findings of this study have implications for whether screening for hemochromatosis is worthwhile, assuming that prevention of diabetes is a goal. Hemochromatosis has many characteristics that make it attractive for screening: the disorder is common, it posseses a long asymptomatic phase, a simple screening test is available, and treatment is effective. To be a reasonable candidate for screening, the condition also needs to cause substantial morbidity or mortality, and treatment in the asymptomatic phase should be more effective than treatment initiated after the onset of symptoms.25 The findings of this study suggested that screening for and treatment of hemochromatosis are not worthwhile as a way to prevent diabetes. However, the relation between hemochromatosis, treatment, and the development of cirrhosis or hepatocellular carcinoma may warrant screening for hemochromatosis. There is preliminary evidence from an observational study that diagnosing patients with hemochromatosis in the precirrhotic stage and treating them with phlebotomy results in a normal life expectancy, whereas those diagnosed with hemochromatosis and cirrhosis have a shortened life expectancy and a high risk of liver cancer, even when iron depletion has been achieved.10
Our study had several limitations. First, the estimate from the NHANES I was based on an elevated serum transferrin saturation level. This is an appropriate first step in a diagnosis of hemochromatosis. Some investigators have recommended that elevated serum transferrin levels should be confirmed with a second fasting level or an elevated ferritin level.26,27 Further, we did not have access to liver biopsy data, which is considered the gold standard for diagnosing hemochromatosis. Thus, a single elevated transferrin saturation level may have resulted in overestimates of the prevalence of hemochromatosis in the study population. Second, the estimate also might have been affected by the use of the lower levels of serum transferrin saturation (> 45%, > 50%, or > 55%). Although using a more stringent level, eg, greater than 60%, might have strengthened the conclusions, so few people with this level developed diabetes that we could not accurately make a population estimate.
In summary, diabetes does not seem to be a likely complication of hemochromatosis as indicated by the presence of an elevated serum transferrin saturation. Consequently, cost-effectiveness models of screening for hereditary hemochromatosis may need to be reevaluated.
1. Feder JN, Penny DM, Irrinki A, et al. The hemochromatosis gene product complexes with the transferrin receptor and lowers its affinity for ligand binding. Proc Natl Acad Sci 1998;95:1472-7.
2. Looker AC, Johnson CL. Prevalence of elevated serum transferrin saturation in adults in the United States. Ann Intern Med 1998;129:940-5.
3. Baer DM, Simons JL, Staples RL, Rumore GJ, Morton CJ. Hemochromatosis screening in asymptomatic ambulatory men 30 years of age and older. Am J Med 1995;98:464-8.
4. McDonnell SM, Hover A, Gloe D, Ou C, Cogswell ME, Grummer-Strawn L. Population-based screening for hemochromatosis using phenotypic and DNA testing among employees of health maintenance organizations in Springfield, Missouri. Am J Med 1999;107:30-7.
5. Edwards CQ, Griffen LM, Goldgar D, Drummond C, Skolnick MH, Kushner JP. Prevalence of hemochromatosis among 11,065 presumably healthy blood donors. N Engl J Med 1988;318:1355-62.
6. Leggett BA, Halliday JW, Brown NN, Bryant S, Powell LW. Prevalence of haemochromatosis amongst asymptomatic Australians. Br J Haematol 1990;74:525-30.
7. Witte DL, Crosby WH, Edwards CQ, Fairbanks VF, Mitros FA. Practice guideline development task force of the College of American Pathologists. Hereditary hemochromatosis. Clin Chim Acta 1996;245:139-200.
8. Mainous AG III, Gill JM, Pearson WS. Should we screen for hemochromatosis? An examination of downstream effects on morbidity and mortality. Arch Intern Med 2002;162:1769-1774.
9. Cogswell ME, McDonnell SM, Khoury MJ, Franks AL, Burke W, Brittenham G. Iron overload, public health, and genetics: evaluating the evidence for hemochromatosis screening. Ann Intern Med 1998;129:971-9.
10. Niederau C, Fischer R, Sonnenburg A, Stremmel W, Trampisch HJ, Strohmeyer G. Survival and causes of death in cirrhotic and noncirrhotic patients with primary hemochromatosis. N Engl J Med 1985;313:1256-62.
11. Adams PC, Speechley M, Kertesz AE. Long term survival analysis in hereditary hemochromatosis. Gastroenterology 1991;101:368-72.
12. Flynn D, Fairney A, Jackson D, Clayton B. Hormonal changes in thalassemia major. Arch Dis Child 1976;51:828-36.
13. Yang Q, McDonnell SM, Khoury MJ, Cono J, Parrish RG. Hemochromatosis associated mortality in the United States from 1979 to 1992: an analysis of multiple-cause mortality data. Ann Intern Med 1998;129:946-53.
14. Buysschaert M, Paris I, Selvais P, Hermans MP. Clinical aspects of diabetes secondary to idiopathic haemochromatosis in French speaking Belgium. Diabetes Metab 1997;23:308-13.
15. Powell LW, Jazwinska E, Halliday JW. Primary iron overload. In: Brock JH, Halliday JW, Pippard MJ, et al, eds. Iron Metabolism in Health and Disease. London: Saunders; 1994;228-70.
16. George DK, Evans RM, Crofton RW, Gunn IR. Testing for haemochromatosis in the diabetic clinic. Ann Clin Biochem 1995;32:521-6.
17. O’Brien T, Barrett B, Murray DM, Dinneen S, O’Sullivan DJ. Usefulness of biochemical screening of diabetic patients for hemochromatosis. Diabetes Care 1990;13:532-4.
18. Yaouanq JM. Diabetes and haemochromatosis: current concepts, management and prevention. Diabetes Metab 1995;21:319-29.
19. Bothwell TH, MacPhail AP. Hereditary hemochromatosis: etiologic, pathologic, and clinical aspects. Semin Hematol 1998;35:55-71.
20. Ober KP. Polyendocrine syndromes. In: Leahy JL, Clark NG, Cefalu WT, eds. Medical Management of Diabetes Mellitus. New York: Marcel Dekker; 2000;699-717.
21. Stremmel W, Niederau C, Berger M, Kley HK, Kruskemper HL, Strohmeyer G. Abnormalities in estrogen, androgen, and insulin metabolism in idiopathic hemochromatosis. Ann N Y Acad Sci 1988;526:209-23.
22. Hramiak IM, Finegood DT, Adams PC. Factors affecting glucose tolerance in hereditary hemochromatosis. Clin Invest Med 1997;20:110-8.
23. Edwards CQ, Kushner JP. Screening for hemochromatosis. N Engl J Med 1993;328:1616-20.
24. Shah, BV, Barnwell BG, Hunt PN, LaVange LM. SUDAAN User’s Manual. Release 5.50. Research Triangle Park, NC: Research Triangle Institute; 1991.
25. McDonnell SM, Phatak PD, Felitti V, Hover A, McLaren GD. Screening for hemochromatosis in primary care settings. Ann Intern Med 1998;129:962-70.
26. Balan V, Baldus W, Fairbanks V, Michels V, Burritt M, Klee G. Screening for hemochromatosis: a cost effectiveness study based on 12,258 patients. Gastroenterology 1994;107:453-9.
27. Karlsson M, Ikkala E, Reunanen A, Takkunen H, Vuori E, Makinen J. Prevalence of hemochromatosis in Finland. Acta Med Scand 1988;224:385-90.
1. Feder JN, Penny DM, Irrinki A, et al. The hemochromatosis gene product complexes with the transferrin receptor and lowers its affinity for ligand binding. Proc Natl Acad Sci 1998;95:1472-7.
2. Looker AC, Johnson CL. Prevalence of elevated serum transferrin saturation in adults in the United States. Ann Intern Med 1998;129:940-5.
3. Baer DM, Simons JL, Staples RL, Rumore GJ, Morton CJ. Hemochromatosis screening in asymptomatic ambulatory men 30 years of age and older. Am J Med 1995;98:464-8.
4. McDonnell SM, Hover A, Gloe D, Ou C, Cogswell ME, Grummer-Strawn L. Population-based screening for hemochromatosis using phenotypic and DNA testing among employees of health maintenance organizations in Springfield, Missouri. Am J Med 1999;107:30-7.
5. Edwards CQ, Griffen LM, Goldgar D, Drummond C, Skolnick MH, Kushner JP. Prevalence of hemochromatosis among 11,065 presumably healthy blood donors. N Engl J Med 1988;318:1355-62.
6. Leggett BA, Halliday JW, Brown NN, Bryant S, Powell LW. Prevalence of haemochromatosis amongst asymptomatic Australians. Br J Haematol 1990;74:525-30.
7. Witte DL, Crosby WH, Edwards CQ, Fairbanks VF, Mitros FA. Practice guideline development task force of the College of American Pathologists. Hereditary hemochromatosis. Clin Chim Acta 1996;245:139-200.
8. Mainous AG III, Gill JM, Pearson WS. Should we screen for hemochromatosis? An examination of downstream effects on morbidity and mortality. Arch Intern Med 2002;162:1769-1774.
9. Cogswell ME, McDonnell SM, Khoury MJ, Franks AL, Burke W, Brittenham G. Iron overload, public health, and genetics: evaluating the evidence for hemochromatosis screening. Ann Intern Med 1998;129:971-9.
10. Niederau C, Fischer R, Sonnenburg A, Stremmel W, Trampisch HJ, Strohmeyer G. Survival and causes of death in cirrhotic and noncirrhotic patients with primary hemochromatosis. N Engl J Med 1985;313:1256-62.
11. Adams PC, Speechley M, Kertesz AE. Long term survival analysis in hereditary hemochromatosis. Gastroenterology 1991;101:368-72.
12. Flynn D, Fairney A, Jackson D, Clayton B. Hormonal changes in thalassemia major. Arch Dis Child 1976;51:828-36.
13. Yang Q, McDonnell SM, Khoury MJ, Cono J, Parrish RG. Hemochromatosis associated mortality in the United States from 1979 to 1992: an analysis of multiple-cause mortality data. Ann Intern Med 1998;129:946-53.
14. Buysschaert M, Paris I, Selvais P, Hermans MP. Clinical aspects of diabetes secondary to idiopathic haemochromatosis in French speaking Belgium. Diabetes Metab 1997;23:308-13.
15. Powell LW, Jazwinska E, Halliday JW. Primary iron overload. In: Brock JH, Halliday JW, Pippard MJ, et al, eds. Iron Metabolism in Health and Disease. London: Saunders; 1994;228-70.
16. George DK, Evans RM, Crofton RW, Gunn IR. Testing for haemochromatosis in the diabetic clinic. Ann Clin Biochem 1995;32:521-6.
17. O’Brien T, Barrett B, Murray DM, Dinneen S, O’Sullivan DJ. Usefulness of biochemical screening of diabetic patients for hemochromatosis. Diabetes Care 1990;13:532-4.
18. Yaouanq JM. Diabetes and haemochromatosis: current concepts, management and prevention. Diabetes Metab 1995;21:319-29.
19. Bothwell TH, MacPhail AP. Hereditary hemochromatosis: etiologic, pathologic, and clinical aspects. Semin Hematol 1998;35:55-71.
20. Ober KP. Polyendocrine syndromes. In: Leahy JL, Clark NG, Cefalu WT, eds. Medical Management of Diabetes Mellitus. New York: Marcel Dekker; 2000;699-717.
21. Stremmel W, Niederau C, Berger M, Kley HK, Kruskemper HL, Strohmeyer G. Abnormalities in estrogen, androgen, and insulin metabolism in idiopathic hemochromatosis. Ann N Y Acad Sci 1988;526:209-23.
22. Hramiak IM, Finegood DT, Adams PC. Factors affecting glucose tolerance in hereditary hemochromatosis. Clin Invest Med 1997;20:110-8.
23. Edwards CQ, Kushner JP. Screening for hemochromatosis. N Engl J Med 1993;328:1616-20.
24. Shah, BV, Barnwell BG, Hunt PN, LaVange LM. SUDAAN User’s Manual. Release 5.50. Research Triangle Park, NC: Research Triangle Institute; 1991.
25. McDonnell SM, Phatak PD, Felitti V, Hover A, McLaren GD. Screening for hemochromatosis in primary care settings. Ann Intern Med 1998;129:962-70.
26. Balan V, Baldus W, Fairbanks V, Michels V, Burritt M, Klee G. Screening for hemochromatosis: a cost effectiveness study based on 12,258 patients. Gastroenterology 1994;107:453-9.
27. Karlsson M, Ikkala E, Reunanen A, Takkunen H, Vuori E, Makinen J. Prevalence of hemochromatosis in Finland. Acta Med Scand 1988;224:385-90.
The Incontinence Quality of Life Instrument in a survey of primary care patients
OBJECTIVE: To assess the performance of the Incontinence Quality of Life (I-QOL) Instrument in measuring the impact of urinary incontinence on the quality of life of family medicine patients.
STUDY DESIGN: Postal survey. Multiple imputation of missing answers. Linear regression analysis of I-QOL predictors. Comparison by receiver operating characteristic of the I-QOL and the Short Form 12 (SF-12).
POPULATION: Women 45 years or older attending either of 2 family medicine clinics. Response rate was 605 (61%) of 992.
OUTCOMES MEASURED: Prevalence of stress, urge, and mixed incontinence. Scores on the I-QOL and SF-12 instruments.
RESULTS: Of the 605 respondents, 310 (51%) reported urinary incontinence in the month before the survey. At least 1 item was missing on 19% of the I-QOL scales and scores were imputed. The relation between I-QOL and the number of leakage episodes was nonlinear. I-QOL scores decreased with the number of episodes, the amount of leakage, and poorer general health. There was no association between the I-QOL and age, education, or the type of incontinence. The I-QOL was more sensitive than the SF-12 to the statement that “urinary incontinence is a problem.”
CONCLUSIONS: The I-QOL is a useful instrument for the investigation of incontinence related quality of life in the community setting.
Urinary incontinence is a common problem among primary care patients.1 In recent years, patient perceptions of quality of life have become increasingly important in the evaluation of health conditions and their treatment.2 Specific instruments have been developed for the evaluation of the health-related quality of life of women reporting urinary incontinence.3,4 Wagner and colleagues developed a self-report quality of life measure specific to urinary incontinence (the Incontinence Quality of Life Instrument; I-QOL) that could be used as an outcome measure in clinical trials and patient care centers.5 The developers tested the instrument on a sample of 62 subjects and reported that the I-QOL was more sensitive than a generic instrument, such as the Short Form 36 (SF-36), in detecting differences between levels of self-perceived incontinence severity. In a follow-up study,6 incontinent women (141 with stress and 147 with mixed urinary incontinence) completed the I-QOL and comparative instruments at screening, pretreatment, and 4 follow-up visits during participation in a randomized trial assessing the efficacy of a medication for incontinence. Those investigators reported that, in the clinical trial, the I-QOL proved to be valid, reproducible, and responsive to treatment for urinary incontinence in women.
The aim of our study was to measure the prevalence of urinary incontinence and its impact on quality of life, in a population of community dwelling women. We selected the I-QOL and the Short Form 12 (SF-12)7 as specific and generic instruments, respectively, for the measurement of quality of life among incontinent women. In this report we describe the performance of the I-QOL in the community setting.
METHODS
Subjects
The subjects were women 45 years or older who attended either of 2 participating family medicine clinics in the city of Hamilton, Ontario. This was a postal survey using a modified Dillman method.8 The Dillman method calls for 3 mailings and a reminder postcard. Because of budgetary constraints we planned an initial mailing, followed by a reminder postcard and a second mailing to nonrespondents. Our budget permitted a sample size of about 1000, so that questionnaires were sent to all eligible patients attending the smaller clinic and to a random sample drawn from the roster of the larger clinic.
The initial mailing was sent to 1082 women. Ninety envelopes were returned undeliverable; in addition, 115 women returned the survey but selected the option of nonparticipation. The final response rate was 605 (61%) of 992.
Survey questions
Two questions inquired about the presence of incontinence: (1) “During the past month have you ever experienced urine loss (wet yourself) when coughing, laughing, or doing some other activity?” (2) “During the past month have you ever had to pee and then wet yourself before getting to the toilet?” We classified incontinence as “stress incontinence” if respondents replied “yes” to question 1; as “urge incontinence” if they replied “yes” to question 2; and as “mixed incontinence” if they responded “yes” to both. In addition, we asked, “Is wetting yourself a problem that interferes with your day-to-day activities or bothers you in other ways?” We also inquired about the number of daytime and nighttime leaking episodes in an average week and the amount of wetness (underwear or pad only, outer clothing, urine runs down legs, or pools on floor). The survey included 2 health-related quality of life instruments, the I-QOL and the SF-12,7 a generic instrument.
Statistical analysis
Handling missing responses. Missing data is a common problem in survey research. Until recently, the only methods widely available for analyzing incomplete data focused on “removing” the missing values by ignoring subjects with incomplete information or by substituting plausible values (eg, means or regression predictions) for the missing items. These ad hoc methods, although simple to implement, have serious drawbacks,9 including the potential introduction of bias. In the past 2 decades, substantial progress has been made in developing statistical procedures for missing data. In an incomplete data set, the observed values provide indirect evidence about the likely values of the unobserved data and one can use the available data to make estimates of the values of the missing data. Because any one estimate is uncertain, one may repeat the process a number of times and then average over these estimates for the missing values in the statistical analysis. Rubin developed the paradigm of multiple imputation, which carries out the averaging via simulation; each missing value is replaced by plausible values drawn from their predictive distribution.10 The variation among the number of imputations reflects the uncertainty with which the missing values can be predicted from the observed ones. After performing identical analyses on each data set, the results are combined according to simple rules to produce overall estimates and standard errors that reflect missing-data uncertainty. We used the publicly available software program, Amelia,11 to generate 5 data sets containing imputed values for those subjects with missing values for I-QOL. Variables in the imputation model included age, education, type of incontinence, number of incontinent episodes, I-QOL, and scores on the physical and mental components of the SF-12.
Regression analysis. Logistic regression was used to investigate factors associated with incomplete responses to the I-QOL instrument. The relations between the I-QOL scores and predictor variables were modeled with multiple linear regression. Before the regression, we used generalized additive models,12 a method that uses data smoothers to graphically display the pattern of relationships, to explore the shape of nonlinear relations, and suggest linearizing transformations. Model checking included an analysis of residuals.13
An important measure of the impact of incontinence is whether or not subjects consider their incontinence to be a “problem.” To compare the performance of the I-QOL and generic quality of life measures in discriminating between women who found and did not find their incontinence to be a problem, we computed the area under receiver operating characteristic curves.14 Computations were done with the Stata statistical package.15
RESULTS
Prevalence of incontinence
Of the 605 respondents, 310 (51%) reported urinary incontinence in the month before the survey. Table 1 shows the distribution of the respondents by age and incontinence status. Among our respondents, the prevalence of incontinence decreased slightly with age, a trend of borderline statistical significance (P = .08). Most surveys have reported that the prevalence of incontinence increases with age. We have no explanation for why this was not the case in our survey.
Incomplete responses to the I-QOL
The I-QOL is a questionnaire instrument with 22 items and the score is computed from all items. In such a situation, missing responses might be an important problem, leading to the reduction of sample size and study power. In our survey, 11 subjects (3.5%) replied to none of the I-QOL items; however, at least 1 item was missing for 49 other subjects. Thirty subjects missed 1 item, 4 missed 2 items, 3 missed 3 items, and 12 each missed 4 to 20 items. The number of missing responses for the 22 questions ranged from 5% to 7%, with the exception of the statement, “I worry about having sex,” for which the missing rate was 13%. Even though the missing rate for individual items was no higher than 13%, only 250 (80.7%) of I-QOL scores were complete.
Table 2 shows those variables significantly associated with incomplete response in a logistic regression model. Older women and women who had not graduated from high school were less likely to return a completed instrument, as were women who reported only urge incontinence. In addition, women who reported that incontinence was a problem were less likely to complete all the questions. These associations suggested that omitting women with incomplete responses from the analyses of I-QOL might introduce selection bias. We managed this problem by making use of the methods of multiple imputation.
Associations of I-QOL with incontinence factors
The I-QOL is scored in the range of 0 to 100, with lower scores indicating lower quality of life. The mean value of I-QOL among respondents was 83 (range, 15-100). We anticipated that the I-QOL scores would vary systematically with incontinence-related factors and planned to investigate these relationships with a linear regression model. We suspected, however, that the relationships between I-QOL and the number of daytime and nighttime episodes of incontinence might be nonlinear. We thus explored these relations with nonparametric regression methods, and Figure 1 shows the relations as estimated with a generalized additive model.12 The method of generalized additive models is a computer-intensive one that makes no prior assumptions about the shape of the relations between the outcome and the explanatory variables. It fits smooth, arbitrarily shaped functions to the data by a method that is a generalization of the “moving average.” Figure 1 shows that the I-QOL decreased as the number of incontinence episodes increased, but then reached a plateau when the frequency of occurrence was about 1 episode per day. In other words, quality of life worsened as the number of incontinence episodes increased from less than once per week to once per day. However, once incontinence was occurring daily, there was no further decrease in the perceived quality of life as incontinence episodes became more frequent than once per day. We found that these plateau relations could be satisfactorily modeled with logarithmic transformations of the number of incontinence episodes; ie, log (number of incontinence episodes per week + 1), where the 1 has been added to avoid log(0), which is undefined mathematically.
The following candidate variables were chosen for inclusion in an initial regression model for the I-QOL scores: age, type of incontinence (stress, urge, mixed), incontinence is perceived as a “problem” (yes/no), log-transformed number of daytime and nighttime incontinence episodes, amount of wetting (wets outer clothing or runs down leg/wets pad or underpants only), self-reported health status (excellent to poor), and education. In this initial model, there was no significant association between I-QOL scores and age, type of incontinence, or education. These variables thus were not retained in the final model, which is shown in Table 3. The model provided a good fit to the data (F8,190 = 43.1; P < .0001; R2 = .64). In addition to presenting the results of the model that excluded subjects with incomplete I-QOL scores, Table 3 shows the results of the model that included these subjects by using imputed values. There were only minor differences between the models. The I-QOL score decreased as the number of daytime and nighttime incontinence episodes increased and as the amount of urine loss increased. After controlling for other variables, the I-QOL averaged 12 points fewer among those women who considered their incontinence problematic. Further, there was a strong relation between the I-QOL and self-reported general health status.
DISCUSSION
Urinary incontinence is common among women in the community, but loss of bladder control is perceived quite differently by various respondents. In a survey of 36,000 Americans with incontinence, Jeter and Wagner reported that 17% described their incontinence as a major problem with important social implications,1 but the rest described it as a relatively minor problem with limited impact on their respective lifestyles. Self-administered quality of life instruments are valuable in measuring the impact of incontinence on the lives of subjects, of identifying subjects among whom interventions might have a beneficial impact on quality of life, and in following the natural history of incontinence and its treatment.
One of these instruments, the I-QOL, has been used in the clinical trial setting. We were impressed with reports of its performance and selected it for use in a postal survey of community dwelling patients of 2 family medicine clinics. It was immediately apparent that the 22-item length of the instrument posed problems in a postal survey because, even though the missing rate for individual items was no higher than 13%, only 80.7% of I-QOL scores were complete. We found that older, less educated subjects were more likely to return incomplete questionnaires. We did not contact subjects to obtain responses to the omitted questions. Rather, we chose to make use of responses to other questions to impute the missing I-QOL scores. Although the differential response rates suggested the possibility of selection bias, there was in fact little difference in the coefficients of regression models when using complete versus complete plus imputed data.
As reported by Patrick and colleagues in the clinical trial context,6 we found that the I-QOL score correlates strongly with physical measures of the extent of incontinence, including the number of incontinence episodes and the amount of wetting. Similar to those researchers, we found no relation between the I-QOL and age, education, or the type of incontinence. These are desirable properties for a condition-specific quality of life instrument; it is responsive to the impact of incontinence on the quality of life and not to “nuisance” variables. Another desirable characteristic was that the I-QOL correlates strongly with the statement, “wetting is a problem.” Figure 2 shows the relations between the I-QOL score and the probability that a respondent would state that incontinence is a problem. To understand Figure 2, consider that each subject will agree or disagree with the statement, “wetting is a problem.” Each subject thus appears on the graph at a probability level of 0 or 1. Because there is considerable overlap of the data points on the display, the point for each subject has been “jittered,” ie, given a random amount of vertical displacement to better display the density of data points in relation to the I-QOL. The curve on the plot is the fit of a “loess” smoother to the data points.16 This smoother is, in essence, a running weighted average of the proportion of subjects who reported that wetting is a problem. When the I-QOL score was low, there was a high probability that incontinence would be seen as a problem. The I-QOL score at which 50% of women found incontinence to be a problem was about 77, and the probability declined rapidly after that.
There were significant correlations between the I-QOL score and the subscales of the SF-12 generic quality of life instrument (r = .49 for the mental component of the SF-12 and r = .33 for the physical component). We found, however, that the I-QOL was more sensitive than the generic instruments in identifying subjects who considered incontinence to be a problem. This is shown in the receiver operating characteristic curves displayed in Figure 3. The area under the I-QOL receiver operating characteristic curve was 0.84, compared with 0.63 for the medical component scale and 0.58 for the physical component scale of the SF-12. It is reassuring that a condition-specific instrument performs better than a generic one.
We note some limitations to this study. Estimates of prevalence in the published epidemiologic data on urinary incontinence often vary widely. These large variations are derived mainly from the wide range in definitions used, differences in survey setting, and methodology used. The overall prevalence rate of 51% reported in our survey is high. Further, the prevalence of incontinence among our respondents appeared to decrease slightly with age. One possible explanation and limitation of our study is that respondents self-selected to participate. The other limitation is that older, incontinent women likely were underrepresented in our sample due to institutionalization and survivor bias. The end result is that the prevalence of incontinence might have been overestimated for younger respondents (incontinent younger women more likely to respond) and underestimated for the older respondents (incontinent older women less likely to dwell in communities). The choice of our sampling frame and thus our ability to generalize the findings to other settings also might be viewed as problematic. This was not intended to be a community sample. We were interested in the impact of urinary incontinence on women attending family medicine clinics because these women are our patients and we have the opportunity to intervene to improve their quality of life.
Another limitation was that our questionnaire was available only in English. This might have affected our response rate and generalizability of these findings.
Despite these limitations, we believe the I-QOL is a useful instrument for the investigation of incontinence-related quality of life in the community and the clinical trial setting. We have begun to use the I-QOL among patients attending an incontinence clinic. We have found it to be well received and plan to report on its performance in this setting in the near future.
1. Jeter KF, Wagner DB. Incontinence in the American home. A survey of 36,500 people. J Am Geriatr Soc 1990;38:379-83.
2. Ware JEJ. The status of health assessment 1994. Annu Rev Public Health 1995;16:327-54.
3. Shumaker SA, Wyman JF, Uebersax JS, McClish D, Fantl JA. Health-related quality of life measures for women with urinary incontinence: the Incontinence Impact Questionnaire and the Urogenital Distress Inventory. Continence Program in Women (CPW) Research Group. Qual Life Res 1994;3:291-306.
4. Uebersax JS, Wyman JF, Shumaker SA, McClish DK, Fantl JA. Short forms to assess life quality and symptom distress for urinary incontinence in women: the Incontinence Impact Questionnaire and the Urogenital Distress Inventory. Continence Program for Women Research Group. Neurourol Urodyn. 1995;14:131-9.
5. Wagner TH, Patrick DL, Bavendam TG, Martin ML, Buesching DP. Quality of life of persons with urinary incontinence: development of a new measure. Urology 1996;47:67-71.
6. Patrick DL, Martin ML, Bushnell DM, et al. Quality of life of women with urinary incontinence: further development of the incontinence quality of life instrument (I-QOL). Urology 1999;53:71-6.
7. Ware JJ, Kosinski M, Keller SD. A 12-Item Short-Form Health Survey: construction of scales and preliminary tests of reliability and validity. Med Care 1996;34:220-33.
8. Dillman DA. Mail and Telephone Surveys: The Total Design Method. New York: John Wiley & Sons; 1978.
9. Schafer JL. Analysis of Incomplete Multivariate Data. London: Chapman & Hall; 1997.
10. Rubin DL. Multiple Imputation for Nonresponse in Surveys. New York: John Wiley & Sons; 1987.
11. Honaker J, Joseph A, King G, Scheve K, Singh N. Amelia: a program for missing data. 2001. Available at: http://gking.harvard.edu/stats/shtml. Accessed September 30, 2002.
12. Hastie TJ, Tibshirani RJ. Generalized Additive Models. London: Chapman & Hall; 1990.
13. Freund RJ, Wilson WJ. Regression Analysis. San Diego: Academic Press; 1998.
14. van Erkel AR, Pattynama PM. Receiver operating characteristic (ROC) analysis: basic principles and applications in radiology. Eur J Radiol 1998;27:88-94.
15. Stata Statistical Software. Version 7.0. College Station, TX: StatCorp; 2000.
16. Fox J. Multiple and Generalized Nonparametric Regression. Thousand Oaks, CA: Sage Publications; 2000.
Address reprint requests to Murray M. Finkelstein, MD, PhD, Family Medicine Centre, Suite 413, Mt Sinai Hospital, Toronto, ON M5G 1X5, Canada. E-mail: [email protected].
To submit a letter to the editor on this topic, click here: [email protected].
OBJECTIVE: To assess the performance of the Incontinence Quality of Life (I-QOL) Instrument in measuring the impact of urinary incontinence on the quality of life of family medicine patients.
STUDY DESIGN: Postal survey. Multiple imputation of missing answers. Linear regression analysis of I-QOL predictors. Comparison by receiver operating characteristic of the I-QOL and the Short Form 12 (SF-12).
POPULATION: Women 45 years or older attending either of 2 family medicine clinics. Response rate was 605 (61%) of 992.
OUTCOMES MEASURED: Prevalence of stress, urge, and mixed incontinence. Scores on the I-QOL and SF-12 instruments.
RESULTS: Of the 605 respondents, 310 (51%) reported urinary incontinence in the month before the survey. At least 1 item was missing on 19% of the I-QOL scales and scores were imputed. The relation between I-QOL and the number of leakage episodes was nonlinear. I-QOL scores decreased with the number of episodes, the amount of leakage, and poorer general health. There was no association between the I-QOL and age, education, or the type of incontinence. The I-QOL was more sensitive than the SF-12 to the statement that “urinary incontinence is a problem.”
CONCLUSIONS: The I-QOL is a useful instrument for the investigation of incontinence related quality of life in the community setting.
Urinary incontinence is a common problem among primary care patients.1 In recent years, patient perceptions of quality of life have become increasingly important in the evaluation of health conditions and their treatment.2 Specific instruments have been developed for the evaluation of the health-related quality of life of women reporting urinary incontinence.3,4 Wagner and colleagues developed a self-report quality of life measure specific to urinary incontinence (the Incontinence Quality of Life Instrument; I-QOL) that could be used as an outcome measure in clinical trials and patient care centers.5 The developers tested the instrument on a sample of 62 subjects and reported that the I-QOL was more sensitive than a generic instrument, such as the Short Form 36 (SF-36), in detecting differences between levels of self-perceived incontinence severity. In a follow-up study,6 incontinent women (141 with stress and 147 with mixed urinary incontinence) completed the I-QOL and comparative instruments at screening, pretreatment, and 4 follow-up visits during participation in a randomized trial assessing the efficacy of a medication for incontinence. Those investigators reported that, in the clinical trial, the I-QOL proved to be valid, reproducible, and responsive to treatment for urinary incontinence in women.
The aim of our study was to measure the prevalence of urinary incontinence and its impact on quality of life, in a population of community dwelling women. We selected the I-QOL and the Short Form 12 (SF-12)7 as specific and generic instruments, respectively, for the measurement of quality of life among incontinent women. In this report we describe the performance of the I-QOL in the community setting.
METHODS
Subjects
The subjects were women 45 years or older who attended either of 2 participating family medicine clinics in the city of Hamilton, Ontario. This was a postal survey using a modified Dillman method.8 The Dillman method calls for 3 mailings and a reminder postcard. Because of budgetary constraints we planned an initial mailing, followed by a reminder postcard and a second mailing to nonrespondents. Our budget permitted a sample size of about 1000, so that questionnaires were sent to all eligible patients attending the smaller clinic and to a random sample drawn from the roster of the larger clinic.
The initial mailing was sent to 1082 women. Ninety envelopes were returned undeliverable; in addition, 115 women returned the survey but selected the option of nonparticipation. The final response rate was 605 (61%) of 992.
Survey questions
Two questions inquired about the presence of incontinence: (1) “During the past month have you ever experienced urine loss (wet yourself) when coughing, laughing, or doing some other activity?” (2) “During the past month have you ever had to pee and then wet yourself before getting to the toilet?” We classified incontinence as “stress incontinence” if respondents replied “yes” to question 1; as “urge incontinence” if they replied “yes” to question 2; and as “mixed incontinence” if they responded “yes” to both. In addition, we asked, “Is wetting yourself a problem that interferes with your day-to-day activities or bothers you in other ways?” We also inquired about the number of daytime and nighttime leaking episodes in an average week and the amount of wetness (underwear or pad only, outer clothing, urine runs down legs, or pools on floor). The survey included 2 health-related quality of life instruments, the I-QOL and the SF-12,7 a generic instrument.
Statistical analysis
Handling missing responses. Missing data is a common problem in survey research. Until recently, the only methods widely available for analyzing incomplete data focused on “removing” the missing values by ignoring subjects with incomplete information or by substituting plausible values (eg, means or regression predictions) for the missing items. These ad hoc methods, although simple to implement, have serious drawbacks,9 including the potential introduction of bias. In the past 2 decades, substantial progress has been made in developing statistical procedures for missing data. In an incomplete data set, the observed values provide indirect evidence about the likely values of the unobserved data and one can use the available data to make estimates of the values of the missing data. Because any one estimate is uncertain, one may repeat the process a number of times and then average over these estimates for the missing values in the statistical analysis. Rubin developed the paradigm of multiple imputation, which carries out the averaging via simulation; each missing value is replaced by plausible values drawn from their predictive distribution.10 The variation among the number of imputations reflects the uncertainty with which the missing values can be predicted from the observed ones. After performing identical analyses on each data set, the results are combined according to simple rules to produce overall estimates and standard errors that reflect missing-data uncertainty. We used the publicly available software program, Amelia,11 to generate 5 data sets containing imputed values for those subjects with missing values for I-QOL. Variables in the imputation model included age, education, type of incontinence, number of incontinent episodes, I-QOL, and scores on the physical and mental components of the SF-12.
Regression analysis. Logistic regression was used to investigate factors associated with incomplete responses to the I-QOL instrument. The relations between the I-QOL scores and predictor variables were modeled with multiple linear regression. Before the regression, we used generalized additive models,12 a method that uses data smoothers to graphically display the pattern of relationships, to explore the shape of nonlinear relations, and suggest linearizing transformations. Model checking included an analysis of residuals.13
An important measure of the impact of incontinence is whether or not subjects consider their incontinence to be a “problem.” To compare the performance of the I-QOL and generic quality of life measures in discriminating between women who found and did not find their incontinence to be a problem, we computed the area under receiver operating characteristic curves.14 Computations were done with the Stata statistical package.15
RESULTS
Prevalence of incontinence
Of the 605 respondents, 310 (51%) reported urinary incontinence in the month before the survey. Table 1 shows the distribution of the respondents by age and incontinence status. Among our respondents, the prevalence of incontinence decreased slightly with age, a trend of borderline statistical significance (P = .08). Most surveys have reported that the prevalence of incontinence increases with age. We have no explanation for why this was not the case in our survey.
Incomplete responses to the I-QOL
The I-QOL is a questionnaire instrument with 22 items and the score is computed from all items. In such a situation, missing responses might be an important problem, leading to the reduction of sample size and study power. In our survey, 11 subjects (3.5%) replied to none of the I-QOL items; however, at least 1 item was missing for 49 other subjects. Thirty subjects missed 1 item, 4 missed 2 items, 3 missed 3 items, and 12 each missed 4 to 20 items. The number of missing responses for the 22 questions ranged from 5% to 7%, with the exception of the statement, “I worry about having sex,” for which the missing rate was 13%. Even though the missing rate for individual items was no higher than 13%, only 250 (80.7%) of I-QOL scores were complete.
Table 2 shows those variables significantly associated with incomplete response in a logistic regression model. Older women and women who had not graduated from high school were less likely to return a completed instrument, as were women who reported only urge incontinence. In addition, women who reported that incontinence was a problem were less likely to complete all the questions. These associations suggested that omitting women with incomplete responses from the analyses of I-QOL might introduce selection bias. We managed this problem by making use of the methods of multiple imputation.
Associations of I-QOL with incontinence factors
The I-QOL is scored in the range of 0 to 100, with lower scores indicating lower quality of life. The mean value of I-QOL among respondents was 83 (range, 15-100). We anticipated that the I-QOL scores would vary systematically with incontinence-related factors and planned to investigate these relationships with a linear regression model. We suspected, however, that the relationships between I-QOL and the number of daytime and nighttime episodes of incontinence might be nonlinear. We thus explored these relations with nonparametric regression methods, and Figure 1 shows the relations as estimated with a generalized additive model.12 The method of generalized additive models is a computer-intensive one that makes no prior assumptions about the shape of the relations between the outcome and the explanatory variables. It fits smooth, arbitrarily shaped functions to the data by a method that is a generalization of the “moving average.” Figure 1 shows that the I-QOL decreased as the number of incontinence episodes increased, but then reached a plateau when the frequency of occurrence was about 1 episode per day. In other words, quality of life worsened as the number of incontinence episodes increased from less than once per week to once per day. However, once incontinence was occurring daily, there was no further decrease in the perceived quality of life as incontinence episodes became more frequent than once per day. We found that these plateau relations could be satisfactorily modeled with logarithmic transformations of the number of incontinence episodes; ie, log (number of incontinence episodes per week + 1), where the 1 has been added to avoid log(0), which is undefined mathematically.
The following candidate variables were chosen for inclusion in an initial regression model for the I-QOL scores: age, type of incontinence (stress, urge, mixed), incontinence is perceived as a “problem” (yes/no), log-transformed number of daytime and nighttime incontinence episodes, amount of wetting (wets outer clothing or runs down leg/wets pad or underpants only), self-reported health status (excellent to poor), and education. In this initial model, there was no significant association between I-QOL scores and age, type of incontinence, or education. These variables thus were not retained in the final model, which is shown in Table 3. The model provided a good fit to the data (F8,190 = 43.1; P < .0001; R2 = .64). In addition to presenting the results of the model that excluded subjects with incomplete I-QOL scores, Table 3 shows the results of the model that included these subjects by using imputed values. There were only minor differences between the models. The I-QOL score decreased as the number of daytime and nighttime incontinence episodes increased and as the amount of urine loss increased. After controlling for other variables, the I-QOL averaged 12 points fewer among those women who considered their incontinence problematic. Further, there was a strong relation between the I-QOL and self-reported general health status.
DISCUSSION
Urinary incontinence is common among women in the community, but loss of bladder control is perceived quite differently by various respondents. In a survey of 36,000 Americans with incontinence, Jeter and Wagner reported that 17% described their incontinence as a major problem with important social implications,1 but the rest described it as a relatively minor problem with limited impact on their respective lifestyles. Self-administered quality of life instruments are valuable in measuring the impact of incontinence on the lives of subjects, of identifying subjects among whom interventions might have a beneficial impact on quality of life, and in following the natural history of incontinence and its treatment.
One of these instruments, the I-QOL, has been used in the clinical trial setting. We were impressed with reports of its performance and selected it for use in a postal survey of community dwelling patients of 2 family medicine clinics. It was immediately apparent that the 22-item length of the instrument posed problems in a postal survey because, even though the missing rate for individual items was no higher than 13%, only 80.7% of I-QOL scores were complete. We found that older, less educated subjects were more likely to return incomplete questionnaires. We did not contact subjects to obtain responses to the omitted questions. Rather, we chose to make use of responses to other questions to impute the missing I-QOL scores. Although the differential response rates suggested the possibility of selection bias, there was in fact little difference in the coefficients of regression models when using complete versus complete plus imputed data.
As reported by Patrick and colleagues in the clinical trial context,6 we found that the I-QOL score correlates strongly with physical measures of the extent of incontinence, including the number of incontinence episodes and the amount of wetting. Similar to those researchers, we found no relation between the I-QOL and age, education, or the type of incontinence. These are desirable properties for a condition-specific quality of life instrument; it is responsive to the impact of incontinence on the quality of life and not to “nuisance” variables. Another desirable characteristic was that the I-QOL correlates strongly with the statement, “wetting is a problem.” Figure 2 shows the relations between the I-QOL score and the probability that a respondent would state that incontinence is a problem. To understand Figure 2, consider that each subject will agree or disagree with the statement, “wetting is a problem.” Each subject thus appears on the graph at a probability level of 0 or 1. Because there is considerable overlap of the data points on the display, the point for each subject has been “jittered,” ie, given a random amount of vertical displacement to better display the density of data points in relation to the I-QOL. The curve on the plot is the fit of a “loess” smoother to the data points.16 This smoother is, in essence, a running weighted average of the proportion of subjects who reported that wetting is a problem. When the I-QOL score was low, there was a high probability that incontinence would be seen as a problem. The I-QOL score at which 50% of women found incontinence to be a problem was about 77, and the probability declined rapidly after that.
There were significant correlations between the I-QOL score and the subscales of the SF-12 generic quality of life instrument (r = .49 for the mental component of the SF-12 and r = .33 for the physical component). We found, however, that the I-QOL was more sensitive than the generic instruments in identifying subjects who considered incontinence to be a problem. This is shown in the receiver operating characteristic curves displayed in Figure 3. The area under the I-QOL receiver operating characteristic curve was 0.84, compared with 0.63 for the medical component scale and 0.58 for the physical component scale of the SF-12. It is reassuring that a condition-specific instrument performs better than a generic one.
We note some limitations to this study. Estimates of prevalence in the published epidemiologic data on urinary incontinence often vary widely. These large variations are derived mainly from the wide range in definitions used, differences in survey setting, and methodology used. The overall prevalence rate of 51% reported in our survey is high. Further, the prevalence of incontinence among our respondents appeared to decrease slightly with age. One possible explanation and limitation of our study is that respondents self-selected to participate. The other limitation is that older, incontinent women likely were underrepresented in our sample due to institutionalization and survivor bias. The end result is that the prevalence of incontinence might have been overestimated for younger respondents (incontinent younger women more likely to respond) and underestimated for the older respondents (incontinent older women less likely to dwell in communities). The choice of our sampling frame and thus our ability to generalize the findings to other settings also might be viewed as problematic. This was not intended to be a community sample. We were interested in the impact of urinary incontinence on women attending family medicine clinics because these women are our patients and we have the opportunity to intervene to improve their quality of life.
Another limitation was that our questionnaire was available only in English. This might have affected our response rate and generalizability of these findings.
Despite these limitations, we believe the I-QOL is a useful instrument for the investigation of incontinence-related quality of life in the community and the clinical trial setting. We have begun to use the I-QOL among patients attending an incontinence clinic. We have found it to be well received and plan to report on its performance in this setting in the near future.
OBJECTIVE: To assess the performance of the Incontinence Quality of Life (I-QOL) Instrument in measuring the impact of urinary incontinence on the quality of life of family medicine patients.
STUDY DESIGN: Postal survey. Multiple imputation of missing answers. Linear regression analysis of I-QOL predictors. Comparison by receiver operating characteristic of the I-QOL and the Short Form 12 (SF-12).
POPULATION: Women 45 years or older attending either of 2 family medicine clinics. Response rate was 605 (61%) of 992.
OUTCOMES MEASURED: Prevalence of stress, urge, and mixed incontinence. Scores on the I-QOL and SF-12 instruments.
RESULTS: Of the 605 respondents, 310 (51%) reported urinary incontinence in the month before the survey. At least 1 item was missing on 19% of the I-QOL scales and scores were imputed. The relation between I-QOL and the number of leakage episodes was nonlinear. I-QOL scores decreased with the number of episodes, the amount of leakage, and poorer general health. There was no association between the I-QOL and age, education, or the type of incontinence. The I-QOL was more sensitive than the SF-12 to the statement that “urinary incontinence is a problem.”
CONCLUSIONS: The I-QOL is a useful instrument for the investigation of incontinence related quality of life in the community setting.
Urinary incontinence is a common problem among primary care patients.1 In recent years, patient perceptions of quality of life have become increasingly important in the evaluation of health conditions and their treatment.2 Specific instruments have been developed for the evaluation of the health-related quality of life of women reporting urinary incontinence.3,4 Wagner and colleagues developed a self-report quality of life measure specific to urinary incontinence (the Incontinence Quality of Life Instrument; I-QOL) that could be used as an outcome measure in clinical trials and patient care centers.5 The developers tested the instrument on a sample of 62 subjects and reported that the I-QOL was more sensitive than a generic instrument, such as the Short Form 36 (SF-36), in detecting differences between levels of self-perceived incontinence severity. In a follow-up study,6 incontinent women (141 with stress and 147 with mixed urinary incontinence) completed the I-QOL and comparative instruments at screening, pretreatment, and 4 follow-up visits during participation in a randomized trial assessing the efficacy of a medication for incontinence. Those investigators reported that, in the clinical trial, the I-QOL proved to be valid, reproducible, and responsive to treatment for urinary incontinence in women.
The aim of our study was to measure the prevalence of urinary incontinence and its impact on quality of life, in a population of community dwelling women. We selected the I-QOL and the Short Form 12 (SF-12)7 as specific and generic instruments, respectively, for the measurement of quality of life among incontinent women. In this report we describe the performance of the I-QOL in the community setting.
METHODS
Subjects
The subjects were women 45 years or older who attended either of 2 participating family medicine clinics in the city of Hamilton, Ontario. This was a postal survey using a modified Dillman method.8 The Dillman method calls for 3 mailings and a reminder postcard. Because of budgetary constraints we planned an initial mailing, followed by a reminder postcard and a second mailing to nonrespondents. Our budget permitted a sample size of about 1000, so that questionnaires were sent to all eligible patients attending the smaller clinic and to a random sample drawn from the roster of the larger clinic.
The initial mailing was sent to 1082 women. Ninety envelopes were returned undeliverable; in addition, 115 women returned the survey but selected the option of nonparticipation. The final response rate was 605 (61%) of 992.
Survey questions
Two questions inquired about the presence of incontinence: (1) “During the past month have you ever experienced urine loss (wet yourself) when coughing, laughing, or doing some other activity?” (2) “During the past month have you ever had to pee and then wet yourself before getting to the toilet?” We classified incontinence as “stress incontinence” if respondents replied “yes” to question 1; as “urge incontinence” if they replied “yes” to question 2; and as “mixed incontinence” if they responded “yes” to both. In addition, we asked, “Is wetting yourself a problem that interferes with your day-to-day activities or bothers you in other ways?” We also inquired about the number of daytime and nighttime leaking episodes in an average week and the amount of wetness (underwear or pad only, outer clothing, urine runs down legs, or pools on floor). The survey included 2 health-related quality of life instruments, the I-QOL and the SF-12,7 a generic instrument.
Statistical analysis
Handling missing responses. Missing data is a common problem in survey research. Until recently, the only methods widely available for analyzing incomplete data focused on “removing” the missing values by ignoring subjects with incomplete information or by substituting plausible values (eg, means or regression predictions) for the missing items. These ad hoc methods, although simple to implement, have serious drawbacks,9 including the potential introduction of bias. In the past 2 decades, substantial progress has been made in developing statistical procedures for missing data. In an incomplete data set, the observed values provide indirect evidence about the likely values of the unobserved data and one can use the available data to make estimates of the values of the missing data. Because any one estimate is uncertain, one may repeat the process a number of times and then average over these estimates for the missing values in the statistical analysis. Rubin developed the paradigm of multiple imputation, which carries out the averaging via simulation; each missing value is replaced by plausible values drawn from their predictive distribution.10 The variation among the number of imputations reflects the uncertainty with which the missing values can be predicted from the observed ones. After performing identical analyses on each data set, the results are combined according to simple rules to produce overall estimates and standard errors that reflect missing-data uncertainty. We used the publicly available software program, Amelia,11 to generate 5 data sets containing imputed values for those subjects with missing values for I-QOL. Variables in the imputation model included age, education, type of incontinence, number of incontinent episodes, I-QOL, and scores on the physical and mental components of the SF-12.
Regression analysis. Logistic regression was used to investigate factors associated with incomplete responses to the I-QOL instrument. The relations between the I-QOL scores and predictor variables were modeled with multiple linear regression. Before the regression, we used generalized additive models,12 a method that uses data smoothers to graphically display the pattern of relationships, to explore the shape of nonlinear relations, and suggest linearizing transformations. Model checking included an analysis of residuals.13
An important measure of the impact of incontinence is whether or not subjects consider their incontinence to be a “problem.” To compare the performance of the I-QOL and generic quality of life measures in discriminating between women who found and did not find their incontinence to be a problem, we computed the area under receiver operating characteristic curves.14 Computations were done with the Stata statistical package.15
RESULTS
Prevalence of incontinence
Of the 605 respondents, 310 (51%) reported urinary incontinence in the month before the survey. Table 1 shows the distribution of the respondents by age and incontinence status. Among our respondents, the prevalence of incontinence decreased slightly with age, a trend of borderline statistical significance (P = .08). Most surveys have reported that the prevalence of incontinence increases with age. We have no explanation for why this was not the case in our survey.
Incomplete responses to the I-QOL
The I-QOL is a questionnaire instrument with 22 items and the score is computed from all items. In such a situation, missing responses might be an important problem, leading to the reduction of sample size and study power. In our survey, 11 subjects (3.5%) replied to none of the I-QOL items; however, at least 1 item was missing for 49 other subjects. Thirty subjects missed 1 item, 4 missed 2 items, 3 missed 3 items, and 12 each missed 4 to 20 items. The number of missing responses for the 22 questions ranged from 5% to 7%, with the exception of the statement, “I worry about having sex,” for which the missing rate was 13%. Even though the missing rate for individual items was no higher than 13%, only 250 (80.7%) of I-QOL scores were complete.
Table 2 shows those variables significantly associated with incomplete response in a logistic regression model. Older women and women who had not graduated from high school were less likely to return a completed instrument, as were women who reported only urge incontinence. In addition, women who reported that incontinence was a problem were less likely to complete all the questions. These associations suggested that omitting women with incomplete responses from the analyses of I-QOL might introduce selection bias. We managed this problem by making use of the methods of multiple imputation.
Associations of I-QOL with incontinence factors
The I-QOL is scored in the range of 0 to 100, with lower scores indicating lower quality of life. The mean value of I-QOL among respondents was 83 (range, 15-100). We anticipated that the I-QOL scores would vary systematically with incontinence-related factors and planned to investigate these relationships with a linear regression model. We suspected, however, that the relationships between I-QOL and the number of daytime and nighttime episodes of incontinence might be nonlinear. We thus explored these relations with nonparametric regression methods, and Figure 1 shows the relations as estimated with a generalized additive model.12 The method of generalized additive models is a computer-intensive one that makes no prior assumptions about the shape of the relations between the outcome and the explanatory variables. It fits smooth, arbitrarily shaped functions to the data by a method that is a generalization of the “moving average.” Figure 1 shows that the I-QOL decreased as the number of incontinence episodes increased, but then reached a plateau when the frequency of occurrence was about 1 episode per day. In other words, quality of life worsened as the number of incontinence episodes increased from less than once per week to once per day. However, once incontinence was occurring daily, there was no further decrease in the perceived quality of life as incontinence episodes became more frequent than once per day. We found that these plateau relations could be satisfactorily modeled with logarithmic transformations of the number of incontinence episodes; ie, log (number of incontinence episodes per week + 1), where the 1 has been added to avoid log(0), which is undefined mathematically.
The following candidate variables were chosen for inclusion in an initial regression model for the I-QOL scores: age, type of incontinence (stress, urge, mixed), incontinence is perceived as a “problem” (yes/no), log-transformed number of daytime and nighttime incontinence episodes, amount of wetting (wets outer clothing or runs down leg/wets pad or underpants only), self-reported health status (excellent to poor), and education. In this initial model, there was no significant association between I-QOL scores and age, type of incontinence, or education. These variables thus were not retained in the final model, which is shown in Table 3. The model provided a good fit to the data (F8,190 = 43.1; P < .0001; R2 = .64). In addition to presenting the results of the model that excluded subjects with incomplete I-QOL scores, Table 3 shows the results of the model that included these subjects by using imputed values. There were only minor differences between the models. The I-QOL score decreased as the number of daytime and nighttime incontinence episodes increased and as the amount of urine loss increased. After controlling for other variables, the I-QOL averaged 12 points fewer among those women who considered their incontinence problematic. Further, there was a strong relation between the I-QOL and self-reported general health status.
DISCUSSION
Urinary incontinence is common among women in the community, but loss of bladder control is perceived quite differently by various respondents. In a survey of 36,000 Americans with incontinence, Jeter and Wagner reported that 17% described their incontinence as a major problem with important social implications,1 but the rest described it as a relatively minor problem with limited impact on their respective lifestyles. Self-administered quality of life instruments are valuable in measuring the impact of incontinence on the lives of subjects, of identifying subjects among whom interventions might have a beneficial impact on quality of life, and in following the natural history of incontinence and its treatment.
One of these instruments, the I-QOL, has been used in the clinical trial setting. We were impressed with reports of its performance and selected it for use in a postal survey of community dwelling patients of 2 family medicine clinics. It was immediately apparent that the 22-item length of the instrument posed problems in a postal survey because, even though the missing rate for individual items was no higher than 13%, only 80.7% of I-QOL scores were complete. We found that older, less educated subjects were more likely to return incomplete questionnaires. We did not contact subjects to obtain responses to the omitted questions. Rather, we chose to make use of responses to other questions to impute the missing I-QOL scores. Although the differential response rates suggested the possibility of selection bias, there was in fact little difference in the coefficients of regression models when using complete versus complete plus imputed data.
As reported by Patrick and colleagues in the clinical trial context,6 we found that the I-QOL score correlates strongly with physical measures of the extent of incontinence, including the number of incontinence episodes and the amount of wetting. Similar to those researchers, we found no relation between the I-QOL and age, education, or the type of incontinence. These are desirable properties for a condition-specific quality of life instrument; it is responsive to the impact of incontinence on the quality of life and not to “nuisance” variables. Another desirable characteristic was that the I-QOL correlates strongly with the statement, “wetting is a problem.” Figure 2 shows the relations between the I-QOL score and the probability that a respondent would state that incontinence is a problem. To understand Figure 2, consider that each subject will agree or disagree with the statement, “wetting is a problem.” Each subject thus appears on the graph at a probability level of 0 or 1. Because there is considerable overlap of the data points on the display, the point for each subject has been “jittered,” ie, given a random amount of vertical displacement to better display the density of data points in relation to the I-QOL. The curve on the plot is the fit of a “loess” smoother to the data points.16 This smoother is, in essence, a running weighted average of the proportion of subjects who reported that wetting is a problem. When the I-QOL score was low, there was a high probability that incontinence would be seen as a problem. The I-QOL score at which 50% of women found incontinence to be a problem was about 77, and the probability declined rapidly after that.
There were significant correlations between the I-QOL score and the subscales of the SF-12 generic quality of life instrument (r = .49 for the mental component of the SF-12 and r = .33 for the physical component). We found, however, that the I-QOL was more sensitive than the generic instruments in identifying subjects who considered incontinence to be a problem. This is shown in the receiver operating characteristic curves displayed in Figure 3. The area under the I-QOL receiver operating characteristic curve was 0.84, compared with 0.63 for the medical component scale and 0.58 for the physical component scale of the SF-12. It is reassuring that a condition-specific instrument performs better than a generic one.
We note some limitations to this study. Estimates of prevalence in the published epidemiologic data on urinary incontinence often vary widely. These large variations are derived mainly from the wide range in definitions used, differences in survey setting, and methodology used. The overall prevalence rate of 51% reported in our survey is high. Further, the prevalence of incontinence among our respondents appeared to decrease slightly with age. One possible explanation and limitation of our study is that respondents self-selected to participate. The other limitation is that older, incontinent women likely were underrepresented in our sample due to institutionalization and survivor bias. The end result is that the prevalence of incontinence might have been overestimated for younger respondents (incontinent younger women more likely to respond) and underestimated for the older respondents (incontinent older women less likely to dwell in communities). The choice of our sampling frame and thus our ability to generalize the findings to other settings also might be viewed as problematic. This was not intended to be a community sample. We were interested in the impact of urinary incontinence on women attending family medicine clinics because these women are our patients and we have the opportunity to intervene to improve their quality of life.
Another limitation was that our questionnaire was available only in English. This might have affected our response rate and generalizability of these findings.
Despite these limitations, we believe the I-QOL is a useful instrument for the investigation of incontinence-related quality of life in the community and the clinical trial setting. We have begun to use the I-QOL among patients attending an incontinence clinic. We have found it to be well received and plan to report on its performance in this setting in the near future.
1. Jeter KF, Wagner DB. Incontinence in the American home. A survey of 36,500 people. J Am Geriatr Soc 1990;38:379-83.
2. Ware JEJ. The status of health assessment 1994. Annu Rev Public Health 1995;16:327-54.
3. Shumaker SA, Wyman JF, Uebersax JS, McClish D, Fantl JA. Health-related quality of life measures for women with urinary incontinence: the Incontinence Impact Questionnaire and the Urogenital Distress Inventory. Continence Program in Women (CPW) Research Group. Qual Life Res 1994;3:291-306.
4. Uebersax JS, Wyman JF, Shumaker SA, McClish DK, Fantl JA. Short forms to assess life quality and symptom distress for urinary incontinence in women: the Incontinence Impact Questionnaire and the Urogenital Distress Inventory. Continence Program for Women Research Group. Neurourol Urodyn. 1995;14:131-9.
5. Wagner TH, Patrick DL, Bavendam TG, Martin ML, Buesching DP. Quality of life of persons with urinary incontinence: development of a new measure. Urology 1996;47:67-71.
6. Patrick DL, Martin ML, Bushnell DM, et al. Quality of life of women with urinary incontinence: further development of the incontinence quality of life instrument (I-QOL). Urology 1999;53:71-6.
7. Ware JJ, Kosinski M, Keller SD. A 12-Item Short-Form Health Survey: construction of scales and preliminary tests of reliability and validity. Med Care 1996;34:220-33.
8. Dillman DA. Mail and Telephone Surveys: The Total Design Method. New York: John Wiley & Sons; 1978.
9. Schafer JL. Analysis of Incomplete Multivariate Data. London: Chapman & Hall; 1997.
10. Rubin DL. Multiple Imputation for Nonresponse in Surveys. New York: John Wiley & Sons; 1987.
11. Honaker J, Joseph A, King G, Scheve K, Singh N. Amelia: a program for missing data. 2001. Available at: http://gking.harvard.edu/stats/shtml. Accessed September 30, 2002.
12. Hastie TJ, Tibshirani RJ. Generalized Additive Models. London: Chapman & Hall; 1990.
13. Freund RJ, Wilson WJ. Regression Analysis. San Diego: Academic Press; 1998.
14. van Erkel AR, Pattynama PM. Receiver operating characteristic (ROC) analysis: basic principles and applications in radiology. Eur J Radiol 1998;27:88-94.
15. Stata Statistical Software. Version 7.0. College Station, TX: StatCorp; 2000.
16. Fox J. Multiple and Generalized Nonparametric Regression. Thousand Oaks, CA: Sage Publications; 2000.
Address reprint requests to Murray M. Finkelstein, MD, PhD, Family Medicine Centre, Suite 413, Mt Sinai Hospital, Toronto, ON M5G 1X5, Canada. E-mail: [email protected].
To submit a letter to the editor on this topic, click here: [email protected].
1. Jeter KF, Wagner DB. Incontinence in the American home. A survey of 36,500 people. J Am Geriatr Soc 1990;38:379-83.
2. Ware JEJ. The status of health assessment 1994. Annu Rev Public Health 1995;16:327-54.
3. Shumaker SA, Wyman JF, Uebersax JS, McClish D, Fantl JA. Health-related quality of life measures for women with urinary incontinence: the Incontinence Impact Questionnaire and the Urogenital Distress Inventory. Continence Program in Women (CPW) Research Group. Qual Life Res 1994;3:291-306.
4. Uebersax JS, Wyman JF, Shumaker SA, McClish DK, Fantl JA. Short forms to assess life quality and symptom distress for urinary incontinence in women: the Incontinence Impact Questionnaire and the Urogenital Distress Inventory. Continence Program for Women Research Group. Neurourol Urodyn. 1995;14:131-9.
5. Wagner TH, Patrick DL, Bavendam TG, Martin ML, Buesching DP. Quality of life of persons with urinary incontinence: development of a new measure. Urology 1996;47:67-71.
6. Patrick DL, Martin ML, Bushnell DM, et al. Quality of life of women with urinary incontinence: further development of the incontinence quality of life instrument (I-QOL). Urology 1999;53:71-6.
7. Ware JJ, Kosinski M, Keller SD. A 12-Item Short-Form Health Survey: construction of scales and preliminary tests of reliability and validity. Med Care 1996;34:220-33.
8. Dillman DA. Mail and Telephone Surveys: The Total Design Method. New York: John Wiley & Sons; 1978.
9. Schafer JL. Analysis of Incomplete Multivariate Data. London: Chapman & Hall; 1997.
10. Rubin DL. Multiple Imputation for Nonresponse in Surveys. New York: John Wiley & Sons; 1987.
11. Honaker J, Joseph A, King G, Scheve K, Singh N. Amelia: a program for missing data. 2001. Available at: http://gking.harvard.edu/stats/shtml. Accessed September 30, 2002.
12. Hastie TJ, Tibshirani RJ. Generalized Additive Models. London: Chapman & Hall; 1990.
13. Freund RJ, Wilson WJ. Regression Analysis. San Diego: Academic Press; 1998.
14. van Erkel AR, Pattynama PM. Receiver operating characteristic (ROC) analysis: basic principles and applications in radiology. Eur J Radiol 1998;27:88-94.
15. Stata Statistical Software. Version 7.0. College Station, TX: StatCorp; 2000.
16. Fox J. Multiple and Generalized Nonparametric Regression. Thousand Oaks, CA: Sage Publications; 2000.
Address reprint requests to Murray M. Finkelstein, MD, PhD, Family Medicine Centre, Suite 413, Mt Sinai Hospital, Toronto, ON M5G 1X5, Canada. E-mail: [email protected].
To submit a letter to the editor on this topic, click here: [email protected].
How should low-density lipoprotein cholesterol concentration be determined?
- The C-LDL-C remains the method of choice for LDL-C determination.
- The D-LDL-C has not been adequately standardized and was not used in the clinical trials which were the basis for the current NCEP-ATP III recommendations.
- Some D-LDL-C assays may give significantly different results from those of the C-LDL-C.
- Some D-LDL-C assays do not perform well in hypertriglyceridemia, the very situation for which they are advocated.
- Use of the D-LDL-C increases cost without evidence of benefit.
The National Cholesterol Education Program Adult Treatment Panel III Report (NCEP-ATP III) has identified low-density lipoprotein cholesterol (LDL-C) as the primary target of therapy.1,2The Friedewald calculated LDL-C (C-LDL-C) is the preferred method2,3 and is calculated with the following equation:
LDL-C =TC – HDL-C – TG/5
where TC is total cholesterol concentration, HDL-C is high-density lipoprotein cholesterol concentration, and TG is triglyceride concentration. The complete NCEP-ATP III report has indicated that methods to directly measure LDL-C (D-LDL-C) in the non-fasting state have been developed and will grow in use but require careful quality control.2Our VA hospital clinical laboratory is 1 of 10 hospitals in the South Central VA Health Care Network that routinely reports D-LDL-C rather than C-LDL-C levels to clinicians. Telephone calls to 4 other research and clinical laboratories found that all are using D-LDL-C to some extent.
D-LDL-C assays correlate variably with C-LDL-C measurements used in research studies.4-17The purported advantages of such measurements are that fasting is not required and that D-LDL-C may be determined in patients with serum triglyceride levels greater than 400 mg/dL when the C-HDL-C and are less reliable. However, clinical trials demonstrating benefit of lowering LDL-C with drug therapy used the C-LDL-C.18-22Only the recently reported Heart Protection Study used a non-fasting D-LDL-C.23Thus, it is important in practicing evidence-based medicine to demonstrate that the D-LDL-C measurements are comparable to those of the C-LDL-C. The present study determined how the D-LDL-C correlated with C-LDL-C and how such a correlation would affect treatment decisions based on the NCEP-ATP III guidelines.
Methods
Data from all patients with a lipid panel during a single week were analyzed. Patients with triglyceride levels above 1000 mg/dL were excluded. Thirty-four patients with triglyceride levels between 400 and 1000 mg/dL were analyzed separately. A C-LDL-C was determined and compared with the D-LDL-C in all 464 patients. Total cholesterol, triglyceride, and HDL-C measurements were done with an autoanalyzer. D-LDL-C was measured with Sigma Diagnostics EZ LDL Cholesterol, procedure 358 (Sigma, St. Louis, MO). Linear regression was performed using Microsoft Excel (Microsoft Corporation, Redmond, WA).
Results
The samples in this study represented the expected distribution of LDL-C concentrations seen in a clinical practice of predominantly male veterans. Of the 464 patient samples with triglyceride levels below 400 mg/dL, the mean C-LDL-C was 123 mg/dL. Twenty-eight percent had a C-LDL-C below 100 mg/dL, 32% had a C-LDL-C of 100 to 129.9 mg/dL, 24% had a C-LDL-C of 130 to 159.9 mg/dL, 12% had a C-LDL-C of 160 to 189.9 mg/dL, and 4% had a C-LDL-C above 190 mg/dL.
The Figure shows the correlation between the C-LDL-C and D-LDL-C in all patients with triglyceride levels below 400 mg/dL. Although there is a strong correlation between the C-LDL-C and D-LDL-C (r = .86), the regression line does not go through 0. A C-LDL-C of 100 mg/dL or lower is the NCEP-ATP III goal for patients with known coronary heart disease (CHD) and other clinical forms of atherosclerotic disease, diabetes, or multiple risk factors that confer a 10-year risk for CHD greater than 20%.1At this cutoff for C-LDL-C, the D-LDL-C derived from the regression line is 118 mg/dL. At a C-LDL-C of 160 mg/dL, the 2 values are comparable; at a C-LDL-C of 190 mg/dL, the D-LDL-C is slightly lower at 182 mg/dL. This is demonstrated graphically in the Figure by a dashed line indicating a perfect correlation between the 2 methods. The Figure also displays vertical and horizontal lines through an LDL-C of 100 mg/dL, the level above which drug therapy is likely to be started or increased in patients with CHD or CHD risk equivalents. This partition illustrates those patients who would require treatment when using the 100 mg/dL treatment goal by the C-LDL-C, the D-LDLC, neither, or both. This is also shown in the (Table, which shows the number of patients who would be treated with the LDL-C cutoffs for treatment recommended by NCEP-ATP III. At an LDL cutoff of 100 mg/dL, 60 patients (13% of total) would be treated with the D-LDL-C and not the C-LDL-C, whereas only 2 patients (<1%) would be treated with the C-LDL-C and not with the D-LDL-C. The results are similar when using a 130 mg/dL cutoff for treatment. Thus, treatment decisions based on the D-LDL-C results in many patients being treated who would not have been treated when using the C-LDL-C.
To determine whether triglyceride concentration influences treatment decisions by either method of LDL-C measurement, similar correlations and analyses were done on the data according to the following triglyceride groupings: <100 mg/dL, 100 to 199 mg/dL, 200 to 299 mg/dL, 300 to 399 mg/dL, and >400 mg/dL. This was further evaluated by plotting triglyceride vs D-LDL-C and triglyceride vs C-LDL-C (data not shown). Whereas the C-LDL-C showed no correlation with triglyceride, the DLDL-C showed a statistically significant correlation with triglyceride concentrations (r = .27), indicating that D-LDL-C increases at higher triglyceride levels. This suggested an influence of triglyceride on the D-LDL-C assay. This has been reported by others in 3 of 4 different D-LDL-C assays including the Sigma assay.15 However, alterations in treatment possibilities when using the D-LDL-C are present at all triglyceride concentrations.
FIGURE 1
Direct vs calculated LDL-C (mg/dL)
TABLE
Effect of LDL assay by LDL treatment cutoff*
Patients who might be treated, n (%) | Additional patients who might be treated, n (%) | |||
---|---|---|---|---|
LDL cutoff for treatment, mg/dL | Calculated LDL | Direct LDL | Calculated, not direct, LDL | Direct, not calculated, LDL |
>100 | 334 (72) | 393 (85) | 2 (<1) | 60 (13) |
>130 | 185 (40) | 237 (51) | 2 (<1) | 55 (12) |
>160 | 71 (15) | 87 (19) | 6 (1) | 21 (5) |
*N = 464 patients. | ||||
LDL, low-density lipoprotein. |
Discussion
LDL-C has been identified in the NCEP-ATP III as the primary target of therapy. Treatment recommendations for high LDL-C are based on low, moderate, or high risk for CHD, with treatment goals of 160, 130, and 100 mg/dL, respectively.1,2
These evidence-based recommendations rely on data from clinical trials demonstrating prevention of CHD events by lowering LDL-C, all of which, with the exception of the recently reported Heart Protection Study, used the CLDL-C.18-23 Thus, important treatment decisions depend on this estimated LDL-C, and systematic deviations from the C-LDL-C will affect treatment decisions and cost.
The Sigma EZ LDL D-LDL-C assay in our hospital produces higher LDL-C levels than the C-LDL-C in a range of 100 to 160 mg/dL, the range of most common concern to clinicians. This results in inappropriate treatment or intensification in treatment according to the NCEP-ATP III guidelines. The DLDL-C was higher than the C-LDL-C at all triglyceride levels, but the error was greater for hypertriglyceridemia, the very situation for which it has been advocated.
Previous publications using a D-LDL-C assay have emphasized the correlation between the DLDL-C assay and research LDL-C determinations rather than the correlation with the C-LDL-C.5-8 Other investigators have observed a similar tendency for higher D-LDL-C than C-LDL-C measurements at an LDL-C of 100 mg/dL17 and a positive bias at higher triglyceride levels.14 Although C-LDLC was often performed, data similar to those shown in the Figure, ie, the simple correlation between the D-LDL-C and C-LDL-C, have not been presented. Two very recent reviews have suggested caution in routinely implementing the D-LDL-C assays and pointed out the considerable variation from one assay to another.14,15 Laboratories often change their assay method; in fact, our hospital laboratory has recently changed to a different DLDL-C method.
Physicians and institutions should be cautious about using a D-LDL-C method as a substitute for the C-LDL-C. First, it has not been standardized in large populations and, with the exception of the recent Heart Protection Study,23 has not been used in large clinical trials demonstrating the benefits of lowering LDL-C. Although the C-LDL-C has been recommended by the NCEP-ATP III,2 the Executive Summary of these guidelines did not address the method for measuring LDL-C.1 Second, cost is increased from the additional therapy and performing the D-LDL-C assay. Third, the major reasons proposed for using a D-LDL-C assay (lack of need for a fasting specimen and usefulness at triglyceride > 400 mg/dL) may not be valid or relevant. Variation in the LDL-C due to hypertriglyceridemia occurs with the D-LDL-C. In addition, the NCEP-ATP III report emphasized triglyceride and recommended a fasting lipid panel including total cholesterol, triglycerides, and HDL-C.1,2 One limitation of this study is the inclusion of predominantly male veterans. There may be populations, not considered in this study, that have an abnormal lipoprotein composition that significantly affects the C-LDL-C.
Conclusions
The C-LDL-C should remain the method of choice for LDL-C determinations because (1) this assay was used in clinical trials documenting the benefits of cholesterol-lowering therapy and (2) use of the D-LDL-C increases cost without evidence of benefit. Further studies are needed to standardize the direct LDL-C assays, and outcome trials using these assays need to be performed.
ACKNOWLEDGMENTS
This work was supported in part by the Biomedical Research Foundation of Arkansas and the Central Arkansas Veterans Healthcare System.
1. Executive summary of the Third Report of the National Cholesterol Education Program (NCEP) Expert Panel on Detection, Evaluation, and Treatment of High Blood Cholesterol in Adults (Adult Treatment Panel III). JAMA 2001;285:2486-97.
2. National Cholesterol Education Program. Third Report of the Expert Panel on Detection, Evaluation, and Treatment of High Blood Cholesterol in Adults (Adult Treatment Panel III), full report, manuscript version, 2001. Available at: www.nhlbi.nih.gov/guide-lines/cholesterol/atp3_rpt.htm. Accessed April 2, 2002.
3. Friedewald WT, Levy RI, Fredrickson DS. Estimation of the concentration of low-density lipoprotein cholesterol in plasma without use of the preparative ultracentrifuge. Clin Chem 1972;18:499-502.
4. Cobbaert C, Broodman I, Swart GR, et al. Performance of a direct, immunoseparation based LDL-cholesterol method compared to Friedewald calculation and a polyvinyl sulphate precipitation method. Eur J Clin Chem Clin Biochem. 1995;33:417-24.
5. Jialal I, Hirany SV, Devaraj S, et al. Comparison of an immunoprecipitation method for direct measurement of LDL- cholesterol with beta-quantification (ultracentrifugation). Am J Clin Pathol 1995;104:76-81.
6. Nauck M, Graziani MS, Bruton D. et al: Analytical and clinical performance of a detergent based homogeneous LDL-cholesterol assay: a multicenter evaluation Clin Chem 2000;46:506-14.
7. Hirany S, Li D, Jialal I. A more valid measurement of low-density lipoprotein cholesterol in diabetic patients Am J Med 1997;102:48-53.
8. Whiting MJ, Shephard MDS, Tallis GA. Measurement of plasma LDL cholesterol in patients with diabetes Diabetes Care. 1997;20:12-4.
9. McNamara JR, Cole TG, Contois JH, et al. Immunoseparation method for measuring low density lipoprotein cholesterol directly from serum evaluated. Clin Chem 1995;41:232-40.
10. Pisani T, Gebski CP, Leary ET, et al. Accurate direct determination of lowdensity lipoprotein cholesterol using an immunoseparation reagent and enzymatic cholesterol assay. Arch Pathol Lab Med 1995;119:1127-35.
11. Maitra A, Hirany SV, Jialal I. Comparison of two assays for measuring LDL cholesterol Clin Chem 1997;43:1040-7.
12. Yu HH, Markowitz R, De Ferranti SD, et al. Direct measurement of LDL-C in children performance of two surfactant-based methods in a general pediatric population. Clin Biochem 2000;95:89-95.
13. Sakaue T, Hirano T, Yoshino G, et al. Reactions of direct LDL-cho-lesterol assays with pure LDL fraction and IDL-comparison of three homogeneous methods. Clin Chim Acta 2000;295:97-106.
14. Nauck M, Warnick GR, Rifai N. Methods for measurement of LDL-cholesterol a critical assessment of direct measurement by homogeneous assays versus calculation. Clin Chem 2002;48:236-54.
15. Miller WG, Waymack PP, Anderson FP, Ethridge SF, Jayne EC. Performance of four homogeneous direct methods for LDL-cholesterol. Clin Chem 2002;48:489-98.
16. Smets EML, Pequerlaux NCV, Blaton V, Goldschmidt HMJ. Analytical performance of a direct assay for LDL-cholesterol. Clin Chem Lab Med 2001;39:270-80.
17. Yu HH, Ginsburg GS, Harris N, Rifai N. Evaluation and clinical application of a direct low-density lipoprotein cholesterol assay in normolipidemic and hyperlipidemic adults. Am J Cardiol 1997;80:1295-99.
18. Shepherd J, Cobbe SM, Ford I, et al. Prevention of coronary heart disease with pravastatin in men with hypercholesterolemia. N Engl J Med 1995;333:1301-7.
19. Downs JR, Clearfield M, Weis S, et al. Primary prevention of acute coronary events with lovastatin in men and women with average cholesterol levels: results of AFCAPS/TexCAPS. Air Force/Texas Coronary Atherosclerosis Prevention Study. JAMA 1998;279:1615-22.
20. Scandinavian Simvastatin Survival Study Group. Design and baseline results of the Scandinavian Simvastatin Survival Study of patients with stable angina and/or previous myocardial infarction Am J Cardiol 1993;71:393-400.
21. Long-term Intervention with Pravastatin in Ischaemic Disease (LIPID) Study Group. Prevention of cardiovascular events and death with pravastatin in patients with coronary heart disease and a broad range of initial cholesterol levels N Engl J Med 1998;339:1349-57.
22. Sacks FM, Pfeffer MA, Moye LA, et al. The effect of pravastatin on coronary events after myocardial infarction in patients with average cholesterol levels. N Engl J Med 1996;335:1001-9.
23. MRC/BHF Heart Protection Study Collaborative Group. MRC/BHF Heart Protection Study of cholesterol lowering with simvastatin in 20,536 high-risk individuals: a randomized placebo-controlled trial Lancet 2002;360:7-22.
- The C-LDL-C remains the method of choice for LDL-C determination.
- The D-LDL-C has not been adequately standardized and was not used in the clinical trials which were the basis for the current NCEP-ATP III recommendations.
- Some D-LDL-C assays may give significantly different results from those of the C-LDL-C.
- Some D-LDL-C assays do not perform well in hypertriglyceridemia, the very situation for which they are advocated.
- Use of the D-LDL-C increases cost without evidence of benefit.
The National Cholesterol Education Program Adult Treatment Panel III Report (NCEP-ATP III) has identified low-density lipoprotein cholesterol (LDL-C) as the primary target of therapy.1,2The Friedewald calculated LDL-C (C-LDL-C) is the preferred method2,3 and is calculated with the following equation:
LDL-C =TC – HDL-C – TG/5
where TC is total cholesterol concentration, HDL-C is high-density lipoprotein cholesterol concentration, and TG is triglyceride concentration. The complete NCEP-ATP III report has indicated that methods to directly measure LDL-C (D-LDL-C) in the non-fasting state have been developed and will grow in use but require careful quality control.2Our VA hospital clinical laboratory is 1 of 10 hospitals in the South Central VA Health Care Network that routinely reports D-LDL-C rather than C-LDL-C levels to clinicians. Telephone calls to 4 other research and clinical laboratories found that all are using D-LDL-C to some extent.
D-LDL-C assays correlate variably with C-LDL-C measurements used in research studies.4-17The purported advantages of such measurements are that fasting is not required and that D-LDL-C may be determined in patients with serum triglyceride levels greater than 400 mg/dL when the C-HDL-C and are less reliable. However, clinical trials demonstrating benefit of lowering LDL-C with drug therapy used the C-LDL-C.18-22Only the recently reported Heart Protection Study used a non-fasting D-LDL-C.23Thus, it is important in practicing evidence-based medicine to demonstrate that the D-LDL-C measurements are comparable to those of the C-LDL-C. The present study determined how the D-LDL-C correlated with C-LDL-C and how such a correlation would affect treatment decisions based on the NCEP-ATP III guidelines.
Methods
Data from all patients with a lipid panel during a single week were analyzed. Patients with triglyceride levels above 1000 mg/dL were excluded. Thirty-four patients with triglyceride levels between 400 and 1000 mg/dL were analyzed separately. A C-LDL-C was determined and compared with the D-LDL-C in all 464 patients. Total cholesterol, triglyceride, and HDL-C measurements were done with an autoanalyzer. D-LDL-C was measured with Sigma Diagnostics EZ LDL Cholesterol, procedure 358 (Sigma, St. Louis, MO). Linear regression was performed using Microsoft Excel (Microsoft Corporation, Redmond, WA).
Results
The samples in this study represented the expected distribution of LDL-C concentrations seen in a clinical practice of predominantly male veterans. Of the 464 patient samples with triglyceride levels below 400 mg/dL, the mean C-LDL-C was 123 mg/dL. Twenty-eight percent had a C-LDL-C below 100 mg/dL, 32% had a C-LDL-C of 100 to 129.9 mg/dL, 24% had a C-LDL-C of 130 to 159.9 mg/dL, 12% had a C-LDL-C of 160 to 189.9 mg/dL, and 4% had a C-LDL-C above 190 mg/dL.
The Figure shows the correlation between the C-LDL-C and D-LDL-C in all patients with triglyceride levels below 400 mg/dL. Although there is a strong correlation between the C-LDL-C and D-LDL-C (r = .86), the regression line does not go through 0. A C-LDL-C of 100 mg/dL or lower is the NCEP-ATP III goal for patients with known coronary heart disease (CHD) and other clinical forms of atherosclerotic disease, diabetes, or multiple risk factors that confer a 10-year risk for CHD greater than 20%.1At this cutoff for C-LDL-C, the D-LDL-C derived from the regression line is 118 mg/dL. At a C-LDL-C of 160 mg/dL, the 2 values are comparable; at a C-LDL-C of 190 mg/dL, the D-LDL-C is slightly lower at 182 mg/dL. This is demonstrated graphically in the Figure by a dashed line indicating a perfect correlation between the 2 methods. The Figure also displays vertical and horizontal lines through an LDL-C of 100 mg/dL, the level above which drug therapy is likely to be started or increased in patients with CHD or CHD risk equivalents. This partition illustrates those patients who would require treatment when using the 100 mg/dL treatment goal by the C-LDL-C, the D-LDLC, neither, or both. This is also shown in the (Table, which shows the number of patients who would be treated with the LDL-C cutoffs for treatment recommended by NCEP-ATP III. At an LDL cutoff of 100 mg/dL, 60 patients (13% of total) would be treated with the D-LDL-C and not the C-LDL-C, whereas only 2 patients (<1%) would be treated with the C-LDL-C and not with the D-LDL-C. The results are similar when using a 130 mg/dL cutoff for treatment. Thus, treatment decisions based on the D-LDL-C results in many patients being treated who would not have been treated when using the C-LDL-C.
To determine whether triglyceride concentration influences treatment decisions by either method of LDL-C measurement, similar correlations and analyses were done on the data according to the following triglyceride groupings: <100 mg/dL, 100 to 199 mg/dL, 200 to 299 mg/dL, 300 to 399 mg/dL, and >400 mg/dL. This was further evaluated by plotting triglyceride vs D-LDL-C and triglyceride vs C-LDL-C (data not shown). Whereas the C-LDL-C showed no correlation with triglyceride, the DLDL-C showed a statistically significant correlation with triglyceride concentrations (r = .27), indicating that D-LDL-C increases at higher triglyceride levels. This suggested an influence of triglyceride on the D-LDL-C assay. This has been reported by others in 3 of 4 different D-LDL-C assays including the Sigma assay.15 However, alterations in treatment possibilities when using the D-LDL-C are present at all triglyceride concentrations.
FIGURE 1
Direct vs calculated LDL-C (mg/dL)
TABLE
Effect of LDL assay by LDL treatment cutoff*
Patients who might be treated, n (%) | Additional patients who might be treated, n (%) | |||
---|---|---|---|---|
LDL cutoff for treatment, mg/dL | Calculated LDL | Direct LDL | Calculated, not direct, LDL | Direct, not calculated, LDL |
>100 | 334 (72) | 393 (85) | 2 (<1) | 60 (13) |
>130 | 185 (40) | 237 (51) | 2 (<1) | 55 (12) |
>160 | 71 (15) | 87 (19) | 6 (1) | 21 (5) |
*N = 464 patients. | ||||
LDL, low-density lipoprotein. |
Discussion
LDL-C has been identified in the NCEP-ATP III as the primary target of therapy. Treatment recommendations for high LDL-C are based on low, moderate, or high risk for CHD, with treatment goals of 160, 130, and 100 mg/dL, respectively.1,2
These evidence-based recommendations rely on data from clinical trials demonstrating prevention of CHD events by lowering LDL-C, all of which, with the exception of the recently reported Heart Protection Study, used the CLDL-C.18-23 Thus, important treatment decisions depend on this estimated LDL-C, and systematic deviations from the C-LDL-C will affect treatment decisions and cost.
The Sigma EZ LDL D-LDL-C assay in our hospital produces higher LDL-C levels than the C-LDL-C in a range of 100 to 160 mg/dL, the range of most common concern to clinicians. This results in inappropriate treatment or intensification in treatment according to the NCEP-ATP III guidelines. The DLDL-C was higher than the C-LDL-C at all triglyceride levels, but the error was greater for hypertriglyceridemia, the very situation for which it has been advocated.
Previous publications using a D-LDL-C assay have emphasized the correlation between the DLDL-C assay and research LDL-C determinations rather than the correlation with the C-LDL-C.5-8 Other investigators have observed a similar tendency for higher D-LDL-C than C-LDL-C measurements at an LDL-C of 100 mg/dL17 and a positive bias at higher triglyceride levels.14 Although C-LDLC was often performed, data similar to those shown in the Figure, ie, the simple correlation between the D-LDL-C and C-LDL-C, have not been presented. Two very recent reviews have suggested caution in routinely implementing the D-LDL-C assays and pointed out the considerable variation from one assay to another.14,15 Laboratories often change their assay method; in fact, our hospital laboratory has recently changed to a different DLDL-C method.
Physicians and institutions should be cautious about using a D-LDL-C method as a substitute for the C-LDL-C. First, it has not been standardized in large populations and, with the exception of the recent Heart Protection Study,23 has not been used in large clinical trials demonstrating the benefits of lowering LDL-C. Although the C-LDL-C has been recommended by the NCEP-ATP III,2 the Executive Summary of these guidelines did not address the method for measuring LDL-C.1 Second, cost is increased from the additional therapy and performing the D-LDL-C assay. Third, the major reasons proposed for using a D-LDL-C assay (lack of need for a fasting specimen and usefulness at triglyceride > 400 mg/dL) may not be valid or relevant. Variation in the LDL-C due to hypertriglyceridemia occurs with the D-LDL-C. In addition, the NCEP-ATP III report emphasized triglyceride and recommended a fasting lipid panel including total cholesterol, triglycerides, and HDL-C.1,2 One limitation of this study is the inclusion of predominantly male veterans. There may be populations, not considered in this study, that have an abnormal lipoprotein composition that significantly affects the C-LDL-C.
Conclusions
The C-LDL-C should remain the method of choice for LDL-C determinations because (1) this assay was used in clinical trials documenting the benefits of cholesterol-lowering therapy and (2) use of the D-LDL-C increases cost without evidence of benefit. Further studies are needed to standardize the direct LDL-C assays, and outcome trials using these assays need to be performed.
ACKNOWLEDGMENTS
This work was supported in part by the Biomedical Research Foundation of Arkansas and the Central Arkansas Veterans Healthcare System.
- The C-LDL-C remains the method of choice for LDL-C determination.
- The D-LDL-C has not been adequately standardized and was not used in the clinical trials which were the basis for the current NCEP-ATP III recommendations.
- Some D-LDL-C assays may give significantly different results from those of the C-LDL-C.
- Some D-LDL-C assays do not perform well in hypertriglyceridemia, the very situation for which they are advocated.
- Use of the D-LDL-C increases cost without evidence of benefit.
The National Cholesterol Education Program Adult Treatment Panel III Report (NCEP-ATP III) has identified low-density lipoprotein cholesterol (LDL-C) as the primary target of therapy.1,2The Friedewald calculated LDL-C (C-LDL-C) is the preferred method2,3 and is calculated with the following equation:
LDL-C =TC – HDL-C – TG/5
where TC is total cholesterol concentration, HDL-C is high-density lipoprotein cholesterol concentration, and TG is triglyceride concentration. The complete NCEP-ATP III report has indicated that methods to directly measure LDL-C (D-LDL-C) in the non-fasting state have been developed and will grow in use but require careful quality control.2Our VA hospital clinical laboratory is 1 of 10 hospitals in the South Central VA Health Care Network that routinely reports D-LDL-C rather than C-LDL-C levels to clinicians. Telephone calls to 4 other research and clinical laboratories found that all are using D-LDL-C to some extent.
D-LDL-C assays correlate variably with C-LDL-C measurements used in research studies.4-17The purported advantages of such measurements are that fasting is not required and that D-LDL-C may be determined in patients with serum triglyceride levels greater than 400 mg/dL when the C-HDL-C and are less reliable. However, clinical trials demonstrating benefit of lowering LDL-C with drug therapy used the C-LDL-C.18-22Only the recently reported Heart Protection Study used a non-fasting D-LDL-C.23Thus, it is important in practicing evidence-based medicine to demonstrate that the D-LDL-C measurements are comparable to those of the C-LDL-C. The present study determined how the D-LDL-C correlated with C-LDL-C and how such a correlation would affect treatment decisions based on the NCEP-ATP III guidelines.
Methods
Data from all patients with a lipid panel during a single week were analyzed. Patients with triglyceride levels above 1000 mg/dL were excluded. Thirty-four patients with triglyceride levels between 400 and 1000 mg/dL were analyzed separately. A C-LDL-C was determined and compared with the D-LDL-C in all 464 patients. Total cholesterol, triglyceride, and HDL-C measurements were done with an autoanalyzer. D-LDL-C was measured with Sigma Diagnostics EZ LDL Cholesterol, procedure 358 (Sigma, St. Louis, MO). Linear regression was performed using Microsoft Excel (Microsoft Corporation, Redmond, WA).
Results
The samples in this study represented the expected distribution of LDL-C concentrations seen in a clinical practice of predominantly male veterans. Of the 464 patient samples with triglyceride levels below 400 mg/dL, the mean C-LDL-C was 123 mg/dL. Twenty-eight percent had a C-LDL-C below 100 mg/dL, 32% had a C-LDL-C of 100 to 129.9 mg/dL, 24% had a C-LDL-C of 130 to 159.9 mg/dL, 12% had a C-LDL-C of 160 to 189.9 mg/dL, and 4% had a C-LDL-C above 190 mg/dL.
The Figure shows the correlation between the C-LDL-C and D-LDL-C in all patients with triglyceride levels below 400 mg/dL. Although there is a strong correlation between the C-LDL-C and D-LDL-C (r = .86), the regression line does not go through 0. A C-LDL-C of 100 mg/dL or lower is the NCEP-ATP III goal for patients with known coronary heart disease (CHD) and other clinical forms of atherosclerotic disease, diabetes, or multiple risk factors that confer a 10-year risk for CHD greater than 20%.1At this cutoff for C-LDL-C, the D-LDL-C derived from the regression line is 118 mg/dL. At a C-LDL-C of 160 mg/dL, the 2 values are comparable; at a C-LDL-C of 190 mg/dL, the D-LDL-C is slightly lower at 182 mg/dL. This is demonstrated graphically in the Figure by a dashed line indicating a perfect correlation between the 2 methods. The Figure also displays vertical and horizontal lines through an LDL-C of 100 mg/dL, the level above which drug therapy is likely to be started or increased in patients with CHD or CHD risk equivalents. This partition illustrates those patients who would require treatment when using the 100 mg/dL treatment goal by the C-LDL-C, the D-LDLC, neither, or both. This is also shown in the (Table, which shows the number of patients who would be treated with the LDL-C cutoffs for treatment recommended by NCEP-ATP III. At an LDL cutoff of 100 mg/dL, 60 patients (13% of total) would be treated with the D-LDL-C and not the C-LDL-C, whereas only 2 patients (<1%) would be treated with the C-LDL-C and not with the D-LDL-C. The results are similar when using a 130 mg/dL cutoff for treatment. Thus, treatment decisions based on the D-LDL-C results in many patients being treated who would not have been treated when using the C-LDL-C.
To determine whether triglyceride concentration influences treatment decisions by either method of LDL-C measurement, similar correlations and analyses were done on the data according to the following triglyceride groupings: <100 mg/dL, 100 to 199 mg/dL, 200 to 299 mg/dL, 300 to 399 mg/dL, and >400 mg/dL. This was further evaluated by plotting triglyceride vs D-LDL-C and triglyceride vs C-LDL-C (data not shown). Whereas the C-LDL-C showed no correlation with triglyceride, the DLDL-C showed a statistically significant correlation with triglyceride concentrations (r = .27), indicating that D-LDL-C increases at higher triglyceride levels. This suggested an influence of triglyceride on the D-LDL-C assay. This has been reported by others in 3 of 4 different D-LDL-C assays including the Sigma assay.15 However, alterations in treatment possibilities when using the D-LDL-C are present at all triglyceride concentrations.
FIGURE 1
Direct vs calculated LDL-C (mg/dL)
TABLE
Effect of LDL assay by LDL treatment cutoff*
Patients who might be treated, n (%) | Additional patients who might be treated, n (%) | |||
---|---|---|---|---|
LDL cutoff for treatment, mg/dL | Calculated LDL | Direct LDL | Calculated, not direct, LDL | Direct, not calculated, LDL |
>100 | 334 (72) | 393 (85) | 2 (<1) | 60 (13) |
>130 | 185 (40) | 237 (51) | 2 (<1) | 55 (12) |
>160 | 71 (15) | 87 (19) | 6 (1) | 21 (5) |
*N = 464 patients. | ||||
LDL, low-density lipoprotein. |
Discussion
LDL-C has been identified in the NCEP-ATP III as the primary target of therapy. Treatment recommendations for high LDL-C are based on low, moderate, or high risk for CHD, with treatment goals of 160, 130, and 100 mg/dL, respectively.1,2
These evidence-based recommendations rely on data from clinical trials demonstrating prevention of CHD events by lowering LDL-C, all of which, with the exception of the recently reported Heart Protection Study, used the CLDL-C.18-23 Thus, important treatment decisions depend on this estimated LDL-C, and systematic deviations from the C-LDL-C will affect treatment decisions and cost.
The Sigma EZ LDL D-LDL-C assay in our hospital produces higher LDL-C levels than the C-LDL-C in a range of 100 to 160 mg/dL, the range of most common concern to clinicians. This results in inappropriate treatment or intensification in treatment according to the NCEP-ATP III guidelines. The DLDL-C was higher than the C-LDL-C at all triglyceride levels, but the error was greater for hypertriglyceridemia, the very situation for which it has been advocated.
Previous publications using a D-LDL-C assay have emphasized the correlation between the DLDL-C assay and research LDL-C determinations rather than the correlation with the C-LDL-C.5-8 Other investigators have observed a similar tendency for higher D-LDL-C than C-LDL-C measurements at an LDL-C of 100 mg/dL17 and a positive bias at higher triglyceride levels.14 Although C-LDLC was often performed, data similar to those shown in the Figure, ie, the simple correlation between the D-LDL-C and C-LDL-C, have not been presented. Two very recent reviews have suggested caution in routinely implementing the D-LDL-C assays and pointed out the considerable variation from one assay to another.14,15 Laboratories often change their assay method; in fact, our hospital laboratory has recently changed to a different DLDL-C method.
Physicians and institutions should be cautious about using a D-LDL-C method as a substitute for the C-LDL-C. First, it has not been standardized in large populations and, with the exception of the recent Heart Protection Study,23 has not been used in large clinical trials demonstrating the benefits of lowering LDL-C. Although the C-LDL-C has been recommended by the NCEP-ATP III,2 the Executive Summary of these guidelines did not address the method for measuring LDL-C.1 Second, cost is increased from the additional therapy and performing the D-LDL-C assay. Third, the major reasons proposed for using a D-LDL-C assay (lack of need for a fasting specimen and usefulness at triglyceride > 400 mg/dL) may not be valid or relevant. Variation in the LDL-C due to hypertriglyceridemia occurs with the D-LDL-C. In addition, the NCEP-ATP III report emphasized triglyceride and recommended a fasting lipid panel including total cholesterol, triglycerides, and HDL-C.1,2 One limitation of this study is the inclusion of predominantly male veterans. There may be populations, not considered in this study, that have an abnormal lipoprotein composition that significantly affects the C-LDL-C.
Conclusions
The C-LDL-C should remain the method of choice for LDL-C determinations because (1) this assay was used in clinical trials documenting the benefits of cholesterol-lowering therapy and (2) use of the D-LDL-C increases cost without evidence of benefit. Further studies are needed to standardize the direct LDL-C assays, and outcome trials using these assays need to be performed.
ACKNOWLEDGMENTS
This work was supported in part by the Biomedical Research Foundation of Arkansas and the Central Arkansas Veterans Healthcare System.
1. Executive summary of the Third Report of the National Cholesterol Education Program (NCEP) Expert Panel on Detection, Evaluation, and Treatment of High Blood Cholesterol in Adults (Adult Treatment Panel III). JAMA 2001;285:2486-97.
2. National Cholesterol Education Program. Third Report of the Expert Panel on Detection, Evaluation, and Treatment of High Blood Cholesterol in Adults (Adult Treatment Panel III), full report, manuscript version, 2001. Available at: www.nhlbi.nih.gov/guide-lines/cholesterol/atp3_rpt.htm. Accessed April 2, 2002.
3. Friedewald WT, Levy RI, Fredrickson DS. Estimation of the concentration of low-density lipoprotein cholesterol in plasma without use of the preparative ultracentrifuge. Clin Chem 1972;18:499-502.
4. Cobbaert C, Broodman I, Swart GR, et al. Performance of a direct, immunoseparation based LDL-cholesterol method compared to Friedewald calculation and a polyvinyl sulphate precipitation method. Eur J Clin Chem Clin Biochem. 1995;33:417-24.
5. Jialal I, Hirany SV, Devaraj S, et al. Comparison of an immunoprecipitation method for direct measurement of LDL- cholesterol with beta-quantification (ultracentrifugation). Am J Clin Pathol 1995;104:76-81.
6. Nauck M, Graziani MS, Bruton D. et al: Analytical and clinical performance of a detergent based homogeneous LDL-cholesterol assay: a multicenter evaluation Clin Chem 2000;46:506-14.
7. Hirany S, Li D, Jialal I. A more valid measurement of low-density lipoprotein cholesterol in diabetic patients Am J Med 1997;102:48-53.
8. Whiting MJ, Shephard MDS, Tallis GA. Measurement of plasma LDL cholesterol in patients with diabetes Diabetes Care. 1997;20:12-4.
9. McNamara JR, Cole TG, Contois JH, et al. Immunoseparation method for measuring low density lipoprotein cholesterol directly from serum evaluated. Clin Chem 1995;41:232-40.
10. Pisani T, Gebski CP, Leary ET, et al. Accurate direct determination of lowdensity lipoprotein cholesterol using an immunoseparation reagent and enzymatic cholesterol assay. Arch Pathol Lab Med 1995;119:1127-35.
11. Maitra A, Hirany SV, Jialal I. Comparison of two assays for measuring LDL cholesterol Clin Chem 1997;43:1040-7.
12. Yu HH, Markowitz R, De Ferranti SD, et al. Direct measurement of LDL-C in children performance of two surfactant-based methods in a general pediatric population. Clin Biochem 2000;95:89-95.
13. Sakaue T, Hirano T, Yoshino G, et al. Reactions of direct LDL-cho-lesterol assays with pure LDL fraction and IDL-comparison of three homogeneous methods. Clin Chim Acta 2000;295:97-106.
14. Nauck M, Warnick GR, Rifai N. Methods for measurement of LDL-cholesterol a critical assessment of direct measurement by homogeneous assays versus calculation. Clin Chem 2002;48:236-54.
15. Miller WG, Waymack PP, Anderson FP, Ethridge SF, Jayne EC. Performance of four homogeneous direct methods for LDL-cholesterol. Clin Chem 2002;48:489-98.
16. Smets EML, Pequerlaux NCV, Blaton V, Goldschmidt HMJ. Analytical performance of a direct assay for LDL-cholesterol. Clin Chem Lab Med 2001;39:270-80.
17. Yu HH, Ginsburg GS, Harris N, Rifai N. Evaluation and clinical application of a direct low-density lipoprotein cholesterol assay in normolipidemic and hyperlipidemic adults. Am J Cardiol 1997;80:1295-99.
18. Shepherd J, Cobbe SM, Ford I, et al. Prevention of coronary heart disease with pravastatin in men with hypercholesterolemia. N Engl J Med 1995;333:1301-7.
19. Downs JR, Clearfield M, Weis S, et al. Primary prevention of acute coronary events with lovastatin in men and women with average cholesterol levels: results of AFCAPS/TexCAPS. Air Force/Texas Coronary Atherosclerosis Prevention Study. JAMA 1998;279:1615-22.
20. Scandinavian Simvastatin Survival Study Group. Design and baseline results of the Scandinavian Simvastatin Survival Study of patients with stable angina and/or previous myocardial infarction Am J Cardiol 1993;71:393-400.
21. Long-term Intervention with Pravastatin in Ischaemic Disease (LIPID) Study Group. Prevention of cardiovascular events and death with pravastatin in patients with coronary heart disease and a broad range of initial cholesterol levels N Engl J Med 1998;339:1349-57.
22. Sacks FM, Pfeffer MA, Moye LA, et al. The effect of pravastatin on coronary events after myocardial infarction in patients with average cholesterol levels. N Engl J Med 1996;335:1001-9.
23. MRC/BHF Heart Protection Study Collaborative Group. MRC/BHF Heart Protection Study of cholesterol lowering with simvastatin in 20,536 high-risk individuals: a randomized placebo-controlled trial Lancet 2002;360:7-22.
1. Executive summary of the Third Report of the National Cholesterol Education Program (NCEP) Expert Panel on Detection, Evaluation, and Treatment of High Blood Cholesterol in Adults (Adult Treatment Panel III). JAMA 2001;285:2486-97.
2. National Cholesterol Education Program. Third Report of the Expert Panel on Detection, Evaluation, and Treatment of High Blood Cholesterol in Adults (Adult Treatment Panel III), full report, manuscript version, 2001. Available at: www.nhlbi.nih.gov/guide-lines/cholesterol/atp3_rpt.htm. Accessed April 2, 2002.
3. Friedewald WT, Levy RI, Fredrickson DS. Estimation of the concentration of low-density lipoprotein cholesterol in plasma without use of the preparative ultracentrifuge. Clin Chem 1972;18:499-502.
4. Cobbaert C, Broodman I, Swart GR, et al. Performance of a direct, immunoseparation based LDL-cholesterol method compared to Friedewald calculation and a polyvinyl sulphate precipitation method. Eur J Clin Chem Clin Biochem. 1995;33:417-24.
5. Jialal I, Hirany SV, Devaraj S, et al. Comparison of an immunoprecipitation method for direct measurement of LDL- cholesterol with beta-quantification (ultracentrifugation). Am J Clin Pathol 1995;104:76-81.
6. Nauck M, Graziani MS, Bruton D. et al: Analytical and clinical performance of a detergent based homogeneous LDL-cholesterol assay: a multicenter evaluation Clin Chem 2000;46:506-14.
7. Hirany S, Li D, Jialal I. A more valid measurement of low-density lipoprotein cholesterol in diabetic patients Am J Med 1997;102:48-53.
8. Whiting MJ, Shephard MDS, Tallis GA. Measurement of plasma LDL cholesterol in patients with diabetes Diabetes Care. 1997;20:12-4.
9. McNamara JR, Cole TG, Contois JH, et al. Immunoseparation method for measuring low density lipoprotein cholesterol directly from serum evaluated. Clin Chem 1995;41:232-40.
10. Pisani T, Gebski CP, Leary ET, et al. Accurate direct determination of lowdensity lipoprotein cholesterol using an immunoseparation reagent and enzymatic cholesterol assay. Arch Pathol Lab Med 1995;119:1127-35.
11. Maitra A, Hirany SV, Jialal I. Comparison of two assays for measuring LDL cholesterol Clin Chem 1997;43:1040-7.
12. Yu HH, Markowitz R, De Ferranti SD, et al. Direct measurement of LDL-C in children performance of two surfactant-based methods in a general pediatric population. Clin Biochem 2000;95:89-95.
13. Sakaue T, Hirano T, Yoshino G, et al. Reactions of direct LDL-cho-lesterol assays with pure LDL fraction and IDL-comparison of three homogeneous methods. Clin Chim Acta 2000;295:97-106.
14. Nauck M, Warnick GR, Rifai N. Methods for measurement of LDL-cholesterol a critical assessment of direct measurement by homogeneous assays versus calculation. Clin Chem 2002;48:236-54.
15. Miller WG, Waymack PP, Anderson FP, Ethridge SF, Jayne EC. Performance of four homogeneous direct methods for LDL-cholesterol. Clin Chem 2002;48:489-98.
16. Smets EML, Pequerlaux NCV, Blaton V, Goldschmidt HMJ. Analytical performance of a direct assay for LDL-cholesterol. Clin Chem Lab Med 2001;39:270-80.
17. Yu HH, Ginsburg GS, Harris N, Rifai N. Evaluation and clinical application of a direct low-density lipoprotein cholesterol assay in normolipidemic and hyperlipidemic adults. Am J Cardiol 1997;80:1295-99.
18. Shepherd J, Cobbe SM, Ford I, et al. Prevention of coronary heart disease with pravastatin in men with hypercholesterolemia. N Engl J Med 1995;333:1301-7.
19. Downs JR, Clearfield M, Weis S, et al. Primary prevention of acute coronary events with lovastatin in men and women with average cholesterol levels: results of AFCAPS/TexCAPS. Air Force/Texas Coronary Atherosclerosis Prevention Study. JAMA 1998;279:1615-22.
20. Scandinavian Simvastatin Survival Study Group. Design and baseline results of the Scandinavian Simvastatin Survival Study of patients with stable angina and/or previous myocardial infarction Am J Cardiol 1993;71:393-400.
21. Long-term Intervention with Pravastatin in Ischaemic Disease (LIPID) Study Group. Prevention of cardiovascular events and death with pravastatin in patients with coronary heart disease and a broad range of initial cholesterol levels N Engl J Med 1998;339:1349-57.
22. Sacks FM, Pfeffer MA, Moye LA, et al. The effect of pravastatin on coronary events after myocardial infarction in patients with average cholesterol levels. N Engl J Med 1996;335:1001-9.
23. MRC/BHF Heart Protection Study Collaborative Group. MRC/BHF Heart Protection Study of cholesterol lowering with simvastatin in 20,536 high-risk individuals: a randomized placebo-controlled trial Lancet 2002;360:7-22.
Efficacy and Cutaneous Safety of Adapalene in Black Patients Versus White Patients With Acne Vulgaris
A Double-Blind Study of the Efficacy and Safety of the ICP10ΔPK Vaccine Against Recurrent Genital HSV-2 Infections
Cancer risk assessment from family history: Gaps in primary care practice
OBJECTIVE: To determine whether an adequate amount of family history is being collected and recorded by family practitioners to appropriately identify patients at increased risk for cancer.
STUDY DESIGN: Retrospective chart audit.
POPULATION: Charts from 500 randomly chosen patients, 40 to 60 years of age, were audited. Of those charts, 400 were from a large academic family practice and 50 charts each were from 2 small community family practices in the greater Philadelphia area.
OUTCOMES MEASURED: General features of family history taking were recorded, including presence of a family history and date when recorded, evidence of updated family history data, and presence of a genogram. Cancer features recorded included mention of family history of cancer or colon polyps and, if positive, identification of which relative was affected, site of cancer, and age of diagnosis or death.
RESULTS: Most charts (89%) had some family history information recorded, and 55% listed a family history of cancer, either positive or negative. Of the 356 relatives affected with cancer, an age of diagnosis was documented in only 8%; of 183 first-degree relatives with cancer, only 7% had a documented age of diagnosis. Two percent of all charts had any mention of a family history of colon polyps. Sixty-five percent of family histories were recorded at the first visit, and only 35% had any updated family history information.
CONCLUSIONS: The number and type of family histories currently being recorded by family practitioners are not adequate to fully assess familial risk of cancer. New strategies will need to be developed to better prepare providers for risk-based clinical decision making.
Taking a family history is a significant component of providing comprehensive primary care because family history provides key psychosocial and medical risk information.1,2 As our understanding of the genetic basis for disease grows, obtaining an accurate and complete family history is likely to gain increasing relevance as a vital source of data to guide counseling and testing. Patients who have a first-degree relative with a colon neoplasm or prostate cancer are advised to screen differently from those who do not.3 Failure to gather accurate or complete family data prevents the clinician from providing advice that is consistent with screening guidelines. Understanding how primary care clinicians gather family history data is necessary to identify gaps in current performance and to develop strategies to bridge these gaps.
Several studies have used physician self-report to assess the current level of taking a family history in primary care. In 1 study, 90% of surveyed physicians stated that they obtain a family history of cancer from their patients, with 77% to 80% inquiring about a family history of colorectal cancer in their patients who are at least 40 years of age.3 In another study, 63% to 85% of responding physicians reported obtaining a family history of cancer from 76% to 100% of their patients.5 Respondents in 1 study reported obtaining family histories of colorectal cancer in only 30.7% of patients, breast cancer in 48.4% of patients, and coronary disease and hypertension in 94.3% of patients.6 However, data on actual performance of family history taking are sparse.7,8 The physicians in the Direct Observation of Primary Care study obtained a family history during 51% of new visits and 22% of established visits. A genogram was present in 11% of charts, and documentation of a family history of breast cancer or colorectal cancer was found in 40% of charts. Further analysis from this study showed that providers who more frequently obtained and recorded family history information performed more preventive care services for their patients.9
First-degree relatives of cancer patients appear to be interested in and receptive to information about their risk and in the possibility of genetic counseling.10-14 In fact, many patients overestimate their likelihood of getting cancer based on family history10; primary care providers thus have the opportunity to counsel and relieve anxiety in their patients. Family history is an important tool to define risk and guide referral, counseling, and testing.
This article presents the findings of a descriptive study of the review of 500 charts from 3 different family practice offices. We documented general family history components and completeness related to a family history of cancer, including whether enough family history information was collected to appropriately identify patients at increased risk for cancer.
Methods
Data were collected from 500 patient charts from 3 family practice offices in the greater Philadelphia area. Patients who were in the practice for at least 1 year, made a minimum of 2 visits between June 1, 1997 and June 1, 2000, and were between the ages of 40 and 60 years at 1 of those visits were included. Fifty charts were selected by using a random starting point at each of 2 small (1 to 3 providers) private practices, and 400 charts were randomly selected from all eligible charts in a large academic practice (more than 60 providers). The large practice had nearly 1500 patients who fit the selection criteria, and the total population did not differ from the random sample in mean age, sex, or race (P < .05). Total population characteristics were not available for the 2 smaller practices.
Family history data were collected from progress notes and designated family history spaces on flow sheets or chart covers. Family histories consisting only of “none” or “noncontributory” were counted only when they clearly referred to a specific condition. Family births or deaths recorded in a psychosocial context also were not counted as part of the family history unless death from a specific disease was mentioned. The first dated family history was considered to be first for the purpose of this study. Date of birth, race, sex, current primary care provider, if any, and the date first seen in the practice were recorded. The current primary care provider was determined by the physician seen for the majority of recent visits and/or notations in the chart at acute visits. Data collected for the primary care providers included practice site, sex, level of training, and years in practice. Variables collected from family history information included: date of first family history; date of most recent family history; presence of a genogram, presence of a patient-completed family history self-questionnaire; whether any mention was made, positive or negative, of cancer or colon polyps; and whether there was a positive family history of cancer or colon polyps. For individuals with a positive family history of cancer or colon polyps, all details recorded in the chart were abstracted; these included site of cancer or polyp, relationship to patient, age at diagnosis, and age at death. The data were entered into an Access 97 database and stored separately from chart number identifiers. All analyses and tests were done in SAS version 6.12.
Results
Demographic data from the 500 patients whose charts were audited are presented in Table 1. Ninety-seven percent of patients had a primary care provider, which included 60 physicians and 3 nurse practitioners. No significant associations were seen with practice site, sex, or level of training of the provider and the presence of family history information in the chart.
Most patients (89%) had some family history information in the chart and 63% had a genogram. This did not differ by sex or race of the patient. Fifty-seven percent of patients supplied family history information at the first visit to the office; 59% of these patients had no family history data recorded on subsequent visits. Only 31% of charts had updated family history information Table 2. For patients who had been in the same practice for at least 5 years and had some family history in the chart, 20% had some updated information within the past 3 years.
Of the 500 charts, 276 (55%) recorded a family history of cancer, positive or negative. Two hundred fifteen patients (43%) had a positive family history of cancer, with a total of 356 relatives affected. The site of cancer was listed for 88% of all family member cancers, with breast, colon, lung, and prostate being the most common cancer locations. The specific relative was identified in 92% of cases, with most being first (51%) or second (37%) degree. Although degree of relative and cancer location were usually recorded, age at diagnosis was listed for only 8% of affected relatives, and age at death was identified for 19% of relatives with cancer.
For patients with affected first-degree relatives, the group with the greatest clinical significance, primary care providers identified the location of the cancer in 93% of cases but listed the ages at diagnosis and death in only 7% and 31%, respectively Table 3. Only 7 medical records (1.4%) had any mention of a family history of polyps; of these, 5 (1%) were positive. None listed an age at diagnosis. Five patients in our study met the American Society of Clinical Oncology criteria to be evaluated for genetic breast and ovarian cancer syndrome, and no patients met the criteria for hereditary nonpolyposis colon cancer.
The 2 community practices intermittently used patient self-administered medical intake questionnaires. In our sample, 31 of 500 patients (6%) had a questionnaire in the chart. All patients who completed questionnaires had family history data in their charts. Use of a questionnaire was associated with a greater likelihood that the physician recorded the age of diagnosis for a relative with cancer, although this did not reach significance (20% vs 7%).
Discussion
Despite our finding that providers are documenting family histories in most charts, very few are recording the age of diagnosis in relatives diagnosed with cancer. Age of diagnosis plays a critical role in determining screening recommendations and identifying patients with possible genetic syndromes. For example, the Amsterdam criteria used to identify families with hereditary nonpolyposis colon cancer include knowing whether 1 of 3 relatives with colorectal cancer was diagnosed at younger than 50 years. Breast and ovarian cancer syndrome should be suspected when breast and/or ovarian cancer is diagnosed in 2 first-degree relatives younger than 50 years.15
Most physicians obtain family histories at initial visits. If the patient’s initial visit is for symptom- or disease-related care, an opportunity to gather family history data may be lost. New tools to consistently capture comprehensive family history data at this first visit may be beneficial. Patient self-administered intake questionnaires may prove valuable in this respect, but only 6% of charts in our study contained such a questionnaire, so we cannot draw conclusions about its impact. We did observe a trend toward gathering more complete family history data in patients who used a questionnaire.
There are no clear guidelines regarding when to update a family history. We arbitrarily chose 3 years as a reasonable period for primary care providers to explore changes in family history status. Updating at any subsequent visit was recorded for 35% of patient charts in which a family history was initially taken. It is not clear that primary care providers are documenting changes in family history in any systematic way. Opportunistic updating likely occurs when a new diagnosis of serious disease in a close family member produces anxiety, stress, or concern in the patient. The value of updating family history and the ideal interval to reexamine family history are unknown.
Several conditions would need to be met for family history updating to have value. (1) A close relative must have developed an important illness in the interval since the last family history was recorded or the update must discover family information that was previously missed. (2) The illness must have a familial component that affects the estimate of the risk of the identified patient. (3) The clinician would need to be aware of the updated information. (4) The clinician must change recommendations to the patient based on this new information. Discovering a new family history of colonic neoplasm satisfies these conditions. Process measures that have the potential to improve updating include adding an update of family history as an item on a preventive care flow sheet or using periodic self-administered patient questionnaires. Whether adequate improvements in health care would occur to justify these changes in process will need to be studied. If any updating has value, determining the appropriate intervals for systematic updates deserves attention.
Charts in this study included a genogram 63% of the time, a significant increase over the 11% noted in the Direct Observation of Primary Care study.9 This discrepancy may be explained by differences in practice types because 1 study suggested a higher genogram use in academic medical centers than in community practices.16 The genogram has been cited as an attractive and efficient way to document family history,17,18 but over one fourth of the charts that contained family history in our study used the more cumbersome narrative form. Many geneticists predict that our ability to apply genetic testing will grow dramatically over the next decade. Optimal application of this new knowledge will rely on the health care system’s capacity to accurately identify risk based on assessment of family history. The 3-generation pedigree is likely to be a key tool in finding individuals who may benefit from testing. However, there is currently no standardized education in family history taking in many undergraduate and graduate medical education programs.19
Although patients with a first-degree relative with a history of polyps diagnosed at younger than 60 years are considered to be at increased risk for colorectal cancer,3 providers infrequently asked about a family history of polyps. This may reflect a recent finding that only 36% of primary care providers recommend screening at the age of 40 years for their patients with a family history of polyps in relatives younger than 60 years.20 In fact, family history data do not consistently influence behavior: in the same study, gastroenterologists asked about a family history of polyps 93% of the time, but only 37% recommended earlier screening in those with such a history.
The study is limited in its use of only 3 primary care practices, 1 of which was a large academic family practice. However, because the charts of 63 different clinicians were represented, a range of educational backgrounds and personal philosophies toward family history taking was included. Most patients (97%) had a clearly identified primary care provider and patients had been members of the practice for an average of 7.6 years. The sample was specifically chosen to review the charts of individuals who had been enrolled in the same practice for at least 1 year. This may explain the higher rates of family history taking found in this study compared with previously published studies. Given that the vast majority of charts did contain some family history, it is even more compelling that age at diagnosis of cancer was inadequately recorded. This study reflected only what was documented in the patient chart and not direct observation of physician behavior regarding the family history. It is likely that physicians are not recording all responses to inquiries about family history, although the extent of this underreporting is unknown.
CONCLUSION
Findings in this chart review study are consistent with previous work showing that the quantity and type of family history currently being recorded in primary care charts are not adequate to fully assess familial risk. Bridging the gap between recommendations and actual practice will demand interventions to alter primary care practice or the introduction of new models to gather and analyze family data. Further research is also needed to evaluate the impact of improved family history taking on health care costs and outcomes.
Acknowledgments
The authors thank Howard Rabinowitz, MD, for providing helpful suggestions in the development and execution of this project and Aliza Mansolino for the preparation of the manuscript.
1. Brownson RC, Davis JR, Simms SG, Kern TG, Harmon RG. Cancer control knowledge and priorities among primary care physicians. J Cancer Educ 1993;8:35-41.
2. Emery J, Rose P. Expanding the role of the family history in primary care (editorial). Br J Gen Pract 1999;49(441):260-1.
3. Smith RA, von Eschenbach AC, Wender RC, et al. American Cancer Society guidelines for the early detection of cancer: update of early detection guidelines for prostate, colorectal, and endometrial cancers. CA Cancer J Clin 2001;51:38-75.
4. Polednak AP. Screening for colorectal cancer by primary-care physicians in Long Island (New York) and Connecticut. Cancer Detect Prev 1989;13:301-9.
5. Acton RT, Burst NM, Casebeer L, et al. Knowledge, attitudes, and behaviors of Alabama’s primary care physicians regarding cancer genetics. Acad Med 2000;75:850-2.
6. Summerton N, Garrood PV. The family history in family practice: a questionnaire study. Fam Pract 1997;14:285-8.
7. Del Mar C, Lowe JB, Adkins P, Arnold E. What is the quality of general practitioner records in Australia? Aust Fam Phys 1996;suppl 1:S21-5.
8. Medalie JH, Zyzanski SJ, Langa D, Stange KC. The family in family practice: is it a reality?. J Fam Pract 1998;46:390-6.
9. Medalie JH, Zyzanski SJ, Goodwin MA, Stange KC. Two physician styles of focusing on the family. J Fam Pract 2000;49:209-15.
10. de Bock GH, Perk DC, Oosterwijk JC, Hageman GC, Kievit J, Springer MP. Women worried about their familial breast cancer risk-a study on genetic advice in general practice. Fam Pract 1997;14:40-3.
11. Graham ID, Logan DM, Hughes-Benzie R, et al. How interested is the public in genetic testing for colon cancer susceptibility? Report of a cross-sectional population survey. Cancer Prev Control 1998;2:167-72.
12. Bosompra K, Flynn BS, Ashikaga T, Rairikar CJ, Worden JK, Solomon LJ. Likelihood of undergoing genetic testing for cancer risk: a population-based study. Prev Med 2000;30:155-66.
13. Kinney AY, Choi YA, DeVellis B, Kobetz E, Millikan RC, Sandler RS. Interest in genetic testing among first-degree relatives of colorectal cancer patients. Am J Prev Med 2000;18:249-52.
14. Petersen GM, Larkin E, Codori AM, et al. Attitudes toward colon cancer gene testing: survey of relatives of colon cancer patients. Cancer Epidemiol Biomarkers Prev 1999;8(4 pt 2):337-44.
15. Statement of the American Society of Clinical Oncology: genetic testing for cancer susceptibility, adopted on February 20, 1996. J Clin Oncol 1996;14:1730-6.
16. Rogers J, Halloway R. Completion rate and reliability of the self-administered genogram. Fam Pract 1990;7:149-51.
17. Rogers J, Durkin M, Kelly K. The family genogram: an underutilized clinical tool. N J Med 1985;82:887-92.
18. Rogers J, Durkin M. The semi-structured genogram interview: I. Protocol, II. Evaluation. Fam Syst Med 1984;2:176-87.
19. Shore WB, Wilkie HA, Croughan-Minihane M. Family of origin genograms: evaluation of a teaching program for medical students. Fam Med 1994;26:238-43.
20. Physicians lax in screening patients with family history of adenomatous polyps. Reuters Health 2000;May 22.
Address reprint requests to Randa Sifri, MD, Department of Family Medicine, 1015 Walnut Street, Suite 401, Philadelphia, PA 19107. E-mail: [email protected].
To submit a letter to the editor on this topic, click here: [email protected].
OBJECTIVE: To determine whether an adequate amount of family history is being collected and recorded by family practitioners to appropriately identify patients at increased risk for cancer.
STUDY DESIGN: Retrospective chart audit.
POPULATION: Charts from 500 randomly chosen patients, 40 to 60 years of age, were audited. Of those charts, 400 were from a large academic family practice and 50 charts each were from 2 small community family practices in the greater Philadelphia area.
OUTCOMES MEASURED: General features of family history taking were recorded, including presence of a family history and date when recorded, evidence of updated family history data, and presence of a genogram. Cancer features recorded included mention of family history of cancer or colon polyps and, if positive, identification of which relative was affected, site of cancer, and age of diagnosis or death.
RESULTS: Most charts (89%) had some family history information recorded, and 55% listed a family history of cancer, either positive or negative. Of the 356 relatives affected with cancer, an age of diagnosis was documented in only 8%; of 183 first-degree relatives with cancer, only 7% had a documented age of diagnosis. Two percent of all charts had any mention of a family history of colon polyps. Sixty-five percent of family histories were recorded at the first visit, and only 35% had any updated family history information.
CONCLUSIONS: The number and type of family histories currently being recorded by family practitioners are not adequate to fully assess familial risk of cancer. New strategies will need to be developed to better prepare providers for risk-based clinical decision making.
Taking a family history is a significant component of providing comprehensive primary care because family history provides key psychosocial and medical risk information.1,2 As our understanding of the genetic basis for disease grows, obtaining an accurate and complete family history is likely to gain increasing relevance as a vital source of data to guide counseling and testing. Patients who have a first-degree relative with a colon neoplasm or prostate cancer are advised to screen differently from those who do not.3 Failure to gather accurate or complete family data prevents the clinician from providing advice that is consistent with screening guidelines. Understanding how primary care clinicians gather family history data is necessary to identify gaps in current performance and to develop strategies to bridge these gaps.
Several studies have used physician self-report to assess the current level of taking a family history in primary care. In 1 study, 90% of surveyed physicians stated that they obtain a family history of cancer from their patients, with 77% to 80% inquiring about a family history of colorectal cancer in their patients who are at least 40 years of age.3 In another study, 63% to 85% of responding physicians reported obtaining a family history of cancer from 76% to 100% of their patients.5 Respondents in 1 study reported obtaining family histories of colorectal cancer in only 30.7% of patients, breast cancer in 48.4% of patients, and coronary disease and hypertension in 94.3% of patients.6 However, data on actual performance of family history taking are sparse.7,8 The physicians in the Direct Observation of Primary Care study obtained a family history during 51% of new visits and 22% of established visits. A genogram was present in 11% of charts, and documentation of a family history of breast cancer or colorectal cancer was found in 40% of charts. Further analysis from this study showed that providers who more frequently obtained and recorded family history information performed more preventive care services for their patients.9
First-degree relatives of cancer patients appear to be interested in and receptive to information about their risk and in the possibility of genetic counseling.10-14 In fact, many patients overestimate their likelihood of getting cancer based on family history10; primary care providers thus have the opportunity to counsel and relieve anxiety in their patients. Family history is an important tool to define risk and guide referral, counseling, and testing.
This article presents the findings of a descriptive study of the review of 500 charts from 3 different family practice offices. We documented general family history components and completeness related to a family history of cancer, including whether enough family history information was collected to appropriately identify patients at increased risk for cancer.
Methods
Data were collected from 500 patient charts from 3 family practice offices in the greater Philadelphia area. Patients who were in the practice for at least 1 year, made a minimum of 2 visits between June 1, 1997 and June 1, 2000, and were between the ages of 40 and 60 years at 1 of those visits were included. Fifty charts were selected by using a random starting point at each of 2 small (1 to 3 providers) private practices, and 400 charts were randomly selected from all eligible charts in a large academic practice (more than 60 providers). The large practice had nearly 1500 patients who fit the selection criteria, and the total population did not differ from the random sample in mean age, sex, or race (P < .05). Total population characteristics were not available for the 2 smaller practices.
Family history data were collected from progress notes and designated family history spaces on flow sheets or chart covers. Family histories consisting only of “none” or “noncontributory” were counted only when they clearly referred to a specific condition. Family births or deaths recorded in a psychosocial context also were not counted as part of the family history unless death from a specific disease was mentioned. The first dated family history was considered to be first for the purpose of this study. Date of birth, race, sex, current primary care provider, if any, and the date first seen in the practice were recorded. The current primary care provider was determined by the physician seen for the majority of recent visits and/or notations in the chart at acute visits. Data collected for the primary care providers included practice site, sex, level of training, and years in practice. Variables collected from family history information included: date of first family history; date of most recent family history; presence of a genogram, presence of a patient-completed family history self-questionnaire; whether any mention was made, positive or negative, of cancer or colon polyps; and whether there was a positive family history of cancer or colon polyps. For individuals with a positive family history of cancer or colon polyps, all details recorded in the chart were abstracted; these included site of cancer or polyp, relationship to patient, age at diagnosis, and age at death. The data were entered into an Access 97 database and stored separately from chart number identifiers. All analyses and tests were done in SAS version 6.12.
Results
Demographic data from the 500 patients whose charts were audited are presented in Table 1. Ninety-seven percent of patients had a primary care provider, which included 60 physicians and 3 nurse practitioners. No significant associations were seen with practice site, sex, or level of training of the provider and the presence of family history information in the chart.
Most patients (89%) had some family history information in the chart and 63% had a genogram. This did not differ by sex or race of the patient. Fifty-seven percent of patients supplied family history information at the first visit to the office; 59% of these patients had no family history data recorded on subsequent visits. Only 31% of charts had updated family history information Table 2. For patients who had been in the same practice for at least 5 years and had some family history in the chart, 20% had some updated information within the past 3 years.
Of the 500 charts, 276 (55%) recorded a family history of cancer, positive or negative. Two hundred fifteen patients (43%) had a positive family history of cancer, with a total of 356 relatives affected. The site of cancer was listed for 88% of all family member cancers, with breast, colon, lung, and prostate being the most common cancer locations. The specific relative was identified in 92% of cases, with most being first (51%) or second (37%) degree. Although degree of relative and cancer location were usually recorded, age at diagnosis was listed for only 8% of affected relatives, and age at death was identified for 19% of relatives with cancer.
For patients with affected first-degree relatives, the group with the greatest clinical significance, primary care providers identified the location of the cancer in 93% of cases but listed the ages at diagnosis and death in only 7% and 31%, respectively Table 3. Only 7 medical records (1.4%) had any mention of a family history of polyps; of these, 5 (1%) were positive. None listed an age at diagnosis. Five patients in our study met the American Society of Clinical Oncology criteria to be evaluated for genetic breast and ovarian cancer syndrome, and no patients met the criteria for hereditary nonpolyposis colon cancer.
The 2 community practices intermittently used patient self-administered medical intake questionnaires. In our sample, 31 of 500 patients (6%) had a questionnaire in the chart. All patients who completed questionnaires had family history data in their charts. Use of a questionnaire was associated with a greater likelihood that the physician recorded the age of diagnosis for a relative with cancer, although this did not reach significance (20% vs 7%).
Discussion
Despite our finding that providers are documenting family histories in most charts, very few are recording the age of diagnosis in relatives diagnosed with cancer. Age of diagnosis plays a critical role in determining screening recommendations and identifying patients with possible genetic syndromes. For example, the Amsterdam criteria used to identify families with hereditary nonpolyposis colon cancer include knowing whether 1 of 3 relatives with colorectal cancer was diagnosed at younger than 50 years. Breast and ovarian cancer syndrome should be suspected when breast and/or ovarian cancer is diagnosed in 2 first-degree relatives younger than 50 years.15
Most physicians obtain family histories at initial visits. If the patient’s initial visit is for symptom- or disease-related care, an opportunity to gather family history data may be lost. New tools to consistently capture comprehensive family history data at this first visit may be beneficial. Patient self-administered intake questionnaires may prove valuable in this respect, but only 6% of charts in our study contained such a questionnaire, so we cannot draw conclusions about its impact. We did observe a trend toward gathering more complete family history data in patients who used a questionnaire.
There are no clear guidelines regarding when to update a family history. We arbitrarily chose 3 years as a reasonable period for primary care providers to explore changes in family history status. Updating at any subsequent visit was recorded for 35% of patient charts in which a family history was initially taken. It is not clear that primary care providers are documenting changes in family history in any systematic way. Opportunistic updating likely occurs when a new diagnosis of serious disease in a close family member produces anxiety, stress, or concern in the patient. The value of updating family history and the ideal interval to reexamine family history are unknown.
Several conditions would need to be met for family history updating to have value. (1) A close relative must have developed an important illness in the interval since the last family history was recorded or the update must discover family information that was previously missed. (2) The illness must have a familial component that affects the estimate of the risk of the identified patient. (3) The clinician would need to be aware of the updated information. (4) The clinician must change recommendations to the patient based on this new information. Discovering a new family history of colonic neoplasm satisfies these conditions. Process measures that have the potential to improve updating include adding an update of family history as an item on a preventive care flow sheet or using periodic self-administered patient questionnaires. Whether adequate improvements in health care would occur to justify these changes in process will need to be studied. If any updating has value, determining the appropriate intervals for systematic updates deserves attention.
Charts in this study included a genogram 63% of the time, a significant increase over the 11% noted in the Direct Observation of Primary Care study.9 This discrepancy may be explained by differences in practice types because 1 study suggested a higher genogram use in academic medical centers than in community practices.16 The genogram has been cited as an attractive and efficient way to document family history,17,18 but over one fourth of the charts that contained family history in our study used the more cumbersome narrative form. Many geneticists predict that our ability to apply genetic testing will grow dramatically over the next decade. Optimal application of this new knowledge will rely on the health care system’s capacity to accurately identify risk based on assessment of family history. The 3-generation pedigree is likely to be a key tool in finding individuals who may benefit from testing. However, there is currently no standardized education in family history taking in many undergraduate and graduate medical education programs.19
Although patients with a first-degree relative with a history of polyps diagnosed at younger than 60 years are considered to be at increased risk for colorectal cancer,3 providers infrequently asked about a family history of polyps. This may reflect a recent finding that only 36% of primary care providers recommend screening at the age of 40 years for their patients with a family history of polyps in relatives younger than 60 years.20 In fact, family history data do not consistently influence behavior: in the same study, gastroenterologists asked about a family history of polyps 93% of the time, but only 37% recommended earlier screening in those with such a history.
The study is limited in its use of only 3 primary care practices, 1 of which was a large academic family practice. However, because the charts of 63 different clinicians were represented, a range of educational backgrounds and personal philosophies toward family history taking was included. Most patients (97%) had a clearly identified primary care provider and patients had been members of the practice for an average of 7.6 years. The sample was specifically chosen to review the charts of individuals who had been enrolled in the same practice for at least 1 year. This may explain the higher rates of family history taking found in this study compared with previously published studies. Given that the vast majority of charts did contain some family history, it is even more compelling that age at diagnosis of cancer was inadequately recorded. This study reflected only what was documented in the patient chart and not direct observation of physician behavior regarding the family history. It is likely that physicians are not recording all responses to inquiries about family history, although the extent of this underreporting is unknown.
CONCLUSION
Findings in this chart review study are consistent with previous work showing that the quantity and type of family history currently being recorded in primary care charts are not adequate to fully assess familial risk. Bridging the gap between recommendations and actual practice will demand interventions to alter primary care practice or the introduction of new models to gather and analyze family data. Further research is also needed to evaluate the impact of improved family history taking on health care costs and outcomes.
Acknowledgments
The authors thank Howard Rabinowitz, MD, for providing helpful suggestions in the development and execution of this project and Aliza Mansolino for the preparation of the manuscript.
OBJECTIVE: To determine whether an adequate amount of family history is being collected and recorded by family practitioners to appropriately identify patients at increased risk for cancer.
STUDY DESIGN: Retrospective chart audit.
POPULATION: Charts from 500 randomly chosen patients, 40 to 60 years of age, were audited. Of those charts, 400 were from a large academic family practice and 50 charts each were from 2 small community family practices in the greater Philadelphia area.
OUTCOMES MEASURED: General features of family history taking were recorded, including presence of a family history and date when recorded, evidence of updated family history data, and presence of a genogram. Cancer features recorded included mention of family history of cancer or colon polyps and, if positive, identification of which relative was affected, site of cancer, and age of diagnosis or death.
RESULTS: Most charts (89%) had some family history information recorded, and 55% listed a family history of cancer, either positive or negative. Of the 356 relatives affected with cancer, an age of diagnosis was documented in only 8%; of 183 first-degree relatives with cancer, only 7% had a documented age of diagnosis. Two percent of all charts had any mention of a family history of colon polyps. Sixty-five percent of family histories were recorded at the first visit, and only 35% had any updated family history information.
CONCLUSIONS: The number and type of family histories currently being recorded by family practitioners are not adequate to fully assess familial risk of cancer. New strategies will need to be developed to better prepare providers for risk-based clinical decision making.
Taking a family history is a significant component of providing comprehensive primary care because family history provides key psychosocial and medical risk information.1,2 As our understanding of the genetic basis for disease grows, obtaining an accurate and complete family history is likely to gain increasing relevance as a vital source of data to guide counseling and testing. Patients who have a first-degree relative with a colon neoplasm or prostate cancer are advised to screen differently from those who do not.3 Failure to gather accurate or complete family data prevents the clinician from providing advice that is consistent with screening guidelines. Understanding how primary care clinicians gather family history data is necessary to identify gaps in current performance and to develop strategies to bridge these gaps.
Several studies have used physician self-report to assess the current level of taking a family history in primary care. In 1 study, 90% of surveyed physicians stated that they obtain a family history of cancer from their patients, with 77% to 80% inquiring about a family history of colorectal cancer in their patients who are at least 40 years of age.3 In another study, 63% to 85% of responding physicians reported obtaining a family history of cancer from 76% to 100% of their patients.5 Respondents in 1 study reported obtaining family histories of colorectal cancer in only 30.7% of patients, breast cancer in 48.4% of patients, and coronary disease and hypertension in 94.3% of patients.6 However, data on actual performance of family history taking are sparse.7,8 The physicians in the Direct Observation of Primary Care study obtained a family history during 51% of new visits and 22% of established visits. A genogram was present in 11% of charts, and documentation of a family history of breast cancer or colorectal cancer was found in 40% of charts. Further analysis from this study showed that providers who more frequently obtained and recorded family history information performed more preventive care services for their patients.9
First-degree relatives of cancer patients appear to be interested in and receptive to information about their risk and in the possibility of genetic counseling.10-14 In fact, many patients overestimate their likelihood of getting cancer based on family history10; primary care providers thus have the opportunity to counsel and relieve anxiety in their patients. Family history is an important tool to define risk and guide referral, counseling, and testing.
This article presents the findings of a descriptive study of the review of 500 charts from 3 different family practice offices. We documented general family history components and completeness related to a family history of cancer, including whether enough family history information was collected to appropriately identify patients at increased risk for cancer.
Methods
Data were collected from 500 patient charts from 3 family practice offices in the greater Philadelphia area. Patients who were in the practice for at least 1 year, made a minimum of 2 visits between June 1, 1997 and June 1, 2000, and were between the ages of 40 and 60 years at 1 of those visits were included. Fifty charts were selected by using a random starting point at each of 2 small (1 to 3 providers) private practices, and 400 charts were randomly selected from all eligible charts in a large academic practice (more than 60 providers). The large practice had nearly 1500 patients who fit the selection criteria, and the total population did not differ from the random sample in mean age, sex, or race (P < .05). Total population characteristics were not available for the 2 smaller practices.
Family history data were collected from progress notes and designated family history spaces on flow sheets or chart covers. Family histories consisting only of “none” or “noncontributory” were counted only when they clearly referred to a specific condition. Family births or deaths recorded in a psychosocial context also were not counted as part of the family history unless death from a specific disease was mentioned. The first dated family history was considered to be first for the purpose of this study. Date of birth, race, sex, current primary care provider, if any, and the date first seen in the practice were recorded. The current primary care provider was determined by the physician seen for the majority of recent visits and/or notations in the chart at acute visits. Data collected for the primary care providers included practice site, sex, level of training, and years in practice. Variables collected from family history information included: date of first family history; date of most recent family history; presence of a genogram, presence of a patient-completed family history self-questionnaire; whether any mention was made, positive or negative, of cancer or colon polyps; and whether there was a positive family history of cancer or colon polyps. For individuals with a positive family history of cancer or colon polyps, all details recorded in the chart were abstracted; these included site of cancer or polyp, relationship to patient, age at diagnosis, and age at death. The data were entered into an Access 97 database and stored separately from chart number identifiers. All analyses and tests were done in SAS version 6.12.
Results
Demographic data from the 500 patients whose charts were audited are presented in Table 1. Ninety-seven percent of patients had a primary care provider, which included 60 physicians and 3 nurse practitioners. No significant associations were seen with practice site, sex, or level of training of the provider and the presence of family history information in the chart.
Most patients (89%) had some family history information in the chart and 63% had a genogram. This did not differ by sex or race of the patient. Fifty-seven percent of patients supplied family history information at the first visit to the office; 59% of these patients had no family history data recorded on subsequent visits. Only 31% of charts had updated family history information Table 2. For patients who had been in the same practice for at least 5 years and had some family history in the chart, 20% had some updated information within the past 3 years.
Of the 500 charts, 276 (55%) recorded a family history of cancer, positive or negative. Two hundred fifteen patients (43%) had a positive family history of cancer, with a total of 356 relatives affected. The site of cancer was listed for 88% of all family member cancers, with breast, colon, lung, and prostate being the most common cancer locations. The specific relative was identified in 92% of cases, with most being first (51%) or second (37%) degree. Although degree of relative and cancer location were usually recorded, age at diagnosis was listed for only 8% of affected relatives, and age at death was identified for 19% of relatives with cancer.
For patients with affected first-degree relatives, the group with the greatest clinical significance, primary care providers identified the location of the cancer in 93% of cases but listed the ages at diagnosis and death in only 7% and 31%, respectively Table 3. Only 7 medical records (1.4%) had any mention of a family history of polyps; of these, 5 (1%) were positive. None listed an age at diagnosis. Five patients in our study met the American Society of Clinical Oncology criteria to be evaluated for genetic breast and ovarian cancer syndrome, and no patients met the criteria for hereditary nonpolyposis colon cancer.
The 2 community practices intermittently used patient self-administered medical intake questionnaires. In our sample, 31 of 500 patients (6%) had a questionnaire in the chart. All patients who completed questionnaires had family history data in their charts. Use of a questionnaire was associated with a greater likelihood that the physician recorded the age of diagnosis for a relative with cancer, although this did not reach significance (20% vs 7%).
Discussion
Despite our finding that providers are documenting family histories in most charts, very few are recording the age of diagnosis in relatives diagnosed with cancer. Age of diagnosis plays a critical role in determining screening recommendations and identifying patients with possible genetic syndromes. For example, the Amsterdam criteria used to identify families with hereditary nonpolyposis colon cancer include knowing whether 1 of 3 relatives with colorectal cancer was diagnosed at younger than 50 years. Breast and ovarian cancer syndrome should be suspected when breast and/or ovarian cancer is diagnosed in 2 first-degree relatives younger than 50 years.15
Most physicians obtain family histories at initial visits. If the patient’s initial visit is for symptom- or disease-related care, an opportunity to gather family history data may be lost. New tools to consistently capture comprehensive family history data at this first visit may be beneficial. Patient self-administered intake questionnaires may prove valuable in this respect, but only 6% of charts in our study contained such a questionnaire, so we cannot draw conclusions about its impact. We did observe a trend toward gathering more complete family history data in patients who used a questionnaire.
There are no clear guidelines regarding when to update a family history. We arbitrarily chose 3 years as a reasonable period for primary care providers to explore changes in family history status. Updating at any subsequent visit was recorded for 35% of patient charts in which a family history was initially taken. It is not clear that primary care providers are documenting changes in family history in any systematic way. Opportunistic updating likely occurs when a new diagnosis of serious disease in a close family member produces anxiety, stress, or concern in the patient. The value of updating family history and the ideal interval to reexamine family history are unknown.
Several conditions would need to be met for family history updating to have value. (1) A close relative must have developed an important illness in the interval since the last family history was recorded or the update must discover family information that was previously missed. (2) The illness must have a familial component that affects the estimate of the risk of the identified patient. (3) The clinician would need to be aware of the updated information. (4) The clinician must change recommendations to the patient based on this new information. Discovering a new family history of colonic neoplasm satisfies these conditions. Process measures that have the potential to improve updating include adding an update of family history as an item on a preventive care flow sheet or using periodic self-administered patient questionnaires. Whether adequate improvements in health care would occur to justify these changes in process will need to be studied. If any updating has value, determining the appropriate intervals for systematic updates deserves attention.
Charts in this study included a genogram 63% of the time, a significant increase over the 11% noted in the Direct Observation of Primary Care study.9 This discrepancy may be explained by differences in practice types because 1 study suggested a higher genogram use in academic medical centers than in community practices.16 The genogram has been cited as an attractive and efficient way to document family history,17,18 but over one fourth of the charts that contained family history in our study used the more cumbersome narrative form. Many geneticists predict that our ability to apply genetic testing will grow dramatically over the next decade. Optimal application of this new knowledge will rely on the health care system’s capacity to accurately identify risk based on assessment of family history. The 3-generation pedigree is likely to be a key tool in finding individuals who may benefit from testing. However, there is currently no standardized education in family history taking in many undergraduate and graduate medical education programs.19
Although patients with a first-degree relative with a history of polyps diagnosed at younger than 60 years are considered to be at increased risk for colorectal cancer,3 providers infrequently asked about a family history of polyps. This may reflect a recent finding that only 36% of primary care providers recommend screening at the age of 40 years for their patients with a family history of polyps in relatives younger than 60 years.20 In fact, family history data do not consistently influence behavior: in the same study, gastroenterologists asked about a family history of polyps 93% of the time, but only 37% recommended earlier screening in those with such a history.
The study is limited in its use of only 3 primary care practices, 1 of which was a large academic family practice. However, because the charts of 63 different clinicians were represented, a range of educational backgrounds and personal philosophies toward family history taking was included. Most patients (97%) had a clearly identified primary care provider and patients had been members of the practice for an average of 7.6 years. The sample was specifically chosen to review the charts of individuals who had been enrolled in the same practice for at least 1 year. This may explain the higher rates of family history taking found in this study compared with previously published studies. Given that the vast majority of charts did contain some family history, it is even more compelling that age at diagnosis of cancer was inadequately recorded. This study reflected only what was documented in the patient chart and not direct observation of physician behavior regarding the family history. It is likely that physicians are not recording all responses to inquiries about family history, although the extent of this underreporting is unknown.
CONCLUSION
Findings in this chart review study are consistent with previous work showing that the quantity and type of family history currently being recorded in primary care charts are not adequate to fully assess familial risk. Bridging the gap between recommendations and actual practice will demand interventions to alter primary care practice or the introduction of new models to gather and analyze family data. Further research is also needed to evaluate the impact of improved family history taking on health care costs and outcomes.
Acknowledgments
The authors thank Howard Rabinowitz, MD, for providing helpful suggestions in the development and execution of this project and Aliza Mansolino for the preparation of the manuscript.
1. Brownson RC, Davis JR, Simms SG, Kern TG, Harmon RG. Cancer control knowledge and priorities among primary care physicians. J Cancer Educ 1993;8:35-41.
2. Emery J, Rose P. Expanding the role of the family history in primary care (editorial). Br J Gen Pract 1999;49(441):260-1.
3. Smith RA, von Eschenbach AC, Wender RC, et al. American Cancer Society guidelines for the early detection of cancer: update of early detection guidelines for prostate, colorectal, and endometrial cancers. CA Cancer J Clin 2001;51:38-75.
4. Polednak AP. Screening for colorectal cancer by primary-care physicians in Long Island (New York) and Connecticut. Cancer Detect Prev 1989;13:301-9.
5. Acton RT, Burst NM, Casebeer L, et al. Knowledge, attitudes, and behaviors of Alabama’s primary care physicians regarding cancer genetics. Acad Med 2000;75:850-2.
6. Summerton N, Garrood PV. The family history in family practice: a questionnaire study. Fam Pract 1997;14:285-8.
7. Del Mar C, Lowe JB, Adkins P, Arnold E. What is the quality of general practitioner records in Australia? Aust Fam Phys 1996;suppl 1:S21-5.
8. Medalie JH, Zyzanski SJ, Langa D, Stange KC. The family in family practice: is it a reality?. J Fam Pract 1998;46:390-6.
9. Medalie JH, Zyzanski SJ, Goodwin MA, Stange KC. Two physician styles of focusing on the family. J Fam Pract 2000;49:209-15.
10. de Bock GH, Perk DC, Oosterwijk JC, Hageman GC, Kievit J, Springer MP. Women worried about their familial breast cancer risk-a study on genetic advice in general practice. Fam Pract 1997;14:40-3.
11. Graham ID, Logan DM, Hughes-Benzie R, et al. How interested is the public in genetic testing for colon cancer susceptibility? Report of a cross-sectional population survey. Cancer Prev Control 1998;2:167-72.
12. Bosompra K, Flynn BS, Ashikaga T, Rairikar CJ, Worden JK, Solomon LJ. Likelihood of undergoing genetic testing for cancer risk: a population-based study. Prev Med 2000;30:155-66.
13. Kinney AY, Choi YA, DeVellis B, Kobetz E, Millikan RC, Sandler RS. Interest in genetic testing among first-degree relatives of colorectal cancer patients. Am J Prev Med 2000;18:249-52.
14. Petersen GM, Larkin E, Codori AM, et al. Attitudes toward colon cancer gene testing: survey of relatives of colon cancer patients. Cancer Epidemiol Biomarkers Prev 1999;8(4 pt 2):337-44.
15. Statement of the American Society of Clinical Oncology: genetic testing for cancer susceptibility, adopted on February 20, 1996. J Clin Oncol 1996;14:1730-6.
16. Rogers J, Halloway R. Completion rate and reliability of the self-administered genogram. Fam Pract 1990;7:149-51.
17. Rogers J, Durkin M, Kelly K. The family genogram: an underutilized clinical tool. N J Med 1985;82:887-92.
18. Rogers J, Durkin M. The semi-structured genogram interview: I. Protocol, II. Evaluation. Fam Syst Med 1984;2:176-87.
19. Shore WB, Wilkie HA, Croughan-Minihane M. Family of origin genograms: evaluation of a teaching program for medical students. Fam Med 1994;26:238-43.
20. Physicians lax in screening patients with family history of adenomatous polyps. Reuters Health 2000;May 22.
Address reprint requests to Randa Sifri, MD, Department of Family Medicine, 1015 Walnut Street, Suite 401, Philadelphia, PA 19107. E-mail: [email protected].
To submit a letter to the editor on this topic, click here: [email protected].
1. Brownson RC, Davis JR, Simms SG, Kern TG, Harmon RG. Cancer control knowledge and priorities among primary care physicians. J Cancer Educ 1993;8:35-41.
2. Emery J, Rose P. Expanding the role of the family history in primary care (editorial). Br J Gen Pract 1999;49(441):260-1.
3. Smith RA, von Eschenbach AC, Wender RC, et al. American Cancer Society guidelines for the early detection of cancer: update of early detection guidelines for prostate, colorectal, and endometrial cancers. CA Cancer J Clin 2001;51:38-75.
4. Polednak AP. Screening for colorectal cancer by primary-care physicians in Long Island (New York) and Connecticut. Cancer Detect Prev 1989;13:301-9.
5. Acton RT, Burst NM, Casebeer L, et al. Knowledge, attitudes, and behaviors of Alabama’s primary care physicians regarding cancer genetics. Acad Med 2000;75:850-2.
6. Summerton N, Garrood PV. The family history in family practice: a questionnaire study. Fam Pract 1997;14:285-8.
7. Del Mar C, Lowe JB, Adkins P, Arnold E. What is the quality of general practitioner records in Australia? Aust Fam Phys 1996;suppl 1:S21-5.
8. Medalie JH, Zyzanski SJ, Langa D, Stange KC. The family in family practice: is it a reality?. J Fam Pract 1998;46:390-6.
9. Medalie JH, Zyzanski SJ, Goodwin MA, Stange KC. Two physician styles of focusing on the family. J Fam Pract 2000;49:209-15.
10. de Bock GH, Perk DC, Oosterwijk JC, Hageman GC, Kievit J, Springer MP. Women worried about their familial breast cancer risk-a study on genetic advice in general practice. Fam Pract 1997;14:40-3.
11. Graham ID, Logan DM, Hughes-Benzie R, et al. How interested is the public in genetic testing for colon cancer susceptibility? Report of a cross-sectional population survey. Cancer Prev Control 1998;2:167-72.
12. Bosompra K, Flynn BS, Ashikaga T, Rairikar CJ, Worden JK, Solomon LJ. Likelihood of undergoing genetic testing for cancer risk: a population-based study. Prev Med 2000;30:155-66.
13. Kinney AY, Choi YA, DeVellis B, Kobetz E, Millikan RC, Sandler RS. Interest in genetic testing among first-degree relatives of colorectal cancer patients. Am J Prev Med 2000;18:249-52.
14. Petersen GM, Larkin E, Codori AM, et al. Attitudes toward colon cancer gene testing: survey of relatives of colon cancer patients. Cancer Epidemiol Biomarkers Prev 1999;8(4 pt 2):337-44.
15. Statement of the American Society of Clinical Oncology: genetic testing for cancer susceptibility, adopted on February 20, 1996. J Clin Oncol 1996;14:1730-6.
16. Rogers J, Halloway R. Completion rate and reliability of the self-administered genogram. Fam Pract 1990;7:149-51.
17. Rogers J, Durkin M, Kelly K. The family genogram: an underutilized clinical tool. N J Med 1985;82:887-92.
18. Rogers J, Durkin M. The semi-structured genogram interview: I. Protocol, II. Evaluation. Fam Syst Med 1984;2:176-87.
19. Shore WB, Wilkie HA, Croughan-Minihane M. Family of origin genograms: evaluation of a teaching program for medical students. Fam Med 1994;26:238-43.
20. Physicians lax in screening patients with family history of adenomatous polyps. Reuters Health 2000;May 22.
Address reprint requests to Randa Sifri, MD, Department of Family Medicine, 1015 Walnut Street, Suite 401, Philadelphia, PA 19107. E-mail: [email protected].
To submit a letter to the editor on this topic, click here: [email protected].
Management of the low-grade abnormal Pap smear: What are women’s preferences?
- Any of several approaches may be used in managing women who have low-grade Pap smear abnormalities.
- Women’s preferences for a particular management approach to an abnormal Pap smear vary widely.
- Asking patients specific questions about their desire to avoid procedures and tolerance for uncertainty may help to clarify preferences.
The management of women who have low-grade cytologic abnormalities—including atypical squamous cells (ASC) and low-grade squamous intraepithelial lesions (LSIL)—is controversial.1-4 Without strong evidence favoring a single approach, some clinicians recommend immediate colposcopy to obtain a definitive diagnosis and to exclude the presence of a high-grade lesion, while others recommend observation with serial Pap smears, given the tendency for many low-grade lesions to regress spontaneously.5,6 Immediate colposcopy has the advantage of giving a patient a relatively rapid assessment of the nature and extent of her cervical dysplasia; however, the procedure is uncomfortable, and overall management may not be affected. Observation with serial Pap smears may avoid an invasive procedure, but it may also cause anxiety as time passes without a definitive diagnosis.
Eliciting and understanding patient preferences is an important part of clinical decision making. The clinician provides the best available information on the probability of clinical outcomes and the implications of each for the patient’s health. But only the patient knows what these outcomes mean to her well-being (also called “utility”).
Given the clinical disagreement over how to proceed with abnormal ASC and LSIL Pap smear results, the decision should be influenced by a patient’s preference, informed by knowledge of outcomes and costs of alternative approaches. It is unclear which approach women prefer, and whether women’s preferences for specific protocols are associated with sociodemographic characteristics. To understand better how women weigh these trade-offs, we evaluated the preferences of a diverse group of women for contrasting management approaches to the evaluation of a hypothetical low-grade abnormal Pap smear result.
Methods
Study population
Study participants were recruited from the waiting rooms of 5 family planning clinics in Northern California’s Central Valley. Women were eligible for the study if they were 18 years of age or older, or, if minors, they were emancipated and could thus provide informed consent. Potential subjects were excluded if they spoke neither English nor Spanish or if they had never had a Pap smear. The study protocol and informed consent procedures were reviewed and approved by the University of California, Davis, Human Subjects Committee.
Instruments and outcome measures
Interviews were conducted in English or Spanish. Information regarding demographic characteristics, level of education, past experiences with abnormal Pap smears and cervical cancer, and self-rated religiosity was collected with a self-administered questionnaire. The primary outcome measures were utilities (quantified preferences for specific health states) for 6 different scenarios. These were assessed by the standard gamble (SG) method, described in more detail below.7
Possible utility scores range from 0 to 1. A score of 0 represents immediate death; a score of 1 represents full (or ideal) health for the rest of one’s life. Because the scenarios under consideration in this study did not involve any meaningful level of risk of death, we expected utility scores for the scenarios to cluster toward the upper end of the scale. As a result, a measurement instrument based on an “immediate death” versus “full health” scale would be unable to discriminate between different scenarios. To avoid this problem, a scale was used in which the lower end point was a non-death state unambiguously less preferred than each of the scenarios under consideration.8 We used “invasive cervical cancer requiring hysterectomy” as the lower end point (utility of 0) contrasted with “full health with all normal Pap smears” (utility of 1) to generate the original score (SG Dys). In a separate standard gamble, subjects rated invasive cervical cancer versus immediate death (SG Ca), so that all utilities could be converted to the standard scale, using the formula: (1 – SG Ca) (SG Dys) + SG Ca.
The 6 scenarios rated in the study are shown in Table 1. The scenarios represent 3 sets of progressively more invasive interventions for a low-grade abnormal Pap smear: (1) resolution, representing spontaneous regression with treatment not required; (2) a low-grade abnormality requiring treatment with cryotherapy; (3) a more severe abnormality requiring a cervical cone biopsy. Following either spontaneous resolution or treatment, all scenarios assumed the abnormality was resolved. For each of the 3 results, a management strategy based on observation with serial Pap smears was applied in 1 instance, and a strategy of early colposcopy was applied in the other instance, resulting in 6 different pathways to the ultimate outcome; a normal Pap smear. The time frame for these scenarios was 18–36 months.
Trained interviewers used a standardized approach to elicit preferences from each subject. Subjects were read a description of all the procedures involved in the scenarios. Descriptions were accompanied by cards summarizing each procedure in pictures and words, and included information about the possibility of progression and spontaneous regression of the Pap smear abnormality. Subjects were encouraged to ask questions at any point during the interview. Procedure descriptions are available from the authors on request.
TABLE 1
Clinical scenarios classified by management approach and required treatment*
Spontaneous resolution | Cryotherapy | Cone biopsy | |
---|---|---|---|
Observation | Pap smear: low-grade abnormal | Pap smear: low-grade abnormal | Pap smear: low-grade abnormal |
↓ | ↓ | ↓ | |
Pap smear: normal | Pap smear: low-grade abnormal | Pap smear: normal | |
↓ | ↓ | ↓ | |
2 Pap smears every 6 months: normal | Pap smear: low-grade abnormal | Pap smear: normal | |
↓ | ↓ | ||
Pap smear: low-grade abnormal | Pap smear: normal | ||
↓ | ↓ | ||
Colposcopy and biopsy at 1 month | Colposcopy and biopsy at 1 month | ||
↓ | ↓ | ||
Biopsy: low-grade abnormal | Biopsy: abnormal with ? ECC | ||
↓ | ↓ | ||
Cryotherapy at 1 month | Cone biopsy at 1 month: moderately abnoramal cells | ||
↓ | ↓ | ||
3 Pap smears every 6 months: normal | Cure with cone biopsy | ||
↓ | |||
3 Pap smears every 6 months: normal | |||
Early colposcopy | Pap smear: low-grade abnormal | Pap smear: low-grade abnormal | Pap smear: low-grade abnormal |
↓ | ↓ | ↓ | |
Colposcopy and biopsy at 1 month | Colposcopy and biopsy at 1 month | Colposcopy and biopsy at 1 month | |
↓ | ↓ | ↓ | |
Biopsy: normal | Biopsy: abnormal with ? ECC | Biopsy: abnormal with ? ECC | |
↓ | ↓ | ↓ | |
Second colposcopy and biopsy | Cone biopsy at 1 month | Cone biopsy at 1 month | |
↓ | ↓ | ↓ | |
Biopsy: normal | Biopsy: moderately abnormal | Biopsy: moderately abnormal | |
↓ | ↓ | ↓ | |
2 Pap smears every 6 months: normal | Cure with cone biopsy | Cure with cone biopsy | |
↓ | ↓ | ↓ | |
Pap smear: low-grade abnormal | Colposcopy: normal | Colposcopy: normal | |
↓ | ↓ | ↓ | |
Colposcopy and biopsy at 1 month | 2 Pap smears every 6 months: normal | 2 Pap smears every 6 months: normal | |
↓ | |||
Biopsy: low-grade abnormal | |||
↓ | |||
Colposcopy: normal | |||
↓ | |||
2 Pap smears every 6 months: normal | |||
*Intervals are 6 months unless specified otherwise. ECC, endocervical curettage. |
Standard gamble
Subjects were asked their preference between the certainty of the scenario under consideration and an uncertain prospect of either having cervical cancer treated by hysterectomy or full health. A probability wheel was used as visual aid.9 The probability of cervical cancer was altered until the subject was indifferent between the certain scenario and the uncertain prospect. Once all 6 scenarios had been scored, each subject was asked about her preference between the certainty of cervical cancer treated by hysterectomy and the uncertain prospect of immediate death or full health, using the same method.
At the end of the interview, both the subject and the interviewer completed evaluation forms including ratings of how well the subject understood the standard gamble rating exercises. Subject confusion was also defined a priori as those placing a higher utility on scenario 3 (observation for a long period followed by cone biopsy), which represented the longest period of uncertainty followed by the most invasive procedure, than on scenario 1 (a single mildly abnormal Pap smear evaluated by observation which then resolved spontaneously), which represented the absence of any invasive procedure.
Statistical analysis
Descriptive statistics were generated for ratings of each scenario for the entire group and with the confused subjects removed. Confused subjects included those who reported they found the interview “very confusing,” those who were recorded by the interviewer as finding the interview “very confusing,” and those whose rankings met the criteria for subject confusion, as described above. Means, standard deviations, medians, and percentiles were calculated for each scenario. The mean differences in adjusted standard gamble ratings between paired scenarios was evaluated using a t distribution. Multiple regression analyses were used to explore how much between-subject variation in the standard gamble scores was explained by the variables listed above.
A simple decision tree (Figure 1) was constructed to contrast preferences for an observational approach vs early colposcopy. Outcome probabilities were derived from meta-analyses of the medical literature,5 from observational data obtained at the same Northern California family planning clinics,10 and, for cone biopsy outcomes, from expert opinion obtained using a modified Delphi process.11 Utilities were assigned to the decision tree based on the standard gamble results. Women having 2 consecutive low-grade abnormal Pap results followed by a normal Pap result were assigned the same utility value as that for women with a single abnormal result. Analysis of the tree, including 1-way and 2-way sensitivity analysis of key variables, was conducted with Data 3.5.
Results
One hundred seventy interviews were completed. Characteristics of the interview subjects are shown in Table 2. A total of 22 subjects were designated “confused.” Analyses including the confused subjects did not alter the pattern of results, but the range in responses was larger. All analyses are presented here with confused subjects removed (n = 148).
Median ratings with 25th–75th percentiles for the paired scenarios rated by the standard gamble are shown as box plots in Figure 2. Mean adjusted scores, standard deviations, and mean differences in scores between paired scenarios are shown in Table 3. Notable findings include the following. (1) For each scenario, the range of responses by either rating method was very large. (2) Mean differences in utilities for observation vs early colposcopy were small. (3) For the paired scenarios in which the outcome was spontaneous resolution, observation was preferred (P = .01); in the paired scenarios in which the outcome was cryotherapy, early colposcopy was preferred (P = .02). (4) In the multiple regression analyses for each scenario, age, education, ethnicity, religiosity, and having known someone with cervical cancer together explained only a small amount of the variability between subjects (range for R2, .09–.22).
The decision model with baseline probabilities is shown in Figure 1. The model was simplified to exclude the outcome of cervical cancer, which is a very rare outcome for women with ASC or LSIL cervical smears who have adequate follow-up.5 In the baseline analysis, the overall utility of early colposcopy was slightly favored over the overall utility of the observation approach (utility of observation = 0.932; utility of early colposcopy = 0.940).
Sensitivity analysis examines the effect of varying elements of the model on the outcome. In sensitivity analyses of probabilities, the early colposcopy branch was favored, but the differences were small. The maximum difference in utilities between branches was 0.012 in these sensitivity analyses. In 1-way sensitivity analysis of branch utilities, threshold utility values to favor the observation branch were 0.986 for spontaneous resolution after observation and 0.898 for early colposcopy. Threshold values for cryotherapy were 0.938 for observation and 0.938 for early colposcopy.
TABLE 2
Characteristics of study subjects (n = 170)
Characteristics | n (%) |
---|---|
Mean age (range), y | 26 (14–53) |
Education | |
Less than high school | 58 (34%) |
High school | 77 (45%) |
Some college | 35 (20%) |
Ethnicity | |
African American | 21 (12%) |
Caucasian | 84 (49%) |
Latina | 46 (27%) |
Other | 21 (12%) |
Interview language, Spanish | 15 (9%) |
Prior colposcopy | 23 (14%) |
Moderately or very religious | 64 (38%) |
Knows someone with cervical cancer | 43 (25%) |
TABLE 3
Adjusted standard gamble values and paired differences* (n=148)
Management Strategy | ||||
---|---|---|---|---|
Short-term outcome | Observation Mean (SD) | Early colposcopy Mean (SD) | Difference | P value (2 sided) |
Spontaneous resolution | .96 ±..13) | .93±.20) | .03 ±..15) | .01 |
Cryotherapy | .93 ±..17) | .95 ±..14) | -.02 ±.11) | .02 |
Cone biopsy | .91 ±..21) | .92 ±..16) | -.02 ±..17) | .23 |
*Adjusted to scale so that immediate death had a utility of 0 and “full health with all normal Pap smears” had a utility of 1. |
FIGURE 1Decision model comparing observation with early colposcopy *
FIGURE 2Distribution of individual utilities as assessed by the standard gamble*
Discussion
We found wide variation in women’s preferences for management approaches to a low-grade abnormal Pap smear result. The range of responses was very large and the variation between individuals rating the same scenario was substantially greater than the variation in mean ratings between different scenarios. Measured subject characteristics explained only a small proportion of the observed variation, indicating that other unmeasured factors contributed substantially to the variation. Although 25% of subjects stated they knew someone with cervical cancer, this high percentage seems improbable and more likely reflects knowledge of someone who had an abnormal Pap smear.
The decision model displayed a small preference for immediate colposcopy. This may be related to preference for quicker resolution of the concern about cancer, although it involves more procedures. Small changes in utilities for spontaneous resolution and cryotherapy influenced the model to prefer observation. For cryotherapy, these utility values were within 1 standard deviation of the mean.
Our finding of a wide variation in preferences is supported by other patient preference studies,12-14 including 2 on this subject. Ferris et al assessed triage preferences for the evaluation and management ASC and LSIL.13 They used a questionnaire with a sample of 968 women who presented for care at obstetrics and gynecology and family practice clinics. They found that more women preferred repeat Pap smear when the index smear was ASC, and more women preferred colposcopy when the index smear was LSIL. Among a group of 136 Canadian women with atypia or LSIL referred for colposcopy, Meana et al found that 64% preferred early colposcopy, while 17% preferred observation and 17% had no strong preference.14
The factors contributing to patient preferences are complex. Differences in preferences may be influenced by knowledge and understanding of the disease and possible interventions, risk aversion, access to services, socioeconomics, cultural background, and other factors. While 1 patient may be most interested in establishing a definitive diagnosis and undergoing treatment as soon as possible, another may place priority on avoiding invasive or uncomfortable procedures. How differences in patient preferences influence clinical choices is highlighted by the work of Kuppermann et al.15 These investigators found that utilities for outcomes of prenatal diagnostic testing predicted subsequent testing behavior.
Our findings are limited by our use of a convenience sample of women attending family planning clinics. They may not be representative of women’s preferences in general, or even those of women attending family planning clinics. Outcomes in our study were specified during the preference assessment process; in real decision making, the outcome is always unknown at the time the decision is made. We did not include HPV typing as an option in our clinical scenarios. While HPV typing may have a role for triage of ASC,6,16 it appears not to be useful in management of LSIL.17
Cost-effectiveness analysis would offer important information about which management approach might be favored in the context of resource allocation. For decision making by individual patients and doctors, however, decision analysis is often more relevant. In this case, the “preferred” decision is very sensitive to patient utilities, emphasizing the need for clear physician-patient communication.
Strengths of our study include the diversity of the subjects, the formal process for preference assessment, and the paired scenarios, which allow assessment of preferences for a single management decision, in which 2 separate paths lead to an equivalent ultimate outcome. Our findings are consistent with previous work showing that the sequence of events leading to an outcome will influence utilities for the outcome.18
Application to clinical practice
How might our findings be translated into clinical practice? In clinical situations where different approaches to management are unlikely to result in substantial outcome differences (a “toss-up”), patient preferences are a key aspect of the decision-making process.19 For women with lowgrade Pap abnormalities, several diagnostic options are available and no single option is strongly supported by evidence to offer better outcomes. Our study indicates that no single option is preferred by most women. Under these conditions, engaging the patient in the decision-making process may produce better health outcomes.20 Clinicians should anticipate highly varied preferences, and will need to adopt a flexible approach. Not all patients will want to be actively involved in the decision process, but the desire for information is nearly universal. Flexible use of the questions in Table 4 may help patients to define their preferences and will likely improve their satisfaction and adherence to the treatment plan.
TABLE 4
Questions for patients with an abnormal Pap smear
What is your understanding of what it means for you to have an abnormal Pap smear showing _____________? |
There are different options for the next step. Would you like to be involved in deciding which option is preferred for your case? |
What questions do you have about these options? |
How important is it to you to have a definite answer as soon as possible? |
How do you feel about undergoing colposcopy? |
Would you prefer to have a follow-up Pap smear in ____ months, which might avoid a colposcopy, or would you prefer to have a colposcopy sooner? |
· Acknowledgments ·
The authors thank the staff of Planned Parenthood Mar Monte East for their assistance with subject recruitment and interviews.
1. Woolf SH. Screening for cervical cancer. In: Goldbloom RB, Lawrence RS, eds. Preventing disease: beyond the rhetoric. New York: Spring-Verlag, 1990:319–23.
2. Kurman RJ, Henson DE, Herbst AL, Noller KL, Schiffman MH. Interim guidelines for management of abnormal cervical cytology. The 1992 National Cancer Institute Workshop. JAMA 1994;271:1866-69.
3. Miller AB, Anderson G, Brisson J, Laidlaw J, Le Pitre N, Malcolmson P, et al. Report of a national workshop on screening for cancer of the cervix. Can Med Assoc J 1991;145:1301-25.
4. American College of Obstetricians and Gynecologists. Cervical cytology: evaluation and management of abnormalities. ACOG technical bulletin no. 183. Washington, DC: American College of Obstetricians and Gynecologists, 1993.
5. Melnikow J, Nuovo J, Willan AR, Chan BK, Howell LP. Natural history of cervical squamous intraepithelial lesions: A meta-analysis. Obstet Gynecol 1998;92:727-34.
6. Wright TC, Cox TJ, Massad LS, Twiggs LB, Wilkonson EJ. Consensus guidelines for the management of women with cervical cytological abnormalities. JAMA 2002;287:2120-29.
7. Drummond MF, O’Brien BJ, Stoddart GL, Torrance GW, eds. Methods for the economic evaluation of health care programs. 2nd edition. New York: Oxford University Press, 1997.
8. Torrance G. Measurement of health state utilities for economic appraisal: A review. J Health Econ., 1986;5:1-30.
9. Furlong W, Feeny D, Torrance GW, Barr R, Horsman J. Guide to design and development of health state utility instrumentation. Centre for Health Economics and Policy Development. Working Paper Series # 90-9. Hamilton, McMaster University, 1990.
10. Melnikow J, Nuovo J, Paliescheskey M, Stewart GK, Howell L, Green B. Detection of high grade cervical dysplasia: Impact of age and Bethesda system-related follow-up criteria. Diagnostic Cytopathol 1997;17:321-25.
11. Fink A, Kosecoff J, Chassin M, Brook RH. Consensus methods: Characteristics and guidelines for use. Am J Pub Health 1984;74:979-83.
12. Nease RF, Kneeland T, O’Connor GT, Sumner W, Lumpkins C, Shaw L, et al. Variation in patient utilities for outcomes of the management of chronic stable angina. JAMA 1995;273:1185-90.
13. Ferris DG, Kriegel D, Cole L, Litaker M, Woodward L. Women’s triage and management p for cervical cytologic reports demonstrating atypical squamous cells of undetermined significance and low grade squamous intraepithelial lesions. Arch Fam Med 1997;6:348-53.
14. Meana M, Steward DE, Lickrish GM, Murphy J, Rosen B. Patient preference for the management of midly abnormal Papanicolaou smears. J Women’s Health and Gender Based Medicine 1999;8:941-7.
15. Kuppermann M, Nease RF, Learman LA, Gates E, Posner SF, Washington AE. How do women value Down syndrome-affected birth and miscarriage? The thirty-five-year-old question. Decis Making 1998;18:468.-
16. Solomon D, Schiffman M, Tarone R. Comparison of three management strategies for patients with atypical squamous cells of undetermined significance: baseline results from a randomized trial. J Natl Cancer Inst 2001;93(4):252-3.
17. The atypical squamous cells of undetermined significance/low grade squamous intraepithelial lesions triage study (ALTS) group. Human papillomavirus testing for triage of women with cytologic evidence of low-grade squamous intra-epithelial lesions: baseline data from a randomized trial. J Natl Cancer Inst,. 2000;92:397-402.
18. Kuppermann M, Shiboski S, Feeny D, Elkin E, Washington AE. Can preference scores for discrete states be used to derive preference scores for an entire path of events? An application to prenatal diagnosis. Med Decis Making 1997;17:42-55.
19. Kassirer JP, Pauker SG. The toss up. N Engl J Med 1981;305:1457-9.
20. Kaplan SH, Greenfield S, Ware JE, Jr. Assessing the effects of physician patient interactions on the outcomes of care. Med Care 1989;27 (Suppl 3):S110-27.
- Any of several approaches may be used in managing women who have low-grade Pap smear abnormalities.
- Women’s preferences for a particular management approach to an abnormal Pap smear vary widely.
- Asking patients specific questions about their desire to avoid procedures and tolerance for uncertainty may help to clarify preferences.
The management of women who have low-grade cytologic abnormalities—including atypical squamous cells (ASC) and low-grade squamous intraepithelial lesions (LSIL)—is controversial.1-4 Without strong evidence favoring a single approach, some clinicians recommend immediate colposcopy to obtain a definitive diagnosis and to exclude the presence of a high-grade lesion, while others recommend observation with serial Pap smears, given the tendency for many low-grade lesions to regress spontaneously.5,6 Immediate colposcopy has the advantage of giving a patient a relatively rapid assessment of the nature and extent of her cervical dysplasia; however, the procedure is uncomfortable, and overall management may not be affected. Observation with serial Pap smears may avoid an invasive procedure, but it may also cause anxiety as time passes without a definitive diagnosis.
Eliciting and understanding patient preferences is an important part of clinical decision making. The clinician provides the best available information on the probability of clinical outcomes and the implications of each for the patient’s health. But only the patient knows what these outcomes mean to her well-being (also called “utility”).
Given the clinical disagreement over how to proceed with abnormal ASC and LSIL Pap smear results, the decision should be influenced by a patient’s preference, informed by knowledge of outcomes and costs of alternative approaches. It is unclear which approach women prefer, and whether women’s preferences for specific protocols are associated with sociodemographic characteristics. To understand better how women weigh these trade-offs, we evaluated the preferences of a diverse group of women for contrasting management approaches to the evaluation of a hypothetical low-grade abnormal Pap smear result.
Methods
Study population
Study participants were recruited from the waiting rooms of 5 family planning clinics in Northern California’s Central Valley. Women were eligible for the study if they were 18 years of age or older, or, if minors, they were emancipated and could thus provide informed consent. Potential subjects were excluded if they spoke neither English nor Spanish or if they had never had a Pap smear. The study protocol and informed consent procedures were reviewed and approved by the University of California, Davis, Human Subjects Committee.
Instruments and outcome measures
Interviews were conducted in English or Spanish. Information regarding demographic characteristics, level of education, past experiences with abnormal Pap smears and cervical cancer, and self-rated religiosity was collected with a self-administered questionnaire. The primary outcome measures were utilities (quantified preferences for specific health states) for 6 different scenarios. These were assessed by the standard gamble (SG) method, described in more detail below.7
Possible utility scores range from 0 to 1. A score of 0 represents immediate death; a score of 1 represents full (or ideal) health for the rest of one’s life. Because the scenarios under consideration in this study did not involve any meaningful level of risk of death, we expected utility scores for the scenarios to cluster toward the upper end of the scale. As a result, a measurement instrument based on an “immediate death” versus “full health” scale would be unable to discriminate between different scenarios. To avoid this problem, a scale was used in which the lower end point was a non-death state unambiguously less preferred than each of the scenarios under consideration.8 We used “invasive cervical cancer requiring hysterectomy” as the lower end point (utility of 0) contrasted with “full health with all normal Pap smears” (utility of 1) to generate the original score (SG Dys). In a separate standard gamble, subjects rated invasive cervical cancer versus immediate death (SG Ca), so that all utilities could be converted to the standard scale, using the formula: (1 – SG Ca) (SG Dys) + SG Ca.
The 6 scenarios rated in the study are shown in Table 1. The scenarios represent 3 sets of progressively more invasive interventions for a low-grade abnormal Pap smear: (1) resolution, representing spontaneous regression with treatment not required; (2) a low-grade abnormality requiring treatment with cryotherapy; (3) a more severe abnormality requiring a cervical cone biopsy. Following either spontaneous resolution or treatment, all scenarios assumed the abnormality was resolved. For each of the 3 results, a management strategy based on observation with serial Pap smears was applied in 1 instance, and a strategy of early colposcopy was applied in the other instance, resulting in 6 different pathways to the ultimate outcome; a normal Pap smear. The time frame for these scenarios was 18–36 months.
Trained interviewers used a standardized approach to elicit preferences from each subject. Subjects were read a description of all the procedures involved in the scenarios. Descriptions were accompanied by cards summarizing each procedure in pictures and words, and included information about the possibility of progression and spontaneous regression of the Pap smear abnormality. Subjects were encouraged to ask questions at any point during the interview. Procedure descriptions are available from the authors on request.
TABLE 1
Clinical scenarios classified by management approach and required treatment*
Spontaneous resolution | Cryotherapy | Cone biopsy | |
---|---|---|---|
Observation | Pap smear: low-grade abnormal | Pap smear: low-grade abnormal | Pap smear: low-grade abnormal |
↓ | ↓ | ↓ | |
Pap smear: normal | Pap smear: low-grade abnormal | Pap smear: normal | |
↓ | ↓ | ↓ | |
2 Pap smears every 6 months: normal | Pap smear: low-grade abnormal | Pap smear: normal | |
↓ | ↓ | ||
Pap smear: low-grade abnormal | Pap smear: normal | ||
↓ | ↓ | ||
Colposcopy and biopsy at 1 month | Colposcopy and biopsy at 1 month | ||
↓ | ↓ | ||
Biopsy: low-grade abnormal | Biopsy: abnormal with ? ECC | ||
↓ | ↓ | ||
Cryotherapy at 1 month | Cone biopsy at 1 month: moderately abnoramal cells | ||
↓ | ↓ | ||
3 Pap smears every 6 months: normal | Cure with cone biopsy | ||
↓ | |||
3 Pap smears every 6 months: normal | |||
Early colposcopy | Pap smear: low-grade abnormal | Pap smear: low-grade abnormal | Pap smear: low-grade abnormal |
↓ | ↓ | ↓ | |
Colposcopy and biopsy at 1 month | Colposcopy and biopsy at 1 month | Colposcopy and biopsy at 1 month | |
↓ | ↓ | ↓ | |
Biopsy: normal | Biopsy: abnormal with ? ECC | Biopsy: abnormal with ? ECC | |
↓ | ↓ | ↓ | |
Second colposcopy and biopsy | Cone biopsy at 1 month | Cone biopsy at 1 month | |
↓ | ↓ | ↓ | |
Biopsy: normal | Biopsy: moderately abnormal | Biopsy: moderately abnormal | |
↓ | ↓ | ↓ | |
2 Pap smears every 6 months: normal | Cure with cone biopsy | Cure with cone biopsy | |
↓ | ↓ | ↓ | |
Pap smear: low-grade abnormal | Colposcopy: normal | Colposcopy: normal | |
↓ | ↓ | ↓ | |
Colposcopy and biopsy at 1 month | 2 Pap smears every 6 months: normal | 2 Pap smears every 6 months: normal | |
↓ | |||
Biopsy: low-grade abnormal | |||
↓ | |||
Colposcopy: normal | |||
↓ | |||
2 Pap smears every 6 months: normal | |||
*Intervals are 6 months unless specified otherwise. ECC, endocervical curettage. |
Standard gamble
Subjects were asked their preference between the certainty of the scenario under consideration and an uncertain prospect of either having cervical cancer treated by hysterectomy or full health. A probability wheel was used as visual aid.9 The probability of cervical cancer was altered until the subject was indifferent between the certain scenario and the uncertain prospect. Once all 6 scenarios had been scored, each subject was asked about her preference between the certainty of cervical cancer treated by hysterectomy and the uncertain prospect of immediate death or full health, using the same method.
At the end of the interview, both the subject and the interviewer completed evaluation forms including ratings of how well the subject understood the standard gamble rating exercises. Subject confusion was also defined a priori as those placing a higher utility on scenario 3 (observation for a long period followed by cone biopsy), which represented the longest period of uncertainty followed by the most invasive procedure, than on scenario 1 (a single mildly abnormal Pap smear evaluated by observation which then resolved spontaneously), which represented the absence of any invasive procedure.
Statistical analysis
Descriptive statistics were generated for ratings of each scenario for the entire group and with the confused subjects removed. Confused subjects included those who reported they found the interview “very confusing,” those who were recorded by the interviewer as finding the interview “very confusing,” and those whose rankings met the criteria for subject confusion, as described above. Means, standard deviations, medians, and percentiles were calculated for each scenario. The mean differences in adjusted standard gamble ratings between paired scenarios was evaluated using a t distribution. Multiple regression analyses were used to explore how much between-subject variation in the standard gamble scores was explained by the variables listed above.
A simple decision tree (Figure 1) was constructed to contrast preferences for an observational approach vs early colposcopy. Outcome probabilities were derived from meta-analyses of the medical literature,5 from observational data obtained at the same Northern California family planning clinics,10 and, for cone biopsy outcomes, from expert opinion obtained using a modified Delphi process.11 Utilities were assigned to the decision tree based on the standard gamble results. Women having 2 consecutive low-grade abnormal Pap results followed by a normal Pap result were assigned the same utility value as that for women with a single abnormal result. Analysis of the tree, including 1-way and 2-way sensitivity analysis of key variables, was conducted with Data 3.5.
Results
One hundred seventy interviews were completed. Characteristics of the interview subjects are shown in Table 2. A total of 22 subjects were designated “confused.” Analyses including the confused subjects did not alter the pattern of results, but the range in responses was larger. All analyses are presented here with confused subjects removed (n = 148).
Median ratings with 25th–75th percentiles for the paired scenarios rated by the standard gamble are shown as box plots in Figure 2. Mean adjusted scores, standard deviations, and mean differences in scores between paired scenarios are shown in Table 3. Notable findings include the following. (1) For each scenario, the range of responses by either rating method was very large. (2) Mean differences in utilities for observation vs early colposcopy were small. (3) For the paired scenarios in which the outcome was spontaneous resolution, observation was preferred (P = .01); in the paired scenarios in which the outcome was cryotherapy, early colposcopy was preferred (P = .02). (4) In the multiple regression analyses for each scenario, age, education, ethnicity, religiosity, and having known someone with cervical cancer together explained only a small amount of the variability between subjects (range for R2, .09–.22).
The decision model with baseline probabilities is shown in Figure 1. The model was simplified to exclude the outcome of cervical cancer, which is a very rare outcome for women with ASC or LSIL cervical smears who have adequate follow-up.5 In the baseline analysis, the overall utility of early colposcopy was slightly favored over the overall utility of the observation approach (utility of observation = 0.932; utility of early colposcopy = 0.940).
Sensitivity analysis examines the effect of varying elements of the model on the outcome. In sensitivity analyses of probabilities, the early colposcopy branch was favored, but the differences were small. The maximum difference in utilities between branches was 0.012 in these sensitivity analyses. In 1-way sensitivity analysis of branch utilities, threshold utility values to favor the observation branch were 0.986 for spontaneous resolution after observation and 0.898 for early colposcopy. Threshold values for cryotherapy were 0.938 for observation and 0.938 for early colposcopy.
TABLE 2
Characteristics of study subjects (n = 170)
Characteristics | n (%) |
---|---|
Mean age (range), y | 26 (14–53) |
Education | |
Less than high school | 58 (34%) |
High school | 77 (45%) |
Some college | 35 (20%) |
Ethnicity | |
African American | 21 (12%) |
Caucasian | 84 (49%) |
Latina | 46 (27%) |
Other | 21 (12%) |
Interview language, Spanish | 15 (9%) |
Prior colposcopy | 23 (14%) |
Moderately or very religious | 64 (38%) |
Knows someone with cervical cancer | 43 (25%) |
TABLE 3
Adjusted standard gamble values and paired differences* (n=148)
Management Strategy | ||||
---|---|---|---|---|
Short-term outcome | Observation Mean (SD) | Early colposcopy Mean (SD) | Difference | P value (2 sided) |
Spontaneous resolution | .96 ±..13) | .93±.20) | .03 ±..15) | .01 |
Cryotherapy | .93 ±..17) | .95 ±..14) | -.02 ±.11) | .02 |
Cone biopsy | .91 ±..21) | .92 ±..16) | -.02 ±..17) | .23 |
*Adjusted to scale so that immediate death had a utility of 0 and “full health with all normal Pap smears” had a utility of 1. |
FIGURE 1Decision model comparing observation with early colposcopy *
FIGURE 2Distribution of individual utilities as assessed by the standard gamble*
Discussion
We found wide variation in women’s preferences for management approaches to a low-grade abnormal Pap smear result. The range of responses was very large and the variation between individuals rating the same scenario was substantially greater than the variation in mean ratings between different scenarios. Measured subject characteristics explained only a small proportion of the observed variation, indicating that other unmeasured factors contributed substantially to the variation. Although 25% of subjects stated they knew someone with cervical cancer, this high percentage seems improbable and more likely reflects knowledge of someone who had an abnormal Pap smear.
The decision model displayed a small preference for immediate colposcopy. This may be related to preference for quicker resolution of the concern about cancer, although it involves more procedures. Small changes in utilities for spontaneous resolution and cryotherapy influenced the model to prefer observation. For cryotherapy, these utility values were within 1 standard deviation of the mean.
Our finding of a wide variation in preferences is supported by other patient preference studies,12-14 including 2 on this subject. Ferris et al assessed triage preferences for the evaluation and management ASC and LSIL.13 They used a questionnaire with a sample of 968 women who presented for care at obstetrics and gynecology and family practice clinics. They found that more women preferred repeat Pap smear when the index smear was ASC, and more women preferred colposcopy when the index smear was LSIL. Among a group of 136 Canadian women with atypia or LSIL referred for colposcopy, Meana et al found that 64% preferred early colposcopy, while 17% preferred observation and 17% had no strong preference.14
The factors contributing to patient preferences are complex. Differences in preferences may be influenced by knowledge and understanding of the disease and possible interventions, risk aversion, access to services, socioeconomics, cultural background, and other factors. While 1 patient may be most interested in establishing a definitive diagnosis and undergoing treatment as soon as possible, another may place priority on avoiding invasive or uncomfortable procedures. How differences in patient preferences influence clinical choices is highlighted by the work of Kuppermann et al.15 These investigators found that utilities for outcomes of prenatal diagnostic testing predicted subsequent testing behavior.
Our findings are limited by our use of a convenience sample of women attending family planning clinics. They may not be representative of women’s preferences in general, or even those of women attending family planning clinics. Outcomes in our study were specified during the preference assessment process; in real decision making, the outcome is always unknown at the time the decision is made. We did not include HPV typing as an option in our clinical scenarios. While HPV typing may have a role for triage of ASC,6,16 it appears not to be useful in management of LSIL.17
Cost-effectiveness analysis would offer important information about which management approach might be favored in the context of resource allocation. For decision making by individual patients and doctors, however, decision analysis is often more relevant. In this case, the “preferred” decision is very sensitive to patient utilities, emphasizing the need for clear physician-patient communication.
Strengths of our study include the diversity of the subjects, the formal process for preference assessment, and the paired scenarios, which allow assessment of preferences for a single management decision, in which 2 separate paths lead to an equivalent ultimate outcome. Our findings are consistent with previous work showing that the sequence of events leading to an outcome will influence utilities for the outcome.18
Application to clinical practice
How might our findings be translated into clinical practice? In clinical situations where different approaches to management are unlikely to result in substantial outcome differences (a “toss-up”), patient preferences are a key aspect of the decision-making process.19 For women with lowgrade Pap abnormalities, several diagnostic options are available and no single option is strongly supported by evidence to offer better outcomes. Our study indicates that no single option is preferred by most women. Under these conditions, engaging the patient in the decision-making process may produce better health outcomes.20 Clinicians should anticipate highly varied preferences, and will need to adopt a flexible approach. Not all patients will want to be actively involved in the decision process, but the desire for information is nearly universal. Flexible use of the questions in Table 4 may help patients to define their preferences and will likely improve their satisfaction and adherence to the treatment plan.
TABLE 4
Questions for patients with an abnormal Pap smear
What is your understanding of what it means for you to have an abnormal Pap smear showing _____________? |
There are different options for the next step. Would you like to be involved in deciding which option is preferred for your case? |
What questions do you have about these options? |
How important is it to you to have a definite answer as soon as possible? |
How do you feel about undergoing colposcopy? |
Would you prefer to have a follow-up Pap smear in ____ months, which might avoid a colposcopy, or would you prefer to have a colposcopy sooner? |
· Acknowledgments ·
The authors thank the staff of Planned Parenthood Mar Monte East for their assistance with subject recruitment and interviews.
- Any of several approaches may be used in managing women who have low-grade Pap smear abnormalities.
- Women’s preferences for a particular management approach to an abnormal Pap smear vary widely.
- Asking patients specific questions about their desire to avoid procedures and tolerance for uncertainty may help to clarify preferences.
The management of women who have low-grade cytologic abnormalities—including atypical squamous cells (ASC) and low-grade squamous intraepithelial lesions (LSIL)—is controversial.1-4 Without strong evidence favoring a single approach, some clinicians recommend immediate colposcopy to obtain a definitive diagnosis and to exclude the presence of a high-grade lesion, while others recommend observation with serial Pap smears, given the tendency for many low-grade lesions to regress spontaneously.5,6 Immediate colposcopy has the advantage of giving a patient a relatively rapid assessment of the nature and extent of her cervical dysplasia; however, the procedure is uncomfortable, and overall management may not be affected. Observation with serial Pap smears may avoid an invasive procedure, but it may also cause anxiety as time passes without a definitive diagnosis.
Eliciting and understanding patient preferences is an important part of clinical decision making. The clinician provides the best available information on the probability of clinical outcomes and the implications of each for the patient’s health. But only the patient knows what these outcomes mean to her well-being (also called “utility”).
Given the clinical disagreement over how to proceed with abnormal ASC and LSIL Pap smear results, the decision should be influenced by a patient’s preference, informed by knowledge of outcomes and costs of alternative approaches. It is unclear which approach women prefer, and whether women’s preferences for specific protocols are associated with sociodemographic characteristics. To understand better how women weigh these trade-offs, we evaluated the preferences of a diverse group of women for contrasting management approaches to the evaluation of a hypothetical low-grade abnormal Pap smear result.
Methods
Study population
Study participants were recruited from the waiting rooms of 5 family planning clinics in Northern California’s Central Valley. Women were eligible for the study if they were 18 years of age or older, or, if minors, they were emancipated and could thus provide informed consent. Potential subjects were excluded if they spoke neither English nor Spanish or if they had never had a Pap smear. The study protocol and informed consent procedures were reviewed and approved by the University of California, Davis, Human Subjects Committee.
Instruments and outcome measures
Interviews were conducted in English or Spanish. Information regarding demographic characteristics, level of education, past experiences with abnormal Pap smears and cervical cancer, and self-rated religiosity was collected with a self-administered questionnaire. The primary outcome measures were utilities (quantified preferences for specific health states) for 6 different scenarios. These were assessed by the standard gamble (SG) method, described in more detail below.7
Possible utility scores range from 0 to 1. A score of 0 represents immediate death; a score of 1 represents full (or ideal) health for the rest of one’s life. Because the scenarios under consideration in this study did not involve any meaningful level of risk of death, we expected utility scores for the scenarios to cluster toward the upper end of the scale. As a result, a measurement instrument based on an “immediate death” versus “full health” scale would be unable to discriminate between different scenarios. To avoid this problem, a scale was used in which the lower end point was a non-death state unambiguously less preferred than each of the scenarios under consideration.8 We used “invasive cervical cancer requiring hysterectomy” as the lower end point (utility of 0) contrasted with “full health with all normal Pap smears” (utility of 1) to generate the original score (SG Dys). In a separate standard gamble, subjects rated invasive cervical cancer versus immediate death (SG Ca), so that all utilities could be converted to the standard scale, using the formula: (1 – SG Ca) (SG Dys) + SG Ca.
The 6 scenarios rated in the study are shown in Table 1. The scenarios represent 3 sets of progressively more invasive interventions for a low-grade abnormal Pap smear: (1) resolution, representing spontaneous regression with treatment not required; (2) a low-grade abnormality requiring treatment with cryotherapy; (3) a more severe abnormality requiring a cervical cone biopsy. Following either spontaneous resolution or treatment, all scenarios assumed the abnormality was resolved. For each of the 3 results, a management strategy based on observation with serial Pap smears was applied in 1 instance, and a strategy of early colposcopy was applied in the other instance, resulting in 6 different pathways to the ultimate outcome; a normal Pap smear. The time frame for these scenarios was 18–36 months.
Trained interviewers used a standardized approach to elicit preferences from each subject. Subjects were read a description of all the procedures involved in the scenarios. Descriptions were accompanied by cards summarizing each procedure in pictures and words, and included information about the possibility of progression and spontaneous regression of the Pap smear abnormality. Subjects were encouraged to ask questions at any point during the interview. Procedure descriptions are available from the authors on request.
TABLE 1
Clinical scenarios classified by management approach and required treatment*
Spontaneous resolution | Cryotherapy | Cone biopsy | |
---|---|---|---|
Observation | Pap smear: low-grade abnormal | Pap smear: low-grade abnormal | Pap smear: low-grade abnormal |
↓ | ↓ | ↓ | |
Pap smear: normal | Pap smear: low-grade abnormal | Pap smear: normal | |
↓ | ↓ | ↓ | |
2 Pap smears every 6 months: normal | Pap smear: low-grade abnormal | Pap smear: normal | |
↓ | ↓ | ||
Pap smear: low-grade abnormal | Pap smear: normal | ||
↓ | ↓ | ||
Colposcopy and biopsy at 1 month | Colposcopy and biopsy at 1 month | ||
↓ | ↓ | ||
Biopsy: low-grade abnormal | Biopsy: abnormal with ? ECC | ||
↓ | ↓ | ||
Cryotherapy at 1 month | Cone biopsy at 1 month: moderately abnoramal cells | ||
↓ | ↓ | ||
3 Pap smears every 6 months: normal | Cure with cone biopsy | ||
↓ | |||
3 Pap smears every 6 months: normal | |||
Early colposcopy | Pap smear: low-grade abnormal | Pap smear: low-grade abnormal | Pap smear: low-grade abnormal |
↓ | ↓ | ↓ | |
Colposcopy and biopsy at 1 month | Colposcopy and biopsy at 1 month | Colposcopy and biopsy at 1 month | |
↓ | ↓ | ↓ | |
Biopsy: normal | Biopsy: abnormal with ? ECC | Biopsy: abnormal with ? ECC | |
↓ | ↓ | ↓ | |
Second colposcopy and biopsy | Cone biopsy at 1 month | Cone biopsy at 1 month | |
↓ | ↓ | ↓ | |
Biopsy: normal | Biopsy: moderately abnormal | Biopsy: moderately abnormal | |
↓ | ↓ | ↓ | |
2 Pap smears every 6 months: normal | Cure with cone biopsy | Cure with cone biopsy | |
↓ | ↓ | ↓ | |
Pap smear: low-grade abnormal | Colposcopy: normal | Colposcopy: normal | |
↓ | ↓ | ↓ | |
Colposcopy and biopsy at 1 month | 2 Pap smears every 6 months: normal | 2 Pap smears every 6 months: normal | |
↓ | |||
Biopsy: low-grade abnormal | |||
↓ | |||
Colposcopy: normal | |||
↓ | |||
2 Pap smears every 6 months: normal | |||
*Intervals are 6 months unless specified otherwise. ECC, endocervical curettage. |
Standard gamble
Subjects were asked their preference between the certainty of the scenario under consideration and an uncertain prospect of either having cervical cancer treated by hysterectomy or full health. A probability wheel was used as visual aid.9 The probability of cervical cancer was altered until the subject was indifferent between the certain scenario and the uncertain prospect. Once all 6 scenarios had been scored, each subject was asked about her preference between the certainty of cervical cancer treated by hysterectomy and the uncertain prospect of immediate death or full health, using the same method.
At the end of the interview, both the subject and the interviewer completed evaluation forms including ratings of how well the subject understood the standard gamble rating exercises. Subject confusion was also defined a priori as those placing a higher utility on scenario 3 (observation for a long period followed by cone biopsy), which represented the longest period of uncertainty followed by the most invasive procedure, than on scenario 1 (a single mildly abnormal Pap smear evaluated by observation which then resolved spontaneously), which represented the absence of any invasive procedure.
Statistical analysis
Descriptive statistics were generated for ratings of each scenario for the entire group and with the confused subjects removed. Confused subjects included those who reported they found the interview “very confusing,” those who were recorded by the interviewer as finding the interview “very confusing,” and those whose rankings met the criteria for subject confusion, as described above. Means, standard deviations, medians, and percentiles were calculated for each scenario. The mean differences in adjusted standard gamble ratings between paired scenarios was evaluated using a t distribution. Multiple regression analyses were used to explore how much between-subject variation in the standard gamble scores was explained by the variables listed above.
A simple decision tree (Figure 1) was constructed to contrast preferences for an observational approach vs early colposcopy. Outcome probabilities were derived from meta-analyses of the medical literature,5 from observational data obtained at the same Northern California family planning clinics,10 and, for cone biopsy outcomes, from expert opinion obtained using a modified Delphi process.11 Utilities were assigned to the decision tree based on the standard gamble results. Women having 2 consecutive low-grade abnormal Pap results followed by a normal Pap result were assigned the same utility value as that for women with a single abnormal result. Analysis of the tree, including 1-way and 2-way sensitivity analysis of key variables, was conducted with Data 3.5.
Results
One hundred seventy interviews were completed. Characteristics of the interview subjects are shown in Table 2. A total of 22 subjects were designated “confused.” Analyses including the confused subjects did not alter the pattern of results, but the range in responses was larger. All analyses are presented here with confused subjects removed (n = 148).
Median ratings with 25th–75th percentiles for the paired scenarios rated by the standard gamble are shown as box plots in Figure 2. Mean adjusted scores, standard deviations, and mean differences in scores between paired scenarios are shown in Table 3. Notable findings include the following. (1) For each scenario, the range of responses by either rating method was very large. (2) Mean differences in utilities for observation vs early colposcopy were small. (3) For the paired scenarios in which the outcome was spontaneous resolution, observation was preferred (P = .01); in the paired scenarios in which the outcome was cryotherapy, early colposcopy was preferred (P = .02). (4) In the multiple regression analyses for each scenario, age, education, ethnicity, religiosity, and having known someone with cervical cancer together explained only a small amount of the variability between subjects (range for R2, .09–.22).
The decision model with baseline probabilities is shown in Figure 1. The model was simplified to exclude the outcome of cervical cancer, which is a very rare outcome for women with ASC or LSIL cervical smears who have adequate follow-up.5 In the baseline analysis, the overall utility of early colposcopy was slightly favored over the overall utility of the observation approach (utility of observation = 0.932; utility of early colposcopy = 0.940).
Sensitivity analysis examines the effect of varying elements of the model on the outcome. In sensitivity analyses of probabilities, the early colposcopy branch was favored, but the differences were small. The maximum difference in utilities between branches was 0.012 in these sensitivity analyses. In 1-way sensitivity analysis of branch utilities, threshold utility values to favor the observation branch were 0.986 for spontaneous resolution after observation and 0.898 for early colposcopy. Threshold values for cryotherapy were 0.938 for observation and 0.938 for early colposcopy.
TABLE 2
Characteristics of study subjects (n = 170)
Characteristics | n (%) |
---|---|
Mean age (range), y | 26 (14–53) |
Education | |
Less than high school | 58 (34%) |
High school | 77 (45%) |
Some college | 35 (20%) |
Ethnicity | |
African American | 21 (12%) |
Caucasian | 84 (49%) |
Latina | 46 (27%) |
Other | 21 (12%) |
Interview language, Spanish | 15 (9%) |
Prior colposcopy | 23 (14%) |
Moderately or very religious | 64 (38%) |
Knows someone with cervical cancer | 43 (25%) |
TABLE 3
Adjusted standard gamble values and paired differences* (n=148)
Management Strategy | ||||
---|---|---|---|---|
Short-term outcome | Observation Mean (SD) | Early colposcopy Mean (SD) | Difference | P value (2 sided) |
Spontaneous resolution | .96 ±..13) | .93±.20) | .03 ±..15) | .01 |
Cryotherapy | .93 ±..17) | .95 ±..14) | -.02 ±.11) | .02 |
Cone biopsy | .91 ±..21) | .92 ±..16) | -.02 ±..17) | .23 |
*Adjusted to scale so that immediate death had a utility of 0 and “full health with all normal Pap smears” had a utility of 1. |
FIGURE 1Decision model comparing observation with early colposcopy *
FIGURE 2Distribution of individual utilities as assessed by the standard gamble*
Discussion
We found wide variation in women’s preferences for management approaches to a low-grade abnormal Pap smear result. The range of responses was very large and the variation between individuals rating the same scenario was substantially greater than the variation in mean ratings between different scenarios. Measured subject characteristics explained only a small proportion of the observed variation, indicating that other unmeasured factors contributed substantially to the variation. Although 25% of subjects stated they knew someone with cervical cancer, this high percentage seems improbable and more likely reflects knowledge of someone who had an abnormal Pap smear.
The decision model displayed a small preference for immediate colposcopy. This may be related to preference for quicker resolution of the concern about cancer, although it involves more procedures. Small changes in utilities for spontaneous resolution and cryotherapy influenced the model to prefer observation. For cryotherapy, these utility values were within 1 standard deviation of the mean.
Our finding of a wide variation in preferences is supported by other patient preference studies,12-14 including 2 on this subject. Ferris et al assessed triage preferences for the evaluation and management ASC and LSIL.13 They used a questionnaire with a sample of 968 women who presented for care at obstetrics and gynecology and family practice clinics. They found that more women preferred repeat Pap smear when the index smear was ASC, and more women preferred colposcopy when the index smear was LSIL. Among a group of 136 Canadian women with atypia or LSIL referred for colposcopy, Meana et al found that 64% preferred early colposcopy, while 17% preferred observation and 17% had no strong preference.14
The factors contributing to patient preferences are complex. Differences in preferences may be influenced by knowledge and understanding of the disease and possible interventions, risk aversion, access to services, socioeconomics, cultural background, and other factors. While 1 patient may be most interested in establishing a definitive diagnosis and undergoing treatment as soon as possible, another may place priority on avoiding invasive or uncomfortable procedures. How differences in patient preferences influence clinical choices is highlighted by the work of Kuppermann et al.15 These investigators found that utilities for outcomes of prenatal diagnostic testing predicted subsequent testing behavior.
Our findings are limited by our use of a convenience sample of women attending family planning clinics. They may not be representative of women’s preferences in general, or even those of women attending family planning clinics. Outcomes in our study were specified during the preference assessment process; in real decision making, the outcome is always unknown at the time the decision is made. We did not include HPV typing as an option in our clinical scenarios. While HPV typing may have a role for triage of ASC,6,16 it appears not to be useful in management of LSIL.17
Cost-effectiveness analysis would offer important information about which management approach might be favored in the context of resource allocation. For decision making by individual patients and doctors, however, decision analysis is often more relevant. In this case, the “preferred” decision is very sensitive to patient utilities, emphasizing the need for clear physician-patient communication.
Strengths of our study include the diversity of the subjects, the formal process for preference assessment, and the paired scenarios, which allow assessment of preferences for a single management decision, in which 2 separate paths lead to an equivalent ultimate outcome. Our findings are consistent with previous work showing that the sequence of events leading to an outcome will influence utilities for the outcome.18
Application to clinical practice
How might our findings be translated into clinical practice? In clinical situations where different approaches to management are unlikely to result in substantial outcome differences (a “toss-up”), patient preferences are a key aspect of the decision-making process.19 For women with lowgrade Pap abnormalities, several diagnostic options are available and no single option is strongly supported by evidence to offer better outcomes. Our study indicates that no single option is preferred by most women. Under these conditions, engaging the patient in the decision-making process may produce better health outcomes.20 Clinicians should anticipate highly varied preferences, and will need to adopt a flexible approach. Not all patients will want to be actively involved in the decision process, but the desire for information is nearly universal. Flexible use of the questions in Table 4 may help patients to define their preferences and will likely improve their satisfaction and adherence to the treatment plan.
TABLE 4
Questions for patients with an abnormal Pap smear
What is your understanding of what it means for you to have an abnormal Pap smear showing _____________? |
There are different options for the next step. Would you like to be involved in deciding which option is preferred for your case? |
What questions do you have about these options? |
How important is it to you to have a definite answer as soon as possible? |
How do you feel about undergoing colposcopy? |
Would you prefer to have a follow-up Pap smear in ____ months, which might avoid a colposcopy, or would you prefer to have a colposcopy sooner? |
· Acknowledgments ·
The authors thank the staff of Planned Parenthood Mar Monte East for their assistance with subject recruitment and interviews.
1. Woolf SH. Screening for cervical cancer. In: Goldbloom RB, Lawrence RS, eds. Preventing disease: beyond the rhetoric. New York: Spring-Verlag, 1990:319–23.
2. Kurman RJ, Henson DE, Herbst AL, Noller KL, Schiffman MH. Interim guidelines for management of abnormal cervical cytology. The 1992 National Cancer Institute Workshop. JAMA 1994;271:1866-69.
3. Miller AB, Anderson G, Brisson J, Laidlaw J, Le Pitre N, Malcolmson P, et al. Report of a national workshop on screening for cancer of the cervix. Can Med Assoc J 1991;145:1301-25.
4. American College of Obstetricians and Gynecologists. Cervical cytology: evaluation and management of abnormalities. ACOG technical bulletin no. 183. Washington, DC: American College of Obstetricians and Gynecologists, 1993.
5. Melnikow J, Nuovo J, Willan AR, Chan BK, Howell LP. Natural history of cervical squamous intraepithelial lesions: A meta-analysis. Obstet Gynecol 1998;92:727-34.
6. Wright TC, Cox TJ, Massad LS, Twiggs LB, Wilkonson EJ. Consensus guidelines for the management of women with cervical cytological abnormalities. JAMA 2002;287:2120-29.
7. Drummond MF, O’Brien BJ, Stoddart GL, Torrance GW, eds. Methods for the economic evaluation of health care programs. 2nd edition. New York: Oxford University Press, 1997.
8. Torrance G. Measurement of health state utilities for economic appraisal: A review. J Health Econ., 1986;5:1-30.
9. Furlong W, Feeny D, Torrance GW, Barr R, Horsman J. Guide to design and development of health state utility instrumentation. Centre for Health Economics and Policy Development. Working Paper Series # 90-9. Hamilton, McMaster University, 1990.
10. Melnikow J, Nuovo J, Paliescheskey M, Stewart GK, Howell L, Green B. Detection of high grade cervical dysplasia: Impact of age and Bethesda system-related follow-up criteria. Diagnostic Cytopathol 1997;17:321-25.
11. Fink A, Kosecoff J, Chassin M, Brook RH. Consensus methods: Characteristics and guidelines for use. Am J Pub Health 1984;74:979-83.
12. Nease RF, Kneeland T, O’Connor GT, Sumner W, Lumpkins C, Shaw L, et al. Variation in patient utilities for outcomes of the management of chronic stable angina. JAMA 1995;273:1185-90.
13. Ferris DG, Kriegel D, Cole L, Litaker M, Woodward L. Women’s triage and management p for cervical cytologic reports demonstrating atypical squamous cells of undetermined significance and low grade squamous intraepithelial lesions. Arch Fam Med 1997;6:348-53.
14. Meana M, Steward DE, Lickrish GM, Murphy J, Rosen B. Patient preference for the management of midly abnormal Papanicolaou smears. J Women’s Health and Gender Based Medicine 1999;8:941-7.
15. Kuppermann M, Nease RF, Learman LA, Gates E, Posner SF, Washington AE. How do women value Down syndrome-affected birth and miscarriage? The thirty-five-year-old question. Decis Making 1998;18:468.-
16. Solomon D, Schiffman M, Tarone R. Comparison of three management strategies for patients with atypical squamous cells of undetermined significance: baseline results from a randomized trial. J Natl Cancer Inst 2001;93(4):252-3.
17. The atypical squamous cells of undetermined significance/low grade squamous intraepithelial lesions triage study (ALTS) group. Human papillomavirus testing for triage of women with cytologic evidence of low-grade squamous intra-epithelial lesions: baseline data from a randomized trial. J Natl Cancer Inst,. 2000;92:397-402.
18. Kuppermann M, Shiboski S, Feeny D, Elkin E, Washington AE. Can preference scores for discrete states be used to derive preference scores for an entire path of events? An application to prenatal diagnosis. Med Decis Making 1997;17:42-55.
19. Kassirer JP, Pauker SG. The toss up. N Engl J Med 1981;305:1457-9.
20. Kaplan SH, Greenfield S, Ware JE, Jr. Assessing the effects of physician patient interactions on the outcomes of care. Med Care 1989;27 (Suppl 3):S110-27.
1. Woolf SH. Screening for cervical cancer. In: Goldbloom RB, Lawrence RS, eds. Preventing disease: beyond the rhetoric. New York: Spring-Verlag, 1990:319–23.
2. Kurman RJ, Henson DE, Herbst AL, Noller KL, Schiffman MH. Interim guidelines for management of abnormal cervical cytology. The 1992 National Cancer Institute Workshop. JAMA 1994;271:1866-69.
3. Miller AB, Anderson G, Brisson J, Laidlaw J, Le Pitre N, Malcolmson P, et al. Report of a national workshop on screening for cancer of the cervix. Can Med Assoc J 1991;145:1301-25.
4. American College of Obstetricians and Gynecologists. Cervical cytology: evaluation and management of abnormalities. ACOG technical bulletin no. 183. Washington, DC: American College of Obstetricians and Gynecologists, 1993.
5. Melnikow J, Nuovo J, Willan AR, Chan BK, Howell LP. Natural history of cervical squamous intraepithelial lesions: A meta-analysis. Obstet Gynecol 1998;92:727-34.
6. Wright TC, Cox TJ, Massad LS, Twiggs LB, Wilkonson EJ. Consensus guidelines for the management of women with cervical cytological abnormalities. JAMA 2002;287:2120-29.
7. Drummond MF, O’Brien BJ, Stoddart GL, Torrance GW, eds. Methods for the economic evaluation of health care programs. 2nd edition. New York: Oxford University Press, 1997.
8. Torrance G. Measurement of health state utilities for economic appraisal: A review. J Health Econ., 1986;5:1-30.
9. Furlong W, Feeny D, Torrance GW, Barr R, Horsman J. Guide to design and development of health state utility instrumentation. Centre for Health Economics and Policy Development. Working Paper Series # 90-9. Hamilton, McMaster University, 1990.
10. Melnikow J, Nuovo J, Paliescheskey M, Stewart GK, Howell L, Green B. Detection of high grade cervical dysplasia: Impact of age and Bethesda system-related follow-up criteria. Diagnostic Cytopathol 1997;17:321-25.
11. Fink A, Kosecoff J, Chassin M, Brook RH. Consensus methods: Characteristics and guidelines for use. Am J Pub Health 1984;74:979-83.
12. Nease RF, Kneeland T, O’Connor GT, Sumner W, Lumpkins C, Shaw L, et al. Variation in patient utilities for outcomes of the management of chronic stable angina. JAMA 1995;273:1185-90.
13. Ferris DG, Kriegel D, Cole L, Litaker M, Woodward L. Women’s triage and management p for cervical cytologic reports demonstrating atypical squamous cells of undetermined significance and low grade squamous intraepithelial lesions. Arch Fam Med 1997;6:348-53.
14. Meana M, Steward DE, Lickrish GM, Murphy J, Rosen B. Patient preference for the management of midly abnormal Papanicolaou smears. J Women’s Health and Gender Based Medicine 1999;8:941-7.
15. Kuppermann M, Nease RF, Learman LA, Gates E, Posner SF, Washington AE. How do women value Down syndrome-affected birth and miscarriage? The thirty-five-year-old question. Decis Making 1998;18:468.-
16. Solomon D, Schiffman M, Tarone R. Comparison of three management strategies for patients with atypical squamous cells of undetermined significance: baseline results from a randomized trial. J Natl Cancer Inst 2001;93(4):252-3.
17. The atypical squamous cells of undetermined significance/low grade squamous intraepithelial lesions triage study (ALTS) group. Human papillomavirus testing for triage of women with cytologic evidence of low-grade squamous intra-epithelial lesions: baseline data from a randomized trial. J Natl Cancer Inst,. 2000;92:397-402.
18. Kuppermann M, Shiboski S, Feeny D, Elkin E, Washington AE. Can preference scores for discrete states be used to derive preference scores for an entire path of events? An application to prenatal diagnosis. Med Decis Making 1997;17:42-55.
19. Kassirer JP, Pauker SG. The toss up. N Engl J Med 1981;305:1457-9.
20. Kaplan SH, Greenfield S, Ware JE, Jr. Assessing the effects of physician patient interactions on the outcomes of care. Med Care 1989;27 (Suppl 3):S110-27.
Do written action plans improve patient outcomes in asthma? An evidence-based analysis
- Most studies of asthma self-management do not permit retrospective isolation of the independent effects of a written action plan or peak flow meter use.
- Studies designed to isolate the effect of these self-care activities are generally underpowered or prone to systematic bias.
- Available evidence suggests that peak flow meters and written action plans do not have a large impact on outcomes when applied to the general population of asthmatics.
- These interventions are most likely to have beneficial effect when applied to selected populations, particularly patients with high baseline utilization.
Self-management skills are widely promoted by health plans and specialty societies with the expectation that they will improve care. The 1997 National Heart, Lung, and Blood Institute guidelines on treating asthma emphasize self-management,1 although they do not recommend specific programs. To maximize therapeutic effectiveness, it would be useful to know which components of patient self-management improve outcomes. Written action plans and peak flow meters are commonly used in asthma self-management programs. While these are simple, low-cost interventions for an individual, the aggregate cost for the entire population of asthmatics may be high.2
Much literature has accumulated on the effectiveness of providing asthma education alone and on programs that actively engage patients in their own care.Several systematic reviews have found that providing educational information alone has had little effect on asthma outcomes.3-5 There is evidence, though, that self-management activities are more effective than educational information alone. A recent Cochrane review of 24 trials found that self-management with regular practitioner review reduces hospi-talizations and emergency room visits.6 This review did not identify specific components contributing to improved outcomes. In contrast to the aforementioned studies on patient education, a large case-control study of children in the Kaiser Permanente System,7 found that written action plans were associated with lower rates of hospitalization and emergency room use. However, such observational studies often include confounding factors and are not sufficient to establish a cause-effect relationship between written action plans and improved outcomes.
We report on a systematic review that attempts to isolate the independent effect of a written action plan on asthma outcomes. We address two key questions:
- Compared with medical management alone, does the addition of a written asthma action plan (with or without peak flow meter use) improve outcomes?
- Compared with a written action plan based on symptoms, does a written action plan based on peak flow monitoring improve outcomes?
Methods
This study is part of a broader evidence report on the management of chronic asthma prepared for the Agency of Health Care Research and Quality8. Complete details of the methodology are available in the full report8 (http://www.ahcpr.gov/clinic/epcix.htm).
Literature search and study selection
We performed a comprehensive literature search from 1980 to August 2000 using MEDLINE, Embase, the Cochrane Library, and a hand search of recent bibliographies. The search was limited to full-length, peer-reviewed articles with an English abstract. Two independent reviewers carried out each step of study selection and data abstraction. Disagreements were resolved by consensus of the two reviewers or, if necessary, by the decision of a third reviewer.
Initial study selection was limited to comparative full-length reports or abstracts in peer-reviewed medical journals, with at least 25 evaluable children or adults per arm, treated for at least 12 weeks. Relevant comparisons included a written action plan and no written action plan; a written action plan based on peak flow readings and a written action plan based on symptoms. Study designs varied: clinical trials, cohort comparisons, case-control analyses, cross-sectional evaluations, and before-after comparisons. Specific components of the management plan had to be described.
Relevant outcomes included measures of inpatient and outpatient utilization, lung function, symptoms, rescue medication or oral steroid use, and quality of life. Outcomes of greatest interest were utilization parameters, as the goals of self-management usually focus on improving these outcomes.
These initial selection criteria yielded many studies that were confounded by multiple asthma management interventions and thus did not isolate the comparisons of interest. Therefore, the research team collectively determined the study design features that would best isolate the effects of written action plans and used them as new criteria in a second round of study selection. The studies thus selected satisfied 4 criteria: 1) randomization of patients; (2) delivery of the same interventions to experimental and control groups, except that the experimental group also received a written action plan; (3) delivery of the same interventions to experimental and control groups, except that one group received a written action plan based on peak flow meter readings, and the comparison group received a written action plan based on symptom monitoring; and 4) inclusion of a written action plan that met our specified definition.
A written action plan, by our definition, had two components: an algorithm that identified specific clinical indicators signaling the need for adjustments in medication; and specific instructions on how to adjust medications in response to such indicators. Many publications lacked sufficient detail on the written plan, so a brief survey was sent to the primary author of each of the 36 studies. If no response was obtained (36%), the article was excluded only when it was clear from the publication that our definition was not met.
Assessment of study quality
High-quality studies were randomized controlled trials that met the 3 domains of study quality that have been demonstrated empirically to impact effect size: concealment of treatment allocation; double-blinding; and minimization of exclusion bias.9,10 However, we doubted the feasibility of double-blinding a written asthma plan intervention, and so relaxed this requirement. We considered exclusion bias to be minimized when a study either reported intent-to-treat analysis or excluded fewer than 10% of subjects from analysis, with the ratio of subjects excluded from each arm being less than 2:1.
To more fully evaluate study design issues that may be particularly important in asthma research,11,12 we constructed asthma-specific quality indicators in consultation with an expert panel. Controls for potential confounders of treatment effect included establishing reversibility of airway obstruction, controlling for other medication use, reporting compliance, and addressing seasonality. In addition, a priori reporting of power calculations and accounting for exclusions and withdrawals were judged to be study quality characteristics pertinent to this body of evidence.
Data analysis
We constructed evidence tables for the outcomes of interest, and performed a qualitative synthesis of the data. Meta-analysis was not appropriate due to wide discrepancies in the patient populations studied, the interventions employed, and measurement and reporting of outcomes.
Results
Our literature search yielded a total of 4578 citations. Of these, 36 studies met the initial selection criteria. Many of these qualifying studies, however, were confounded by multiple asthma management interventions applied inconsistently across treatment arms. For example, a common confounder was review of and change in long-term medication use in the treatment group, but not in the control group. This necessitated a refinement in our selection criteria to focus on studies that largely isolated the effect of written action plans.13-21 This step yielded a final evidence base of 9 randomized controlled trials with a total enrollment of 1501 patients.
Table 1 summarizes the characteristics, interventions, and outcomes of the 9 studies. Two studies were 3-arm trials,16,17 which raised the total number of comparisons among the 9 studies to 11. The largest study was the Grampian Asthma Study of Integrated Care (n=569),14 a community study conducted in the UK. Enrollment in the other 8 studies ranged from 43 to 64 patients per arm. Treatment duration ranged from 24 to 52 weeks.
None of the studies met our definition of high quality. In fact, no study met any of the generic quality criteria—none was blinded, none described concealment of allocation, and all excluded more than 10% of subjects. Furthermore, none reported an intention-to-treat analysis. Thus these trials were prone to withdrawal bias as well as overestimation of treatment effect due to lack of allocation concealment.
No study met the majority of asthma-specific indicators (Table 1). Of the 9 studies, only 5 met any asthma-specific indicator. Three reported prospective power calculations,13-15 but 2 of these substantially overestimated the expected effect.13,15 Two studies established reversibility;14,17 2 controlled for other medication use;13,15 and 2 reported compliance.17,21 Thus, the studies were also prone to a type II error (failing to detect a true effect) and to potential confounding of outcomes.
We performed sample power calculations for hospitalizations (Table 2), derived from baseline rates reported in 4 studies14,16-18 and standard deviations reported in 2.14,17 A study with 250 patients per arm could detect a reduction of 50% or more in hospitalization, given a control rate of at least 0.2 hospitalizations/patient/year. In actuality, GRASSIC,14 which is the largest available trial (N=569), had baseline hospitalization rates of 0.12 and 0.13. With this baseline rate, over 700 patients per arm are required, higher than the actual enrollment in GRASSIC. The other studies in this review would be adequately powered to detect a 50% difference only in the setting of even higher baseline utilization (eg, 0.30 hospitalizations/patient/year).
Table 3 displays utilization outcomes for the 11 comparisons in the 9 trials. In 5 studies (N=1019), medical management with a written action plan was compared with medical management without a written action plan.13-17 Two trials (N=185) compared a peak flow meter plus a written action plan with a peak flow meter and no written action plan18,19 In 4 studies (N=393), a written action plan based on peak flow monitoring was compared with a written action plan based on symptoms.
TABLE 1
Study characteristics
Study | Patient popultation | Study Arms | Intervention components | Outcomes reported | Asthma quality indicators met |
---|---|---|---|---|---|
Optimal medical management vs. optimal medical management + PFM action plan | |||||
Jones 199514 | Inclusions: patients using ICS | Usual care | SxD, FU | Ut, LF, Sx | Pow, Med |
Exclusions: patients on oral steroids or using peak flow meters at home | PFM action plan | AP, PF, SxD, FU | |||
Mean age: 29.5 years | |||||
Severity level: Mild–moderate | |||||
Drummond 1994 (GRASSIC)15 | Inclusion: FEV1 reversibility 20% or greater | Usual care | FU | Ut, LF, Med Ex | Pow, Rev |
Exclusions: patients who already owned a PFM | PFM action plan | AP, PF, FU | |||
Mean age: 50.8 years | |||||
Severity level: Mild–severe | |||||
Ayres 199516 | Inclusions: maximum PEF variability, 0.15%; minimum nights/week with symptoms, 3; minimum use of ICS or sodium cromoglycate, 3 months | Usual care | SxD, FU | LF, Sx, Ex | Pow, Med |
Mean age: 45 years | PFM action plan | AP, PF, SxD, FU | |||
Severity level: Moderate–severe | |||||
Cowie 199713 | Inclusions: treatment for an exacerbation of asthma in an ER asthma clinic; history of receiving urgent treatment for asthma in the previous 12 months | Usual care | Ed, SxD, FU | Ut, PF, Med, Ex | None |
Mean age: 37.8 years | PFM action plan | AP, PF, Ed, SxD, FU | |||
Severity level: Mild–severe | |||||
Cote 199717 | Inclusions: FEV1postbronchodilator 85-100 % of predicted; PEF, at minimum, 85 % of predicted; minimum PEF variability, 0%; Methacholine | Usual care | Ed | Ut, LF, Med | Exc, Rev, Com |
Exclusions: patients having previously taken an asthma educational program | PFM action plan | Ed, Cn, AP, PF | |||
Mean age: 36.5 years Severity level: Mild | |||||
Usual care + PFM use alone vs. usual care + PFM action plan | |||||
Ignacio-Garcia 199518 | Inclusions: patients from outpatient asthma clinic with asthma for 2 years | Usual care + PFM | PF, SxD, FU | Ut, LF, Med | None |
Mean age: 41.9 years | Usual care + PFM action plan | PF, AP, Ed, SxD, FU | |||
Severity level: Mild–severe | |||||
Charlton 199419 | Inclusion: patients with inpatient or outpatient visit for asthma | Usual care + PFM | PF, Ed, SxD, FU | Ut, Sx, Med, Ex | None |
Mean age: 6.5 years | Usual care + PFM action plan | PF, AP, Ed, SxD, FU | |||
Severity level: Mild–moderate | |||||
PFM action plan vs. Symptom action plan | |||||
Turner 199820 | Inclusions: Maximum methacholine PC20, 7.9; using ICS | Symptom action plan | AP, Ed, SxD, Cn BM, EM | Ut, LF, Sx, Med | Exc, Com |
Exclusions: previous PFM use; significant comorbid conditions | PFM action plan | PF, AP, Ed, SxD, Cn BM, EM | |||
Mean age: 34.1 years | |||||
Severity level: Mild–severe | |||||
Charlton 199021 | Inclusions: patients on repeat prescribing register | Symptom action plan | AP, Ed, FU | Ut, Med | None |
Mean age: NR | PFM action plan | PF, AP, Ed, FU, Cn | Ut, PF, Med, Ex | None | |
Severity level: Mild–severe (?) | |||||
Cowie 199716 | Inclusions: treatment for an exacerbation of asthma in an ER, or asthma clinic; history of receiving urgent treatment for asthma in the previous 12 months | Symptom action plan | AP, Ed, SxD, FU | ||
PFM action plan | AP, PF, Ed, SxD, FU | ||||
Cote 199717 | Inclusions: FEV1postbronchodilator, 85-100 % of predicted; PEF, at minimum, 85 % of predicted; minimum PEF variability, 0%; Methacholine | Symptom action plan | Ed, AP | Ut, LF, Med | Exc, Rev, Com |
Exclusions: previous enrollment in an asthma educational program | PFM action plan | Ed, Cn, AP, PF | |||
Eligibility criteria: ICS = inhaled corticosteroid; FEV1 = forced expiratory volume in 1 second; PEF = peak expiratory flow; PFM = peak flow meter; ER = emergency room; PC20 = 20% fall in FEV1 Intervention components: PF = Peak flow meter; AP = Written Action Plan; Ed = Education; SxD = Symptom diary; FU = Follow-up visits; Cn = Counseling; BM = Behavior modification; EM = Environmental modification | |||||
Outcomes: Ut = Utilization measures; LF= Lung function measurements; Sx = Symptom=based measurements; Med = Medication use; Ex = Exacerbations of asth ma Asthma Quality Indicators: Exc = Accounted for excluded patients; Pow = Reported power calculations; Rev = Established reversibility of airway obstruction; Med = Controlled for other medication use; Com = Reported compliance; Sea = Addressed seasonality. |
TABLE 2
Power calculations for hospitalizations per patient per year
Assumed control mean | Possible treatment mean | % decrease | N needed per study arm |
---|---|---|---|
0.10 | 0.075 | 25 | 3077 |
0.10 | 0.05 | 50 | 770 |
0.10 | 0.025 | 75 | 342 |
0.20 | 0.015 | 25 | 770 |
0.20 | 0.10 | 50 | 193 |
0.20 | 0.05 | 75 | 86 |
0.30 | 0.225 | 25 | 342 |
0.30 | 0.15 | 50 | 86 |
0.30 | 0.075 | 75 | 38 |
Studies were identified that contained baseline rates on hospitalizations/patient/year, or information that allowed calculation of this parameter (Drummond, Abdalla, Beattie et al., 1994; Cote, Cartier, Robichaud et al. 1997; Cowie, Revitt, Underwood et al., 1997; Ignacio-Garcia and Gonzalez-Santos, 1995). Baseline rates of hospitalization varied in these studies from 0.04-0.29/patient/year. Standard deviations for this outcome were available only in two studies; Cote, Cartier, Robichaud et al. (1997) reported an SD of 0.30 for this variable, and an SD of 0.35 was calculated from the confidence intervals reported in GRASSIC (Drummond, Abdalla, Beattie et al., 1994). For the calculations, the more conservative 0.35 estimate for SD was used. | |||
Number of patients per study arm were estimated for 80 percent power at the 5 percent significance level using control arm means of 0.10, 0.20, and 0.30 hospitalizations/patient/year. The expected reduction in this variable was tested along a spectrum from 25-75 percent. |
Written action plan versus no written action plan
All 5 studies used a peak flow meter based written action plan. All reported utilization outcomes, but the types and units of measurement were not consistent across studies (Table 2). Additionally, 4 studies reported on symptoms,13-16 and 3 reported lung function outcomes.13-15
With one notable exception, there were no statistically significant differences in outcomes among groups. Cowie et al16 reported an 11-fold decrease in total emergency room visits for the group using a peak-flow action plan (5 vs 55, P = .02), and also reported a reduction in hospitalizations of a similar magnitude (2 vs 12) that did not reach statistical significance. However, this study suffers from notable flaws that diminish confidence in the results. It is a post-intervention comparison among groups, which does not compare change from baseline, or incorporate baseline values as covariates in the analysis. Moreover baseline utilization data were provided by patient recall and not corroborated by medical records. There was a substantially larger variability in the baseline utilization rates for the peak flow group compared with the control group. This suggests that a subset of very high frequency users may have been over-represented in the peak flow group, and the reduction in emergency room visits may be concentrated in this subset.
Peak-flow meter-based written action plan versus peak flow meter with no written action plan
Two studies18,19 addressed the independent effect of a written action plan when added to peak flow self-monitoring (Table 3). Charlton19 reported no significant group differences for main outcomes, while Ignacio-Garcia18 reported large and statistically significant differences in most of the outcomes, favoring the group that used the written action plan.
The Ignacio-Garcia study, however, suffers from notable flaws suggesting the results may be attributable to bias. The sole participating physician, not blinded to treatment assignment, was highly involved in all phases of patient assessment, monitoring, and treatment. There was evidence of baseline differences between the two groups. A total of 25% of patients were withdrawn after randomization, and an unexplained decline in lung function occurred in the control group. Thus, the potential for selection bias, withdrawal bias, and ascertainment bias limits confidence in the results of this study
Symptom-based written action plan compared with peak flow-based written action plan
In 4 studies,16,17,20,21 reported outcomes were generally equivalent between groups and comparisons were not statistically significant, with one exception (Table 3). The 3-arm study by Cowie et al16 reported a striking reduction in the total number of emergency room visits with a peak flow meter-based written action plan compared with a symptom-based written action plan (5 versus 45, P
Discussion
The objective of this systematic review was to assess the independent effects of 2 specific components commonly included in asthma self-management plans—a written action plan and a peak flow meter. Few studies, however, are designed to permit reviewers to isolate the effects of these components. Moreover, the studies we reviewed did not clearly identify the population expected to benefit from interventions or specify the primary outcomes of interest; nor was the level of clinically meaningful improvement prospectively defined.
Most of the trials we reviewed, including the largest community study of 569 patients, did not demonstrate improved outcomes. The 2 trials that reported statistically significant results favoring a peak flow-based written action plan suffer from notable flaws suggesting the results may be attributable to bias. In the other 7 trials, there was little difference in outcomes between groups. However, these studies had insufficient power to detect group differences or confidently conclude equivalence between groups.
Thus, available evidence is insufficient to demonstrate that asthma outcomes are improved by use of a written asthma action plan, with or without peak flow monitoring. While this body of literature does not establish that these interventions are ineffective, it suggests they will not have a large effect on outcomes when applied to the general asthmatic population. The application of written action plans to all asthmatics indiscriminately may be a wasteful use of resources. This systematic review also questions the validity of written action plans as an indicator of asthma quality of care, or as a means to achieve quality improvement.
This analysis also highlights several obstacles to assessing the effects of disease management interventions. First, while the impact of whole intervention programs can be evaluated in controlled trials, it may be unfeasible to isolate each component of such programs and subject it to a rigorous analysis. Furthermore, as a behavioral intervention, the general principle of engaging patients in self-management may be more important that the specific components of these programs. Finally, regarding the optimization of medications (most obviously initiation of inhaled steroids) the impact of written action plans is likely to be relatively small, particularly on lung function or symptom control.
Future clinical trials should be done selectively, aimed at producing rigorous results that can improve the effectiveness of self-management interventions. Further study is warranted for specific subpopulations, such as those with higher baseline severity of illness or those with high baseline utilization rates. Available data suggest that, if there is benefit to be gained from self-management interventions, it will most likely be seen among these patients. Specific components of self-management that might be tested individually are those that are relatively high-cost, resource intensive, or risky for the patient.
Existing trials have tended to over-estimate the effects of action plan-based interventions, thus having invested resources for results inadequate for optimizing self-management strategies. Careful consideration needs to be taken in future trials to realistically estimate the expected impact of each intervention, and to specify the primary outcomes of interest and their baseline frequencies. Future trials should be large enough to detect a difference if one exists, or to confidently conclude that the intervention is ineffective.
Attention to these principles will help to advance our knowledge in this area most efficiently and to ultimately improve the quality of care for the entire population of patients with asthma.
· Acknowledgments ·
We acknowledge Kathleen Ziegler, Pharm.D, and Claudia Bonnell, RN, MSL, for their assistance in the research and preparation of this manuscript.
1. National Heart, Lung and Blood Institute. Expert panel report 2: guidelines for the diagnosis and management of asthma. Bethesda, MD: National Institutes of Health; 1997. NIH publication 97-4051.
2. Ruffin RE, Pierce RJ. Peak flow monitoring—which asthmatics, when, and how? Aust N Z J Med 1994;24:519-20.
3. Devine EC. Meta-analysis of the effects of psychoeducational care in adults with asthma. Res Nursing Health 1996;19:367-76.
4. Bernard-Bonnin AC, Stachenko S, Bonin D, et al. Self-management teaching programs and morbidity of pediatric asthma: a meta-analysis. J Allergy Clin Immunol 1995;95(1 Pt 1):34-41.
5. Gibson PG, Coughlan J, Wilson AJ, et al. Limited (information only) patient education programs for adults with asthma. Cochrane Database Syst Rev 2000a;(2):CD001005.-
6. Gibson PG, Coughlan J, Wilson AJ, et al. Self-management education and regular practitioner review for adults with asthma. Cochrane Database Syst Rev 2000b (2):CD001117.-
7. Lieu TA, Quesenberry CP, Jr, Capra AM, et al. Outpatient management practices associated with reduced risk of pediatric asthma hospitalization and emergency department visits. Pediatrics 1997;100(3 Pt 1):334-41.
8. Lefevre F, Piper M, Mark D, et al. Management of Chronic Asthma. AHRQ evidence report, contract number 290-97-001-5, 2001, http://www.ahcpr.gov/clinic/epcix.htm.
9. Mulrow CD, Oxman AD, editors. Cochrane Collaboration Handbook. Available in the Cochrane Library [database on disk and CD-ROM]. The Cochrane Collaboration; Issue 1. Oxford: Update Software; 1997.
10. Schulz KF, Chalmers I, Hayes RJ, et al. Empirical evidence of bias: dimensions of methodological quality associated with estimates of treatment effects in controlled trials. JAMA 1995;273(5):408-12.
11. Berlin JA, Rennie D. Measuring the quality of trials: the quality of the quality scales. JAMA 1999;282(11):1083-5.
12. Juni P, Witschi A, Bloch R, et al. The hazards of scoring the quality of clinical trials for meta-analysis. JAMA 1999;282(11):1054-60.
13. Jones KP, Mullee MA, Middleton M, et al. Peak flow based asthma self-management: a randomised controlled study in general practice. British Thoracic Society Research Committee. Thorax 1995;50(8):851-7.
14. Drummond N, Abdalla M, Beattie JAG, et al. Effectiveness of routine self monitoring of peak flow in patients with asthma. Grampian Asthma Study of Integrated Care GRASSIC). BMJ 1994 Feb. 26;308(6928):564-7.
15. Ayres JG, Campbell LM. A controlled assessment of an asthma self-management plan involving a budesonide dose regimen. OPTIONS Research Group. Eur Respir J 1996;886-92.
16. Cowie RL, Revitt SG, Underwood MF, et al. The effect of a peak flow-based action plan in the prevention of exacerbations of asthma. Chest 1997;112(6):1534-8.
17. Cote J, Cartier A, Robichaud P, et al. Influence on asthma morbidity of asthma education programs based on self-management plans following treatment optimization. Am J Respir Crit Care Med 1997;155(5):1509-14.
18. Ignacio-Garcia JM, Gonzalez-Santos P. Asthma self-management education program by home monitoring of peak expiratory flow. Am J Respir Crit Care Med 1995;151(2 Pt 1):353-9.
19. Charlton I, Antoniou AG, Atkinson J, et al. Asthma at the interface: bridging the gap between general practice and a district general hospital. Arch Dis Child 1994;70(4):313-8.
20. Turner MO, Taylor D, Bennett R, et al. A randomized trial comparing peak expiratory flow and symptom self-management plans for patients with asthma attending a primary care clinic. Am J Respir Crit Care Med 1998;157(2):540-6.
21. Charlton I, Charlton G, Broomfield J, et al. Evaluation of peak flow and symptoms only self-management plans for control of asthma in general practice. BMJ 1990;301(6765):1355-9.
- Most studies of asthma self-management do not permit retrospective isolation of the independent effects of a written action plan or peak flow meter use.
- Studies designed to isolate the effect of these self-care activities are generally underpowered or prone to systematic bias.
- Available evidence suggests that peak flow meters and written action plans do not have a large impact on outcomes when applied to the general population of asthmatics.
- These interventions are most likely to have beneficial effect when applied to selected populations, particularly patients with high baseline utilization.
Self-management skills are widely promoted by health plans and specialty societies with the expectation that they will improve care. The 1997 National Heart, Lung, and Blood Institute guidelines on treating asthma emphasize self-management,1 although they do not recommend specific programs. To maximize therapeutic effectiveness, it would be useful to know which components of patient self-management improve outcomes. Written action plans and peak flow meters are commonly used in asthma self-management programs. While these are simple, low-cost interventions for an individual, the aggregate cost for the entire population of asthmatics may be high.2
Much literature has accumulated on the effectiveness of providing asthma education alone and on programs that actively engage patients in their own care.Several systematic reviews have found that providing educational information alone has had little effect on asthma outcomes.3-5 There is evidence, though, that self-management activities are more effective than educational information alone. A recent Cochrane review of 24 trials found that self-management with regular practitioner review reduces hospi-talizations and emergency room visits.6 This review did not identify specific components contributing to improved outcomes. In contrast to the aforementioned studies on patient education, a large case-control study of children in the Kaiser Permanente System,7 found that written action plans were associated with lower rates of hospitalization and emergency room use. However, such observational studies often include confounding factors and are not sufficient to establish a cause-effect relationship between written action plans and improved outcomes.
We report on a systematic review that attempts to isolate the independent effect of a written action plan on asthma outcomes. We address two key questions:
- Compared with medical management alone, does the addition of a written asthma action plan (with or without peak flow meter use) improve outcomes?
- Compared with a written action plan based on symptoms, does a written action plan based on peak flow monitoring improve outcomes?
Methods
This study is part of a broader evidence report on the management of chronic asthma prepared for the Agency of Health Care Research and Quality8. Complete details of the methodology are available in the full report8 (http://www.ahcpr.gov/clinic/epcix.htm).
Literature search and study selection
We performed a comprehensive literature search from 1980 to August 2000 using MEDLINE, Embase, the Cochrane Library, and a hand search of recent bibliographies. The search was limited to full-length, peer-reviewed articles with an English abstract. Two independent reviewers carried out each step of study selection and data abstraction. Disagreements were resolved by consensus of the two reviewers or, if necessary, by the decision of a third reviewer.
Initial study selection was limited to comparative full-length reports or abstracts in peer-reviewed medical journals, with at least 25 evaluable children or adults per arm, treated for at least 12 weeks. Relevant comparisons included a written action plan and no written action plan; a written action plan based on peak flow readings and a written action plan based on symptoms. Study designs varied: clinical trials, cohort comparisons, case-control analyses, cross-sectional evaluations, and before-after comparisons. Specific components of the management plan had to be described.
Relevant outcomes included measures of inpatient and outpatient utilization, lung function, symptoms, rescue medication or oral steroid use, and quality of life. Outcomes of greatest interest were utilization parameters, as the goals of self-management usually focus on improving these outcomes.
These initial selection criteria yielded many studies that were confounded by multiple asthma management interventions and thus did not isolate the comparisons of interest. Therefore, the research team collectively determined the study design features that would best isolate the effects of written action plans and used them as new criteria in a second round of study selection. The studies thus selected satisfied 4 criteria: 1) randomization of patients; (2) delivery of the same interventions to experimental and control groups, except that the experimental group also received a written action plan; (3) delivery of the same interventions to experimental and control groups, except that one group received a written action plan based on peak flow meter readings, and the comparison group received a written action plan based on symptom monitoring; and 4) inclusion of a written action plan that met our specified definition.
A written action plan, by our definition, had two components: an algorithm that identified specific clinical indicators signaling the need for adjustments in medication; and specific instructions on how to adjust medications in response to such indicators. Many publications lacked sufficient detail on the written plan, so a brief survey was sent to the primary author of each of the 36 studies. If no response was obtained (36%), the article was excluded only when it was clear from the publication that our definition was not met.
Assessment of study quality
High-quality studies were randomized controlled trials that met the 3 domains of study quality that have been demonstrated empirically to impact effect size: concealment of treatment allocation; double-blinding; and minimization of exclusion bias.9,10 However, we doubted the feasibility of double-blinding a written asthma plan intervention, and so relaxed this requirement. We considered exclusion bias to be minimized when a study either reported intent-to-treat analysis or excluded fewer than 10% of subjects from analysis, with the ratio of subjects excluded from each arm being less than 2:1.
To more fully evaluate study design issues that may be particularly important in asthma research,11,12 we constructed asthma-specific quality indicators in consultation with an expert panel. Controls for potential confounders of treatment effect included establishing reversibility of airway obstruction, controlling for other medication use, reporting compliance, and addressing seasonality. In addition, a priori reporting of power calculations and accounting for exclusions and withdrawals were judged to be study quality characteristics pertinent to this body of evidence.
Data analysis
We constructed evidence tables for the outcomes of interest, and performed a qualitative synthesis of the data. Meta-analysis was not appropriate due to wide discrepancies in the patient populations studied, the interventions employed, and measurement and reporting of outcomes.
Results
Our literature search yielded a total of 4578 citations. Of these, 36 studies met the initial selection criteria. Many of these qualifying studies, however, were confounded by multiple asthma management interventions applied inconsistently across treatment arms. For example, a common confounder was review of and change in long-term medication use in the treatment group, but not in the control group. This necessitated a refinement in our selection criteria to focus on studies that largely isolated the effect of written action plans.13-21 This step yielded a final evidence base of 9 randomized controlled trials with a total enrollment of 1501 patients.
Table 1 summarizes the characteristics, interventions, and outcomes of the 9 studies. Two studies were 3-arm trials,16,17 which raised the total number of comparisons among the 9 studies to 11. The largest study was the Grampian Asthma Study of Integrated Care (n=569),14 a community study conducted in the UK. Enrollment in the other 8 studies ranged from 43 to 64 patients per arm. Treatment duration ranged from 24 to 52 weeks.
None of the studies met our definition of high quality. In fact, no study met any of the generic quality criteria—none was blinded, none described concealment of allocation, and all excluded more than 10% of subjects. Furthermore, none reported an intention-to-treat analysis. Thus these trials were prone to withdrawal bias as well as overestimation of treatment effect due to lack of allocation concealment.
No study met the majority of asthma-specific indicators (Table 1). Of the 9 studies, only 5 met any asthma-specific indicator. Three reported prospective power calculations,13-15 but 2 of these substantially overestimated the expected effect.13,15 Two studies established reversibility;14,17 2 controlled for other medication use;13,15 and 2 reported compliance.17,21 Thus, the studies were also prone to a type II error (failing to detect a true effect) and to potential confounding of outcomes.
We performed sample power calculations for hospitalizations (Table 2), derived from baseline rates reported in 4 studies14,16-18 and standard deviations reported in 2.14,17 A study with 250 patients per arm could detect a reduction of 50% or more in hospitalization, given a control rate of at least 0.2 hospitalizations/patient/year. In actuality, GRASSIC,14 which is the largest available trial (N=569), had baseline hospitalization rates of 0.12 and 0.13. With this baseline rate, over 700 patients per arm are required, higher than the actual enrollment in GRASSIC. The other studies in this review would be adequately powered to detect a 50% difference only in the setting of even higher baseline utilization (eg, 0.30 hospitalizations/patient/year).
Table 3 displays utilization outcomes for the 11 comparisons in the 9 trials. In 5 studies (N=1019), medical management with a written action plan was compared with medical management without a written action plan.13-17 Two trials (N=185) compared a peak flow meter plus a written action plan with a peak flow meter and no written action plan18,19 In 4 studies (N=393), a written action plan based on peak flow monitoring was compared with a written action plan based on symptoms.
TABLE 1
Study characteristics
Study | Patient popultation | Study Arms | Intervention components | Outcomes reported | Asthma quality indicators met |
---|---|---|---|---|---|
Optimal medical management vs. optimal medical management + PFM action plan | |||||
Jones 199514 | Inclusions: patients using ICS | Usual care | SxD, FU | Ut, LF, Sx | Pow, Med |
Exclusions: patients on oral steroids or using peak flow meters at home | PFM action plan | AP, PF, SxD, FU | |||
Mean age: 29.5 years | |||||
Severity level: Mild–moderate | |||||
Drummond 1994 (GRASSIC)15 | Inclusion: FEV1 reversibility 20% or greater | Usual care | FU | Ut, LF, Med Ex | Pow, Rev |
Exclusions: patients who already owned a PFM | PFM action plan | AP, PF, FU | |||
Mean age: 50.8 years | |||||
Severity level: Mild–severe | |||||
Ayres 199516 | Inclusions: maximum PEF variability, 0.15%; minimum nights/week with symptoms, 3; minimum use of ICS or sodium cromoglycate, 3 months | Usual care | SxD, FU | LF, Sx, Ex | Pow, Med |
Mean age: 45 years | PFM action plan | AP, PF, SxD, FU | |||
Severity level: Moderate–severe | |||||
Cowie 199713 | Inclusions: treatment for an exacerbation of asthma in an ER asthma clinic; history of receiving urgent treatment for asthma in the previous 12 months | Usual care | Ed, SxD, FU | Ut, PF, Med, Ex | None |
Mean age: 37.8 years | PFM action plan | AP, PF, Ed, SxD, FU | |||
Severity level: Mild–severe | |||||
Cote 199717 | Inclusions: FEV1postbronchodilator 85-100 % of predicted; PEF, at minimum, 85 % of predicted; minimum PEF variability, 0%; Methacholine | Usual care | Ed | Ut, LF, Med | Exc, Rev, Com |
Exclusions: patients having previously taken an asthma educational program | PFM action plan | Ed, Cn, AP, PF | |||
Mean age: 36.5 years Severity level: Mild | |||||
Usual care + PFM use alone vs. usual care + PFM action plan | |||||
Ignacio-Garcia 199518 | Inclusions: patients from outpatient asthma clinic with asthma for 2 years | Usual care + PFM | PF, SxD, FU | Ut, LF, Med | None |
Mean age: 41.9 years | Usual care + PFM action plan | PF, AP, Ed, SxD, FU | |||
Severity level: Mild–severe | |||||
Charlton 199419 | Inclusion: patients with inpatient or outpatient visit for asthma | Usual care + PFM | PF, Ed, SxD, FU | Ut, Sx, Med, Ex | None |
Mean age: 6.5 years | Usual care + PFM action plan | PF, AP, Ed, SxD, FU | |||
Severity level: Mild–moderate | |||||
PFM action plan vs. Symptom action plan | |||||
Turner 199820 | Inclusions: Maximum methacholine PC20, 7.9; using ICS | Symptom action plan | AP, Ed, SxD, Cn BM, EM | Ut, LF, Sx, Med | Exc, Com |
Exclusions: previous PFM use; significant comorbid conditions | PFM action plan | PF, AP, Ed, SxD, Cn BM, EM | |||
Mean age: 34.1 years | |||||
Severity level: Mild–severe | |||||
Charlton 199021 | Inclusions: patients on repeat prescribing register | Symptom action plan | AP, Ed, FU | Ut, Med | None |
Mean age: NR | PFM action plan | PF, AP, Ed, FU, Cn | Ut, PF, Med, Ex | None | |
Severity level: Mild–severe (?) | |||||
Cowie 199716 | Inclusions: treatment for an exacerbation of asthma in an ER, or asthma clinic; history of receiving urgent treatment for asthma in the previous 12 months | Symptom action plan | AP, Ed, SxD, FU | ||
PFM action plan | AP, PF, Ed, SxD, FU | ||||
Cote 199717 | Inclusions: FEV1postbronchodilator, 85-100 % of predicted; PEF, at minimum, 85 % of predicted; minimum PEF variability, 0%; Methacholine | Symptom action plan | Ed, AP | Ut, LF, Med | Exc, Rev, Com |
Exclusions: previous enrollment in an asthma educational program | PFM action plan | Ed, Cn, AP, PF | |||
Eligibility criteria: ICS = inhaled corticosteroid; FEV1 = forced expiratory volume in 1 second; PEF = peak expiratory flow; PFM = peak flow meter; ER = emergency room; PC20 = 20% fall in FEV1 Intervention components: PF = Peak flow meter; AP = Written Action Plan; Ed = Education; SxD = Symptom diary; FU = Follow-up visits; Cn = Counseling; BM = Behavior modification; EM = Environmental modification | |||||
Outcomes: Ut = Utilization measures; LF= Lung function measurements; Sx = Symptom=based measurements; Med = Medication use; Ex = Exacerbations of asth ma Asthma Quality Indicators: Exc = Accounted for excluded patients; Pow = Reported power calculations; Rev = Established reversibility of airway obstruction; Med = Controlled for other medication use; Com = Reported compliance; Sea = Addressed seasonality. |
TABLE 2
Power calculations for hospitalizations per patient per year
Assumed control mean | Possible treatment mean | % decrease | N needed per study arm |
---|---|---|---|
0.10 | 0.075 | 25 | 3077 |
0.10 | 0.05 | 50 | 770 |
0.10 | 0.025 | 75 | 342 |
0.20 | 0.015 | 25 | 770 |
0.20 | 0.10 | 50 | 193 |
0.20 | 0.05 | 75 | 86 |
0.30 | 0.225 | 25 | 342 |
0.30 | 0.15 | 50 | 86 |
0.30 | 0.075 | 75 | 38 |
Studies were identified that contained baseline rates on hospitalizations/patient/year, or information that allowed calculation of this parameter (Drummond, Abdalla, Beattie et al., 1994; Cote, Cartier, Robichaud et al. 1997; Cowie, Revitt, Underwood et al., 1997; Ignacio-Garcia and Gonzalez-Santos, 1995). Baseline rates of hospitalization varied in these studies from 0.04-0.29/patient/year. Standard deviations for this outcome were available only in two studies; Cote, Cartier, Robichaud et al. (1997) reported an SD of 0.30 for this variable, and an SD of 0.35 was calculated from the confidence intervals reported in GRASSIC (Drummond, Abdalla, Beattie et al., 1994). For the calculations, the more conservative 0.35 estimate for SD was used. | |||
Number of patients per study arm were estimated for 80 percent power at the 5 percent significance level using control arm means of 0.10, 0.20, and 0.30 hospitalizations/patient/year. The expected reduction in this variable was tested along a spectrum from 25-75 percent. |
Written action plan versus no written action plan
All 5 studies used a peak flow meter based written action plan. All reported utilization outcomes, but the types and units of measurement were not consistent across studies (Table 2). Additionally, 4 studies reported on symptoms,13-16 and 3 reported lung function outcomes.13-15
With one notable exception, there were no statistically significant differences in outcomes among groups. Cowie et al16 reported an 11-fold decrease in total emergency room visits for the group using a peak-flow action plan (5 vs 55, P = .02), and also reported a reduction in hospitalizations of a similar magnitude (2 vs 12) that did not reach statistical significance. However, this study suffers from notable flaws that diminish confidence in the results. It is a post-intervention comparison among groups, which does not compare change from baseline, or incorporate baseline values as covariates in the analysis. Moreover baseline utilization data were provided by patient recall and not corroborated by medical records. There was a substantially larger variability in the baseline utilization rates for the peak flow group compared with the control group. This suggests that a subset of very high frequency users may have been over-represented in the peak flow group, and the reduction in emergency room visits may be concentrated in this subset.
Peak-flow meter-based written action plan versus peak flow meter with no written action plan
Two studies18,19 addressed the independent effect of a written action plan when added to peak flow self-monitoring (Table 3). Charlton19 reported no significant group differences for main outcomes, while Ignacio-Garcia18 reported large and statistically significant differences in most of the outcomes, favoring the group that used the written action plan.
The Ignacio-Garcia study, however, suffers from notable flaws suggesting the results may be attributable to bias. The sole participating physician, not blinded to treatment assignment, was highly involved in all phases of patient assessment, monitoring, and treatment. There was evidence of baseline differences between the two groups. A total of 25% of patients were withdrawn after randomization, and an unexplained decline in lung function occurred in the control group. Thus, the potential for selection bias, withdrawal bias, and ascertainment bias limits confidence in the results of this study
Symptom-based written action plan compared with peak flow-based written action plan
In 4 studies,16,17,20,21 reported outcomes were generally equivalent between groups and comparisons were not statistically significant, with one exception (Table 3). The 3-arm study by Cowie et al16 reported a striking reduction in the total number of emergency room visits with a peak flow meter-based written action plan compared with a symptom-based written action plan (5 versus 45, P
Discussion
The objective of this systematic review was to assess the independent effects of 2 specific components commonly included in asthma self-management plans—a written action plan and a peak flow meter. Few studies, however, are designed to permit reviewers to isolate the effects of these components. Moreover, the studies we reviewed did not clearly identify the population expected to benefit from interventions or specify the primary outcomes of interest; nor was the level of clinically meaningful improvement prospectively defined.
Most of the trials we reviewed, including the largest community study of 569 patients, did not demonstrate improved outcomes. The 2 trials that reported statistically significant results favoring a peak flow-based written action plan suffer from notable flaws suggesting the results may be attributable to bias. In the other 7 trials, there was little difference in outcomes between groups. However, these studies had insufficient power to detect group differences or confidently conclude equivalence between groups.
Thus, available evidence is insufficient to demonstrate that asthma outcomes are improved by use of a written asthma action plan, with or without peak flow monitoring. While this body of literature does not establish that these interventions are ineffective, it suggests they will not have a large effect on outcomes when applied to the general asthmatic population. The application of written action plans to all asthmatics indiscriminately may be a wasteful use of resources. This systematic review also questions the validity of written action plans as an indicator of asthma quality of care, or as a means to achieve quality improvement.
This analysis also highlights several obstacles to assessing the effects of disease management interventions. First, while the impact of whole intervention programs can be evaluated in controlled trials, it may be unfeasible to isolate each component of such programs and subject it to a rigorous analysis. Furthermore, as a behavioral intervention, the general principle of engaging patients in self-management may be more important that the specific components of these programs. Finally, regarding the optimization of medications (most obviously initiation of inhaled steroids) the impact of written action plans is likely to be relatively small, particularly on lung function or symptom control.
Future clinical trials should be done selectively, aimed at producing rigorous results that can improve the effectiveness of self-management interventions. Further study is warranted for specific subpopulations, such as those with higher baseline severity of illness or those with high baseline utilization rates. Available data suggest that, if there is benefit to be gained from self-management interventions, it will most likely be seen among these patients. Specific components of self-management that might be tested individually are those that are relatively high-cost, resource intensive, or risky for the patient.
Existing trials have tended to over-estimate the effects of action plan-based interventions, thus having invested resources for results inadequate for optimizing self-management strategies. Careful consideration needs to be taken in future trials to realistically estimate the expected impact of each intervention, and to specify the primary outcomes of interest and their baseline frequencies. Future trials should be large enough to detect a difference if one exists, or to confidently conclude that the intervention is ineffective.
Attention to these principles will help to advance our knowledge in this area most efficiently and to ultimately improve the quality of care for the entire population of patients with asthma.
· Acknowledgments ·
We acknowledge Kathleen Ziegler, Pharm.D, and Claudia Bonnell, RN, MSL, for their assistance in the research and preparation of this manuscript.
- Most studies of asthma self-management do not permit retrospective isolation of the independent effects of a written action plan or peak flow meter use.
- Studies designed to isolate the effect of these self-care activities are generally underpowered or prone to systematic bias.
- Available evidence suggests that peak flow meters and written action plans do not have a large impact on outcomes when applied to the general population of asthmatics.
- These interventions are most likely to have beneficial effect when applied to selected populations, particularly patients with high baseline utilization.
Self-management skills are widely promoted by health plans and specialty societies with the expectation that they will improve care. The 1997 National Heart, Lung, and Blood Institute guidelines on treating asthma emphasize self-management,1 although they do not recommend specific programs. To maximize therapeutic effectiveness, it would be useful to know which components of patient self-management improve outcomes. Written action plans and peak flow meters are commonly used in asthma self-management programs. While these are simple, low-cost interventions for an individual, the aggregate cost for the entire population of asthmatics may be high.2
Much literature has accumulated on the effectiveness of providing asthma education alone and on programs that actively engage patients in their own care.Several systematic reviews have found that providing educational information alone has had little effect on asthma outcomes.3-5 There is evidence, though, that self-management activities are more effective than educational information alone. A recent Cochrane review of 24 trials found that self-management with regular practitioner review reduces hospi-talizations and emergency room visits.6 This review did not identify specific components contributing to improved outcomes. In contrast to the aforementioned studies on patient education, a large case-control study of children in the Kaiser Permanente System,7 found that written action plans were associated with lower rates of hospitalization and emergency room use. However, such observational studies often include confounding factors and are not sufficient to establish a cause-effect relationship between written action plans and improved outcomes.
We report on a systematic review that attempts to isolate the independent effect of a written action plan on asthma outcomes. We address two key questions:
- Compared with medical management alone, does the addition of a written asthma action plan (with or without peak flow meter use) improve outcomes?
- Compared with a written action plan based on symptoms, does a written action plan based on peak flow monitoring improve outcomes?
Methods
This study is part of a broader evidence report on the management of chronic asthma prepared for the Agency of Health Care Research and Quality8. Complete details of the methodology are available in the full report8 (http://www.ahcpr.gov/clinic/epcix.htm).
Literature search and study selection
We performed a comprehensive literature search from 1980 to August 2000 using MEDLINE, Embase, the Cochrane Library, and a hand search of recent bibliographies. The search was limited to full-length, peer-reviewed articles with an English abstract. Two independent reviewers carried out each step of study selection and data abstraction. Disagreements were resolved by consensus of the two reviewers or, if necessary, by the decision of a third reviewer.
Initial study selection was limited to comparative full-length reports or abstracts in peer-reviewed medical journals, with at least 25 evaluable children or adults per arm, treated for at least 12 weeks. Relevant comparisons included a written action plan and no written action plan; a written action plan based on peak flow readings and a written action plan based on symptoms. Study designs varied: clinical trials, cohort comparisons, case-control analyses, cross-sectional evaluations, and before-after comparisons. Specific components of the management plan had to be described.
Relevant outcomes included measures of inpatient and outpatient utilization, lung function, symptoms, rescue medication or oral steroid use, and quality of life. Outcomes of greatest interest were utilization parameters, as the goals of self-management usually focus on improving these outcomes.
These initial selection criteria yielded many studies that were confounded by multiple asthma management interventions and thus did not isolate the comparisons of interest. Therefore, the research team collectively determined the study design features that would best isolate the effects of written action plans and used them as new criteria in a second round of study selection. The studies thus selected satisfied 4 criteria: 1) randomization of patients; (2) delivery of the same interventions to experimental and control groups, except that the experimental group also received a written action plan; (3) delivery of the same interventions to experimental and control groups, except that one group received a written action plan based on peak flow meter readings, and the comparison group received a written action plan based on symptom monitoring; and 4) inclusion of a written action plan that met our specified definition.
A written action plan, by our definition, had two components: an algorithm that identified specific clinical indicators signaling the need for adjustments in medication; and specific instructions on how to adjust medications in response to such indicators. Many publications lacked sufficient detail on the written plan, so a brief survey was sent to the primary author of each of the 36 studies. If no response was obtained (36%), the article was excluded only when it was clear from the publication that our definition was not met.
Assessment of study quality
High-quality studies were randomized controlled trials that met the 3 domains of study quality that have been demonstrated empirically to impact effect size: concealment of treatment allocation; double-blinding; and minimization of exclusion bias.9,10 However, we doubted the feasibility of double-blinding a written asthma plan intervention, and so relaxed this requirement. We considered exclusion bias to be minimized when a study either reported intent-to-treat analysis or excluded fewer than 10% of subjects from analysis, with the ratio of subjects excluded from each arm being less than 2:1.
To more fully evaluate study design issues that may be particularly important in asthma research,11,12 we constructed asthma-specific quality indicators in consultation with an expert panel. Controls for potential confounders of treatment effect included establishing reversibility of airway obstruction, controlling for other medication use, reporting compliance, and addressing seasonality. In addition, a priori reporting of power calculations and accounting for exclusions and withdrawals were judged to be study quality characteristics pertinent to this body of evidence.
Data analysis
We constructed evidence tables for the outcomes of interest, and performed a qualitative synthesis of the data. Meta-analysis was not appropriate due to wide discrepancies in the patient populations studied, the interventions employed, and measurement and reporting of outcomes.
Results
Our literature search yielded a total of 4578 citations. Of these, 36 studies met the initial selection criteria. Many of these qualifying studies, however, were confounded by multiple asthma management interventions applied inconsistently across treatment arms. For example, a common confounder was review of and change in long-term medication use in the treatment group, but not in the control group. This necessitated a refinement in our selection criteria to focus on studies that largely isolated the effect of written action plans.13-21 This step yielded a final evidence base of 9 randomized controlled trials with a total enrollment of 1501 patients.
Table 1 summarizes the characteristics, interventions, and outcomes of the 9 studies. Two studies were 3-arm trials,16,17 which raised the total number of comparisons among the 9 studies to 11. The largest study was the Grampian Asthma Study of Integrated Care (n=569),14 a community study conducted in the UK. Enrollment in the other 8 studies ranged from 43 to 64 patients per arm. Treatment duration ranged from 24 to 52 weeks.
None of the studies met our definition of high quality. In fact, no study met any of the generic quality criteria—none was blinded, none described concealment of allocation, and all excluded more than 10% of subjects. Furthermore, none reported an intention-to-treat analysis. Thus these trials were prone to withdrawal bias as well as overestimation of treatment effect due to lack of allocation concealment.
No study met the majority of asthma-specific indicators (Table 1). Of the 9 studies, only 5 met any asthma-specific indicator. Three reported prospective power calculations,13-15 but 2 of these substantially overestimated the expected effect.13,15 Two studies established reversibility;14,17 2 controlled for other medication use;13,15 and 2 reported compliance.17,21 Thus, the studies were also prone to a type II error (failing to detect a true effect) and to potential confounding of outcomes.
We performed sample power calculations for hospitalizations (Table 2), derived from baseline rates reported in 4 studies14,16-18 and standard deviations reported in 2.14,17 A study with 250 patients per arm could detect a reduction of 50% or more in hospitalization, given a control rate of at least 0.2 hospitalizations/patient/year. In actuality, GRASSIC,14 which is the largest available trial (N=569), had baseline hospitalization rates of 0.12 and 0.13. With this baseline rate, over 700 patients per arm are required, higher than the actual enrollment in GRASSIC. The other studies in this review would be adequately powered to detect a 50% difference only in the setting of even higher baseline utilization (eg, 0.30 hospitalizations/patient/year).
Table 3 displays utilization outcomes for the 11 comparisons in the 9 trials. In 5 studies (N=1019), medical management with a written action plan was compared with medical management without a written action plan.13-17 Two trials (N=185) compared a peak flow meter plus a written action plan with a peak flow meter and no written action plan18,19 In 4 studies (N=393), a written action plan based on peak flow monitoring was compared with a written action plan based on symptoms.
TABLE 1
Study characteristics
Study | Patient popultation | Study Arms | Intervention components | Outcomes reported | Asthma quality indicators met |
---|---|---|---|---|---|
Optimal medical management vs. optimal medical management + PFM action plan | |||||
Jones 199514 | Inclusions: patients using ICS | Usual care | SxD, FU | Ut, LF, Sx | Pow, Med |
Exclusions: patients on oral steroids or using peak flow meters at home | PFM action plan | AP, PF, SxD, FU | |||
Mean age: 29.5 years | |||||
Severity level: Mild–moderate | |||||
Drummond 1994 (GRASSIC)15 | Inclusion: FEV1 reversibility 20% or greater | Usual care | FU | Ut, LF, Med Ex | Pow, Rev |
Exclusions: patients who already owned a PFM | PFM action plan | AP, PF, FU | |||
Mean age: 50.8 years | |||||
Severity level: Mild–severe | |||||
Ayres 199516 | Inclusions: maximum PEF variability, 0.15%; minimum nights/week with symptoms, 3; minimum use of ICS or sodium cromoglycate, 3 months | Usual care | SxD, FU | LF, Sx, Ex | Pow, Med |
Mean age: 45 years | PFM action plan | AP, PF, SxD, FU | |||
Severity level: Moderate–severe | |||||
Cowie 199713 | Inclusions: treatment for an exacerbation of asthma in an ER asthma clinic; history of receiving urgent treatment for asthma in the previous 12 months | Usual care | Ed, SxD, FU | Ut, PF, Med, Ex | None |
Mean age: 37.8 years | PFM action plan | AP, PF, Ed, SxD, FU | |||
Severity level: Mild–severe | |||||
Cote 199717 | Inclusions: FEV1postbronchodilator 85-100 % of predicted; PEF, at minimum, 85 % of predicted; minimum PEF variability, 0%; Methacholine | Usual care | Ed | Ut, LF, Med | Exc, Rev, Com |
Exclusions: patients having previously taken an asthma educational program | PFM action plan | Ed, Cn, AP, PF | |||
Mean age: 36.5 years Severity level: Mild | |||||
Usual care + PFM use alone vs. usual care + PFM action plan | |||||
Ignacio-Garcia 199518 | Inclusions: patients from outpatient asthma clinic with asthma for 2 years | Usual care + PFM | PF, SxD, FU | Ut, LF, Med | None |
Mean age: 41.9 years | Usual care + PFM action plan | PF, AP, Ed, SxD, FU | |||
Severity level: Mild–severe | |||||
Charlton 199419 | Inclusion: patients with inpatient or outpatient visit for asthma | Usual care + PFM | PF, Ed, SxD, FU | Ut, Sx, Med, Ex | None |
Mean age: 6.5 years | Usual care + PFM action plan | PF, AP, Ed, SxD, FU | |||
Severity level: Mild–moderate | |||||
PFM action plan vs. Symptom action plan | |||||
Turner 199820 | Inclusions: Maximum methacholine PC20, 7.9; using ICS | Symptom action plan | AP, Ed, SxD, Cn BM, EM | Ut, LF, Sx, Med | Exc, Com |
Exclusions: previous PFM use; significant comorbid conditions | PFM action plan | PF, AP, Ed, SxD, Cn BM, EM | |||
Mean age: 34.1 years | |||||
Severity level: Mild–severe | |||||
Charlton 199021 | Inclusions: patients on repeat prescribing register | Symptom action plan | AP, Ed, FU | Ut, Med | None |
Mean age: NR | PFM action plan | PF, AP, Ed, FU, Cn | Ut, PF, Med, Ex | None | |
Severity level: Mild–severe (?) | |||||
Cowie 199716 | Inclusions: treatment for an exacerbation of asthma in an ER, or asthma clinic; history of receiving urgent treatment for asthma in the previous 12 months | Symptom action plan | AP, Ed, SxD, FU | ||
PFM action plan | AP, PF, Ed, SxD, FU | ||||
Cote 199717 | Inclusions: FEV1postbronchodilator, 85-100 % of predicted; PEF, at minimum, 85 % of predicted; minimum PEF variability, 0%; Methacholine | Symptom action plan | Ed, AP | Ut, LF, Med | Exc, Rev, Com |
Exclusions: previous enrollment in an asthma educational program | PFM action plan | Ed, Cn, AP, PF | |||
Eligibility criteria: ICS = inhaled corticosteroid; FEV1 = forced expiratory volume in 1 second; PEF = peak expiratory flow; PFM = peak flow meter; ER = emergency room; PC20 = 20% fall in FEV1 Intervention components: PF = Peak flow meter; AP = Written Action Plan; Ed = Education; SxD = Symptom diary; FU = Follow-up visits; Cn = Counseling; BM = Behavior modification; EM = Environmental modification | |||||
Outcomes: Ut = Utilization measures; LF= Lung function measurements; Sx = Symptom=based measurements; Med = Medication use; Ex = Exacerbations of asth ma Asthma Quality Indicators: Exc = Accounted for excluded patients; Pow = Reported power calculations; Rev = Established reversibility of airway obstruction; Med = Controlled for other medication use; Com = Reported compliance; Sea = Addressed seasonality. |
TABLE 2
Power calculations for hospitalizations per patient per year
Assumed control mean | Possible treatment mean | % decrease | N needed per study arm |
---|---|---|---|
0.10 | 0.075 | 25 | 3077 |
0.10 | 0.05 | 50 | 770 |
0.10 | 0.025 | 75 | 342 |
0.20 | 0.015 | 25 | 770 |
0.20 | 0.10 | 50 | 193 |
0.20 | 0.05 | 75 | 86 |
0.30 | 0.225 | 25 | 342 |
0.30 | 0.15 | 50 | 86 |
0.30 | 0.075 | 75 | 38 |
Studies were identified that contained baseline rates on hospitalizations/patient/year, or information that allowed calculation of this parameter (Drummond, Abdalla, Beattie et al., 1994; Cote, Cartier, Robichaud et al. 1997; Cowie, Revitt, Underwood et al., 1997; Ignacio-Garcia and Gonzalez-Santos, 1995). Baseline rates of hospitalization varied in these studies from 0.04-0.29/patient/year. Standard deviations for this outcome were available only in two studies; Cote, Cartier, Robichaud et al. (1997) reported an SD of 0.30 for this variable, and an SD of 0.35 was calculated from the confidence intervals reported in GRASSIC (Drummond, Abdalla, Beattie et al., 1994). For the calculations, the more conservative 0.35 estimate for SD was used. | |||
Number of patients per study arm were estimated for 80 percent power at the 5 percent significance level using control arm means of 0.10, 0.20, and 0.30 hospitalizations/patient/year. The expected reduction in this variable was tested along a spectrum from 25-75 percent. |
Written action plan versus no written action plan
All 5 studies used a peak flow meter based written action plan. All reported utilization outcomes, but the types and units of measurement were not consistent across studies (Table 2). Additionally, 4 studies reported on symptoms,13-16 and 3 reported lung function outcomes.13-15
With one notable exception, there were no statistically significant differences in outcomes among groups. Cowie et al16 reported an 11-fold decrease in total emergency room visits for the group using a peak-flow action plan (5 vs 55, P = .02), and also reported a reduction in hospitalizations of a similar magnitude (2 vs 12) that did not reach statistical significance. However, this study suffers from notable flaws that diminish confidence in the results. It is a post-intervention comparison among groups, which does not compare change from baseline, or incorporate baseline values as covariates in the analysis. Moreover baseline utilization data were provided by patient recall and not corroborated by medical records. There was a substantially larger variability in the baseline utilization rates for the peak flow group compared with the control group. This suggests that a subset of very high frequency users may have been over-represented in the peak flow group, and the reduction in emergency room visits may be concentrated in this subset.
Peak-flow meter-based written action plan versus peak flow meter with no written action plan
Two studies18,19 addressed the independent effect of a written action plan when added to peak flow self-monitoring (Table 3). Charlton19 reported no significant group differences for main outcomes, while Ignacio-Garcia18 reported large and statistically significant differences in most of the outcomes, favoring the group that used the written action plan.
The Ignacio-Garcia study, however, suffers from notable flaws suggesting the results may be attributable to bias. The sole participating physician, not blinded to treatment assignment, was highly involved in all phases of patient assessment, monitoring, and treatment. There was evidence of baseline differences between the two groups. A total of 25% of patients were withdrawn after randomization, and an unexplained decline in lung function occurred in the control group. Thus, the potential for selection bias, withdrawal bias, and ascertainment bias limits confidence in the results of this study
Symptom-based written action plan compared with peak flow-based written action plan
In 4 studies,16,17,20,21 reported outcomes were generally equivalent between groups and comparisons were not statistically significant, with one exception (Table 3). The 3-arm study by Cowie et al16 reported a striking reduction in the total number of emergency room visits with a peak flow meter-based written action plan compared with a symptom-based written action plan (5 versus 45, P
Discussion
The objective of this systematic review was to assess the independent effects of 2 specific components commonly included in asthma self-management plans—a written action plan and a peak flow meter. Few studies, however, are designed to permit reviewers to isolate the effects of these components. Moreover, the studies we reviewed did not clearly identify the population expected to benefit from interventions or specify the primary outcomes of interest; nor was the level of clinically meaningful improvement prospectively defined.
Most of the trials we reviewed, including the largest community study of 569 patients, did not demonstrate improved outcomes. The 2 trials that reported statistically significant results favoring a peak flow-based written action plan suffer from notable flaws suggesting the results may be attributable to bias. In the other 7 trials, there was little difference in outcomes between groups. However, these studies had insufficient power to detect group differences or confidently conclude equivalence between groups.
Thus, available evidence is insufficient to demonstrate that asthma outcomes are improved by use of a written asthma action plan, with or without peak flow monitoring. While this body of literature does not establish that these interventions are ineffective, it suggests they will not have a large effect on outcomes when applied to the general asthmatic population. The application of written action plans to all asthmatics indiscriminately may be a wasteful use of resources. This systematic review also questions the validity of written action plans as an indicator of asthma quality of care, or as a means to achieve quality improvement.
This analysis also highlights several obstacles to assessing the effects of disease management interventions. First, while the impact of whole intervention programs can be evaluated in controlled trials, it may be unfeasible to isolate each component of such programs and subject it to a rigorous analysis. Furthermore, as a behavioral intervention, the general principle of engaging patients in self-management may be more important that the specific components of these programs. Finally, regarding the optimization of medications (most obviously initiation of inhaled steroids) the impact of written action plans is likely to be relatively small, particularly on lung function or symptom control.
Future clinical trials should be done selectively, aimed at producing rigorous results that can improve the effectiveness of self-management interventions. Further study is warranted for specific subpopulations, such as those with higher baseline severity of illness or those with high baseline utilization rates. Available data suggest that, if there is benefit to be gained from self-management interventions, it will most likely be seen among these patients. Specific components of self-management that might be tested individually are those that are relatively high-cost, resource intensive, or risky for the patient.
Existing trials have tended to over-estimate the effects of action plan-based interventions, thus having invested resources for results inadequate for optimizing self-management strategies. Careful consideration needs to be taken in future trials to realistically estimate the expected impact of each intervention, and to specify the primary outcomes of interest and their baseline frequencies. Future trials should be large enough to detect a difference if one exists, or to confidently conclude that the intervention is ineffective.
Attention to these principles will help to advance our knowledge in this area most efficiently and to ultimately improve the quality of care for the entire population of patients with asthma.
· Acknowledgments ·
We acknowledge Kathleen Ziegler, Pharm.D, and Claudia Bonnell, RN, MSL, for their assistance in the research and preparation of this manuscript.
1. National Heart, Lung and Blood Institute. Expert panel report 2: guidelines for the diagnosis and management of asthma. Bethesda, MD: National Institutes of Health; 1997. NIH publication 97-4051.
2. Ruffin RE, Pierce RJ. Peak flow monitoring—which asthmatics, when, and how? Aust N Z J Med 1994;24:519-20.
3. Devine EC. Meta-analysis of the effects of psychoeducational care in adults with asthma. Res Nursing Health 1996;19:367-76.
4. Bernard-Bonnin AC, Stachenko S, Bonin D, et al. Self-management teaching programs and morbidity of pediatric asthma: a meta-analysis. J Allergy Clin Immunol 1995;95(1 Pt 1):34-41.
5. Gibson PG, Coughlan J, Wilson AJ, et al. Limited (information only) patient education programs for adults with asthma. Cochrane Database Syst Rev 2000a;(2):CD001005.-
6. Gibson PG, Coughlan J, Wilson AJ, et al. Self-management education and regular practitioner review for adults with asthma. Cochrane Database Syst Rev 2000b (2):CD001117.-
7. Lieu TA, Quesenberry CP, Jr, Capra AM, et al. Outpatient management practices associated with reduced risk of pediatric asthma hospitalization and emergency department visits. Pediatrics 1997;100(3 Pt 1):334-41.
8. Lefevre F, Piper M, Mark D, et al. Management of Chronic Asthma. AHRQ evidence report, contract number 290-97-001-5, 2001, http://www.ahcpr.gov/clinic/epcix.htm.
9. Mulrow CD, Oxman AD, editors. Cochrane Collaboration Handbook. Available in the Cochrane Library [database on disk and CD-ROM]. The Cochrane Collaboration; Issue 1. Oxford: Update Software; 1997.
10. Schulz KF, Chalmers I, Hayes RJ, et al. Empirical evidence of bias: dimensions of methodological quality associated with estimates of treatment effects in controlled trials. JAMA 1995;273(5):408-12.
11. Berlin JA, Rennie D. Measuring the quality of trials: the quality of the quality scales. JAMA 1999;282(11):1083-5.
12. Juni P, Witschi A, Bloch R, et al. The hazards of scoring the quality of clinical trials for meta-analysis. JAMA 1999;282(11):1054-60.
13. Jones KP, Mullee MA, Middleton M, et al. Peak flow based asthma self-management: a randomised controlled study in general practice. British Thoracic Society Research Committee. Thorax 1995;50(8):851-7.
14. Drummond N, Abdalla M, Beattie JAG, et al. Effectiveness of routine self monitoring of peak flow in patients with asthma. Grampian Asthma Study of Integrated Care GRASSIC). BMJ 1994 Feb. 26;308(6928):564-7.
15. Ayres JG, Campbell LM. A controlled assessment of an asthma self-management plan involving a budesonide dose regimen. OPTIONS Research Group. Eur Respir J 1996;886-92.
16. Cowie RL, Revitt SG, Underwood MF, et al. The effect of a peak flow-based action plan in the prevention of exacerbations of asthma. Chest 1997;112(6):1534-8.
17. Cote J, Cartier A, Robichaud P, et al. Influence on asthma morbidity of asthma education programs based on self-management plans following treatment optimization. Am J Respir Crit Care Med 1997;155(5):1509-14.
18. Ignacio-Garcia JM, Gonzalez-Santos P. Asthma self-management education program by home monitoring of peak expiratory flow. Am J Respir Crit Care Med 1995;151(2 Pt 1):353-9.
19. Charlton I, Antoniou AG, Atkinson J, et al. Asthma at the interface: bridging the gap between general practice and a district general hospital. Arch Dis Child 1994;70(4):313-8.
20. Turner MO, Taylor D, Bennett R, et al. A randomized trial comparing peak expiratory flow and symptom self-management plans for patients with asthma attending a primary care clinic. Am J Respir Crit Care Med 1998;157(2):540-6.
21. Charlton I, Charlton G, Broomfield J, et al. Evaluation of peak flow and symptoms only self-management plans for control of asthma in general practice. BMJ 1990;301(6765):1355-9.
1. National Heart, Lung and Blood Institute. Expert panel report 2: guidelines for the diagnosis and management of asthma. Bethesda, MD: National Institutes of Health; 1997. NIH publication 97-4051.
2. Ruffin RE, Pierce RJ. Peak flow monitoring—which asthmatics, when, and how? Aust N Z J Med 1994;24:519-20.
3. Devine EC. Meta-analysis of the effects of psychoeducational care in adults with asthma. Res Nursing Health 1996;19:367-76.
4. Bernard-Bonnin AC, Stachenko S, Bonin D, et al. Self-management teaching programs and morbidity of pediatric asthma: a meta-analysis. J Allergy Clin Immunol 1995;95(1 Pt 1):34-41.
5. Gibson PG, Coughlan J, Wilson AJ, et al. Limited (information only) patient education programs for adults with asthma. Cochrane Database Syst Rev 2000a;(2):CD001005.-
6. Gibson PG, Coughlan J, Wilson AJ, et al. Self-management education and regular practitioner review for adults with asthma. Cochrane Database Syst Rev 2000b (2):CD001117.-
7. Lieu TA, Quesenberry CP, Jr, Capra AM, et al. Outpatient management practices associated with reduced risk of pediatric asthma hospitalization and emergency department visits. Pediatrics 1997;100(3 Pt 1):334-41.
8. Lefevre F, Piper M, Mark D, et al. Management of Chronic Asthma. AHRQ evidence report, contract number 290-97-001-5, 2001, http://www.ahcpr.gov/clinic/epcix.htm.
9. Mulrow CD, Oxman AD, editors. Cochrane Collaboration Handbook. Available in the Cochrane Library [database on disk and CD-ROM]. The Cochrane Collaboration; Issue 1. Oxford: Update Software; 1997.
10. Schulz KF, Chalmers I, Hayes RJ, et al. Empirical evidence of bias: dimensions of methodological quality associated with estimates of treatment effects in controlled trials. JAMA 1995;273(5):408-12.
11. Berlin JA, Rennie D. Measuring the quality of trials: the quality of the quality scales. JAMA 1999;282(11):1083-5.
12. Juni P, Witschi A, Bloch R, et al. The hazards of scoring the quality of clinical trials for meta-analysis. JAMA 1999;282(11):1054-60.
13. Jones KP, Mullee MA, Middleton M, et al. Peak flow based asthma self-management: a randomised controlled study in general practice. British Thoracic Society Research Committee. Thorax 1995;50(8):851-7.
14. Drummond N, Abdalla M, Beattie JAG, et al. Effectiveness of routine self monitoring of peak flow in patients with asthma. Grampian Asthma Study of Integrated Care GRASSIC). BMJ 1994 Feb. 26;308(6928):564-7.
15. Ayres JG, Campbell LM. A controlled assessment of an asthma self-management plan involving a budesonide dose regimen. OPTIONS Research Group. Eur Respir J 1996;886-92.
16. Cowie RL, Revitt SG, Underwood MF, et al. The effect of a peak flow-based action plan in the prevention of exacerbations of asthma. Chest 1997;112(6):1534-8.
17. Cote J, Cartier A, Robichaud P, et al. Influence on asthma morbidity of asthma education programs based on self-management plans following treatment optimization. Am J Respir Crit Care Med 1997;155(5):1509-14.
18. Ignacio-Garcia JM, Gonzalez-Santos P. Asthma self-management education program by home monitoring of peak expiratory flow. Am J Respir Crit Care Med 1995;151(2 Pt 1):353-9.
19. Charlton I, Antoniou AG, Atkinson J, et al. Asthma at the interface: bridging the gap between general practice and a district general hospital. Arch Dis Child 1994;70(4):313-8.
20. Turner MO, Taylor D, Bennett R, et al. A randomized trial comparing peak expiratory flow and symptom self-management plans for patients with asthma attending a primary care clinic. Am J Respir Crit Care Med 1998;157(2):540-6.
21. Charlton I, Charlton G, Broomfield J, et al. Evaluation of peak flow and symptoms only self-management plans for control of asthma in general practice. BMJ 1990;301(6765):1355-9.
Relationships between physician practice style, patient satisfaction, and attributes of primary care
- Different physician-patient interaction styles are actively used in community practice.
- A person-focused style is being used by almost half of the physicians observed, and this style is associated with greater patient-reported quality of primary care and greater patient satisfaction.
- This study provides further evidence to support the widespread implementation of this approach to the physician-patient interaction.
Over the past half century, changing medical technology, law, education, ethics, and research have influenced the current shape of physician-patient interactions.9 In 1956, the traditional model of Activity-Passivity (physician does something to the patient) was challenged with the revolutionary concept of active patient participation.10 The models of Guidance and Cooperation (physician tells patient what to do, patient cooperates) and Mutual Participation (physician enables patient to help him/herself, patient is a partner) were proposed10 and are reflected in modern theoretically-based interaction models. Numerous models have been proposed as variants of the Guidance/Cooperation model (eg, paternalistic model,11 priestly model,12 contractual model13) and the Mutual Participation model (eg, ethnographic model,14 consumerist model,11,15 family systems model16). Few of these models, though, have been empirically evaluated. The best-developed and most-studied mutual participation model is the patient-centered method.5,17-20
When data have been collected using quantitative or qualitative approaches, significant strides have been made in understanding physician-patient interaction3, 21-23 and the effect of such interactions on patient outcomes,5,24,25 primarily patient satisfaction.1,26-29 However, many studies have been limited by their focus on a narrow aspect of physician-patient communication, studying a small number of physicians or patients, and using medical students, residents, and hospital faculty as study subjects.
The purpose of this study was not to develop a new model of physician-patient interaction. Rather, variables characterizing physician style grounded by the direct observation of thousands of encounters for 138 community practicing family physicians were used to empirically cluster physicians into groups that represent distinct interaction styles. Because interaction style may be manifested in all phases of a patient encounter, we used as a guiding framework the 3 primary functions of an interview:30,31gathering information, enhancing a healing relationship, and making and implementing decisions. The importance of each of these functions varies depending on the nature of the encounter, but our overall approach provides a practical way of conceptualizing physician-patient interaction style. The association of the empirically derived and theoretically-based physician styles are tested with 3 outcomes: 1) patient report of delivery of attributes of primary care measured using the Components of Primary Care Instrument (CPCI), 2) patient satisfaction with the visit, and 3) the duration of the visit.
Methods
This study was part of the larger Direct Observation of Primary Care (DOPC) study, a cross sectional observational study that examined the content of 4454 outpatient visits to family physicians in northeast Ohio. Details of the methods of the DOPC study have been described extensively elsewhere.32,34 Briefly, 4 teams of 2 research nurses directly observed consecutive patient visits to 138 participating physicians in 84 practices between October 1994 and August 1995. The research nurses collected data on the content and context of consecutive office visits using the following methods: direct observation of the patient visit, patient exit questionnaire, medical record review, and collection of ethnographic field notes.33,34
Measures
Patients’ perception of the delivery 5 attributes of primary care was measured by the Components of Primary Care Instrument (CPCI). Interpersonal communication was an evaluation of the ease of exchange of information between patient and physician. The physician’s accumulated knowledge about the patient refers to the physician’s understanding of the patient’s medical history, health care needs, and values. Coordination of care refers to the information received from referrals to specialists and previous health care visits, and its incorporation into the current and future care of the patient. Preference to see usual physician refers to the degree to which patients believed and valued that they could go to their regular physician for almost all problems. Scale scores demonstrate good internal consistency reliability (Cronbach’s alpha: .68–.79).35 Continuity of care is measured by the Usual Provider Continuity index (UPC), which is the proportion of visits to the patient’s regular doctor in the past year out of the total number of physician visits in the past year.
Patient satisfaction was measured using the 4 physician-specific items from the MOS 9 Item Visit Rating Form36 (Cronbach’s alpha = .89).33 Also included on the patient survey was a single item assessing the degree to which patients’ expectations with the visit were met. Duration of the visit was the total face-to-face time the physician spent with the patient and was measured by direct observation.
Each physician’s interaction style was determined through a 2-step process. In the first step, ethnographic field notes were used to gather information that helps define core features of physician style. The field notes from 4 days of observation of 138 family physicians in 84 practices were transcribed and imported into FolioVIEWS37 for data management and coding. Analysis was conducted with an immersion-crystallization approach38 involving repetitive reading and summarization of the text data. Case summaries were constructed from a sample of practices selected to maximize variation among practice characteristics such as size, physician sex, and practice location. The case summaries were independently reviewed, and important features were identified. These features were cross-checked against the original data. This process, and the resulting 30 features, are described in detail elsewhere.32
Six of the features that emerged from the qualitative analyses pertain to physician style and are listed in Table 1. Each of the 3 primary interview functions30 is represented by at least 1 feature, ensuring good coverage of the core aspects of the interaction. Gathering information is shaped by physician orientation and the clinical information allowed or elicited in the visits. Enhancing healing relationships is realized in part through affective connection with patients. The final function, making and implementing decisions, is influenced by the level of control or shared power with patients, the physician’s openness to patients’ agendas, and the physician’s willingness to negotiate options with patients.
The second step involved a cluster analysis of the 6 variables. First a hierarchical approach was used to estimate the number of clusters. Then a non-hierarchical clustering approach was used to determine physician classification among the clusters and the features that distinguish the clusters.39 Analysis of variance was used to confirm that variables included in the cluster analysis significantly differed between at least 2 of the identified clusters, and thus were contributing to defining interaction style.
TABLE 1
Physician style variables
Physician orientation: |
Problem focused—physician focuses on the patient’s presenting complaint |
Patient-focused—physician is open to a broader health care agenda with the patient and explores other possible issues |
Scope of clinical information: |
Biomedical—talk focuses on the biological information, diagnoses and treatments |
Biopsychosocial—explores both the biological and social and psychological issues |
Affective connection with patients: |
Physician personable and friendly, connects with person on a personal level |
Physician not personable and friendly, maintains professional distance |
Openness to patient agenda: |
Physician open to patient’s agenda |
Physician sets and maintains the agenda |
Sharing of control in interaction: |
Physician shares control of the interaction |
Physician controls the interaction |
Negotiation of options with patient: |
Physician negotiates options with patients |
Physician does not negotiate options with patients |
Analyses
The association of physician and patient characteristics with interaction style was assessed by chi square for categorical variables and by analysis of variance for continuous variables. The association of physician style with each of the 5 attributes of primary care measured by the CPCI, the indicators of patient satisfaction, and duration of the visit were tested using multilevel modeling,40 to account for the hierarchical nature of data (ie, patients nested within physicians).
Results
Of the 4994 patients presenting for care by their family physicians, 4454 (89%) agreed to participate in the DOPC study. Physicians participating in the DOPC study were similar in age to national samples of family physicians, but over-represented female and residency-trained physicians.34 Patient age, sex, and race were similar to the population of patients seeing family physicians and general practitioners nationally as reported in the National Ambulatory Medical Care Survey.34 Patient questionnaires were returned by 3283 (74%) of the patients. Of those respondents, 2881 satisfactorily completed the CPCI, representing 88% of those returning a patient questionnaire and 65% of the total sample. The patients who completed the CPCI were more likely to be white, have private health care insurance, and be somewhat older than patients who did not complete the CPCI.35
The cluster analysis identified 4 distinct groups of physicians. Each of the 138 physicians was classified into 1 group. Each of the 6 variables in the analysis contributed to defining the 4 groups by significantly (P
Forty-nine percent of physicians were classified as person focused. These physicians were more focused on the person than the disease, were perceived as personable and friendly, were open to the patient’s agenda, and frequently negotiated options with the patient. Physicians classified as biopsychosocial (16%) were more focused on the patient’s disease, but elicited psychosocial clinical information. Physicians classified as biomedical (20%) were also more focused on the patient’s disease and were unlikely to elicit psychosocial information. These physicians also demonstrated a low level of friendliness and were unlikely to negotiate options with the patient. The high physician control group’s major characteristics were domination of the encounter and disregard of the patient’s agenda (14%).
Association of physician characteristics with the interaction styles is presented in Table 2. The percent of male and female physicians differed greatly among the 4 style groups. The proportion of female physicians in the person-focused group was almost 4 times that of the biopsychosocial group and the high physician control group (P
As reported in Table 3, physician style is significantly associated with 3 of the 5 patient reports of the attributes of primary care. Physicians classified as having a person-focused approach have the highest mean score of communication; the other 3 styles score lower, with the high-physician-control style scoring the lowest. Person-focused and biopsychosocial physicians scored highest on patient reports of accumulated knowledge; those in the biomedical group scored the lowest. Coordination of care was highest among the person-focused group and lowest among the high-control group Across the different types of physician style, there was no difference in patient report of preference for his or her regular physician or the measure of continuity of care.
The associations of physician style with 2 indicators of patient satisfaction are displayed in Table 4. The highest group mean of patient satisfaction is for the person-focused style, and the lowest is for the high-physician-control group. The indicator of the degree to which patient expectations were met also follows this pattern. Also displayed in Table 4, the person-focused style demonstrated the longest average duration of visit, at 11.5 minutes; the high-physician-control group visits were the shortest in duration, at about 9.5 minutes.
TABLE 2
Physician and patient characteristics associated with interaction style
Characteristic | Total | Biopsychosocial | Biomedical | Person focused | High physician control | P |
---|---|---|---|---|---|---|
Physician | ||||||
Number | 138 | 22 | 28 | 68 | 20 | |
Age (mean years) | 43 | 45 | 43 | 42 | 46 | .06 |
Female | 26% | 9% | 21% | 38% | 10% | |
Residency trained | 90% | 86% | 86% | 94% | 85% | .44 |
Patient | ||||||
Number | 2881 | 504 | 578 | 1258 | 541 | |
Age (mean years) | 42 | 44 | 41 | 42 | 43 | .11 |
Female | 62% | 57% | 61% | 65% | 58% |
Association of physician style with attributes of primary care1
Attribute of primary care | Biopsychosocial | Biomedical | Person focused | High physician control | P |
---|---|---|---|---|---|
Communication | 4.27 | 4.26 | 4.43 | 4.21 | |
Accumulated knowledge | 3.54 | 3.33 | 3.56 | 3.51 | |
Coordination of care | 3.85 | 3.78 | 3.99 | 3.74 | |
Preference for regular doctor | 4.46 | 4.45 | 4.46 | 4.39 | ns |
Usual provider continuity2 | 0.67 | 0.66 | 0.64 | 0.65 | ns |
1Each row represents a separate multilevel regression model wherein each attribute of primary care is the outcome variable and the number in each column is the group mean of that attribute, adjusted for patient and physician age and sex, as well as the effect of the patients being nested within physicians. | |||||
2Usual provide continuity = total number of visits to regular physician in past year, divided by the total number of physician visits in the past year. |
Association of physician interaction style with patient satisfaction and duration of visit1
Outcome measures | Biopsychosocial | Biomedical | Person focused | High physician control | P |
---|---|---|---|---|---|
Patient satisfaction with physician | 4.38 | 4.39 | 4.49 | 4.30 | 002 |
Patient expectations met | 4.36 | 4.33 | 4.45 | 4.31 | .02 |
Length of visit (mean minutes) | 9.97 | 10.02 | 11.56 | 9.51 | .005 |
1Results from multilevel regression model, analyses include patient and physician age and gende as covariates, and controls for the nested nature of the data. |
Discussion
These data indicate that a person-focused approach is actively used in community practice, and is the style most congruent with patient-reported quality of primary care and satisfaction with care. Our data, in concert with data reported by others,5,24 indicate strong support for the feasibility and value of the person-focused model. We found that, of the 4 distinct interaction styles, physicians with the person-focused style scored highest across all measures of the attributes of primary care and on the indicators of patient satisfaction, with the exception of continuity of care. In contrast, physicians with the high-control style were generally lowest on the primary care and satisfaction indicators.
It is important to emphasize that, even though the vast majority of patients in this sample are likely to have self-selected their primary care physician, patient rating of some attributes of primary care differed across the 4 physician styles. Patients of physicians with different styles equally valued seeing their regular physician, as reported by the preference-for-their-regular-doctor score; they exhibited similar proportions of continuity visits in the past year; and their satisfaction scores were all generally high. Patients appear to want to see their regular physician, regardless of interaction approach, even though some approaches—particularly the high-physician-control style—were rated poorer for communication, coordination of care, and accumulated knowledge.
There may be several explanations as to why a particular physician style is associated with specific patient reports of communication, accumulated knowledge, and coordination of care. Openness to the patient’s agenda and willingness to negotiate options—as was characteristic of the person-focused physicians—may facilitate good communication and convey an understanding of patient preferences and values regarding health. It is interesting to note that different groups scored lowest on some of the attributes of primary care. The high-physician-control group was the lowest on interpersonal communication and coordination of care. High-control physicians were more likely to dominate the agenda and the verbal exchanges. Patients may have felt they could not ask questions or that the physician did not listen to what they tried to say. The biomedical group of physicians were given the lowest scores by patients on accumulated knowledge, suggesting that patients thought these physicians were less likely to know their preferences and values regarding health care, know less about them as persons, and know less about their family and medical histories.
As others have proposed, we concur that interaction style is not a dichotomy or even a continuum of patient versus physician control, but is multidimensional, cutting across the main functions of the patient encounter (ie, information gathering, relationship building, and making and implementing decisions). These data provide some confirmation for the original scheme proposed by Szasz and Hollander,10 with the Mutual Participation model most represented by the person-focused approach and the Activity-Passivity model most represented by the high-physician-control group. The biopsychosocial and biomedical approaches represent different versions of the Guidance and Cooperative model.
The 4 types of physician style empirically derived from our data are similar to communication pattern types found by Roter et al,27 in a study with similar aims but different methods. Of the 5 types reported, narrowly biomedical and expanded biomedical accounted for 65% of visits, and biopsychosocial accounted for 20%. Psychosocial and consumerist (distinguished by a high degree of patient questions) accounted for only 8% each. It is interesting that in our data, we found the person-focused style was by far the most common approach (49%) among this group of family physicians. These differences in use of particular interaction styles may have several explanations. First, these data were collected more recently.27 Thus our data may reflect trends in a movement away from a paternalistic style and toward an increased patient participatory style. Second, our sample consisted entirely of family physicians practicing in the community, where the model of person-focused care may have a longer history of support and endorsement or be of greater importance to community family physicians, whose emphasis is on a breadth of care based on patient needs.6,7,18
Physicians with a person-focused style granted the longest visits, while high-control-physicians granted the shortest—a difference of more than 2 minutes per visit on average. The associations were not explained away by accounting for patient or physician characteristics, suggesting that a person-focused style may require more time. However, others have found that physicians engaging in a patient participatory style had office visits that were of similar duration as found with other approaches,23, 27 although the average duration of visit for both of these studies were considerably longer than the office visits among our sample.
This study has several strengths. The use of community practicing physicians in real world conditions for whom visits were similar in content to the visits reported by NAMCS34 adds to the generalizability of the findings. We have used an integration of qualitative and quantitative approaches to empirically derive categories of physician interaction style. Our data are based on nurse observation of an average of 32 encounters per physician and documented in rich and comprehensive qualitative fieldnotes. And finally, by using multilevel modeling, we have reported an honest estimate of the association of physician style and patient report of primary care by appropriately modeling the nested data structure.
The findings must be interpreted in light of potential study limitations. First, the patients who did not complete the patient questionnaire are somewhat different demographically than those patients who did complete it. However, non-completion of the questionnaire was not associated with physician style; therefore, it is unlikely that the associations would change, had these individuals been included. Second, because the study was cross-sectional we cannot control for patient self-selection of physicians. Nonetheless, since patients dissatisfied with the quality of care are likely to seek another physician, we would expect patient self-selection of physicians to bias the study toward the null, thus making our results even more remarkable.
These findings, in combination with the literature on the person-focused,24 patient-centered5,17,19,20,41 and relationship-centered approaches,42 provide strong evidence to support the widespread implementation of this physician-patient interaction approach. Further investigation in community practice may lead to identification of ways to support and encourage person-focused care and the time needed to provide such care.
· Acknowledgments ·
The authors are indebted to the physicians, office staff members, and patients without whose participation this study would not have been possible. This paper was improved by helpful suggestions on an earlier draft by Kurt C. Stange, MD, PhD. This study was supported by a grant from the National Cancer Institute (1R01 CA60862) and in part by the Center for Research in Family Practice and Primary Care and the American Academy of Family Practice.
1. Bertakis KD, Roter D, Putnam SM. The relationship of physician medical interview style to patient satisfaction. J Fam Pract. 1991;32:175-181.
2. Bertakis KD, Callahan EJ, Helms LJ, Azari R, Robbins JA, Miller J. Physician practice styles and patient outcomes. Med Care. 1998;36:879-891.
3. Stewart MA. What is a successful doctor-patient interview? A study of interactions and outcomes. Soc Sci Med. 1984;19:167-175.
4. Levinson W, Roter DL, Mullooly JP, Dull VT, Frankel RM. Physician-patient communication: the relationship with malpractice claims among primary care physicians and surgeons. JAMA. 1997;277:553-559.
5. Stewart M, Brown JB, Donner A, McWhinney IR, Oates J, Weston WW, Jordan J. The impact of patient-centered care on outcomes. J Fam Pract. 2000;49:796-804.
6. McWhinney IR. Through clinical method to a more humane medicine. In: White KL, ed. The task of medicine. Menlo Park, CA: The Henry J. Kaiser Family Foundation; 1988.
7. Stange KC, Jaén CR, Flocke SA, Miller WL, Crabtree BF, Zyzanski SJ. The value of a family physician. J Fam Pract. 1998;46:363-368.
8. Institute of Medicine. Primary Care: America’s Health in a New Era. Donaldson MS. YK, Lohr KN, Vanselow NA, ed Washington D.C.: National Academy Press; 1996.
9. Laine C, Davidoff F. Patient-centered medicine: A professional evolution. JAMA. 1996;275:152-156.
10. Szasz TS, Hollender MH. The basic models of the doctor-patient relationship. Arch Int Med. 1956;97:585-592.
11. Emanuel EJ, Emanuel LL. Four models of the physician-patient relationship. JAMA. 1992;267:2221-2226.
12. Veatch RM. Models for ethical medicine in a revolutionary age. What physician-patient roles foster the most ethical relationship? Hasting Center Reports. 1972;2:5-7.
13. Quill TE. Partnerships in patient care: a contractual approach. Ann Int Med. 1983;98:228-234.
14. Kleinman AM, Eisenberg L, Good B. Culture, illness, and care: Clinical lessons from anthropologic and cross-cultural research. Ann Int Med. 1978;88:251-258.
15. Lazare A, Eisenthal S, Wasserman L. The customer approach to patienthood: Attending to patient requests in a walk-in clinic. Archives of General Psychiatry. 1975;32:553-558.
16. McDaniel S, Campbell T, Seaburn D. Family-oriented primary care: a manual for medical providers. Berlin: Springer-Verlag; 1990.
17. Stewart M, Weston WW, Brown JB, McWhinney IR, McWilliam CL, Freeman TR. Patient-centered medicine: Transforming the clinical method. Thousand Oaks, CA: Sage Publications; 1995.
18. Levenstein JH, McCracken EC, McWhinney IR, Stewart MA, Brown JB. The patient-centred clinical method. 1. A model for the doctor-patient interaction in family medicine. Fam Pract. 1986;3:24-30.
19. Epstein RM. The science of patient-centered care. J Fam Pract. 2000;49:805-807.
20. Stewart M, Roter D. Communicating With Medical Patients. Knapp ML, ed second printing (1990) ed: Sage Publications; 1989.
21. Hall JA, Roter DL, Katz NR. Meta-analysis of correlates of provider behavior in medical encounters. Med Care. 1988;26:657-675.
22. Byrne PS, Long BEL. Doctors talking to patients. London: H.M.S.O.; 1976.
23. Marvel MK, Doherty WJ, Weiner E. Medical interviewing by exemplary family physicians. J Fam Pract. 1998;47:343-348.
24. Roter D. The enduring and evolving nature of the patient-physician relationship. Patient Educ and Counseling. 2000;39:5-15.
25. Kaplan SH, Greenfield S, Ware JE. Assessing the effects of physician-patient interactions on the outcomes of chronic disease. Med Care. 1989;27:S110-S127.
26. Buller MK, Buller DB. Physicians’ communication style and patient satisfaction. J Health Soc Behav. 1987;28:375-388.
27. Roter DL, Stewart M, Putnam SM, Lipkin M, Stiles W, Inui TS. Communication patterns of primary care physicians. JAMA. 1997;277:350-356.
28. Williams S, Weinman J, Dale J. Doctor-patient communication and patient satisfaction: A review. Fam Pract. 1998;15:480-492.
29. Greene MG, Adelman RD, Friedman E, Charon R. Older patient satisfaction with communication during an initial medical encounter. Soc Sci Med. 1994;38:1279-1288.
30. Cohen-Cole S. The medical interview: The three-function approach. St. Louis: Mosby Year Book; 1991.
31. Lazare A, Putnam SM, Lipkin M. Three functions of the medical interview. In: Lipkin M, Putnam S, Lazare A, eds. The medical interview: Clinical care, education and research. New York: Springer; 1995;3-19.
32. Crabtree BF, Miller WL, Aita V, Flocke SA, Stange KC. Primary care practice organization: A qualitative analysis. J Fam Pract. 1998;46:403-409.
33. Stange KC, Zyzanski SJ, Jaén CR, Callahan EJ, Kelly RB, Gillanders WR, Shank JC, Chao J, Medalie JH, Miller WL, Crabtree BF, Flocke SA, Gilchrist VJ, Langa DM, Goodwin MA. Illuminating the black box: a description of 4454 patient visits to 138 family physicians. J Fam Pract. 1998;46:377-389.
34. Stange KC, Zyzanski SJ, Smith TF, Kelly R, Langa DM, Flocke SA, Jaén CR. How valid are medical records and patient questionnaires for physician profiling and health services research? A comparison with direct observation of patient visits. Med Care. 1998;36:851-867.
35. Flocke SA. Measuring attributes of primary care: Development of a new instrument. J Fam Pract. 1997;45:64-74.
36. Rubin H, Gandek B, Roger WH, Kisinski M, McHorney C, Ware J. Patients’ ratings of outpatient visits in different practice settings. JAMA. 1993;270:835-840.
37. FolioVIEWS.. 3.1 ed. Provo, Utah: Folio Corporation; 1998.
38. Crabtree BF, Miller WL. Doing Qualitative Research. Newbury Park, California: Sage Publications; 1992.
39. Aldenderfer MS, Blashfield RK. Cluster Analysis. Lewis-Beck MS, ed Newbury Park: Sage; 1984.
40. Bryk AS, Raudenbush SW. Hierarchical linear models: applications and data analysis methods. Newbury Park: Sage Publications; 1992.
41. Stewart M, Brown JB, Boon H, Galajda J, Meredith L, Sangster M. Evidence on patient-doctor communication. Cancer Prevention and Control. 1999;3:25-30.
42. Carol P. Tresolini and the Pew-Fetzer Task Force on Advancing Psychosocial Health Education. Health profession education and relationship-centered care. San Francisco, CA: Pew Health Professions Commission; 1994.
- Different physician-patient interaction styles are actively used in community practice.
- A person-focused style is being used by almost half of the physicians observed, and this style is associated with greater patient-reported quality of primary care and greater patient satisfaction.
- This study provides further evidence to support the widespread implementation of this approach to the physician-patient interaction.
Over the past half century, changing medical technology, law, education, ethics, and research have influenced the current shape of physician-patient interactions.9 In 1956, the traditional model of Activity-Passivity (physician does something to the patient) was challenged with the revolutionary concept of active patient participation.10 The models of Guidance and Cooperation (physician tells patient what to do, patient cooperates) and Mutual Participation (physician enables patient to help him/herself, patient is a partner) were proposed10 and are reflected in modern theoretically-based interaction models. Numerous models have been proposed as variants of the Guidance/Cooperation model (eg, paternalistic model,11 priestly model,12 contractual model13) and the Mutual Participation model (eg, ethnographic model,14 consumerist model,11,15 family systems model16). Few of these models, though, have been empirically evaluated. The best-developed and most-studied mutual participation model is the patient-centered method.5,17-20
When data have been collected using quantitative or qualitative approaches, significant strides have been made in understanding physician-patient interaction3, 21-23 and the effect of such interactions on patient outcomes,5,24,25 primarily patient satisfaction.1,26-29 However, many studies have been limited by their focus on a narrow aspect of physician-patient communication, studying a small number of physicians or patients, and using medical students, residents, and hospital faculty as study subjects.
The purpose of this study was not to develop a new model of physician-patient interaction. Rather, variables characterizing physician style grounded by the direct observation of thousands of encounters for 138 community practicing family physicians were used to empirically cluster physicians into groups that represent distinct interaction styles. Because interaction style may be manifested in all phases of a patient encounter, we used as a guiding framework the 3 primary functions of an interview:30,31gathering information, enhancing a healing relationship, and making and implementing decisions. The importance of each of these functions varies depending on the nature of the encounter, but our overall approach provides a practical way of conceptualizing physician-patient interaction style. The association of the empirically derived and theoretically-based physician styles are tested with 3 outcomes: 1) patient report of delivery of attributes of primary care measured using the Components of Primary Care Instrument (CPCI), 2) patient satisfaction with the visit, and 3) the duration of the visit.
Methods
This study was part of the larger Direct Observation of Primary Care (DOPC) study, a cross sectional observational study that examined the content of 4454 outpatient visits to family physicians in northeast Ohio. Details of the methods of the DOPC study have been described extensively elsewhere.32,34 Briefly, 4 teams of 2 research nurses directly observed consecutive patient visits to 138 participating physicians in 84 practices between October 1994 and August 1995. The research nurses collected data on the content and context of consecutive office visits using the following methods: direct observation of the patient visit, patient exit questionnaire, medical record review, and collection of ethnographic field notes.33,34
Measures
Patients’ perception of the delivery 5 attributes of primary care was measured by the Components of Primary Care Instrument (CPCI). Interpersonal communication was an evaluation of the ease of exchange of information between patient and physician. The physician’s accumulated knowledge about the patient refers to the physician’s understanding of the patient’s medical history, health care needs, and values. Coordination of care refers to the information received from referrals to specialists and previous health care visits, and its incorporation into the current and future care of the patient. Preference to see usual physician refers to the degree to which patients believed and valued that they could go to their regular physician for almost all problems. Scale scores demonstrate good internal consistency reliability (Cronbach’s alpha: .68–.79).35 Continuity of care is measured by the Usual Provider Continuity index (UPC), which is the proportion of visits to the patient’s regular doctor in the past year out of the total number of physician visits in the past year.
Patient satisfaction was measured using the 4 physician-specific items from the MOS 9 Item Visit Rating Form36 (Cronbach’s alpha = .89).33 Also included on the patient survey was a single item assessing the degree to which patients’ expectations with the visit were met. Duration of the visit was the total face-to-face time the physician spent with the patient and was measured by direct observation.
Each physician’s interaction style was determined through a 2-step process. In the first step, ethnographic field notes were used to gather information that helps define core features of physician style. The field notes from 4 days of observation of 138 family physicians in 84 practices were transcribed and imported into FolioVIEWS37 for data management and coding. Analysis was conducted with an immersion-crystallization approach38 involving repetitive reading and summarization of the text data. Case summaries were constructed from a sample of practices selected to maximize variation among practice characteristics such as size, physician sex, and practice location. The case summaries were independently reviewed, and important features were identified. These features were cross-checked against the original data. This process, and the resulting 30 features, are described in detail elsewhere.32
Six of the features that emerged from the qualitative analyses pertain to physician style and are listed in Table 1. Each of the 3 primary interview functions30 is represented by at least 1 feature, ensuring good coverage of the core aspects of the interaction. Gathering information is shaped by physician orientation and the clinical information allowed or elicited in the visits. Enhancing healing relationships is realized in part through affective connection with patients. The final function, making and implementing decisions, is influenced by the level of control or shared power with patients, the physician’s openness to patients’ agendas, and the physician’s willingness to negotiate options with patients.
The second step involved a cluster analysis of the 6 variables. First a hierarchical approach was used to estimate the number of clusters. Then a non-hierarchical clustering approach was used to determine physician classification among the clusters and the features that distinguish the clusters.39 Analysis of variance was used to confirm that variables included in the cluster analysis significantly differed between at least 2 of the identified clusters, and thus were contributing to defining interaction style.
TABLE 1
Physician style variables
Physician orientation: |
Problem focused—physician focuses on the patient’s presenting complaint |
Patient-focused—physician is open to a broader health care agenda with the patient and explores other possible issues |
Scope of clinical information: |
Biomedical—talk focuses on the biological information, diagnoses and treatments |
Biopsychosocial—explores both the biological and social and psychological issues |
Affective connection with patients: |
Physician personable and friendly, connects with person on a personal level |
Physician not personable and friendly, maintains professional distance |
Openness to patient agenda: |
Physician open to patient’s agenda |
Physician sets and maintains the agenda |
Sharing of control in interaction: |
Physician shares control of the interaction |
Physician controls the interaction |
Negotiation of options with patient: |
Physician negotiates options with patients |
Physician does not negotiate options with patients |
Analyses
The association of physician and patient characteristics with interaction style was assessed by chi square for categorical variables and by analysis of variance for continuous variables. The association of physician style with each of the 5 attributes of primary care measured by the CPCI, the indicators of patient satisfaction, and duration of the visit were tested using multilevel modeling,40 to account for the hierarchical nature of data (ie, patients nested within physicians).
Results
Of the 4994 patients presenting for care by their family physicians, 4454 (89%) agreed to participate in the DOPC study. Physicians participating in the DOPC study were similar in age to national samples of family physicians, but over-represented female and residency-trained physicians.34 Patient age, sex, and race were similar to the population of patients seeing family physicians and general practitioners nationally as reported in the National Ambulatory Medical Care Survey.34 Patient questionnaires were returned by 3283 (74%) of the patients. Of those respondents, 2881 satisfactorily completed the CPCI, representing 88% of those returning a patient questionnaire and 65% of the total sample. The patients who completed the CPCI were more likely to be white, have private health care insurance, and be somewhat older than patients who did not complete the CPCI.35
The cluster analysis identified 4 distinct groups of physicians. Each of the 138 physicians was classified into 1 group. Each of the 6 variables in the analysis contributed to defining the 4 groups by significantly (P
Forty-nine percent of physicians were classified as person focused. These physicians were more focused on the person than the disease, were perceived as personable and friendly, were open to the patient’s agenda, and frequently negotiated options with the patient. Physicians classified as biopsychosocial (16%) were more focused on the patient’s disease, but elicited psychosocial clinical information. Physicians classified as biomedical (20%) were also more focused on the patient’s disease and were unlikely to elicit psychosocial information. These physicians also demonstrated a low level of friendliness and were unlikely to negotiate options with the patient. The high physician control group’s major characteristics were domination of the encounter and disregard of the patient’s agenda (14%).
Association of physician characteristics with the interaction styles is presented in Table 2. The percent of male and female physicians differed greatly among the 4 style groups. The proportion of female physicians in the person-focused group was almost 4 times that of the biopsychosocial group and the high physician control group (P
As reported in Table 3, physician style is significantly associated with 3 of the 5 patient reports of the attributes of primary care. Physicians classified as having a person-focused approach have the highest mean score of communication; the other 3 styles score lower, with the high-physician-control style scoring the lowest. Person-focused and biopsychosocial physicians scored highest on patient reports of accumulated knowledge; those in the biomedical group scored the lowest. Coordination of care was highest among the person-focused group and lowest among the high-control group Across the different types of physician style, there was no difference in patient report of preference for his or her regular physician or the measure of continuity of care.
The associations of physician style with 2 indicators of patient satisfaction are displayed in Table 4. The highest group mean of patient satisfaction is for the person-focused style, and the lowest is for the high-physician-control group. The indicator of the degree to which patient expectations were met also follows this pattern. Also displayed in Table 4, the person-focused style demonstrated the longest average duration of visit, at 11.5 minutes; the high-physician-control group visits were the shortest in duration, at about 9.5 minutes.
TABLE 2
Physician and patient characteristics associated with interaction style
Characteristic | Total | Biopsychosocial | Biomedical | Person focused | High physician control | P |
---|---|---|---|---|---|---|
Physician | ||||||
Number | 138 | 22 | 28 | 68 | 20 | |
Age (mean years) | 43 | 45 | 43 | 42 | 46 | .06 |
Female | 26% | 9% | 21% | 38% | 10% | |
Residency trained | 90% | 86% | 86% | 94% | 85% | .44 |
Patient | ||||||
Number | 2881 | 504 | 578 | 1258 | 541 | |
Age (mean years) | 42 | 44 | 41 | 42 | 43 | .11 |
Female | 62% | 57% | 61% | 65% | 58% |
Association of physician style with attributes of primary care1
Attribute of primary care | Biopsychosocial | Biomedical | Person focused | High physician control | P |
---|---|---|---|---|---|
Communication | 4.27 | 4.26 | 4.43 | 4.21 | |
Accumulated knowledge | 3.54 | 3.33 | 3.56 | 3.51 | |
Coordination of care | 3.85 | 3.78 | 3.99 | 3.74 | |
Preference for regular doctor | 4.46 | 4.45 | 4.46 | 4.39 | ns |
Usual provider continuity2 | 0.67 | 0.66 | 0.64 | 0.65 | ns |
1Each row represents a separate multilevel regression model wherein each attribute of primary care is the outcome variable and the number in each column is the group mean of that attribute, adjusted for patient and physician age and sex, as well as the effect of the patients being nested within physicians. | |||||
2Usual provide continuity = total number of visits to regular physician in past year, divided by the total number of physician visits in the past year. |
Association of physician interaction style with patient satisfaction and duration of visit1
Outcome measures | Biopsychosocial | Biomedical | Person focused | High physician control | P |
---|---|---|---|---|---|
Patient satisfaction with physician | 4.38 | 4.39 | 4.49 | 4.30 | 002 |
Patient expectations met | 4.36 | 4.33 | 4.45 | 4.31 | .02 |
Length of visit (mean minutes) | 9.97 | 10.02 | 11.56 | 9.51 | .005 |
1Results from multilevel regression model, analyses include patient and physician age and gende as covariates, and controls for the nested nature of the data. |
Discussion
These data indicate that a person-focused approach is actively used in community practice, and is the style most congruent with patient-reported quality of primary care and satisfaction with care. Our data, in concert with data reported by others,5,24 indicate strong support for the feasibility and value of the person-focused model. We found that, of the 4 distinct interaction styles, physicians with the person-focused style scored highest across all measures of the attributes of primary care and on the indicators of patient satisfaction, with the exception of continuity of care. In contrast, physicians with the high-control style were generally lowest on the primary care and satisfaction indicators.
It is important to emphasize that, even though the vast majority of patients in this sample are likely to have self-selected their primary care physician, patient rating of some attributes of primary care differed across the 4 physician styles. Patients of physicians with different styles equally valued seeing their regular physician, as reported by the preference-for-their-regular-doctor score; they exhibited similar proportions of continuity visits in the past year; and their satisfaction scores were all generally high. Patients appear to want to see their regular physician, regardless of interaction approach, even though some approaches—particularly the high-physician-control style—were rated poorer for communication, coordination of care, and accumulated knowledge.
There may be several explanations as to why a particular physician style is associated with specific patient reports of communication, accumulated knowledge, and coordination of care. Openness to the patient’s agenda and willingness to negotiate options—as was characteristic of the person-focused physicians—may facilitate good communication and convey an understanding of patient preferences and values regarding health. It is interesting to note that different groups scored lowest on some of the attributes of primary care. The high-physician-control group was the lowest on interpersonal communication and coordination of care. High-control physicians were more likely to dominate the agenda and the verbal exchanges. Patients may have felt they could not ask questions or that the physician did not listen to what they tried to say. The biomedical group of physicians were given the lowest scores by patients on accumulated knowledge, suggesting that patients thought these physicians were less likely to know their preferences and values regarding health care, know less about them as persons, and know less about their family and medical histories.
As others have proposed, we concur that interaction style is not a dichotomy or even a continuum of patient versus physician control, but is multidimensional, cutting across the main functions of the patient encounter (ie, information gathering, relationship building, and making and implementing decisions). These data provide some confirmation for the original scheme proposed by Szasz and Hollander,10 with the Mutual Participation model most represented by the person-focused approach and the Activity-Passivity model most represented by the high-physician-control group. The biopsychosocial and biomedical approaches represent different versions of the Guidance and Cooperative model.
The 4 types of physician style empirically derived from our data are similar to communication pattern types found by Roter et al,27 in a study with similar aims but different methods. Of the 5 types reported, narrowly biomedical and expanded biomedical accounted for 65% of visits, and biopsychosocial accounted for 20%. Psychosocial and consumerist (distinguished by a high degree of patient questions) accounted for only 8% each. It is interesting that in our data, we found the person-focused style was by far the most common approach (49%) among this group of family physicians. These differences in use of particular interaction styles may have several explanations. First, these data were collected more recently.27 Thus our data may reflect trends in a movement away from a paternalistic style and toward an increased patient participatory style. Second, our sample consisted entirely of family physicians practicing in the community, where the model of person-focused care may have a longer history of support and endorsement or be of greater importance to community family physicians, whose emphasis is on a breadth of care based on patient needs.6,7,18
Physicians with a person-focused style granted the longest visits, while high-control-physicians granted the shortest—a difference of more than 2 minutes per visit on average. The associations were not explained away by accounting for patient or physician characteristics, suggesting that a person-focused style may require more time. However, others have found that physicians engaging in a patient participatory style had office visits that were of similar duration as found with other approaches,23, 27 although the average duration of visit for both of these studies were considerably longer than the office visits among our sample.
This study has several strengths. The use of community practicing physicians in real world conditions for whom visits were similar in content to the visits reported by NAMCS34 adds to the generalizability of the findings. We have used an integration of qualitative and quantitative approaches to empirically derive categories of physician interaction style. Our data are based on nurse observation of an average of 32 encounters per physician and documented in rich and comprehensive qualitative fieldnotes. And finally, by using multilevel modeling, we have reported an honest estimate of the association of physician style and patient report of primary care by appropriately modeling the nested data structure.
The findings must be interpreted in light of potential study limitations. First, the patients who did not complete the patient questionnaire are somewhat different demographically than those patients who did complete it. However, non-completion of the questionnaire was not associated with physician style; therefore, it is unlikely that the associations would change, had these individuals been included. Second, because the study was cross-sectional we cannot control for patient self-selection of physicians. Nonetheless, since patients dissatisfied with the quality of care are likely to seek another physician, we would expect patient self-selection of physicians to bias the study toward the null, thus making our results even more remarkable.
These findings, in combination with the literature on the person-focused,24 patient-centered5,17,19,20,41 and relationship-centered approaches,42 provide strong evidence to support the widespread implementation of this physician-patient interaction approach. Further investigation in community practice may lead to identification of ways to support and encourage person-focused care and the time needed to provide such care.
· Acknowledgments ·
The authors are indebted to the physicians, office staff members, and patients without whose participation this study would not have been possible. This paper was improved by helpful suggestions on an earlier draft by Kurt C. Stange, MD, PhD. This study was supported by a grant from the National Cancer Institute (1R01 CA60862) and in part by the Center for Research in Family Practice and Primary Care and the American Academy of Family Practice.
- Different physician-patient interaction styles are actively used in community practice.
- A person-focused style is being used by almost half of the physicians observed, and this style is associated with greater patient-reported quality of primary care and greater patient satisfaction.
- This study provides further evidence to support the widespread implementation of this approach to the physician-patient interaction.
Over the past half century, changing medical technology, law, education, ethics, and research have influenced the current shape of physician-patient interactions.9 In 1956, the traditional model of Activity-Passivity (physician does something to the patient) was challenged with the revolutionary concept of active patient participation.10 The models of Guidance and Cooperation (physician tells patient what to do, patient cooperates) and Mutual Participation (physician enables patient to help him/herself, patient is a partner) were proposed10 and are reflected in modern theoretically-based interaction models. Numerous models have been proposed as variants of the Guidance/Cooperation model (eg, paternalistic model,11 priestly model,12 contractual model13) and the Mutual Participation model (eg, ethnographic model,14 consumerist model,11,15 family systems model16). Few of these models, though, have been empirically evaluated. The best-developed and most-studied mutual participation model is the patient-centered method.5,17-20
When data have been collected using quantitative or qualitative approaches, significant strides have been made in understanding physician-patient interaction3, 21-23 and the effect of such interactions on patient outcomes,5,24,25 primarily patient satisfaction.1,26-29 However, many studies have been limited by their focus on a narrow aspect of physician-patient communication, studying a small number of physicians or patients, and using medical students, residents, and hospital faculty as study subjects.
The purpose of this study was not to develop a new model of physician-patient interaction. Rather, variables characterizing physician style grounded by the direct observation of thousands of encounters for 138 community practicing family physicians were used to empirically cluster physicians into groups that represent distinct interaction styles. Because interaction style may be manifested in all phases of a patient encounter, we used as a guiding framework the 3 primary functions of an interview:30,31gathering information, enhancing a healing relationship, and making and implementing decisions. The importance of each of these functions varies depending on the nature of the encounter, but our overall approach provides a practical way of conceptualizing physician-patient interaction style. The association of the empirically derived and theoretically-based physician styles are tested with 3 outcomes: 1) patient report of delivery of attributes of primary care measured using the Components of Primary Care Instrument (CPCI), 2) patient satisfaction with the visit, and 3) the duration of the visit.
Methods
This study was part of the larger Direct Observation of Primary Care (DOPC) study, a cross sectional observational study that examined the content of 4454 outpatient visits to family physicians in northeast Ohio. Details of the methods of the DOPC study have been described extensively elsewhere.32,34 Briefly, 4 teams of 2 research nurses directly observed consecutive patient visits to 138 participating physicians in 84 practices between October 1994 and August 1995. The research nurses collected data on the content and context of consecutive office visits using the following methods: direct observation of the patient visit, patient exit questionnaire, medical record review, and collection of ethnographic field notes.33,34
Measures
Patients’ perception of the delivery 5 attributes of primary care was measured by the Components of Primary Care Instrument (CPCI). Interpersonal communication was an evaluation of the ease of exchange of information between patient and physician. The physician’s accumulated knowledge about the patient refers to the physician’s understanding of the patient’s medical history, health care needs, and values. Coordination of care refers to the information received from referrals to specialists and previous health care visits, and its incorporation into the current and future care of the patient. Preference to see usual physician refers to the degree to which patients believed and valued that they could go to their regular physician for almost all problems. Scale scores demonstrate good internal consistency reliability (Cronbach’s alpha: .68–.79).35 Continuity of care is measured by the Usual Provider Continuity index (UPC), which is the proportion of visits to the patient’s regular doctor in the past year out of the total number of physician visits in the past year.
Patient satisfaction was measured using the 4 physician-specific items from the MOS 9 Item Visit Rating Form36 (Cronbach’s alpha = .89).33 Also included on the patient survey was a single item assessing the degree to which patients’ expectations with the visit were met. Duration of the visit was the total face-to-face time the physician spent with the patient and was measured by direct observation.
Each physician’s interaction style was determined through a 2-step process. In the first step, ethnographic field notes were used to gather information that helps define core features of physician style. The field notes from 4 days of observation of 138 family physicians in 84 practices were transcribed and imported into FolioVIEWS37 for data management and coding. Analysis was conducted with an immersion-crystallization approach38 involving repetitive reading and summarization of the text data. Case summaries were constructed from a sample of practices selected to maximize variation among practice characteristics such as size, physician sex, and practice location. The case summaries were independently reviewed, and important features were identified. These features were cross-checked against the original data. This process, and the resulting 30 features, are described in detail elsewhere.32
Six of the features that emerged from the qualitative analyses pertain to physician style and are listed in Table 1. Each of the 3 primary interview functions30 is represented by at least 1 feature, ensuring good coverage of the core aspects of the interaction. Gathering information is shaped by physician orientation and the clinical information allowed or elicited in the visits. Enhancing healing relationships is realized in part through affective connection with patients. The final function, making and implementing decisions, is influenced by the level of control or shared power with patients, the physician’s openness to patients’ agendas, and the physician’s willingness to negotiate options with patients.
The second step involved a cluster analysis of the 6 variables. First a hierarchical approach was used to estimate the number of clusters. Then a non-hierarchical clustering approach was used to determine physician classification among the clusters and the features that distinguish the clusters.39 Analysis of variance was used to confirm that variables included in the cluster analysis significantly differed between at least 2 of the identified clusters, and thus were contributing to defining interaction style.
TABLE 1
Physician style variables
Physician orientation: |
Problem focused—physician focuses on the patient’s presenting complaint |
Patient-focused—physician is open to a broader health care agenda with the patient and explores other possible issues |
Scope of clinical information: |
Biomedical—talk focuses on the biological information, diagnoses and treatments |
Biopsychosocial—explores both the biological and social and psychological issues |
Affective connection with patients: |
Physician personable and friendly, connects with person on a personal level |
Physician not personable and friendly, maintains professional distance |
Openness to patient agenda: |
Physician open to patient’s agenda |
Physician sets and maintains the agenda |
Sharing of control in interaction: |
Physician shares control of the interaction |
Physician controls the interaction |
Negotiation of options with patient: |
Physician negotiates options with patients |
Physician does not negotiate options with patients |
Analyses
The association of physician and patient characteristics with interaction style was assessed by chi square for categorical variables and by analysis of variance for continuous variables. The association of physician style with each of the 5 attributes of primary care measured by the CPCI, the indicators of patient satisfaction, and duration of the visit were tested using multilevel modeling,40 to account for the hierarchical nature of data (ie, patients nested within physicians).
Results
Of the 4994 patients presenting for care by their family physicians, 4454 (89%) agreed to participate in the DOPC study. Physicians participating in the DOPC study were similar in age to national samples of family physicians, but over-represented female and residency-trained physicians.34 Patient age, sex, and race were similar to the population of patients seeing family physicians and general practitioners nationally as reported in the National Ambulatory Medical Care Survey.34 Patient questionnaires were returned by 3283 (74%) of the patients. Of those respondents, 2881 satisfactorily completed the CPCI, representing 88% of those returning a patient questionnaire and 65% of the total sample. The patients who completed the CPCI were more likely to be white, have private health care insurance, and be somewhat older than patients who did not complete the CPCI.35
The cluster analysis identified 4 distinct groups of physicians. Each of the 138 physicians was classified into 1 group. Each of the 6 variables in the analysis contributed to defining the 4 groups by significantly (P
Forty-nine percent of physicians were classified as person focused. These physicians were more focused on the person than the disease, were perceived as personable and friendly, were open to the patient’s agenda, and frequently negotiated options with the patient. Physicians classified as biopsychosocial (16%) were more focused on the patient’s disease, but elicited psychosocial clinical information. Physicians classified as biomedical (20%) were also more focused on the patient’s disease and were unlikely to elicit psychosocial information. These physicians also demonstrated a low level of friendliness and were unlikely to negotiate options with the patient. The high physician control group’s major characteristics were domination of the encounter and disregard of the patient’s agenda (14%).
Association of physician characteristics with the interaction styles is presented in Table 2. The percent of male and female physicians differed greatly among the 4 style groups. The proportion of female physicians in the person-focused group was almost 4 times that of the biopsychosocial group and the high physician control group (P
As reported in Table 3, physician style is significantly associated with 3 of the 5 patient reports of the attributes of primary care. Physicians classified as having a person-focused approach have the highest mean score of communication; the other 3 styles score lower, with the high-physician-control style scoring the lowest. Person-focused and biopsychosocial physicians scored highest on patient reports of accumulated knowledge; those in the biomedical group scored the lowest. Coordination of care was highest among the person-focused group and lowest among the high-control group Across the different types of physician style, there was no difference in patient report of preference for his or her regular physician or the measure of continuity of care.
The associations of physician style with 2 indicators of patient satisfaction are displayed in Table 4. The highest group mean of patient satisfaction is for the person-focused style, and the lowest is for the high-physician-control group. The indicator of the degree to which patient expectations were met also follows this pattern. Also displayed in Table 4, the person-focused style demonstrated the longest average duration of visit, at 11.5 minutes; the high-physician-control group visits were the shortest in duration, at about 9.5 minutes.
TABLE 2
Physician and patient characteristics associated with interaction style
Characteristic | Total | Biopsychosocial | Biomedical | Person focused | High physician control | P |
---|---|---|---|---|---|---|
Physician | ||||||
Number | 138 | 22 | 28 | 68 | 20 | |
Age (mean years) | 43 | 45 | 43 | 42 | 46 | .06 |
Female | 26% | 9% | 21% | 38% | 10% | |
Residency trained | 90% | 86% | 86% | 94% | 85% | .44 |
Patient | ||||||
Number | 2881 | 504 | 578 | 1258 | 541 | |
Age (mean years) | 42 | 44 | 41 | 42 | 43 | .11 |
Female | 62% | 57% | 61% | 65% | 58% |
Association of physician style with attributes of primary care1
Attribute of primary care | Biopsychosocial | Biomedical | Person focused | High physician control | P |
---|---|---|---|---|---|
Communication | 4.27 | 4.26 | 4.43 | 4.21 | |
Accumulated knowledge | 3.54 | 3.33 | 3.56 | 3.51 | |
Coordination of care | 3.85 | 3.78 | 3.99 | 3.74 | |
Preference for regular doctor | 4.46 | 4.45 | 4.46 | 4.39 | ns |
Usual provider continuity2 | 0.67 | 0.66 | 0.64 | 0.65 | ns |
1Each row represents a separate multilevel regression model wherein each attribute of primary care is the outcome variable and the number in each column is the group mean of that attribute, adjusted for patient and physician age and sex, as well as the effect of the patients being nested within physicians. | |||||
2Usual provide continuity = total number of visits to regular physician in past year, divided by the total number of physician visits in the past year. |
Association of physician interaction style with patient satisfaction and duration of visit1
Outcome measures | Biopsychosocial | Biomedical | Person focused | High physician control | P |
---|---|---|---|---|---|
Patient satisfaction with physician | 4.38 | 4.39 | 4.49 | 4.30 | 002 |
Patient expectations met | 4.36 | 4.33 | 4.45 | 4.31 | .02 |
Length of visit (mean minutes) | 9.97 | 10.02 | 11.56 | 9.51 | .005 |
1Results from multilevel regression model, analyses include patient and physician age and gende as covariates, and controls for the nested nature of the data. |
Discussion
These data indicate that a person-focused approach is actively used in community practice, and is the style most congruent with patient-reported quality of primary care and satisfaction with care. Our data, in concert with data reported by others,5,24 indicate strong support for the feasibility and value of the person-focused model. We found that, of the 4 distinct interaction styles, physicians with the person-focused style scored highest across all measures of the attributes of primary care and on the indicators of patient satisfaction, with the exception of continuity of care. In contrast, physicians with the high-control style were generally lowest on the primary care and satisfaction indicators.
It is important to emphasize that, even though the vast majority of patients in this sample are likely to have self-selected their primary care physician, patient rating of some attributes of primary care differed across the 4 physician styles. Patients of physicians with different styles equally valued seeing their regular physician, as reported by the preference-for-their-regular-doctor score; they exhibited similar proportions of continuity visits in the past year; and their satisfaction scores were all generally high. Patients appear to want to see their regular physician, regardless of interaction approach, even though some approaches—particularly the high-physician-control style—were rated poorer for communication, coordination of care, and accumulated knowledge.
There may be several explanations as to why a particular physician style is associated with specific patient reports of communication, accumulated knowledge, and coordination of care. Openness to the patient’s agenda and willingness to negotiate options—as was characteristic of the person-focused physicians—may facilitate good communication and convey an understanding of patient preferences and values regarding health. It is interesting to note that different groups scored lowest on some of the attributes of primary care. The high-physician-control group was the lowest on interpersonal communication and coordination of care. High-control physicians were more likely to dominate the agenda and the verbal exchanges. Patients may have felt they could not ask questions or that the physician did not listen to what they tried to say. The biomedical group of physicians were given the lowest scores by patients on accumulated knowledge, suggesting that patients thought these physicians were less likely to know their preferences and values regarding health care, know less about them as persons, and know less about their family and medical histories.
As others have proposed, we concur that interaction style is not a dichotomy or even a continuum of patient versus physician control, but is multidimensional, cutting across the main functions of the patient encounter (ie, information gathering, relationship building, and making and implementing decisions). These data provide some confirmation for the original scheme proposed by Szasz and Hollander,10 with the Mutual Participation model most represented by the person-focused approach and the Activity-Passivity model most represented by the high-physician-control group. The biopsychosocial and biomedical approaches represent different versions of the Guidance and Cooperative model.
The 4 types of physician style empirically derived from our data are similar to communication pattern types found by Roter et al,27 in a study with similar aims but different methods. Of the 5 types reported, narrowly biomedical and expanded biomedical accounted for 65% of visits, and biopsychosocial accounted for 20%. Psychosocial and consumerist (distinguished by a high degree of patient questions) accounted for only 8% each. It is interesting that in our data, we found the person-focused style was by far the most common approach (49%) among this group of family physicians. These differences in use of particular interaction styles may have several explanations. First, these data were collected more recently.27 Thus our data may reflect trends in a movement away from a paternalistic style and toward an increased patient participatory style. Second, our sample consisted entirely of family physicians practicing in the community, where the model of person-focused care may have a longer history of support and endorsement or be of greater importance to community family physicians, whose emphasis is on a breadth of care based on patient needs.6,7,18
Physicians with a person-focused style granted the longest visits, while high-control-physicians granted the shortest—a difference of more than 2 minutes per visit on average. The associations were not explained away by accounting for patient or physician characteristics, suggesting that a person-focused style may require more time. However, others have found that physicians engaging in a patient participatory style had office visits that were of similar duration as found with other approaches,23, 27 although the average duration of visit for both of these studies were considerably longer than the office visits among our sample.
This study has several strengths. The use of community practicing physicians in real world conditions for whom visits were similar in content to the visits reported by NAMCS34 adds to the generalizability of the findings. We have used an integration of qualitative and quantitative approaches to empirically derive categories of physician interaction style. Our data are based on nurse observation of an average of 32 encounters per physician and documented in rich and comprehensive qualitative fieldnotes. And finally, by using multilevel modeling, we have reported an honest estimate of the association of physician style and patient report of primary care by appropriately modeling the nested data structure.
The findings must be interpreted in light of potential study limitations. First, the patients who did not complete the patient questionnaire are somewhat different demographically than those patients who did complete it. However, non-completion of the questionnaire was not associated with physician style; therefore, it is unlikely that the associations would change, had these individuals been included. Second, because the study was cross-sectional we cannot control for patient self-selection of physicians. Nonetheless, since patients dissatisfied with the quality of care are likely to seek another physician, we would expect patient self-selection of physicians to bias the study toward the null, thus making our results even more remarkable.
These findings, in combination with the literature on the person-focused,24 patient-centered5,17,19,20,41 and relationship-centered approaches,42 provide strong evidence to support the widespread implementation of this physician-patient interaction approach. Further investigation in community practice may lead to identification of ways to support and encourage person-focused care and the time needed to provide such care.
· Acknowledgments ·
The authors are indebted to the physicians, office staff members, and patients without whose participation this study would not have been possible. This paper was improved by helpful suggestions on an earlier draft by Kurt C. Stange, MD, PhD. This study was supported by a grant from the National Cancer Institute (1R01 CA60862) and in part by the Center for Research in Family Practice and Primary Care and the American Academy of Family Practice.
1. Bertakis KD, Roter D, Putnam SM. The relationship of physician medical interview style to patient satisfaction. J Fam Pract. 1991;32:175-181.
2. Bertakis KD, Callahan EJ, Helms LJ, Azari R, Robbins JA, Miller J. Physician practice styles and patient outcomes. Med Care. 1998;36:879-891.
3. Stewart MA. What is a successful doctor-patient interview? A study of interactions and outcomes. Soc Sci Med. 1984;19:167-175.
4. Levinson W, Roter DL, Mullooly JP, Dull VT, Frankel RM. Physician-patient communication: the relationship with malpractice claims among primary care physicians and surgeons. JAMA. 1997;277:553-559.
5. Stewart M, Brown JB, Donner A, McWhinney IR, Oates J, Weston WW, Jordan J. The impact of patient-centered care on outcomes. J Fam Pract. 2000;49:796-804.
6. McWhinney IR. Through clinical method to a more humane medicine. In: White KL, ed. The task of medicine. Menlo Park, CA: The Henry J. Kaiser Family Foundation; 1988.
7. Stange KC, Jaén CR, Flocke SA, Miller WL, Crabtree BF, Zyzanski SJ. The value of a family physician. J Fam Pract. 1998;46:363-368.
8. Institute of Medicine. Primary Care: America’s Health in a New Era. Donaldson MS. YK, Lohr KN, Vanselow NA, ed Washington D.C.: National Academy Press; 1996.
9. Laine C, Davidoff F. Patient-centered medicine: A professional evolution. JAMA. 1996;275:152-156.
10. Szasz TS, Hollender MH. The basic models of the doctor-patient relationship. Arch Int Med. 1956;97:585-592.
11. Emanuel EJ, Emanuel LL. Four models of the physician-patient relationship. JAMA. 1992;267:2221-2226.
12. Veatch RM. Models for ethical medicine in a revolutionary age. What physician-patient roles foster the most ethical relationship? Hasting Center Reports. 1972;2:5-7.
13. Quill TE. Partnerships in patient care: a contractual approach. Ann Int Med. 1983;98:228-234.
14. Kleinman AM, Eisenberg L, Good B. Culture, illness, and care: Clinical lessons from anthropologic and cross-cultural research. Ann Int Med. 1978;88:251-258.
15. Lazare A, Eisenthal S, Wasserman L. The customer approach to patienthood: Attending to patient requests in a walk-in clinic. Archives of General Psychiatry. 1975;32:553-558.
16. McDaniel S, Campbell T, Seaburn D. Family-oriented primary care: a manual for medical providers. Berlin: Springer-Verlag; 1990.
17. Stewart M, Weston WW, Brown JB, McWhinney IR, McWilliam CL, Freeman TR. Patient-centered medicine: Transforming the clinical method. Thousand Oaks, CA: Sage Publications; 1995.
18. Levenstein JH, McCracken EC, McWhinney IR, Stewart MA, Brown JB. The patient-centred clinical method. 1. A model for the doctor-patient interaction in family medicine. Fam Pract. 1986;3:24-30.
19. Epstein RM. The science of patient-centered care. J Fam Pract. 2000;49:805-807.
20. Stewart M, Roter D. Communicating With Medical Patients. Knapp ML, ed second printing (1990) ed: Sage Publications; 1989.
21. Hall JA, Roter DL, Katz NR. Meta-analysis of correlates of provider behavior in medical encounters. Med Care. 1988;26:657-675.
22. Byrne PS, Long BEL. Doctors talking to patients. London: H.M.S.O.; 1976.
23. Marvel MK, Doherty WJ, Weiner E. Medical interviewing by exemplary family physicians. J Fam Pract. 1998;47:343-348.
24. Roter D. The enduring and evolving nature of the patient-physician relationship. Patient Educ and Counseling. 2000;39:5-15.
25. Kaplan SH, Greenfield S, Ware JE. Assessing the effects of physician-patient interactions on the outcomes of chronic disease. Med Care. 1989;27:S110-S127.
26. Buller MK, Buller DB. Physicians’ communication style and patient satisfaction. J Health Soc Behav. 1987;28:375-388.
27. Roter DL, Stewart M, Putnam SM, Lipkin M, Stiles W, Inui TS. Communication patterns of primary care physicians. JAMA. 1997;277:350-356.
28. Williams S, Weinman J, Dale J. Doctor-patient communication and patient satisfaction: A review. Fam Pract. 1998;15:480-492.
29. Greene MG, Adelman RD, Friedman E, Charon R. Older patient satisfaction with communication during an initial medical encounter. Soc Sci Med. 1994;38:1279-1288.
30. Cohen-Cole S. The medical interview: The three-function approach. St. Louis: Mosby Year Book; 1991.
31. Lazare A, Putnam SM, Lipkin M. Three functions of the medical interview. In: Lipkin M, Putnam S, Lazare A, eds. The medical interview: Clinical care, education and research. New York: Springer; 1995;3-19.
32. Crabtree BF, Miller WL, Aita V, Flocke SA, Stange KC. Primary care practice organization: A qualitative analysis. J Fam Pract. 1998;46:403-409.
33. Stange KC, Zyzanski SJ, Jaén CR, Callahan EJ, Kelly RB, Gillanders WR, Shank JC, Chao J, Medalie JH, Miller WL, Crabtree BF, Flocke SA, Gilchrist VJ, Langa DM, Goodwin MA. Illuminating the black box: a description of 4454 patient visits to 138 family physicians. J Fam Pract. 1998;46:377-389.
34. Stange KC, Zyzanski SJ, Smith TF, Kelly R, Langa DM, Flocke SA, Jaén CR. How valid are medical records and patient questionnaires for physician profiling and health services research? A comparison with direct observation of patient visits. Med Care. 1998;36:851-867.
35. Flocke SA. Measuring attributes of primary care: Development of a new instrument. J Fam Pract. 1997;45:64-74.
36. Rubin H, Gandek B, Roger WH, Kisinski M, McHorney C, Ware J. Patients’ ratings of outpatient visits in different practice settings. JAMA. 1993;270:835-840.
37. FolioVIEWS.. 3.1 ed. Provo, Utah: Folio Corporation; 1998.
38. Crabtree BF, Miller WL. Doing Qualitative Research. Newbury Park, California: Sage Publications; 1992.
39. Aldenderfer MS, Blashfield RK. Cluster Analysis. Lewis-Beck MS, ed Newbury Park: Sage; 1984.
40. Bryk AS, Raudenbush SW. Hierarchical linear models: applications and data analysis methods. Newbury Park: Sage Publications; 1992.
41. Stewart M, Brown JB, Boon H, Galajda J, Meredith L, Sangster M. Evidence on patient-doctor communication. Cancer Prevention and Control. 1999;3:25-30.
42. Carol P. Tresolini and the Pew-Fetzer Task Force on Advancing Psychosocial Health Education. Health profession education and relationship-centered care. San Francisco, CA: Pew Health Professions Commission; 1994.
1. Bertakis KD, Roter D, Putnam SM. The relationship of physician medical interview style to patient satisfaction. J Fam Pract. 1991;32:175-181.
2. Bertakis KD, Callahan EJ, Helms LJ, Azari R, Robbins JA, Miller J. Physician practice styles and patient outcomes. Med Care. 1998;36:879-891.
3. Stewart MA. What is a successful doctor-patient interview? A study of interactions and outcomes. Soc Sci Med. 1984;19:167-175.
4. Levinson W, Roter DL, Mullooly JP, Dull VT, Frankel RM. Physician-patient communication: the relationship with malpractice claims among primary care physicians and surgeons. JAMA. 1997;277:553-559.
5. Stewart M, Brown JB, Donner A, McWhinney IR, Oates J, Weston WW, Jordan J. The impact of patient-centered care on outcomes. J Fam Pract. 2000;49:796-804.
6. McWhinney IR. Through clinical method to a more humane medicine. In: White KL, ed. The task of medicine. Menlo Park, CA: The Henry J. Kaiser Family Foundation; 1988.
7. Stange KC, Jaén CR, Flocke SA, Miller WL, Crabtree BF, Zyzanski SJ. The value of a family physician. J Fam Pract. 1998;46:363-368.
8. Institute of Medicine. Primary Care: America’s Health in a New Era. Donaldson MS. YK, Lohr KN, Vanselow NA, ed Washington D.C.: National Academy Press; 1996.
9. Laine C, Davidoff F. Patient-centered medicine: A professional evolution. JAMA. 1996;275:152-156.
10. Szasz TS, Hollender MH. The basic models of the doctor-patient relationship. Arch Int Med. 1956;97:585-592.
11. Emanuel EJ, Emanuel LL. Four models of the physician-patient relationship. JAMA. 1992;267:2221-2226.
12. Veatch RM. Models for ethical medicine in a revolutionary age. What physician-patient roles foster the most ethical relationship? Hasting Center Reports. 1972;2:5-7.
13. Quill TE. Partnerships in patient care: a contractual approach. Ann Int Med. 1983;98:228-234.
14. Kleinman AM, Eisenberg L, Good B. Culture, illness, and care: Clinical lessons from anthropologic and cross-cultural research. Ann Int Med. 1978;88:251-258.
15. Lazare A, Eisenthal S, Wasserman L. The customer approach to patienthood: Attending to patient requests in a walk-in clinic. Archives of General Psychiatry. 1975;32:553-558.
16. McDaniel S, Campbell T, Seaburn D. Family-oriented primary care: a manual for medical providers. Berlin: Springer-Verlag; 1990.
17. Stewart M, Weston WW, Brown JB, McWhinney IR, McWilliam CL, Freeman TR. Patient-centered medicine: Transforming the clinical method. Thousand Oaks, CA: Sage Publications; 1995.
18. Levenstein JH, McCracken EC, McWhinney IR, Stewart MA, Brown JB. The patient-centred clinical method. 1. A model for the doctor-patient interaction in family medicine. Fam Pract. 1986;3:24-30.
19. Epstein RM. The science of patient-centered care. J Fam Pract. 2000;49:805-807.
20. Stewart M, Roter D. Communicating With Medical Patients. Knapp ML, ed second printing (1990) ed: Sage Publications; 1989.
21. Hall JA, Roter DL, Katz NR. Meta-analysis of correlates of provider behavior in medical encounters. Med Care. 1988;26:657-675.
22. Byrne PS, Long BEL. Doctors talking to patients. London: H.M.S.O.; 1976.
23. Marvel MK, Doherty WJ, Weiner E. Medical interviewing by exemplary family physicians. J Fam Pract. 1998;47:343-348.
24. Roter D. The enduring and evolving nature of the patient-physician relationship. Patient Educ and Counseling. 2000;39:5-15.
25. Kaplan SH, Greenfield S, Ware JE. Assessing the effects of physician-patient interactions on the outcomes of chronic disease. Med Care. 1989;27:S110-S127.
26. Buller MK, Buller DB. Physicians’ communication style and patient satisfaction. J Health Soc Behav. 1987;28:375-388.
27. Roter DL, Stewart M, Putnam SM, Lipkin M, Stiles W, Inui TS. Communication patterns of primary care physicians. JAMA. 1997;277:350-356.
28. Williams S, Weinman J, Dale J. Doctor-patient communication and patient satisfaction: A review. Fam Pract. 1998;15:480-492.
29. Greene MG, Adelman RD, Friedman E, Charon R. Older patient satisfaction with communication during an initial medical encounter. Soc Sci Med. 1994;38:1279-1288.
30. Cohen-Cole S. The medical interview: The three-function approach. St. Louis: Mosby Year Book; 1991.
31. Lazare A, Putnam SM, Lipkin M. Three functions of the medical interview. In: Lipkin M, Putnam S, Lazare A, eds. The medical interview: Clinical care, education and research. New York: Springer; 1995;3-19.
32. Crabtree BF, Miller WL, Aita V, Flocke SA, Stange KC. Primary care practice organization: A qualitative analysis. J Fam Pract. 1998;46:403-409.
33. Stange KC, Zyzanski SJ, Jaén CR, Callahan EJ, Kelly RB, Gillanders WR, Shank JC, Chao J, Medalie JH, Miller WL, Crabtree BF, Flocke SA, Gilchrist VJ, Langa DM, Goodwin MA. Illuminating the black box: a description of 4454 patient visits to 138 family physicians. J Fam Pract. 1998;46:377-389.
34. Stange KC, Zyzanski SJ, Smith TF, Kelly R, Langa DM, Flocke SA, Jaén CR. How valid are medical records and patient questionnaires for physician profiling and health services research? A comparison with direct observation of patient visits. Med Care. 1998;36:851-867.
35. Flocke SA. Measuring attributes of primary care: Development of a new instrument. J Fam Pract. 1997;45:64-74.
36. Rubin H, Gandek B, Roger WH, Kisinski M, McHorney C, Ware J. Patients’ ratings of outpatient visits in different practice settings. JAMA. 1993;270:835-840.
37. FolioVIEWS.. 3.1 ed. Provo, Utah: Folio Corporation; 1998.
38. Crabtree BF, Miller WL. Doing Qualitative Research. Newbury Park, California: Sage Publications; 1992.
39. Aldenderfer MS, Blashfield RK. Cluster Analysis. Lewis-Beck MS, ed Newbury Park: Sage; 1984.
40. Bryk AS, Raudenbush SW. Hierarchical linear models: applications and data analysis methods. Newbury Park: Sage Publications; 1992.
41. Stewart M, Brown JB, Boon H, Galajda J, Meredith L, Sangster M. Evidence on patient-doctor communication. Cancer Prevention and Control. 1999;3:25-30.
42. Carol P. Tresolini and the Pew-Fetzer Task Force on Advancing Psychosocial Health Education. Health profession education and relationship-centered care. San Francisco, CA: Pew Health Professions Commission; 1994.
Is a history of trauma associated with a reduced likelihood of cervical cancer screening?
- Women who had not had recommended cervical cancer screening were more likely to have been sexually abused in childhood.
- Women who were sexually abused in childhood may be at higher risk than other women for HPV and cervical cancer; therefore, screening is particularly important for these women.
- Not having cervical cancer screening may be a marker for childhood sexual abuse. Therefore, health care providers should consider investigating these issues with women who do not adhere to guidelines for routine Pap smears.
Unfortunately, 15% to 24% of US women do not receive recommended cervical cancer screening.1-3 Barriers to Pap screening include low income, low education, minority status;4 lack of cancer knowledge, attitudes, beliefs, low perceived cancer susceptibility, pain, embarrassment;5-7 language, and certain cultural beliefs.7-9 Sexual trauma has received little research attention as a factor contributing to lowered rates of Pap screening. Sexual trauma is reliably associated with subsequent poor health, which may be partially accounted for by poor preventive care.10-16 Childhood sexual abuse is strongly associated with negative health behaviors such as physical inactivity and smoking.13,17 Sexual violence is associated with lower rates of breast cancer screening18 and increased risk of posttraumatic stress disorder (PTSD).19-21 Avoidant coping styles (an aspect of PTSD) are associated with decreased health promotion behaviors such as screening.22-25
Gynecologic procedures may feel threatening to women with a history of sexual assault, and may be experienced as re-traumatizing.14,26-29 Women who had suffered childhood sexual abuse reported more anxiety, shame, and fear during a gynecologic examination than other women.28 Springs and Friedrich16 found a lower frequency of screening for cervical cancer among adult survivors of childhood sexual abuse, but did not assess the impact of other traumatic events in childhood or adulthood on Pap screening. Because previous research on correlates of sexual trauma has been criticized on the grounds that third variables could account for the observed associations,30 we evaluated associations of any traumatic event with low rates of Pap screening.
We hypothesized that having experienced traumatic events, in particular childhood sexual trauma, would function as barriers to Pap screening. We predicted that women who had not had medically appropriate Pap screening would report a greater number of traumatic events, especially sexual abuse trauma in this ethnically diverse random sample of women. We also expected that sexually traumatized women would express more negative attitudes toward Pap screening, and would be more likely to meet criteria for PTSD, both of which might contribute to lower levels of Pap screening.
Methods
Kaiser Permanente (KP), a pre-paid maintenance organization, offers cervical cancer screening at no cost to patients. KP’s clinical guidelines recommend Pap screening every 2 years for women over age 20 with average risk for cervical cancer. Self-report questionnaires were mailed to an age-stratified random sample of women 21–64 years old who were KP members at 3 locations. Women who had had a total hysterectomy were excluded. We compared women who had and who had not obtained Pap screening in the previous 2 years. In previous research18 we found that women who had not obtained mammography had a lower response rate to mailed questionnaires than women who had been screened. We therefore oversampled women who had not had Pap screening. We mailed questionnaires to 1314 women who had obtained Pap screening and 2897 who had not. The final sample included 364 women who had received screening in the past two years (28% response rate) and 372 who had not (13% response rate). Repeated sampling or telephoning of non-respondents was not allowed by KP policy.
Trauma history was measured in 2 ways. The Trauma History Questionnaire31,32 assesses a range of lifetime traumatic events. The Childhood Trauma Questionnaire33 assesses childhood physical abuse, physical neglect, sexual abuse, emotional abuse, and emotional neglect. PTSD was assessed with the Posttraumatic Stress Disorder Checklist.34 We inquired about attitudes toward Pap screening based on previous findings.
Data were analyzed using SAS.35 Contingency tables were analyzed to estimate the prevalence of traumatic events and their bivariate associations with Pap screening. Chi square analysis was used to evaluate the statistical significance of these associations. Hierarchical logistic regression was used to evaluate associations of traumatic events with screening, independent of clinic location, demographic characteristics, attitudes about screening, and PTSD.
Results
Sample demographics
Women who had been screened for cervical cancer and unscreened women were similar in age and education (Table 1). Unscreened women were more likely to be Asian American, to have incomes of $20,000 per year or less, and to have never been married.
TABLE 1
Demographic characteristics of women with and without Pap screening
No Pap (%) n = 372a | Pap (%) n = 364a | P | |
---|---|---|---|
Ethnicity | .001 | ||
African American | 10.1 | 11.6 | |
Asian American | 20.1 | 8.0 | |
European American | 60.6 | 71.8 | |
Other | 9.2 | 8.6 | |
Age | .076 | ||
Mean (standard deviation) | 43.8 (12.8) | 45.5 (12.4) | |
Education | .187 | ||
Elementary school | 2.7 | 1.4 | |
High school | 39.6 | 34.8 | |
College | 41.5 | 43.2 | |
Post-college | 16.9 | 20.6 | |
Family income | .002 | ||
$20,000/year or less | 12.4 | 5.8 | |
$20,001–$50,000 | 41.3 | 38.2 | |
More than $50,000 | 46.2 | 56.0 | |
Marital status | .012 | ||
Never married | 34.3 | 24.8 | |
Married | 47.0 | 59.1 | |
Separated | 1.6 | 0.6 | |
Divorced | 13.8 | 13.1 | |
Widowed | 3.2 | 2.5 | |
aSample sizes vary slightly because of missing data on individual demographic items. |
Prevalence of trauma
Commonly reported events during childhood included natural disaster (reported by 13% of the women), sexual assault other than rape (11%), and news of a death or injury (10%). Childhood sexual abuse or sexual assault was reported by 18.4% of the respondents. The most common traumas in adulthood were receiving news of a death or serious injury (46%), natural disasters (33%), actual or attempted robbery (27%), and serious accidents (14%). Of the respondents, 8.3% reported sexual abuse or sexual assault in adulthood. Their overall rate of childhood and adult sexual assault was 26.7%.
Associations of trauma history with pap screening
We investigated the association of trauma with screening using chi square analyses. Women who had been raped before age 18 (36% vs. 50%, n = 713, P = .050) and women who had been subjected to other sexual assaults before age 18 (35% vs. 51%, n = 694, P = .009) were less likely to have been screened. Nonsexual childhood abuse and neglect were not related to screening. Women who experienced a natural disaster during childhood (36% vs. 52%, n = 571, P = .009) and those who experienced terrorist acts during adulthood (20% vs. 49%, n = 715, P = .024) were less likely to have been screened. (Although the association with a terrorist act was significant, exposures were reported by only 3% of unscreened women and 0.9% of screened women.) Women who reported a household break-in during adulthood were slightly more likely to have been screened (53% vs. 47%, n = 656, P = .032).
In a hierarchical logistic regression model (Table 2), childhood sexual abuse, but not other traumatic events, was associated with lower odds of screening when clinic location, demographic characteristics, attitudes, and PTSD were controlled. The logistic regression model was repeated using CTQ subscales to assess trauma, with similar results. Unmarried women were less likely than currently married women to have been screened, and Latina, Native American, Asian/Pacific, or multicultural women were less likely than European American women to have been screened. Women who endorsed the statement, “I have no symptoms so I do not need a Pap test” and those who anticipated embarrassment during screening were less likely than others to have been screened; women who believed that testing would ease their mind were more likely to have been screened.
TABLE 2
Hierarchical logistic regression model of sexual trauma and attitudes as predictors of pap screening
Predictor | Adjusted odds ratio (95% CI) |
---|---|
Traumatic events | |
Break-in (adult) | 1.14 (0.77, 1.70) |
Natural disaster (child) | 0.78 (0.45, 1.38) |
Terrorist act (adult) | 0.28 (0.07, 1.07) |
Childhood sexual trauma | 0.56 (0.34, 0.91) * |
Site | |
Santa Rosa | 0.68 (0.44, 1.04) |
San Francisco | 1.0 (referent) |
Oakland | 1.27 (0.80, 2.02) |
Education | |
Less than college | 1.09 (0.71, 1.69) |
College | 1.0 (referent) |
More than college | 1.01 (0.65, 1.57) |
Ethnicity | |
European-American | 1.0 (referent) |
African American | 0.59 (0.33, 1.06) |
Other than African | |
American or European | |
American | 0.46 (0.29, 0.71) ** |
Unmarried (compared with married) | 0.67 (0.48, 0.94) * |
Attitudes toward Pap screening | |
“I have no symptoms so I do not need a Pap test” | 0.66 (0.51, 0.85) ** |
“I’ve had negative experiences with my health care provider” | 0.90 (0.73, 1.10) |
“Getting a Pap test would ease my mind” | 1.54 (1.25, 1.89) *** |
“There is danger of infection from a Pap test” | 1.09 (0.83, 1.43) |
“I do not trust the health care system” | 1.06 (0.81, 1.39) |
“I would be embarrassed to have a Pap test” | 0.67 (0.52, 0.84) *** |
“Women who have many sexual | |
partners are more likely to have cervical cancer” | 0.88 (0.73, 1.06) |
“Pap would cause sexual assault flashbacks, or health care provider looks at me in a sexual way” | 1.05 (0.77, 1.45) |
PTSD diagnosis | 1.62 (0.91, 2.90) |
Missing data | 0.96 (0.80, 1.13) |
*P |
Discussion
Childhood sexual abuse is reliably associated with a decreased likelihood of cervical cancer screening. This association persisted despite controlling for demographic characteristics, attitudes about Pap screening, and PTSD symptoms. These findings are strengthened by the consistency with which childhood sexual abuse is associated with low rates of Pap screening using 2 measures of trauma in 3 clinics. Although cost has been a major barrier to access in previous studies of cervical cancer screening, it is not a barrier for women who are members of a pre-paid health plan. It was therefore possible for us to investigate known and suspected barriers to cervical cancer screening with fewer confounding co-variables.
This study clarifies the role of childhood sexual assault in Pap screening. Sexual assault, but not other traumatic events or other types of childhood abuse, is associated with lower rates of cervical cancer screening. Furthermore, sexual assault during childhood, but not during adulthood, is strongly associated with decreased Pap screening.
The relationship between childhood sexual abuse and Pap screening is particularly disturbing because women who were sexually assaulted as children are more likely to develop cervical dysplasia.36 Women who were sexually assaulted in childhood also tend to begin sexual activity at a young age and have more sexual partners.15,16,36 These are among the primary risk factors for human papillomavirus (HPV),37 an important cause of cervical cancer,38,39 and for cervical cancer.7 Women who were sexually abused in childhood are at increased risk of sexually transmitted disease,15,40 and HPV is the most common sexually transmitted viral disease.38 Therefore, women at higher risk for cervical cancer may be the same women who are least likely to be screened. Childhood sexual abuse may increase cervical cancer morbidity by reducing the probability of Pap screening, and by increasing the probability of disease. It may also decrease the likelihood that these women visit their physician for other routine health maintenance needs.
The low response rate in this study may have resulted from the questionnaire’s being sent to KP members once, without follow-up. Our response rate was comparable to a similar study of HMO members.16 Use of a mailed questionnaire probably resulted in underestimation of childhood sexual abuse prevalence.41 The relationship of sexual abuse to preventive health behaviors is comparable to that reported in studies with higher response rates.13,17
There is some evidence that the interpersonal climate between patient and clinician affects health outcomes,42 and we suspect it is a critical factor in increasing women’s comfort with Pap screening. One of our respondents commented: “I’ve always been treated professionally by my gynecologist and yet I still feel the need for the reassuring presence of a nurse during this procedure. I have asked the nurse to hold my hand during the test to calm me down. I find the hand holding or even her hand on my arm comforting.”
The most consistent predictor of cancer screening among women aged 40 and over was a health maintenance visit or regular source of care.43,44 Not having cervical cancer screening may be a marker for childhood sexual abuse. Therefore, health care providers should consider inquiring about a history of sexual abuse with women who do not follow guidelines for routine Pap screening. It is crucial to develop interventions that will lead to routine medical visits for women who have experienced sexual violence. As part of this process, we recommend education for physicians and other health care providers regarding sexual violence against women.
· Acknowledgments ·
Larry Walter, MA, and Sujaya Parthasarathy, PhD, of the Kaiser Permanente Division of Research in Oakland, California, contributed to our obtaining the random sample of women health plan members in this study. Howard Barkan, DrPH, helped design this project and participated in the data collection. We thank him for his insight and expertise.
1. American Cancer Society. Statistics: Table 3C.Pap Test, Women 18 and Older, by State, 1997 [website]. In; http://www3.cancer.org/cancerinfo/sitecenter.asp?ct=1&ctid=8&scp=8.3.8.42080&scs=4&scss=16&scdoc=42096&pnt=2&language=english [accessed 2001, 2/27], 2000.
2. American Cancer Society. Statistics: Cervical cancer [website]. In http://www3.cancer.org/cancerinfo/sitecenter.asp?ct=1&ctid=8&scp=8.3.4.4071&scs=4&scss=2&scdoc=42073&pnt=2&language=english [accessed 2001, 2/27]; 2000.
3. Hayward RA, Shapiro MF, Freeman HE, Corey CR. Who gets screened for cervical and breast cancer? Results from a new national survey. Arch Intern Med 1988;148:1177-81.
4. Breen N, B FJ. Stage of breast and cervical cancer diagnosis in disadvantaged neighborhoods: A prevention policy perspective. Am J Prev Med 1996;12(5):319-26.
5. Calle EE, Flanders WD, Thun MJ, Martin LM. Demographic predictors of mammography and Pap smear screening in US women. Am J Public Health 1993;83:53-60.
6. Peters RK, Bear MB, Thomas D. Barriers to screening for cancer of the cervix. Prev Med 1989;18:133-46.
7. Womeodu RJ, Bailey JE. Barriers to cancer screening. Med Clin North Am 1996;80(1):115-33.
8. Suarez L. Pap smear and mammogram screening in Mexican-American women: the effects of acculturation. Am J Public Health 1994;84:742-6.
9. Tang TW, Solomon LJ, Yeh CJ, Worden JK. The role of cultural variables in breast self-examination and cervical cancer screening behavior in young Asian women living in the United States. J Behav Med 1999;22(5):419-36.
10. Golding JM. Sexual assault history and physical health in randomly selected Los Angeles women. Health Psychol 1994;13:130-8.
11. Golding JM. Sexual assault history and women’s reproductive and sexual health. Psychol of Women Quarterly 1996;20:101-21.
12. Golding JM. Sexual assault history and long-term physical health: Evidence from clinical and population epidemiology. Curr Directions in Psychol Sci 1999;8:191-4.
13. Koss MP, Koss PG, Woodruff WJ. Deleterious effects of criminal victimization on women’s health and medical utilization. Arch Intern Med 1991;151:342-7.
14. Laws A. Sexual abuse history and women’s medical problems. J Gen Intern Med 1993;8:441-44.
15. Lechner ME, Vogel ME, Garcia-Shelton LM, Leichter JL, Steibel KR. Self-reported medical problems of adult female survivors of childhood sexual abuse. J Fam Pract 1993;36:633-8.
16. Springs FE, Friedrich WN. Health risk behaviors and medical sequelae of childhood sexual abuse. Mayo Clin Proc 1992;67:527-32.
17. Felitti V, Anda F, Nordenberg D, Williamson, Spitz A, Edwards V, et al. Relationship of childhood abuse and household dysfunction to many of the leading causes of death in adults. Am J Prev Med 1998;14:245-58.
18. Farley M, Minkoff J, Barkan H. Breast cancer screening and trauma history. Women Health in press.
19. Kessler RC, Sonnega A, Bromet E, Hughes M, Nelson CB. Posttraumatic stress disorder in the National Comorbidity Survey. Arch Gen Psychiatry 1995;52:1048-60.
20. Polusny MA, Follette VM. Long-term correlates of child sexual abuse: Theory and review of the empirical literature. Applied and Preventive Psychology 1995;4:143-66.
21. Resnick HS, Kilpatrick DG, Dansky BS, Saunders BE, Best CL. Prevalence of civilian trauma and posttraumatic stress disorder in a representative national sample of women. J Consulting Clin Psychol 1993;61:984-91.
22. Blake DD, Cook JD, Keane TM. Posttraumatic stress disorder and coping in veterans who are seeking medical treatment. J Clin Psychol 1992;48:695-704.
23. Fama LD, Blake DD, Gusman F. Coping and health behaviors in combat-related PTSD inpatients. In: Annual Meeting of the International Society for Traumatic Stress Studies; San Antonio; 1993.
24. Farley M, Barkan H. Somatization, dissociation, and tension-reducing behaviors in psychiatric outpatients. Psychother Psychosom 1997;66:133-40.
25. Wolfe J, Proctor SP, Brown P, Kimerling RD, J., Sullivan M, Chrestman K, et al. Relationship of physical health and posttraumatic stress disorder in young adult women. In: Annual Meeting of the International Society for Traumatic Stress Studies; 1994; Los Angeles; 1994.
26. Kitzinger J. Recalling the pain. Nursing Times 1990 January;17:38-40.
27. Menage J. Women’s perception of obstetric and gynaecological examinations. Br Med J 1993;306:1127-8.
28. Robohm JS, Buttenheim M. The gynecological care experiences of adult survivors of childhood sexual abuse: A preliminary investigation. Women Health 1996;24:59-75.
29. Wahlen SD. Adult survivors of childhood sexual abuse. In: Hendricks-Matthews M, editor. Violence education: Toward a solution. Kansas City, MO: Society of Teachers of Family Medicine; 1992. p. 89-102.
30. Briere J. Methodological issues in the study of sexual abuse effects. J Consulting Clin Psychol 1992;60:196-203.
31. Stamm BH, Varra ME. Instrumentation in the Field of Traumatic Stress. Oswego, NY: Research and Methodology Interest Group of the International Society for Traumatic Stress Studies; 1993.
32. Carlson EB, Briere J. Screening for traumatic experiences and trauma responses in mental health treatment settings. In: International Society for Traumatic Stress Studies; 1999 November 14; Miami, FL; 1999.
33. Bernstein DP, Fink L. Childhood Trauma Questionnaire: A Retrospective Self-Report (Manual). San Antonio, TX: Psychological Corporation; 1998.
34. Weathers FW, Litz BT, Herman DS, Huska JA, Keane TM. The PTSD Checklist (PCL): Reliability, Validity, and Diagnostic Utility. In: 9th Annual Meeting of the International Society for Traumatic Stress Studies; 1993; San Antonio, TX; 1993.
35. The SAS System for Windows. In. 8.02 ed. Cary, NC: SAS Institute; 2001.
36. Coker AL, Patel NJ, Krishnaswami W, Schmidt W, Richter DL. Childhood forced sex and cervical dysplasia among women prison inmates. Violence Against Women 1998;4(5):595-608.
37. Becker TM, Wheeler CM, McGough NS, Parmenter CA, Jordan SW, Stidley CA, et al. Sexually transmitted diseases and other risk factors for cervical dysplasia among southwestern Hispanic and non-Hispanic white women. JAMA 1994;271(15):1181-8.
38. Melnikow J, Nuovo J. Cancer prevention and screening in women. Women’s Health 1997;24(1):15-26.
39. Daling JR, Madeleine MM, McKnight B, Carter JJ, Wipf GC, Ashley R, et al. The relationship of human papillomavirus-related cervical tumors to cigarette smoking, oral contraceptive use, and prior herpes simplex virus type 2 infection. Cancer Epidemiol Biomarkers Prev 1996;5(7):541-8.
40. Plichta SB. Violence and abuse: Implications for women’s health. In: Falk, Collins, editors. Women’s health: The Commonwealth Fund survey. Baltimore, MD: Johns Hopkins University Press; 1996.
41. Peters SD, Wyatt GE, Finkelhor D. Prevalence. In: Finkelhor D, editor. A sourcebook on child sexual abuse. Beverly Hills, CA: Sage; 1986. p. 15-59.
42. DeBlasi Z, Harkness E, Ernst E, Georgiou A, Kleijnen J. Influence of context effects on health outcomes: A systematic review. Lancet 2001;357:757-62.
43. Mandelblatt JS, Gold K, O’Malley AS, Taylor K, Cagney K, Hopkins J, et al. Breast and cervical cancer screening among multiethnic women: Role of age, health and source of care. J Preventive Med 1999;28:418-25.
44. Ruffin MT, Gorenflo DW, Woodman B. Predictors of screening for breast, cervical, colorectal, and prostatic cancer among community-based primary care practices. J Am Board Fam Pract 2000;13:1-10.
- Women who had not had recommended cervical cancer screening were more likely to have been sexually abused in childhood.
- Women who were sexually abused in childhood may be at higher risk than other women for HPV and cervical cancer; therefore, screening is particularly important for these women.
- Not having cervical cancer screening may be a marker for childhood sexual abuse. Therefore, health care providers should consider investigating these issues with women who do not adhere to guidelines for routine Pap smears.
Unfortunately, 15% to 24% of US women do not receive recommended cervical cancer screening.1-3 Barriers to Pap screening include low income, low education, minority status;4 lack of cancer knowledge, attitudes, beliefs, low perceived cancer susceptibility, pain, embarrassment;5-7 language, and certain cultural beliefs.7-9 Sexual trauma has received little research attention as a factor contributing to lowered rates of Pap screening. Sexual trauma is reliably associated with subsequent poor health, which may be partially accounted for by poor preventive care.10-16 Childhood sexual abuse is strongly associated with negative health behaviors such as physical inactivity and smoking.13,17 Sexual violence is associated with lower rates of breast cancer screening18 and increased risk of posttraumatic stress disorder (PTSD).19-21 Avoidant coping styles (an aspect of PTSD) are associated with decreased health promotion behaviors such as screening.22-25
Gynecologic procedures may feel threatening to women with a history of sexual assault, and may be experienced as re-traumatizing.14,26-29 Women who had suffered childhood sexual abuse reported more anxiety, shame, and fear during a gynecologic examination than other women.28 Springs and Friedrich16 found a lower frequency of screening for cervical cancer among adult survivors of childhood sexual abuse, but did not assess the impact of other traumatic events in childhood or adulthood on Pap screening. Because previous research on correlates of sexual trauma has been criticized on the grounds that third variables could account for the observed associations,30 we evaluated associations of any traumatic event with low rates of Pap screening.
We hypothesized that having experienced traumatic events, in particular childhood sexual trauma, would function as barriers to Pap screening. We predicted that women who had not had medically appropriate Pap screening would report a greater number of traumatic events, especially sexual abuse trauma in this ethnically diverse random sample of women. We also expected that sexually traumatized women would express more negative attitudes toward Pap screening, and would be more likely to meet criteria for PTSD, both of which might contribute to lower levels of Pap screening.
Methods
Kaiser Permanente (KP), a pre-paid maintenance organization, offers cervical cancer screening at no cost to patients. KP’s clinical guidelines recommend Pap screening every 2 years for women over age 20 with average risk for cervical cancer. Self-report questionnaires were mailed to an age-stratified random sample of women 21–64 years old who were KP members at 3 locations. Women who had had a total hysterectomy were excluded. We compared women who had and who had not obtained Pap screening in the previous 2 years. In previous research18 we found that women who had not obtained mammography had a lower response rate to mailed questionnaires than women who had been screened. We therefore oversampled women who had not had Pap screening. We mailed questionnaires to 1314 women who had obtained Pap screening and 2897 who had not. The final sample included 364 women who had received screening in the past two years (28% response rate) and 372 who had not (13% response rate). Repeated sampling or telephoning of non-respondents was not allowed by KP policy.
Trauma history was measured in 2 ways. The Trauma History Questionnaire31,32 assesses a range of lifetime traumatic events. The Childhood Trauma Questionnaire33 assesses childhood physical abuse, physical neglect, sexual abuse, emotional abuse, and emotional neglect. PTSD was assessed with the Posttraumatic Stress Disorder Checklist.34 We inquired about attitudes toward Pap screening based on previous findings.
Data were analyzed using SAS.35 Contingency tables were analyzed to estimate the prevalence of traumatic events and their bivariate associations with Pap screening. Chi square analysis was used to evaluate the statistical significance of these associations. Hierarchical logistic regression was used to evaluate associations of traumatic events with screening, independent of clinic location, demographic characteristics, attitudes about screening, and PTSD.
Results
Sample demographics
Women who had been screened for cervical cancer and unscreened women were similar in age and education (Table 1). Unscreened women were more likely to be Asian American, to have incomes of $20,000 per year or less, and to have never been married.
TABLE 1
Demographic characteristics of women with and without Pap screening
No Pap (%) n = 372a | Pap (%) n = 364a | P | |
---|---|---|---|
Ethnicity | .001 | ||
African American | 10.1 | 11.6 | |
Asian American | 20.1 | 8.0 | |
European American | 60.6 | 71.8 | |
Other | 9.2 | 8.6 | |
Age | .076 | ||
Mean (standard deviation) | 43.8 (12.8) | 45.5 (12.4) | |
Education | .187 | ||
Elementary school | 2.7 | 1.4 | |
High school | 39.6 | 34.8 | |
College | 41.5 | 43.2 | |
Post-college | 16.9 | 20.6 | |
Family income | .002 | ||
$20,000/year or less | 12.4 | 5.8 | |
$20,001–$50,000 | 41.3 | 38.2 | |
More than $50,000 | 46.2 | 56.0 | |
Marital status | .012 | ||
Never married | 34.3 | 24.8 | |
Married | 47.0 | 59.1 | |
Separated | 1.6 | 0.6 | |
Divorced | 13.8 | 13.1 | |
Widowed | 3.2 | 2.5 | |
aSample sizes vary slightly because of missing data on individual demographic items. |
Prevalence of trauma
Commonly reported events during childhood included natural disaster (reported by 13% of the women), sexual assault other than rape (11%), and news of a death or injury (10%). Childhood sexual abuse or sexual assault was reported by 18.4% of the respondents. The most common traumas in adulthood were receiving news of a death or serious injury (46%), natural disasters (33%), actual or attempted robbery (27%), and serious accidents (14%). Of the respondents, 8.3% reported sexual abuse or sexual assault in adulthood. Their overall rate of childhood and adult sexual assault was 26.7%.
Associations of trauma history with pap screening
We investigated the association of trauma with screening using chi square analyses. Women who had been raped before age 18 (36% vs. 50%, n = 713, P = .050) and women who had been subjected to other sexual assaults before age 18 (35% vs. 51%, n = 694, P = .009) were less likely to have been screened. Nonsexual childhood abuse and neglect were not related to screening. Women who experienced a natural disaster during childhood (36% vs. 52%, n = 571, P = .009) and those who experienced terrorist acts during adulthood (20% vs. 49%, n = 715, P = .024) were less likely to have been screened. (Although the association with a terrorist act was significant, exposures were reported by only 3% of unscreened women and 0.9% of screened women.) Women who reported a household break-in during adulthood were slightly more likely to have been screened (53% vs. 47%, n = 656, P = .032).
In a hierarchical logistic regression model (Table 2), childhood sexual abuse, but not other traumatic events, was associated with lower odds of screening when clinic location, demographic characteristics, attitudes, and PTSD were controlled. The logistic regression model was repeated using CTQ subscales to assess trauma, with similar results. Unmarried women were less likely than currently married women to have been screened, and Latina, Native American, Asian/Pacific, or multicultural women were less likely than European American women to have been screened. Women who endorsed the statement, “I have no symptoms so I do not need a Pap test” and those who anticipated embarrassment during screening were less likely than others to have been screened; women who believed that testing would ease their mind were more likely to have been screened.
TABLE 2
Hierarchical logistic regression model of sexual trauma and attitudes as predictors of pap screening
Predictor | Adjusted odds ratio (95% CI) |
---|---|
Traumatic events | |
Break-in (adult) | 1.14 (0.77, 1.70) |
Natural disaster (child) | 0.78 (0.45, 1.38) |
Terrorist act (adult) | 0.28 (0.07, 1.07) |
Childhood sexual trauma | 0.56 (0.34, 0.91) * |
Site | |
Santa Rosa | 0.68 (0.44, 1.04) |
San Francisco | 1.0 (referent) |
Oakland | 1.27 (0.80, 2.02) |
Education | |
Less than college | 1.09 (0.71, 1.69) |
College | 1.0 (referent) |
More than college | 1.01 (0.65, 1.57) |
Ethnicity | |
European-American | 1.0 (referent) |
African American | 0.59 (0.33, 1.06) |
Other than African | |
American or European | |
American | 0.46 (0.29, 0.71) ** |
Unmarried (compared with married) | 0.67 (0.48, 0.94) * |
Attitudes toward Pap screening | |
“I have no symptoms so I do not need a Pap test” | 0.66 (0.51, 0.85) ** |
“I’ve had negative experiences with my health care provider” | 0.90 (0.73, 1.10) |
“Getting a Pap test would ease my mind” | 1.54 (1.25, 1.89) *** |
“There is danger of infection from a Pap test” | 1.09 (0.83, 1.43) |
“I do not trust the health care system” | 1.06 (0.81, 1.39) |
“I would be embarrassed to have a Pap test” | 0.67 (0.52, 0.84) *** |
“Women who have many sexual | |
partners are more likely to have cervical cancer” | 0.88 (0.73, 1.06) |
“Pap would cause sexual assault flashbacks, or health care provider looks at me in a sexual way” | 1.05 (0.77, 1.45) |
PTSD diagnosis | 1.62 (0.91, 2.90) |
Missing data | 0.96 (0.80, 1.13) |
*P |
Discussion
Childhood sexual abuse is reliably associated with a decreased likelihood of cervical cancer screening. This association persisted despite controlling for demographic characteristics, attitudes about Pap screening, and PTSD symptoms. These findings are strengthened by the consistency with which childhood sexual abuse is associated with low rates of Pap screening using 2 measures of trauma in 3 clinics. Although cost has been a major barrier to access in previous studies of cervical cancer screening, it is not a barrier for women who are members of a pre-paid health plan. It was therefore possible for us to investigate known and suspected barriers to cervical cancer screening with fewer confounding co-variables.
This study clarifies the role of childhood sexual assault in Pap screening. Sexual assault, but not other traumatic events or other types of childhood abuse, is associated with lower rates of cervical cancer screening. Furthermore, sexual assault during childhood, but not during adulthood, is strongly associated with decreased Pap screening.
The relationship between childhood sexual abuse and Pap screening is particularly disturbing because women who were sexually assaulted as children are more likely to develop cervical dysplasia.36 Women who were sexually assaulted in childhood also tend to begin sexual activity at a young age and have more sexual partners.15,16,36 These are among the primary risk factors for human papillomavirus (HPV),37 an important cause of cervical cancer,38,39 and for cervical cancer.7 Women who were sexually abused in childhood are at increased risk of sexually transmitted disease,15,40 and HPV is the most common sexually transmitted viral disease.38 Therefore, women at higher risk for cervical cancer may be the same women who are least likely to be screened. Childhood sexual abuse may increase cervical cancer morbidity by reducing the probability of Pap screening, and by increasing the probability of disease. It may also decrease the likelihood that these women visit their physician for other routine health maintenance needs.
The low response rate in this study may have resulted from the questionnaire’s being sent to KP members once, without follow-up. Our response rate was comparable to a similar study of HMO members.16 Use of a mailed questionnaire probably resulted in underestimation of childhood sexual abuse prevalence.41 The relationship of sexual abuse to preventive health behaviors is comparable to that reported in studies with higher response rates.13,17
There is some evidence that the interpersonal climate between patient and clinician affects health outcomes,42 and we suspect it is a critical factor in increasing women’s comfort with Pap screening. One of our respondents commented: “I’ve always been treated professionally by my gynecologist and yet I still feel the need for the reassuring presence of a nurse during this procedure. I have asked the nurse to hold my hand during the test to calm me down. I find the hand holding or even her hand on my arm comforting.”
The most consistent predictor of cancer screening among women aged 40 and over was a health maintenance visit or regular source of care.43,44 Not having cervical cancer screening may be a marker for childhood sexual abuse. Therefore, health care providers should consider inquiring about a history of sexual abuse with women who do not follow guidelines for routine Pap screening. It is crucial to develop interventions that will lead to routine medical visits for women who have experienced sexual violence. As part of this process, we recommend education for physicians and other health care providers regarding sexual violence against women.
· Acknowledgments ·
Larry Walter, MA, and Sujaya Parthasarathy, PhD, of the Kaiser Permanente Division of Research in Oakland, California, contributed to our obtaining the random sample of women health plan members in this study. Howard Barkan, DrPH, helped design this project and participated in the data collection. We thank him for his insight and expertise.
- Women who had not had recommended cervical cancer screening were more likely to have been sexually abused in childhood.
- Women who were sexually abused in childhood may be at higher risk than other women for HPV and cervical cancer; therefore, screening is particularly important for these women.
- Not having cervical cancer screening may be a marker for childhood sexual abuse. Therefore, health care providers should consider investigating these issues with women who do not adhere to guidelines for routine Pap smears.
Unfortunately, 15% to 24% of US women do not receive recommended cervical cancer screening.1-3 Barriers to Pap screening include low income, low education, minority status;4 lack of cancer knowledge, attitudes, beliefs, low perceived cancer susceptibility, pain, embarrassment;5-7 language, and certain cultural beliefs.7-9 Sexual trauma has received little research attention as a factor contributing to lowered rates of Pap screening. Sexual trauma is reliably associated with subsequent poor health, which may be partially accounted for by poor preventive care.10-16 Childhood sexual abuse is strongly associated with negative health behaviors such as physical inactivity and smoking.13,17 Sexual violence is associated with lower rates of breast cancer screening18 and increased risk of posttraumatic stress disorder (PTSD).19-21 Avoidant coping styles (an aspect of PTSD) are associated with decreased health promotion behaviors such as screening.22-25
Gynecologic procedures may feel threatening to women with a history of sexual assault, and may be experienced as re-traumatizing.14,26-29 Women who had suffered childhood sexual abuse reported more anxiety, shame, and fear during a gynecologic examination than other women.28 Springs and Friedrich16 found a lower frequency of screening for cervical cancer among adult survivors of childhood sexual abuse, but did not assess the impact of other traumatic events in childhood or adulthood on Pap screening. Because previous research on correlates of sexual trauma has been criticized on the grounds that third variables could account for the observed associations,30 we evaluated associations of any traumatic event with low rates of Pap screening.
We hypothesized that having experienced traumatic events, in particular childhood sexual trauma, would function as barriers to Pap screening. We predicted that women who had not had medically appropriate Pap screening would report a greater number of traumatic events, especially sexual abuse trauma in this ethnically diverse random sample of women. We also expected that sexually traumatized women would express more negative attitudes toward Pap screening, and would be more likely to meet criteria for PTSD, both of which might contribute to lower levels of Pap screening.
Methods
Kaiser Permanente (KP), a pre-paid maintenance organization, offers cervical cancer screening at no cost to patients. KP’s clinical guidelines recommend Pap screening every 2 years for women over age 20 with average risk for cervical cancer. Self-report questionnaires were mailed to an age-stratified random sample of women 21–64 years old who were KP members at 3 locations. Women who had had a total hysterectomy were excluded. We compared women who had and who had not obtained Pap screening in the previous 2 years. In previous research18 we found that women who had not obtained mammography had a lower response rate to mailed questionnaires than women who had been screened. We therefore oversampled women who had not had Pap screening. We mailed questionnaires to 1314 women who had obtained Pap screening and 2897 who had not. The final sample included 364 women who had received screening in the past two years (28% response rate) and 372 who had not (13% response rate). Repeated sampling or telephoning of non-respondents was not allowed by KP policy.
Trauma history was measured in 2 ways. The Trauma History Questionnaire31,32 assesses a range of lifetime traumatic events. The Childhood Trauma Questionnaire33 assesses childhood physical abuse, physical neglect, sexual abuse, emotional abuse, and emotional neglect. PTSD was assessed with the Posttraumatic Stress Disorder Checklist.34 We inquired about attitudes toward Pap screening based on previous findings.
Data were analyzed using SAS.35 Contingency tables were analyzed to estimate the prevalence of traumatic events and their bivariate associations with Pap screening. Chi square analysis was used to evaluate the statistical significance of these associations. Hierarchical logistic regression was used to evaluate associations of traumatic events with screening, independent of clinic location, demographic characteristics, attitudes about screening, and PTSD.
Results
Sample demographics
Women who had been screened for cervical cancer and unscreened women were similar in age and education (Table 1). Unscreened women were more likely to be Asian American, to have incomes of $20,000 per year or less, and to have never been married.
TABLE 1
Demographic characteristics of women with and without Pap screening
No Pap (%) n = 372a | Pap (%) n = 364a | P | |
---|---|---|---|
Ethnicity | .001 | ||
African American | 10.1 | 11.6 | |
Asian American | 20.1 | 8.0 | |
European American | 60.6 | 71.8 | |
Other | 9.2 | 8.6 | |
Age | .076 | ||
Mean (standard deviation) | 43.8 (12.8) | 45.5 (12.4) | |
Education | .187 | ||
Elementary school | 2.7 | 1.4 | |
High school | 39.6 | 34.8 | |
College | 41.5 | 43.2 | |
Post-college | 16.9 | 20.6 | |
Family income | .002 | ||
$20,000/year or less | 12.4 | 5.8 | |
$20,001–$50,000 | 41.3 | 38.2 | |
More than $50,000 | 46.2 | 56.0 | |
Marital status | .012 | ||
Never married | 34.3 | 24.8 | |
Married | 47.0 | 59.1 | |
Separated | 1.6 | 0.6 | |
Divorced | 13.8 | 13.1 | |
Widowed | 3.2 | 2.5 | |
aSample sizes vary slightly because of missing data on individual demographic items. |
Prevalence of trauma
Commonly reported events during childhood included natural disaster (reported by 13% of the women), sexual assault other than rape (11%), and news of a death or injury (10%). Childhood sexual abuse or sexual assault was reported by 18.4% of the respondents. The most common traumas in adulthood were receiving news of a death or serious injury (46%), natural disasters (33%), actual or attempted robbery (27%), and serious accidents (14%). Of the respondents, 8.3% reported sexual abuse or sexual assault in adulthood. Their overall rate of childhood and adult sexual assault was 26.7%.
Associations of trauma history with pap screening
We investigated the association of trauma with screening using chi square analyses. Women who had been raped before age 18 (36% vs. 50%, n = 713, P = .050) and women who had been subjected to other sexual assaults before age 18 (35% vs. 51%, n = 694, P = .009) were less likely to have been screened. Nonsexual childhood abuse and neglect were not related to screening. Women who experienced a natural disaster during childhood (36% vs. 52%, n = 571, P = .009) and those who experienced terrorist acts during adulthood (20% vs. 49%, n = 715, P = .024) were less likely to have been screened. (Although the association with a terrorist act was significant, exposures were reported by only 3% of unscreened women and 0.9% of screened women.) Women who reported a household break-in during adulthood were slightly more likely to have been screened (53% vs. 47%, n = 656, P = .032).
In a hierarchical logistic regression model (Table 2), childhood sexual abuse, but not other traumatic events, was associated with lower odds of screening when clinic location, demographic characteristics, attitudes, and PTSD were controlled. The logistic regression model was repeated using CTQ subscales to assess trauma, with similar results. Unmarried women were less likely than currently married women to have been screened, and Latina, Native American, Asian/Pacific, or multicultural women were less likely than European American women to have been screened. Women who endorsed the statement, “I have no symptoms so I do not need a Pap test” and those who anticipated embarrassment during screening were less likely than others to have been screened; women who believed that testing would ease their mind were more likely to have been screened.
TABLE 2
Hierarchical logistic regression model of sexual trauma and attitudes as predictors of pap screening
Predictor | Adjusted odds ratio (95% CI) |
---|---|
Traumatic events | |
Break-in (adult) | 1.14 (0.77, 1.70) |
Natural disaster (child) | 0.78 (0.45, 1.38) |
Terrorist act (adult) | 0.28 (0.07, 1.07) |
Childhood sexual trauma | 0.56 (0.34, 0.91) * |
Site | |
Santa Rosa | 0.68 (0.44, 1.04) |
San Francisco | 1.0 (referent) |
Oakland | 1.27 (0.80, 2.02) |
Education | |
Less than college | 1.09 (0.71, 1.69) |
College | 1.0 (referent) |
More than college | 1.01 (0.65, 1.57) |
Ethnicity | |
European-American | 1.0 (referent) |
African American | 0.59 (0.33, 1.06) |
Other than African | |
American or European | |
American | 0.46 (0.29, 0.71) ** |
Unmarried (compared with married) | 0.67 (0.48, 0.94) * |
Attitudes toward Pap screening | |
“I have no symptoms so I do not need a Pap test” | 0.66 (0.51, 0.85) ** |
“I’ve had negative experiences with my health care provider” | 0.90 (0.73, 1.10) |
“Getting a Pap test would ease my mind” | 1.54 (1.25, 1.89) *** |
“There is danger of infection from a Pap test” | 1.09 (0.83, 1.43) |
“I do not trust the health care system” | 1.06 (0.81, 1.39) |
“I would be embarrassed to have a Pap test” | 0.67 (0.52, 0.84) *** |
“Women who have many sexual | |
partners are more likely to have cervical cancer” | 0.88 (0.73, 1.06) |
“Pap would cause sexual assault flashbacks, or health care provider looks at me in a sexual way” | 1.05 (0.77, 1.45) |
PTSD diagnosis | 1.62 (0.91, 2.90) |
Missing data | 0.96 (0.80, 1.13) |
*P |
Discussion
Childhood sexual abuse is reliably associated with a decreased likelihood of cervical cancer screening. This association persisted despite controlling for demographic characteristics, attitudes about Pap screening, and PTSD symptoms. These findings are strengthened by the consistency with which childhood sexual abuse is associated with low rates of Pap screening using 2 measures of trauma in 3 clinics. Although cost has been a major barrier to access in previous studies of cervical cancer screening, it is not a barrier for women who are members of a pre-paid health plan. It was therefore possible for us to investigate known and suspected barriers to cervical cancer screening with fewer confounding co-variables.
This study clarifies the role of childhood sexual assault in Pap screening. Sexual assault, but not other traumatic events or other types of childhood abuse, is associated with lower rates of cervical cancer screening. Furthermore, sexual assault during childhood, but not during adulthood, is strongly associated with decreased Pap screening.
The relationship between childhood sexual abuse and Pap screening is particularly disturbing because women who were sexually assaulted as children are more likely to develop cervical dysplasia.36 Women who were sexually assaulted in childhood also tend to begin sexual activity at a young age and have more sexual partners.15,16,36 These are among the primary risk factors for human papillomavirus (HPV),37 an important cause of cervical cancer,38,39 and for cervical cancer.7 Women who were sexually abused in childhood are at increased risk of sexually transmitted disease,15,40 and HPV is the most common sexually transmitted viral disease.38 Therefore, women at higher risk for cervical cancer may be the same women who are least likely to be screened. Childhood sexual abuse may increase cervical cancer morbidity by reducing the probability of Pap screening, and by increasing the probability of disease. It may also decrease the likelihood that these women visit their physician for other routine health maintenance needs.
The low response rate in this study may have resulted from the questionnaire’s being sent to KP members once, without follow-up. Our response rate was comparable to a similar study of HMO members.16 Use of a mailed questionnaire probably resulted in underestimation of childhood sexual abuse prevalence.41 The relationship of sexual abuse to preventive health behaviors is comparable to that reported in studies with higher response rates.13,17
There is some evidence that the interpersonal climate between patient and clinician affects health outcomes,42 and we suspect it is a critical factor in increasing women’s comfort with Pap screening. One of our respondents commented: “I’ve always been treated professionally by my gynecologist and yet I still feel the need for the reassuring presence of a nurse during this procedure. I have asked the nurse to hold my hand during the test to calm me down. I find the hand holding or even her hand on my arm comforting.”
The most consistent predictor of cancer screening among women aged 40 and over was a health maintenance visit or regular source of care.43,44 Not having cervical cancer screening may be a marker for childhood sexual abuse. Therefore, health care providers should consider inquiring about a history of sexual abuse with women who do not follow guidelines for routine Pap screening. It is crucial to develop interventions that will lead to routine medical visits for women who have experienced sexual violence. As part of this process, we recommend education for physicians and other health care providers regarding sexual violence against women.
· Acknowledgments ·
Larry Walter, MA, and Sujaya Parthasarathy, PhD, of the Kaiser Permanente Division of Research in Oakland, California, contributed to our obtaining the random sample of women health plan members in this study. Howard Barkan, DrPH, helped design this project and participated in the data collection. We thank him for his insight and expertise.
1. American Cancer Society. Statistics: Table 3C.Pap Test, Women 18 and Older, by State, 1997 [website]. In; http://www3.cancer.org/cancerinfo/sitecenter.asp?ct=1&ctid=8&scp=8.3.8.42080&scs=4&scss=16&scdoc=42096&pnt=2&language=english [accessed 2001, 2/27], 2000.
2. American Cancer Society. Statistics: Cervical cancer [website]. In http://www3.cancer.org/cancerinfo/sitecenter.asp?ct=1&ctid=8&scp=8.3.4.4071&scs=4&scss=2&scdoc=42073&pnt=2&language=english [accessed 2001, 2/27]; 2000.
3. Hayward RA, Shapiro MF, Freeman HE, Corey CR. Who gets screened for cervical and breast cancer? Results from a new national survey. Arch Intern Med 1988;148:1177-81.
4. Breen N, B FJ. Stage of breast and cervical cancer diagnosis in disadvantaged neighborhoods: A prevention policy perspective. Am J Prev Med 1996;12(5):319-26.
5. Calle EE, Flanders WD, Thun MJ, Martin LM. Demographic predictors of mammography and Pap smear screening in US women. Am J Public Health 1993;83:53-60.
6. Peters RK, Bear MB, Thomas D. Barriers to screening for cancer of the cervix. Prev Med 1989;18:133-46.
7. Womeodu RJ, Bailey JE. Barriers to cancer screening. Med Clin North Am 1996;80(1):115-33.
8. Suarez L. Pap smear and mammogram screening in Mexican-American women: the effects of acculturation. Am J Public Health 1994;84:742-6.
9. Tang TW, Solomon LJ, Yeh CJ, Worden JK. The role of cultural variables in breast self-examination and cervical cancer screening behavior in young Asian women living in the United States. J Behav Med 1999;22(5):419-36.
10. Golding JM. Sexual assault history and physical health in randomly selected Los Angeles women. Health Psychol 1994;13:130-8.
11. Golding JM. Sexual assault history and women’s reproductive and sexual health. Psychol of Women Quarterly 1996;20:101-21.
12. Golding JM. Sexual assault history and long-term physical health: Evidence from clinical and population epidemiology. Curr Directions in Psychol Sci 1999;8:191-4.
13. Koss MP, Koss PG, Woodruff WJ. Deleterious effects of criminal victimization on women’s health and medical utilization. Arch Intern Med 1991;151:342-7.
14. Laws A. Sexual abuse history and women’s medical problems. J Gen Intern Med 1993;8:441-44.
15. Lechner ME, Vogel ME, Garcia-Shelton LM, Leichter JL, Steibel KR. Self-reported medical problems of adult female survivors of childhood sexual abuse. J Fam Pract 1993;36:633-8.
16. Springs FE, Friedrich WN. Health risk behaviors and medical sequelae of childhood sexual abuse. Mayo Clin Proc 1992;67:527-32.
17. Felitti V, Anda F, Nordenberg D, Williamson, Spitz A, Edwards V, et al. Relationship of childhood abuse and household dysfunction to many of the leading causes of death in adults. Am J Prev Med 1998;14:245-58.
18. Farley M, Minkoff J, Barkan H. Breast cancer screening and trauma history. Women Health in press.
19. Kessler RC, Sonnega A, Bromet E, Hughes M, Nelson CB. Posttraumatic stress disorder in the National Comorbidity Survey. Arch Gen Psychiatry 1995;52:1048-60.
20. Polusny MA, Follette VM. Long-term correlates of child sexual abuse: Theory and review of the empirical literature. Applied and Preventive Psychology 1995;4:143-66.
21. Resnick HS, Kilpatrick DG, Dansky BS, Saunders BE, Best CL. Prevalence of civilian trauma and posttraumatic stress disorder in a representative national sample of women. J Consulting Clin Psychol 1993;61:984-91.
22. Blake DD, Cook JD, Keane TM. Posttraumatic stress disorder and coping in veterans who are seeking medical treatment. J Clin Psychol 1992;48:695-704.
23. Fama LD, Blake DD, Gusman F. Coping and health behaviors in combat-related PTSD inpatients. In: Annual Meeting of the International Society for Traumatic Stress Studies; San Antonio; 1993.
24. Farley M, Barkan H. Somatization, dissociation, and tension-reducing behaviors in psychiatric outpatients. Psychother Psychosom 1997;66:133-40.
25. Wolfe J, Proctor SP, Brown P, Kimerling RD, J., Sullivan M, Chrestman K, et al. Relationship of physical health and posttraumatic stress disorder in young adult women. In: Annual Meeting of the International Society for Traumatic Stress Studies; 1994; Los Angeles; 1994.
26. Kitzinger J. Recalling the pain. Nursing Times 1990 January;17:38-40.
27. Menage J. Women’s perception of obstetric and gynaecological examinations. Br Med J 1993;306:1127-8.
28. Robohm JS, Buttenheim M. The gynecological care experiences of adult survivors of childhood sexual abuse: A preliminary investigation. Women Health 1996;24:59-75.
29. Wahlen SD. Adult survivors of childhood sexual abuse. In: Hendricks-Matthews M, editor. Violence education: Toward a solution. Kansas City, MO: Society of Teachers of Family Medicine; 1992. p. 89-102.
30. Briere J. Methodological issues in the study of sexual abuse effects. J Consulting Clin Psychol 1992;60:196-203.
31. Stamm BH, Varra ME. Instrumentation in the Field of Traumatic Stress. Oswego, NY: Research and Methodology Interest Group of the International Society for Traumatic Stress Studies; 1993.
32. Carlson EB, Briere J. Screening for traumatic experiences and trauma responses in mental health treatment settings. In: International Society for Traumatic Stress Studies; 1999 November 14; Miami, FL; 1999.
33. Bernstein DP, Fink L. Childhood Trauma Questionnaire: A Retrospective Self-Report (Manual). San Antonio, TX: Psychological Corporation; 1998.
34. Weathers FW, Litz BT, Herman DS, Huska JA, Keane TM. The PTSD Checklist (PCL): Reliability, Validity, and Diagnostic Utility. In: 9th Annual Meeting of the International Society for Traumatic Stress Studies; 1993; San Antonio, TX; 1993.
35. The SAS System for Windows. In. 8.02 ed. Cary, NC: SAS Institute; 2001.
36. Coker AL, Patel NJ, Krishnaswami W, Schmidt W, Richter DL. Childhood forced sex and cervical dysplasia among women prison inmates. Violence Against Women 1998;4(5):595-608.
37. Becker TM, Wheeler CM, McGough NS, Parmenter CA, Jordan SW, Stidley CA, et al. Sexually transmitted diseases and other risk factors for cervical dysplasia among southwestern Hispanic and non-Hispanic white women. JAMA 1994;271(15):1181-8.
38. Melnikow J, Nuovo J. Cancer prevention and screening in women. Women’s Health 1997;24(1):15-26.
39. Daling JR, Madeleine MM, McKnight B, Carter JJ, Wipf GC, Ashley R, et al. The relationship of human papillomavirus-related cervical tumors to cigarette smoking, oral contraceptive use, and prior herpes simplex virus type 2 infection. Cancer Epidemiol Biomarkers Prev 1996;5(7):541-8.
40. Plichta SB. Violence and abuse: Implications for women’s health. In: Falk, Collins, editors. Women’s health: The Commonwealth Fund survey. Baltimore, MD: Johns Hopkins University Press; 1996.
41. Peters SD, Wyatt GE, Finkelhor D. Prevalence. In: Finkelhor D, editor. A sourcebook on child sexual abuse. Beverly Hills, CA: Sage; 1986. p. 15-59.
42. DeBlasi Z, Harkness E, Ernst E, Georgiou A, Kleijnen J. Influence of context effects on health outcomes: A systematic review. Lancet 2001;357:757-62.
43. Mandelblatt JS, Gold K, O’Malley AS, Taylor K, Cagney K, Hopkins J, et al. Breast and cervical cancer screening among multiethnic women: Role of age, health and source of care. J Preventive Med 1999;28:418-25.
44. Ruffin MT, Gorenflo DW, Woodman B. Predictors of screening for breast, cervical, colorectal, and prostatic cancer among community-based primary care practices. J Am Board Fam Pract 2000;13:1-10.
1. American Cancer Society. Statistics: Table 3C.Pap Test, Women 18 and Older, by State, 1997 [website]. In; http://www3.cancer.org/cancerinfo/sitecenter.asp?ct=1&ctid=8&scp=8.3.8.42080&scs=4&scss=16&scdoc=42096&pnt=2&language=english [accessed 2001, 2/27], 2000.
2. American Cancer Society. Statistics: Cervical cancer [website]. In http://www3.cancer.org/cancerinfo/sitecenter.asp?ct=1&ctid=8&scp=8.3.4.4071&scs=4&scss=2&scdoc=42073&pnt=2&language=english [accessed 2001, 2/27]; 2000.
3. Hayward RA, Shapiro MF, Freeman HE, Corey CR. Who gets screened for cervical and breast cancer? Results from a new national survey. Arch Intern Med 1988;148:1177-81.
4. Breen N, B FJ. Stage of breast and cervical cancer diagnosis in disadvantaged neighborhoods: A prevention policy perspective. Am J Prev Med 1996;12(5):319-26.
5. Calle EE, Flanders WD, Thun MJ, Martin LM. Demographic predictors of mammography and Pap smear screening in US women. Am J Public Health 1993;83:53-60.
6. Peters RK, Bear MB, Thomas D. Barriers to screening for cancer of the cervix. Prev Med 1989;18:133-46.
7. Womeodu RJ, Bailey JE. Barriers to cancer screening. Med Clin North Am 1996;80(1):115-33.
8. Suarez L. Pap smear and mammogram screening in Mexican-American women: the effects of acculturation. Am J Public Health 1994;84:742-6.
9. Tang TW, Solomon LJ, Yeh CJ, Worden JK. The role of cultural variables in breast self-examination and cervical cancer screening behavior in young Asian women living in the United States. J Behav Med 1999;22(5):419-36.
10. Golding JM. Sexual assault history and physical health in randomly selected Los Angeles women. Health Psychol 1994;13:130-8.
11. Golding JM. Sexual assault history and women’s reproductive and sexual health. Psychol of Women Quarterly 1996;20:101-21.
12. Golding JM. Sexual assault history and long-term physical health: Evidence from clinical and population epidemiology. Curr Directions in Psychol Sci 1999;8:191-4.
13. Koss MP, Koss PG, Woodruff WJ. Deleterious effects of criminal victimization on women’s health and medical utilization. Arch Intern Med 1991;151:342-7.
14. Laws A. Sexual abuse history and women’s medical problems. J Gen Intern Med 1993;8:441-44.
15. Lechner ME, Vogel ME, Garcia-Shelton LM, Leichter JL, Steibel KR. Self-reported medical problems of adult female survivors of childhood sexual abuse. J Fam Pract 1993;36:633-8.
16. Springs FE, Friedrich WN. Health risk behaviors and medical sequelae of childhood sexual abuse. Mayo Clin Proc 1992;67:527-32.
17. Felitti V, Anda F, Nordenberg D, Williamson, Spitz A, Edwards V, et al. Relationship of childhood abuse and household dysfunction to many of the leading causes of death in adults. Am J Prev Med 1998;14:245-58.
18. Farley M, Minkoff J, Barkan H. Breast cancer screening and trauma history. Women Health in press.
19. Kessler RC, Sonnega A, Bromet E, Hughes M, Nelson CB. Posttraumatic stress disorder in the National Comorbidity Survey. Arch Gen Psychiatry 1995;52:1048-60.
20. Polusny MA, Follette VM. Long-term correlates of child sexual abuse: Theory and review of the empirical literature. Applied and Preventive Psychology 1995;4:143-66.
21. Resnick HS, Kilpatrick DG, Dansky BS, Saunders BE, Best CL. Prevalence of civilian trauma and posttraumatic stress disorder in a representative national sample of women. J Consulting Clin Psychol 1993;61:984-91.
22. Blake DD, Cook JD, Keane TM. Posttraumatic stress disorder and coping in veterans who are seeking medical treatment. J Clin Psychol 1992;48:695-704.
23. Fama LD, Blake DD, Gusman F. Coping and health behaviors in combat-related PTSD inpatients. In: Annual Meeting of the International Society for Traumatic Stress Studies; San Antonio; 1993.
24. Farley M, Barkan H. Somatization, dissociation, and tension-reducing behaviors in psychiatric outpatients. Psychother Psychosom 1997;66:133-40.
25. Wolfe J, Proctor SP, Brown P, Kimerling RD, J., Sullivan M, Chrestman K, et al. Relationship of physical health and posttraumatic stress disorder in young adult women. In: Annual Meeting of the International Society for Traumatic Stress Studies; 1994; Los Angeles; 1994.
26. Kitzinger J. Recalling the pain. Nursing Times 1990 January;17:38-40.
27. Menage J. Women’s perception of obstetric and gynaecological examinations. Br Med J 1993;306:1127-8.
28. Robohm JS, Buttenheim M. The gynecological care experiences of adult survivors of childhood sexual abuse: A preliminary investigation. Women Health 1996;24:59-75.
29. Wahlen SD. Adult survivors of childhood sexual abuse. In: Hendricks-Matthews M, editor. Violence education: Toward a solution. Kansas City, MO: Society of Teachers of Family Medicine; 1992. p. 89-102.
30. Briere J. Methodological issues in the study of sexual abuse effects. J Consulting Clin Psychol 1992;60:196-203.
31. Stamm BH, Varra ME. Instrumentation in the Field of Traumatic Stress. Oswego, NY: Research and Methodology Interest Group of the International Society for Traumatic Stress Studies; 1993.
32. Carlson EB, Briere J. Screening for traumatic experiences and trauma responses in mental health treatment settings. In: International Society for Traumatic Stress Studies; 1999 November 14; Miami, FL; 1999.
33. Bernstein DP, Fink L. Childhood Trauma Questionnaire: A Retrospective Self-Report (Manual). San Antonio, TX: Psychological Corporation; 1998.
34. Weathers FW, Litz BT, Herman DS, Huska JA, Keane TM. The PTSD Checklist (PCL): Reliability, Validity, and Diagnostic Utility. In: 9th Annual Meeting of the International Society for Traumatic Stress Studies; 1993; San Antonio, TX; 1993.
35. The SAS System for Windows. In. 8.02 ed. Cary, NC: SAS Institute; 2001.
36. Coker AL, Patel NJ, Krishnaswami W, Schmidt W, Richter DL. Childhood forced sex and cervical dysplasia among women prison inmates. Violence Against Women 1998;4(5):595-608.
37. Becker TM, Wheeler CM, McGough NS, Parmenter CA, Jordan SW, Stidley CA, et al. Sexually transmitted diseases and other risk factors for cervical dysplasia among southwestern Hispanic and non-Hispanic white women. JAMA 1994;271(15):1181-8.
38. Melnikow J, Nuovo J. Cancer prevention and screening in women. Women’s Health 1997;24(1):15-26.
39. Daling JR, Madeleine MM, McKnight B, Carter JJ, Wipf GC, Ashley R, et al. The relationship of human papillomavirus-related cervical tumors to cigarette smoking, oral contraceptive use, and prior herpes simplex virus type 2 infection. Cancer Epidemiol Biomarkers Prev 1996;5(7):541-8.
40. Plichta SB. Violence and abuse: Implications for women’s health. In: Falk, Collins, editors. Women’s health: The Commonwealth Fund survey. Baltimore, MD: Johns Hopkins University Press; 1996.
41. Peters SD, Wyatt GE, Finkelhor D. Prevalence. In: Finkelhor D, editor. A sourcebook on child sexual abuse. Beverly Hills, CA: Sage; 1986. p. 15-59.
42. DeBlasi Z, Harkness E, Ernst E, Georgiou A, Kleijnen J. Influence of context effects on health outcomes: A systematic review. Lancet 2001;357:757-62.
43. Mandelblatt JS, Gold K, O’Malley AS, Taylor K, Cagney K, Hopkins J, et al. Breast and cervical cancer screening among multiethnic women: Role of age, health and source of care. J Preventive Med 1999;28:418-25.
44. Ruffin MT, Gorenflo DW, Woodman B. Predictors of screening for breast, cervical, colorectal, and prostatic cancer among community-based primary care practices. J Am Board Fam Pract 2000;13:1-10.