User login
A Concise Tool for Measuring Care Coordination from the Provider’s Perspective in the Hospital Setting
Care Coordination has been defined as “…the deliberate organization of patient care activities between two or more participants (including the patient) involved in a patient’s care to facilitate the appropriate delivery of healthcare services.”1 The Institute of Medicine identified care coordination as a key strategy to improve the American healthcare system,2 and evidence has been building that well-coordinated care improves patient outcomes and reduces healthcare costs associated with chronic conditions.3-5 In 2012, Johns Hopkins Medicine was awarded a Healthcare Innovation Award by the Centers for Medicare & Medicaid Services to improve coordination of care across the continuum of care for adult patients admitted to Johns Hopkins Hospital (JHH) and Johns Hopkins Bayview Medical Center (JHBMC), and for high-risk low-income Medicare and Medicaid beneficiaries receiving ambulatory care in targeted zip codes. The purpose of this project, known as the Johns Hopkins Community Health Partnership (J-CHiP), was to improve health and healthcare and to reduce healthcare costs. The acute care component of the program consisted of a bundle of interventions focused on improving coordination of care for all patients, including a “bridge to home” discharge process, as they transitioned back to the community from inpatient admission. The bundle included the following: early screening for discharge planning to predict needed postdischarge services; discussion in daily multidisciplinary rounds about goals and priorities of the hospitalization and potential postdischarge needs; patient and family self-care management; education enhanced medication management, including the option of “medications in hand” at the time of discharge; postdischarge telephone follow-up by nurses; and, for patients identified as high-risk, a “transition guide” (a nurse who works with the patient via home visits and by phone to optimize compliance with care for 30 days postdischarge).6 While the primary endpoints of the J-CHiP program were to improve clinical outcomes and reduce healthcare costs, we were also interested in the impact of the program on care coordination processes in the acute care setting. This created the need for an instrument to measure healthcare professionals’ views of care coordination in their immediate work environments.
We began our search for existing measures by reviewing the Coordination Measures Atlas published in 2014.7 Although this report evaluates over 80 different measures of care coordination, most of them focus on the perspective of the patient and/or family members, on specific conditions, and on primary care or outpatient settings.7,8 We were unable to identify an existing measure from the provider perspective, designed for the inpatient setting, that was both brief but comprehensive enough to cover a range of care coordination domains.8
Consequently, our first aim was to develop a brief, comprehensive tool to measure care coordination from the perspective of hospital inpatient staff that could be used to compare different units or types of providers, or to conduct longitudinal assessment. The second aim was to conduct a preliminary evaluation of the tool in our healthcare setting, including to assess its psychometric properties, to describe provider perceptions of care coordination after the implementation of J-CHiP, and to explore potential differences among departments, types of professionals, and between the 2 hospitals.
METHODS
Development of the Care Coordination Questionnaire
The survey was developed in collaboration with leaders of the J-CHiP Acute Care Team. We met at the outset and on multiple subsequent occasions to align survey domains with the main components of the J-CHiP acute care intervention and to assure that the survey would be relevant and understandable to a variety of multidisciplinary professionals, including physicians, nurses, social workers, physical therapists, and other health professionals. Care was taken to avoid redundancy with existing evaluation efforts and to minimize respondent burden. This process helped to ensure the content validity of the items, the usefulness of the results, and the future usability of the tool.
We modeled the Care Coordination Questionnaire (CCQ) after the Safety Attitudes Questionnaire (SAQ),9 a widely used survey that is deployed approximately annually at JHH and JHBMC. While the SAQ focuses on healthcare provider attitudes about issues relevant to patient safety (often referred to as safety climate or safety culture), this new tool was designed to focus on healthcare professionals’ attitudes about care coordination. Similar to the way that the SAQ “elicits a snapshot of the safety climate through surveys of frontline worker perceptions,” we sought to elicit a picture of our care coordination climate through a survey of frontline hospital staff.
The CCQ was built upon the domains and approaches to care coordination described in the Agency for Healthcare Research and Quality Care Coordination Atlas.3 This report identifies 9 mechanisms for achieving care coordination, including the following: Establish Accountability or Negotiate Responsibility; Communicate; Facilitate Transitions; Assess Needs and Goals; Create a Proactive Plan of Care; Monitor, Follow Up, and Respond to Change; Support Self-Management Goals; Link to Community Resources; and Align Resources with Patient and Population Needs; as well as 5 broad approaches commonly used to improve the delivery of healthcare, including Teamwork Focused on Coordination, Healthcare Home, Care Management, Medication Management, and Health IT-Enabled Coordination.7 We generated at least 1 item to represent 8 of the 9 domains, as well as the broad approach described as Teamwork Focused on Coordination. After developing an initial set of items, we sought input from 3 senior leaders of the J-CHiP Acute Care Team to determine if the items covered the care coordination domains of interest, and to provide feedback on content validity. To test the interpretability of survey items and consistency across professional groups, we sent an initial version of the survey questions to at least 1 person from each of the following professional groups: hospitalist, social worker, case manager, clinical pharmacist, and nurse. We asked them to review all of our survey questions and to provide us with feedback on all aspects of the questions, such as whether they believed the questions were relevant and understandable to the members of their professional discipline, the appropriateness of the wording of the questions, and other comments. Modifications were made to the content and wording of the questions based on the feedback received. The final draft of the questionnaire was reviewed by the leadership team of the J-CHiP Acute Care Team to ensure its usefulness in providing actionable information.
The resulting 12-item questionnaire used a 5-point Likert response scale ranging from 1 = “disagree strongly” to 5 = “agree strongly,” and an additional option of “not applicable (N/A).” To help assess construct validity, a global question was added at the end of the questionnaire asking, “Overall, how would you rate the care coordination at the hospital of your primary work setting?” The response was measured on a 10-point Likert-type scale ranging from 1 = “totally uncoordinated care” to 10 = “perfectly coordinated care” (see Appendix). In addition, the questionnaire requested information about the respondents’ gender, position, and their primary unit, department, and hospital affiliation.
Data Collection Procedures
An invitation to complete an anonymous questionnaire was sent to the following inpatient care professionals: all nursing staff working on care coordination units in the departments of medicine, surgery, and neurology/neurosurgery, as well as physicians, pharmacists, acute care therapists (eg, occupational and physical therapists), and other frontline staff. All healthcare staff fitting these criteria was sent an e-mail with a request to fill out the survey online using QualtricsTM (Qualtrics Labs Inc., Provo, UT), as well as multiple follow-up reminders. The participants worked either at the JHH (a 1194-bed tertiary academic medical center in Baltimore, MD) or the JHBMC (a 440-bed academic community hospital located nearby). Data were collected from October 2015 through January 2016.
Analysis
Means and standard deviations were calculated by treating the responses as continuous variables. We tried 3 different methods to handle missing data: (1) without imputation, (2) imputing the mean value of each item, and (3) substituting a neutral score. Because all 3 methods produced very similar results, we treated the N/A responses as missing values without imputation for simplicity of analysis. We used STATA 13.1 (Stata Corporation, College Station, Texas) to analyze the data.
To identify subscales, we performed exploratory factor analysis on responses to the 12 specific items. Promax rotation was selected based on the simple structure. Subscale scores for each respondent were generated by computing the mean of responses to the items in the subscale. Internal consistency reliability of the subscales was estimated using Cronbach’s alpha. We calculated Pearson correlation coefficients for the items in each subscale, and examined Cronbach’s alpha deleting each item in turn. For each of the subscales identified and the global scale, we calculated the mean, standard deviation, median and interquartile range. Although distributions of scores tended to be non-normal, this was done to increase interpretability. We also calculated percent scoring at the ceiling (highest possible score).
We analyzed the data with 3 research questions in mind: Was there a difference in perceptions of care coordination between (1) staff affiliated with the 2 different hospitals, (2) staff affiliated with different clinical departments, or (3) staff with different professional roles? For comparisons based on hospital and department, and type of professional, nonparametric tests (Wilcoxon rank-sum and Kruskal-Wallis test) were used with a level of statistical significance set at 0.05. The comparison between hospitals and departments was made only among nurses to minimize the confounding effect of different distribution of professionals. We tested the distribution of “years in specialty” between hospitals and departments for this comparison using Pearson’s χ2 test. The difference was not statistically significant (P = 0.167 for hospitals, and P = 0.518 for departments), so we assumed that the potential confounding effect of this variable was negligible in this analysis. The comparison of scores within each professional group used the Friedman test. Pearson’s χ2 test was used to compare the baseline characteristics between 2 hospitals.
RESULTS
Among the 1486 acute care professionals asked to participate in the survey, 841 completed the questionnaire (response rate 56.6%). Table 1 shows the characteristics of the participants from each hospital. Table 2 summarizes the item response rates, proportion scoring at the ceiling, and weighting from the factor analysis. All items had completion rates of 99.2% or higher, with N/A responses ranging from 0% (item 2) to 3.1% (item 7). The percent scoring at the ceiling was 1.7% for the global item and ranged from 18.3% up to 63.3% for other individual items.
We also examined differences in perceptions of care coordination among nursing units to illustrate the tool’s ability to detect variation in Patient Engagement subscale scores for JHH nurses (see Appendix).
DISCUSSION
This study resulted in one of the first measurement tools to succinctly measure multiple aspects of care coordination in the hospital from the perspective of healthcare professionals. Given the hectic work environment of healthcare professionals, and the increasing emphasis on collecting data for evaluation and improvement, it is important to minimize respondent burden. This effort was catalyzed by a multifaceted initiative to redesign acute care delivery and promote seamless transitions of care, supported by the Center for Medicare & Medicaid Innovation. In initial testing, this questionnaire has evidence for reliability and validity. It was encouraging to find that the preliminary psychometric performance of the measure was very similar in 2 different settings of a tertiary academic hospital and a community hospital.
Our analysis of the survey data explored potential differences between the 2 hospitals, among different types of healthcare professionals and across different departments. Although we expected differences, we had no specific hypotheses about what those differences might be, and, in fact, did not observe any substantial differences. This could be taken to indicate that the intervention was uniformly and successfully implemented in both hospitals, and engaged various professionals in different departments. The ability to detect differences in care coordination at the nursing unit level could also prove to be beneficial for more precisely targeting where process improvement is needed. Further data collection and analyses should be conducted to more systematically compare units and to help identify those where practice is most advanced and those where improvements may be needed. It would also be informative to link differences in care coordination scores with patient outcomes. In addition, differences identified on specific domains between professional groups could be helpful to identify where greater efforts are needed to improve interdisciplinary practice. Sampling strategies stratified by provider type would need to be targeted to make this kind of analysis informative.
The consistently lower scores observed for patient engagement, from the perspective of care professionals in all groups, suggest that this is an area where improvement is needed. These findings are consistent with published reports on the common failure by hospitals to include patients as a member of their own care team. In addition to measuring care processes from the perspective of frontline healthcare workers, future evaluations within the healthcare system would also benefit from including data collected from the perspective of the patient and family.
This study had some limitations. First, there may be more than 4 domains of care coordination that are important and can be measured in the acute care setting from provider perspective. However, the addition of more domains should be balanced against practicality and respondent burden. It may be possible to further clarify priority domains in hospital settings as opposed to the primary care setting. Future research should be directed to find these areas and to develop a more comprehensive, yet still concise measurement instrument. Second, the tool was developed to measure the impact of a large-scale intervention, and to fit into the specific context of 2 hospitals. Therefore, it should be tested in different settings of hospital care to see how it performs. However, virtually all hospitals in the United States today are adapting to changes in both financing and healthcare delivery. A tool such as the one described in this paper could be helpful to many organizations. Third, the scoring system for the overall scale score is not weighted and therefore reflects teamwork more than other components of care coordination, which are represented by fewer items. In general, we believe that use of the subscale scores may be more informative. Alternative scoring systems might also be proposed, including item weighting based on factor scores.
For the purposes of evaluation in this specific instance, we only collected data at a single point in time, after the intervention had been deployed. Thus, we were not able to evaluate the effectiveness of the J-CHiP intervention. We also did not intend to focus too much on the differences between units, given the limited number of respondents from individual units. It would be useful to collect more data at future time points, both to test the responsiveness of the scales and to evaluate the impact of future interventions at both the hospital and unit level.
The preliminary data from this study have generated insights about gaps in current practice, such as in engaging patients in the inpatient care process. It has also increased awareness by hospital leaders about the need to achieve high reliability in the adoption of new procedures and interdisciplinary practice. This tool might be used to find areas in need of improvement, to evaluate the effect of initiatives to improve care coordination, to monitor the change over time in the perception of care coordination among healthcare professionals, and to develop better intervention strategies for coordination activities in acute care settings. Additional research is needed to provide further evidence for the reliability and validity of this measure in diverse settings.
Disclosure
The project described was supported by Grant Number 1C1CMS331053-01-00 from the US Department of Health and Human Services, Centers for Medicare & Medicaid Services. The contents of this publication are solely the responsibility of the authors and do not necessarily represent the official views of the US Department of Health and Human Services or any of its agencies. The research presented was conducted by the awardee. Results may or may not be consistent with or confirmed by the findings of the independent evaluation contractor.
The authors have no other disclosures.
1. McDonald KM, Sundaram V, Bravata DM, et al. Closing the Quality Gap: A Critical Analysis of Quality Improvement Strategies (Vol. 7: Care Coordination). Technical Reviews, No. 9.7. Rockville (MD): Agency for Healthcare Research and Quality (US); 2007. PubMed
2. Adams K, Corrigan J. Priority areas for national action: transforming health care quality. Washington, DC: National Academies Press; 2003. PubMed
3. Renders CM, Valk GD, Griffin S, Wagner EH, Eijk JT, Assendelft WJ. Interventions to improve the management of diabetes mellitus in primary care, outpatient and community settings. Cochrane Database Syst Rev. 2001(1):CD001481. PubMed
4. McAlister FA, Lawson FM, Teo KK, Armstrong PW. A systematic review of randomized trials of disease management programs in heart failure. Am J Med. 2001;110(5):378-384. PubMed
5. Bruce ML, Raue PJ, Reilly CF, et al. Clinical effectiveness of integrating depression care management into medicare home health: the Depression CAREPATH Randomized trial. JAMA Intern Med. 2015;175(1):55-64. PubMed
6. Berkowitz SA, Brown P, Brotman DJ, et al. Case Study: Johns Hopkins Community Health Partnership: A model for transformation. Healthc (Amst). 2016;4(4):264-270. PubMed
7. McDonald. KM, Schultz. E, Albin. L, et al. Care Coordination Measures Atlas Version 4. Rockville, MD: Agency for Healthcare Research and Quality; 2014.
8 Schultz EM, Pineda N, Lonhart J, Davies SM, McDonald KM. A systematic review of the care coordination measurement landscape. BMC Health Serv Res. 2013;13:119. PubMed
9. Sexton JB, Helmreich RL, Neilands TB, et al. The Safety Attitudes Questionnaire: psychometric properties, benchmarking data, and emerging research. BMC Health Serv Res. 2006;6:44. PubMed
Care Coordination has been defined as “…the deliberate organization of patient care activities between two or more participants (including the patient) involved in a patient’s care to facilitate the appropriate delivery of healthcare services.”1 The Institute of Medicine identified care coordination as a key strategy to improve the American healthcare system,2 and evidence has been building that well-coordinated care improves patient outcomes and reduces healthcare costs associated with chronic conditions.3-5 In 2012, Johns Hopkins Medicine was awarded a Healthcare Innovation Award by the Centers for Medicare & Medicaid Services to improve coordination of care across the continuum of care for adult patients admitted to Johns Hopkins Hospital (JHH) and Johns Hopkins Bayview Medical Center (JHBMC), and for high-risk low-income Medicare and Medicaid beneficiaries receiving ambulatory care in targeted zip codes. The purpose of this project, known as the Johns Hopkins Community Health Partnership (J-CHiP), was to improve health and healthcare and to reduce healthcare costs. The acute care component of the program consisted of a bundle of interventions focused on improving coordination of care for all patients, including a “bridge to home” discharge process, as they transitioned back to the community from inpatient admission. The bundle included the following: early screening for discharge planning to predict needed postdischarge services; discussion in daily multidisciplinary rounds about goals and priorities of the hospitalization and potential postdischarge needs; patient and family self-care management; education enhanced medication management, including the option of “medications in hand” at the time of discharge; postdischarge telephone follow-up by nurses; and, for patients identified as high-risk, a “transition guide” (a nurse who works with the patient via home visits and by phone to optimize compliance with care for 30 days postdischarge).6 While the primary endpoints of the J-CHiP program were to improve clinical outcomes and reduce healthcare costs, we were also interested in the impact of the program on care coordination processes in the acute care setting. This created the need for an instrument to measure healthcare professionals’ views of care coordination in their immediate work environments.
We began our search for existing measures by reviewing the Coordination Measures Atlas published in 2014.7 Although this report evaluates over 80 different measures of care coordination, most of them focus on the perspective of the patient and/or family members, on specific conditions, and on primary care or outpatient settings.7,8 We were unable to identify an existing measure from the provider perspective, designed for the inpatient setting, that was both brief but comprehensive enough to cover a range of care coordination domains.8
Consequently, our first aim was to develop a brief, comprehensive tool to measure care coordination from the perspective of hospital inpatient staff that could be used to compare different units or types of providers, or to conduct longitudinal assessment. The second aim was to conduct a preliminary evaluation of the tool in our healthcare setting, including to assess its psychometric properties, to describe provider perceptions of care coordination after the implementation of J-CHiP, and to explore potential differences among departments, types of professionals, and between the 2 hospitals.
METHODS
Development of the Care Coordination Questionnaire
The survey was developed in collaboration with leaders of the J-CHiP Acute Care Team. We met at the outset and on multiple subsequent occasions to align survey domains with the main components of the J-CHiP acute care intervention and to assure that the survey would be relevant and understandable to a variety of multidisciplinary professionals, including physicians, nurses, social workers, physical therapists, and other health professionals. Care was taken to avoid redundancy with existing evaluation efforts and to minimize respondent burden. This process helped to ensure the content validity of the items, the usefulness of the results, and the future usability of the tool.
We modeled the Care Coordination Questionnaire (CCQ) after the Safety Attitudes Questionnaire (SAQ),9 a widely used survey that is deployed approximately annually at JHH and JHBMC. While the SAQ focuses on healthcare provider attitudes about issues relevant to patient safety (often referred to as safety climate or safety culture), this new tool was designed to focus on healthcare professionals’ attitudes about care coordination. Similar to the way that the SAQ “elicits a snapshot of the safety climate through surveys of frontline worker perceptions,” we sought to elicit a picture of our care coordination climate through a survey of frontline hospital staff.
The CCQ was built upon the domains and approaches to care coordination described in the Agency for Healthcare Research and Quality Care Coordination Atlas.3 This report identifies 9 mechanisms for achieving care coordination, including the following: Establish Accountability or Negotiate Responsibility; Communicate; Facilitate Transitions; Assess Needs and Goals; Create a Proactive Plan of Care; Monitor, Follow Up, and Respond to Change; Support Self-Management Goals; Link to Community Resources; and Align Resources with Patient and Population Needs; as well as 5 broad approaches commonly used to improve the delivery of healthcare, including Teamwork Focused on Coordination, Healthcare Home, Care Management, Medication Management, and Health IT-Enabled Coordination.7 We generated at least 1 item to represent 8 of the 9 domains, as well as the broad approach described as Teamwork Focused on Coordination. After developing an initial set of items, we sought input from 3 senior leaders of the J-CHiP Acute Care Team to determine if the items covered the care coordination domains of interest, and to provide feedback on content validity. To test the interpretability of survey items and consistency across professional groups, we sent an initial version of the survey questions to at least 1 person from each of the following professional groups: hospitalist, social worker, case manager, clinical pharmacist, and nurse. We asked them to review all of our survey questions and to provide us with feedback on all aspects of the questions, such as whether they believed the questions were relevant and understandable to the members of their professional discipline, the appropriateness of the wording of the questions, and other comments. Modifications were made to the content and wording of the questions based on the feedback received. The final draft of the questionnaire was reviewed by the leadership team of the J-CHiP Acute Care Team to ensure its usefulness in providing actionable information.
The resulting 12-item questionnaire used a 5-point Likert response scale ranging from 1 = “disagree strongly” to 5 = “agree strongly,” and an additional option of “not applicable (N/A).” To help assess construct validity, a global question was added at the end of the questionnaire asking, “Overall, how would you rate the care coordination at the hospital of your primary work setting?” The response was measured on a 10-point Likert-type scale ranging from 1 = “totally uncoordinated care” to 10 = “perfectly coordinated care” (see Appendix). In addition, the questionnaire requested information about the respondents’ gender, position, and their primary unit, department, and hospital affiliation.
Data Collection Procedures
An invitation to complete an anonymous questionnaire was sent to the following inpatient care professionals: all nursing staff working on care coordination units in the departments of medicine, surgery, and neurology/neurosurgery, as well as physicians, pharmacists, acute care therapists (eg, occupational and physical therapists), and other frontline staff. All healthcare staff fitting these criteria was sent an e-mail with a request to fill out the survey online using QualtricsTM (Qualtrics Labs Inc., Provo, UT), as well as multiple follow-up reminders. The participants worked either at the JHH (a 1194-bed tertiary academic medical center in Baltimore, MD) or the JHBMC (a 440-bed academic community hospital located nearby). Data were collected from October 2015 through January 2016.
Analysis
Means and standard deviations were calculated by treating the responses as continuous variables. We tried 3 different methods to handle missing data: (1) without imputation, (2) imputing the mean value of each item, and (3) substituting a neutral score. Because all 3 methods produced very similar results, we treated the N/A responses as missing values without imputation for simplicity of analysis. We used STATA 13.1 (Stata Corporation, College Station, Texas) to analyze the data.
To identify subscales, we performed exploratory factor analysis on responses to the 12 specific items. Promax rotation was selected based on the simple structure. Subscale scores for each respondent were generated by computing the mean of responses to the items in the subscale. Internal consistency reliability of the subscales was estimated using Cronbach’s alpha. We calculated Pearson correlation coefficients for the items in each subscale, and examined Cronbach’s alpha deleting each item in turn. For each of the subscales identified and the global scale, we calculated the mean, standard deviation, median and interquartile range. Although distributions of scores tended to be non-normal, this was done to increase interpretability. We also calculated percent scoring at the ceiling (highest possible score).
We analyzed the data with 3 research questions in mind: Was there a difference in perceptions of care coordination between (1) staff affiliated with the 2 different hospitals, (2) staff affiliated with different clinical departments, or (3) staff with different professional roles? For comparisons based on hospital and department, and type of professional, nonparametric tests (Wilcoxon rank-sum and Kruskal-Wallis test) were used with a level of statistical significance set at 0.05. The comparison between hospitals and departments was made only among nurses to minimize the confounding effect of different distribution of professionals. We tested the distribution of “years in specialty” between hospitals and departments for this comparison using Pearson’s χ2 test. The difference was not statistically significant (P = 0.167 for hospitals, and P = 0.518 for departments), so we assumed that the potential confounding effect of this variable was negligible in this analysis. The comparison of scores within each professional group used the Friedman test. Pearson’s χ2 test was used to compare the baseline characteristics between 2 hospitals.
RESULTS
Among the 1486 acute care professionals asked to participate in the survey, 841 completed the questionnaire (response rate 56.6%). Table 1 shows the characteristics of the participants from each hospital. Table 2 summarizes the item response rates, proportion scoring at the ceiling, and weighting from the factor analysis. All items had completion rates of 99.2% or higher, with N/A responses ranging from 0% (item 2) to 3.1% (item 7). The percent scoring at the ceiling was 1.7% for the global item and ranged from 18.3% up to 63.3% for other individual items.
We also examined differences in perceptions of care coordination among nursing units to illustrate the tool’s ability to detect variation in Patient Engagement subscale scores for JHH nurses (see Appendix).
DISCUSSION
This study resulted in one of the first measurement tools to succinctly measure multiple aspects of care coordination in the hospital from the perspective of healthcare professionals. Given the hectic work environment of healthcare professionals, and the increasing emphasis on collecting data for evaluation and improvement, it is important to minimize respondent burden. This effort was catalyzed by a multifaceted initiative to redesign acute care delivery and promote seamless transitions of care, supported by the Center for Medicare & Medicaid Innovation. In initial testing, this questionnaire has evidence for reliability and validity. It was encouraging to find that the preliminary psychometric performance of the measure was very similar in 2 different settings of a tertiary academic hospital and a community hospital.
Our analysis of the survey data explored potential differences between the 2 hospitals, among different types of healthcare professionals and across different departments. Although we expected differences, we had no specific hypotheses about what those differences might be, and, in fact, did not observe any substantial differences. This could be taken to indicate that the intervention was uniformly and successfully implemented in both hospitals, and engaged various professionals in different departments. The ability to detect differences in care coordination at the nursing unit level could also prove to be beneficial for more precisely targeting where process improvement is needed. Further data collection and analyses should be conducted to more systematically compare units and to help identify those where practice is most advanced and those where improvements may be needed. It would also be informative to link differences in care coordination scores with patient outcomes. In addition, differences identified on specific domains between professional groups could be helpful to identify where greater efforts are needed to improve interdisciplinary practice. Sampling strategies stratified by provider type would need to be targeted to make this kind of analysis informative.
The consistently lower scores observed for patient engagement, from the perspective of care professionals in all groups, suggest that this is an area where improvement is needed. These findings are consistent with published reports on the common failure by hospitals to include patients as a member of their own care team. In addition to measuring care processes from the perspective of frontline healthcare workers, future evaluations within the healthcare system would also benefit from including data collected from the perspective of the patient and family.
This study had some limitations. First, there may be more than 4 domains of care coordination that are important and can be measured in the acute care setting from provider perspective. However, the addition of more domains should be balanced against practicality and respondent burden. It may be possible to further clarify priority domains in hospital settings as opposed to the primary care setting. Future research should be directed to find these areas and to develop a more comprehensive, yet still concise measurement instrument. Second, the tool was developed to measure the impact of a large-scale intervention, and to fit into the specific context of 2 hospitals. Therefore, it should be tested in different settings of hospital care to see how it performs. However, virtually all hospitals in the United States today are adapting to changes in both financing and healthcare delivery. A tool such as the one described in this paper could be helpful to many organizations. Third, the scoring system for the overall scale score is not weighted and therefore reflects teamwork more than other components of care coordination, which are represented by fewer items. In general, we believe that use of the subscale scores may be more informative. Alternative scoring systems might also be proposed, including item weighting based on factor scores.
For the purposes of evaluation in this specific instance, we only collected data at a single point in time, after the intervention had been deployed. Thus, we were not able to evaluate the effectiveness of the J-CHiP intervention. We also did not intend to focus too much on the differences between units, given the limited number of respondents from individual units. It would be useful to collect more data at future time points, both to test the responsiveness of the scales and to evaluate the impact of future interventions at both the hospital and unit level.
The preliminary data from this study have generated insights about gaps in current practice, such as in engaging patients in the inpatient care process. It has also increased awareness by hospital leaders about the need to achieve high reliability in the adoption of new procedures and interdisciplinary practice. This tool might be used to find areas in need of improvement, to evaluate the effect of initiatives to improve care coordination, to monitor the change over time in the perception of care coordination among healthcare professionals, and to develop better intervention strategies for coordination activities in acute care settings. Additional research is needed to provide further evidence for the reliability and validity of this measure in diverse settings.
Disclosure
The project described was supported by Grant Number 1C1CMS331053-01-00 from the US Department of Health and Human Services, Centers for Medicare & Medicaid Services. The contents of this publication are solely the responsibility of the authors and do not necessarily represent the official views of the US Department of Health and Human Services or any of its agencies. The research presented was conducted by the awardee. Results may or may not be consistent with or confirmed by the findings of the independent evaluation contractor.
The authors have no other disclosures.
Care Coordination has been defined as “…the deliberate organization of patient care activities between two or more participants (including the patient) involved in a patient’s care to facilitate the appropriate delivery of healthcare services.”1 The Institute of Medicine identified care coordination as a key strategy to improve the American healthcare system,2 and evidence has been building that well-coordinated care improves patient outcomes and reduces healthcare costs associated with chronic conditions.3-5 In 2012, Johns Hopkins Medicine was awarded a Healthcare Innovation Award by the Centers for Medicare & Medicaid Services to improve coordination of care across the continuum of care for adult patients admitted to Johns Hopkins Hospital (JHH) and Johns Hopkins Bayview Medical Center (JHBMC), and for high-risk low-income Medicare and Medicaid beneficiaries receiving ambulatory care in targeted zip codes. The purpose of this project, known as the Johns Hopkins Community Health Partnership (J-CHiP), was to improve health and healthcare and to reduce healthcare costs. The acute care component of the program consisted of a bundle of interventions focused on improving coordination of care for all patients, including a “bridge to home” discharge process, as they transitioned back to the community from inpatient admission. The bundle included the following: early screening for discharge planning to predict needed postdischarge services; discussion in daily multidisciplinary rounds about goals and priorities of the hospitalization and potential postdischarge needs; patient and family self-care management; education enhanced medication management, including the option of “medications in hand” at the time of discharge; postdischarge telephone follow-up by nurses; and, for patients identified as high-risk, a “transition guide” (a nurse who works with the patient via home visits and by phone to optimize compliance with care for 30 days postdischarge).6 While the primary endpoints of the J-CHiP program were to improve clinical outcomes and reduce healthcare costs, we were also interested in the impact of the program on care coordination processes in the acute care setting. This created the need for an instrument to measure healthcare professionals’ views of care coordination in their immediate work environments.
We began our search for existing measures by reviewing the Coordination Measures Atlas published in 2014.7 Although this report evaluates over 80 different measures of care coordination, most of them focus on the perspective of the patient and/or family members, on specific conditions, and on primary care or outpatient settings.7,8 We were unable to identify an existing measure from the provider perspective, designed for the inpatient setting, that was both brief but comprehensive enough to cover a range of care coordination domains.8
Consequently, our first aim was to develop a brief, comprehensive tool to measure care coordination from the perspective of hospital inpatient staff that could be used to compare different units or types of providers, or to conduct longitudinal assessment. The second aim was to conduct a preliminary evaluation of the tool in our healthcare setting, including to assess its psychometric properties, to describe provider perceptions of care coordination after the implementation of J-CHiP, and to explore potential differences among departments, types of professionals, and between the 2 hospitals.
METHODS
Development of the Care Coordination Questionnaire
The survey was developed in collaboration with leaders of the J-CHiP Acute Care Team. We met at the outset and on multiple subsequent occasions to align survey domains with the main components of the J-CHiP acute care intervention and to assure that the survey would be relevant and understandable to a variety of multidisciplinary professionals, including physicians, nurses, social workers, physical therapists, and other health professionals. Care was taken to avoid redundancy with existing evaluation efforts and to minimize respondent burden. This process helped to ensure the content validity of the items, the usefulness of the results, and the future usability of the tool.
We modeled the Care Coordination Questionnaire (CCQ) after the Safety Attitudes Questionnaire (SAQ),9 a widely used survey that is deployed approximately annually at JHH and JHBMC. While the SAQ focuses on healthcare provider attitudes about issues relevant to patient safety (often referred to as safety climate or safety culture), this new tool was designed to focus on healthcare professionals’ attitudes about care coordination. Similar to the way that the SAQ “elicits a snapshot of the safety climate through surveys of frontline worker perceptions,” we sought to elicit a picture of our care coordination climate through a survey of frontline hospital staff.
The CCQ was built upon the domains and approaches to care coordination described in the Agency for Healthcare Research and Quality Care Coordination Atlas.3 This report identifies 9 mechanisms for achieving care coordination, including the following: Establish Accountability or Negotiate Responsibility; Communicate; Facilitate Transitions; Assess Needs and Goals; Create a Proactive Plan of Care; Monitor, Follow Up, and Respond to Change; Support Self-Management Goals; Link to Community Resources; and Align Resources with Patient and Population Needs; as well as 5 broad approaches commonly used to improve the delivery of healthcare, including Teamwork Focused on Coordination, Healthcare Home, Care Management, Medication Management, and Health IT-Enabled Coordination.7 We generated at least 1 item to represent 8 of the 9 domains, as well as the broad approach described as Teamwork Focused on Coordination. After developing an initial set of items, we sought input from 3 senior leaders of the J-CHiP Acute Care Team to determine if the items covered the care coordination domains of interest, and to provide feedback on content validity. To test the interpretability of survey items and consistency across professional groups, we sent an initial version of the survey questions to at least 1 person from each of the following professional groups: hospitalist, social worker, case manager, clinical pharmacist, and nurse. We asked them to review all of our survey questions and to provide us with feedback on all aspects of the questions, such as whether they believed the questions were relevant and understandable to the members of their professional discipline, the appropriateness of the wording of the questions, and other comments. Modifications were made to the content and wording of the questions based on the feedback received. The final draft of the questionnaire was reviewed by the leadership team of the J-CHiP Acute Care Team to ensure its usefulness in providing actionable information.
The resulting 12-item questionnaire used a 5-point Likert response scale ranging from 1 = “disagree strongly” to 5 = “agree strongly,” and an additional option of “not applicable (N/A).” To help assess construct validity, a global question was added at the end of the questionnaire asking, “Overall, how would you rate the care coordination at the hospital of your primary work setting?” The response was measured on a 10-point Likert-type scale ranging from 1 = “totally uncoordinated care” to 10 = “perfectly coordinated care” (see Appendix). In addition, the questionnaire requested information about the respondents’ gender, position, and their primary unit, department, and hospital affiliation.
Data Collection Procedures
An invitation to complete an anonymous questionnaire was sent to the following inpatient care professionals: all nursing staff working on care coordination units in the departments of medicine, surgery, and neurology/neurosurgery, as well as physicians, pharmacists, acute care therapists (eg, occupational and physical therapists), and other frontline staff. All healthcare staff fitting these criteria was sent an e-mail with a request to fill out the survey online using QualtricsTM (Qualtrics Labs Inc., Provo, UT), as well as multiple follow-up reminders. The participants worked either at the JHH (a 1194-bed tertiary academic medical center in Baltimore, MD) or the JHBMC (a 440-bed academic community hospital located nearby). Data were collected from October 2015 through January 2016.
Analysis
Means and standard deviations were calculated by treating the responses as continuous variables. We tried 3 different methods to handle missing data: (1) without imputation, (2) imputing the mean value of each item, and (3) substituting a neutral score. Because all 3 methods produced very similar results, we treated the N/A responses as missing values without imputation for simplicity of analysis. We used STATA 13.1 (Stata Corporation, College Station, Texas) to analyze the data.
To identify subscales, we performed exploratory factor analysis on responses to the 12 specific items. Promax rotation was selected based on the simple structure. Subscale scores for each respondent were generated by computing the mean of responses to the items in the subscale. Internal consistency reliability of the subscales was estimated using Cronbach’s alpha. We calculated Pearson correlation coefficients for the items in each subscale, and examined Cronbach’s alpha deleting each item in turn. For each of the subscales identified and the global scale, we calculated the mean, standard deviation, median and interquartile range. Although distributions of scores tended to be non-normal, this was done to increase interpretability. We also calculated percent scoring at the ceiling (highest possible score).
We analyzed the data with 3 research questions in mind: Was there a difference in perceptions of care coordination between (1) staff affiliated with the 2 different hospitals, (2) staff affiliated with different clinical departments, or (3) staff with different professional roles? For comparisons based on hospital and department, and type of professional, nonparametric tests (Wilcoxon rank-sum and Kruskal-Wallis test) were used with a level of statistical significance set at 0.05. The comparison between hospitals and departments was made only among nurses to minimize the confounding effect of different distribution of professionals. We tested the distribution of “years in specialty” between hospitals and departments for this comparison using Pearson’s χ2 test. The difference was not statistically significant (P = 0.167 for hospitals, and P = 0.518 for departments), so we assumed that the potential confounding effect of this variable was negligible in this analysis. The comparison of scores within each professional group used the Friedman test. Pearson’s χ2 test was used to compare the baseline characteristics between 2 hospitals.
RESULTS
Among the 1486 acute care professionals asked to participate in the survey, 841 completed the questionnaire (response rate 56.6%). Table 1 shows the characteristics of the participants from each hospital. Table 2 summarizes the item response rates, proportion scoring at the ceiling, and weighting from the factor analysis. All items had completion rates of 99.2% or higher, with N/A responses ranging from 0% (item 2) to 3.1% (item 7). The percent scoring at the ceiling was 1.7% for the global item and ranged from 18.3% up to 63.3% for other individual items.
We also examined differences in perceptions of care coordination among nursing units to illustrate the tool’s ability to detect variation in Patient Engagement subscale scores for JHH nurses (see Appendix).
DISCUSSION
This study resulted in one of the first measurement tools to succinctly measure multiple aspects of care coordination in the hospital from the perspective of healthcare professionals. Given the hectic work environment of healthcare professionals, and the increasing emphasis on collecting data for evaluation and improvement, it is important to minimize respondent burden. This effort was catalyzed by a multifaceted initiative to redesign acute care delivery and promote seamless transitions of care, supported by the Center for Medicare & Medicaid Innovation. In initial testing, this questionnaire has evidence for reliability and validity. It was encouraging to find that the preliminary psychometric performance of the measure was very similar in 2 different settings of a tertiary academic hospital and a community hospital.
Our analysis of the survey data explored potential differences between the 2 hospitals, among different types of healthcare professionals and across different departments. Although we expected differences, we had no specific hypotheses about what those differences might be, and, in fact, did not observe any substantial differences. This could be taken to indicate that the intervention was uniformly and successfully implemented in both hospitals, and engaged various professionals in different departments. The ability to detect differences in care coordination at the nursing unit level could also prove to be beneficial for more precisely targeting where process improvement is needed. Further data collection and analyses should be conducted to more systematically compare units and to help identify those where practice is most advanced and those where improvements may be needed. It would also be informative to link differences in care coordination scores with patient outcomes. In addition, differences identified on specific domains between professional groups could be helpful to identify where greater efforts are needed to improve interdisciplinary practice. Sampling strategies stratified by provider type would need to be targeted to make this kind of analysis informative.
The consistently lower scores observed for patient engagement, from the perspective of care professionals in all groups, suggest that this is an area where improvement is needed. These findings are consistent with published reports on the common failure by hospitals to include patients as a member of their own care team. In addition to measuring care processes from the perspective of frontline healthcare workers, future evaluations within the healthcare system would also benefit from including data collected from the perspective of the patient and family.
This study had some limitations. First, there may be more than 4 domains of care coordination that are important and can be measured in the acute care setting from provider perspective. However, the addition of more domains should be balanced against practicality and respondent burden. It may be possible to further clarify priority domains in hospital settings as opposed to the primary care setting. Future research should be directed to find these areas and to develop a more comprehensive, yet still concise measurement instrument. Second, the tool was developed to measure the impact of a large-scale intervention, and to fit into the specific context of 2 hospitals. Therefore, it should be tested in different settings of hospital care to see how it performs. However, virtually all hospitals in the United States today are adapting to changes in both financing and healthcare delivery. A tool such as the one described in this paper could be helpful to many organizations. Third, the scoring system for the overall scale score is not weighted and therefore reflects teamwork more than other components of care coordination, which are represented by fewer items. In general, we believe that use of the subscale scores may be more informative. Alternative scoring systems might also be proposed, including item weighting based on factor scores.
For the purposes of evaluation in this specific instance, we only collected data at a single point in time, after the intervention had been deployed. Thus, we were not able to evaluate the effectiveness of the J-CHiP intervention. We also did not intend to focus too much on the differences between units, given the limited number of respondents from individual units. It would be useful to collect more data at future time points, both to test the responsiveness of the scales and to evaluate the impact of future interventions at both the hospital and unit level.
The preliminary data from this study have generated insights about gaps in current practice, such as in engaging patients in the inpatient care process. It has also increased awareness by hospital leaders about the need to achieve high reliability in the adoption of new procedures and interdisciplinary practice. This tool might be used to find areas in need of improvement, to evaluate the effect of initiatives to improve care coordination, to monitor the change over time in the perception of care coordination among healthcare professionals, and to develop better intervention strategies for coordination activities in acute care settings. Additional research is needed to provide further evidence for the reliability and validity of this measure in diverse settings.
Disclosure
The project described was supported by Grant Number 1C1CMS331053-01-00 from the US Department of Health and Human Services, Centers for Medicare & Medicaid Services. The contents of this publication are solely the responsibility of the authors and do not necessarily represent the official views of the US Department of Health and Human Services or any of its agencies. The research presented was conducted by the awardee. Results may or may not be consistent with or confirmed by the findings of the independent evaluation contractor.
The authors have no other disclosures.
1. McDonald KM, Sundaram V, Bravata DM, et al. Closing the Quality Gap: A Critical Analysis of Quality Improvement Strategies (Vol. 7: Care Coordination). Technical Reviews, No. 9.7. Rockville (MD): Agency for Healthcare Research and Quality (US); 2007. PubMed
2. Adams K, Corrigan J. Priority areas for national action: transforming health care quality. Washington, DC: National Academies Press; 2003. PubMed
3. Renders CM, Valk GD, Griffin S, Wagner EH, Eijk JT, Assendelft WJ. Interventions to improve the management of diabetes mellitus in primary care, outpatient and community settings. Cochrane Database Syst Rev. 2001(1):CD001481. PubMed
4. McAlister FA, Lawson FM, Teo KK, Armstrong PW. A systematic review of randomized trials of disease management programs in heart failure. Am J Med. 2001;110(5):378-384. PubMed
5. Bruce ML, Raue PJ, Reilly CF, et al. Clinical effectiveness of integrating depression care management into medicare home health: the Depression CAREPATH Randomized trial. JAMA Intern Med. 2015;175(1):55-64. PubMed
6. Berkowitz SA, Brown P, Brotman DJ, et al. Case Study: Johns Hopkins Community Health Partnership: A model for transformation. Healthc (Amst). 2016;4(4):264-270. PubMed
7. McDonald. KM, Schultz. E, Albin. L, et al. Care Coordination Measures Atlas Version 4. Rockville, MD: Agency for Healthcare Research and Quality; 2014.
8 Schultz EM, Pineda N, Lonhart J, Davies SM, McDonald KM. A systematic review of the care coordination measurement landscape. BMC Health Serv Res. 2013;13:119. PubMed
9. Sexton JB, Helmreich RL, Neilands TB, et al. The Safety Attitudes Questionnaire: psychometric properties, benchmarking data, and emerging research. BMC Health Serv Res. 2006;6:44. PubMed
1. McDonald KM, Sundaram V, Bravata DM, et al. Closing the Quality Gap: A Critical Analysis of Quality Improvement Strategies (Vol. 7: Care Coordination). Technical Reviews, No. 9.7. Rockville (MD): Agency for Healthcare Research and Quality (US); 2007. PubMed
2. Adams K, Corrigan J. Priority areas for national action: transforming health care quality. Washington, DC: National Academies Press; 2003. PubMed
3. Renders CM, Valk GD, Griffin S, Wagner EH, Eijk JT, Assendelft WJ. Interventions to improve the management of diabetes mellitus in primary care, outpatient and community settings. Cochrane Database Syst Rev. 2001(1):CD001481. PubMed
4. McAlister FA, Lawson FM, Teo KK, Armstrong PW. A systematic review of randomized trials of disease management programs in heart failure. Am J Med. 2001;110(5):378-384. PubMed
5. Bruce ML, Raue PJ, Reilly CF, et al. Clinical effectiveness of integrating depression care management into medicare home health: the Depression CAREPATH Randomized trial. JAMA Intern Med. 2015;175(1):55-64. PubMed
6. Berkowitz SA, Brown P, Brotman DJ, et al. Case Study: Johns Hopkins Community Health Partnership: A model for transformation. Healthc (Amst). 2016;4(4):264-270. PubMed
7. McDonald. KM, Schultz. E, Albin. L, et al. Care Coordination Measures Atlas Version 4. Rockville, MD: Agency for Healthcare Research and Quality; 2014.
8 Schultz EM, Pineda N, Lonhart J, Davies SM, McDonald KM. A systematic review of the care coordination measurement landscape. BMC Health Serv Res. 2013;13:119. PubMed
9. Sexton JB, Helmreich RL, Neilands TB, et al. The Safety Attitudes Questionnaire: psychometric properties, benchmarking data, and emerging research. BMC Health Serv Res. 2006;6:44. PubMed
© 2017 Society of Hospital Medicine
Associations of Physician Empathy with Patient Anxiety and Ratings of Communication in Hospital Admission Encounters
Admission to a hospital can be a stressful event,1,2 and patients report having many concerns at the time of hospital admission.3 Over the last 20 years, the United States has widely adopted the hospitalist model of inpatient care. Although this model has clear benefits, it also has the potential to contribute to patient stress, as hospitalized patients generally lack preexisting relationships with their inpatient physicians.4,5 In this changing hospital environment, defining and promoting effective medical communication has become an essential goal of both individual practitioners and medical centers.
Successful communication and strong therapeutic relationships with physicians support patients’ coping with illness-associated stress6,7 as well as promote adherence to medical treatment plans.8 Empathy serves as an important building block of patient-centered communication and encourages a strong therapeutic alliance.9 Studies from primary care, oncology, and intensive care unit (ICU) settings indicate that physician empathy is associated with decreased emotional distress,10,11 improved ratings of communication,12 and even better medical outcomes.13
Prior work has shown that hospitalists, like other clinicians, underutilize empathy as a tool in their daily interactions with patients.14-16 Our prior qualitative analysis of audio-recorded hospitalist-patient admission encounters indicated that how hospitalists respond to patient expressions of negative emotion influences relationships with patients and alignment around care plans.17 To determine whether empathic communication is associated with patient-reported outcomes in the hospitalist model, we quantitatively analyzed coded admission encounters and survey data to examine the association between hospitalists’ responses to patient expressions of negative emotion (anxiety, sadness, and anger) and patient anxiety and ratings of communication. Given the often-limited time hospitalists have to complete admission encounters, we also examined the association between response to emotion and encounter length.
METHODS
We analyzed data collected as part of an observational study of hospitalist-patient communication during hospital admission encounters14 to assess the association between the way physicians responded to patient expressions of negative emotion and patient anxiety, ratings of communication in the encounter, and encounter length. We collected data between August 2008 and March 2009 on the general medical service at 2 urban hospitals that are part of an academic medical center. Participants were attending hospitalists (not physician trainees), and patients admitted under participating hospitalists’ care who were able to communicate verbally in English and provide informed consent for the study. The institutional review board at the University of California, San Francisco approved the study; physician and patient participants provided written informed consent.
Enrollment and data collection has been described previously.17 Our cohort for this analysis included 76 patients of 27 physicians who completed encounter audio recordings and pre- and postencounter surveys. Following enrollment, patients completed a preencounter survey to collect demographic information and to measure their baseline anxiety via the State Anxiety Scale (STAI-S), which assesses transient anxious mood using 20 items answered on a 4-point scale for a final score range of 20 to 80.10,18,19 We timed and audio-recorded admission encounters. Encounter recordings were obtained solely from patient interactions with attending hospitalists and did not take into account the time patients may have spent with other physicians, including trainees. After the encounter, patients completed postencounter surveys, which included the STAI-S and patients’ ratings of communication during the encounter. To rate communication, patients responded to 7 items on a 0- to 10-point scale that were derived from previous work (Table 1)12,20,21; the anchors were “not at all” and “completely.” To identify patients with serious illness, which we used as a covariate in regression models, we asked physicians on a postencounter survey whether or not they “would be surprised by this patient’s death or admission to the ICU in the next year.”22
We considered physician as a clustering variable in the calculation of robust standard errors for all models. In addition, we included in each model covariates that were associated with the outcome at P ≤ 0.10, including patient gender, patient age, serious illness,22 preencounter anxiety, encounter length, and hospital. We considered P values < 0.05 to be statistically significant. We used Stata SE 13 (StataCorp LLC, College Station, TX) for all statistical analyses.
RESULTS
We analyzed data from admission encounters with 76 patients (consent rate 63%) and 27 hospitalists (consent rate 91%). Their characteristics are shown in Table 3. Median encounter length was 19 minutes (mean 21 minutes, range 3-68). Patients expressed negative emotion in 190 instances across all encounters; median number of expressions per encounter was 1 (range 0-14). Hospitalists responded empathically to 32% (n = 61) of the patient expressions, neutrally to 43% (n = 81), and nonempathically to 25% (n = 48).
The STAI-S was normally distributed. The mean preencounter STAI-S score was 39 (standard deviation [SD] 8.9). Mean postencounter STAI-S score was 38 (SD 10.7). Mean change in anxiety over the course of the encounter, calculated as the postencounter minus preencounter mean was −1.2 (SD 7.6). Table 1 shows summary statistics for the patient ratings of communication items. All items were rated highly. Across the items, between 51% and 78% of patients rated the highest score of 10.
Across the range of frequencies of emotional expressions per encounter in our data set (0-14 expressions), each additional empathic hospitalist response was associated with a 1.65-point decrease in the STAI-S (95% confidence interval [CI], 0.48-2.82). We did not find significant associations between changes in the STAI-S and the number of neutral hospitalist responses (−0.65 per response; 95% CI, −1.67-0.37) or nonempathic hospitalist responses (0.61 per response; 95% CI, −0.88-2.10).
In addition, nonempathic responses were associated with more negative ratings of communication for 5 of the 7 items: ease of understanding information, covering points of interest, the doctor listening, the doctor caring, and trusting the doctor. For example, for the item “I felt this doctor cared about me,” each nonempathic hospitalist response was associated with a more than doubling of negative patient ratings (aRE: 2.3; 95% CI, 1.32-4.16). Neutral physician responses to patient expressions of negative emotion were associated with less negative patient ratings for 2 of the items: covering points of interest (aRE 0.68; 95% CI, 0.51-0.90) and trusting the doctor (aRE: 0.86; 95% CI, 0.75-0.99).
We did not find a statistical association between encounter length and the number of empathic hospitalist responses in the encounter (percent change in encounter length per response [PC]: 1%; 95% CI, −8%-10%) or the number of nonempathic responses (PC: 18%; 95% CI, −2%-42%). We did find a statistically significant association between the number of neutral responses and encounter length (PC: 13%; 95% CI, 3%-24%), corresponding to 2.5 minutes of additional encounter time per neutral response for the median encounter length of 19 minutes.
DISCUSSION
Our study set out to measure how hospitalists responded to expressions of negative emotion during admission encounters with patients and how those responses correlated with patient anxiety, ratings of communication, and encounter length. We found that empathic responses were associated with diminishing patient anxiety after the visit, as well as with better ratings of several domains of hospitalist communication. Moreover, nonempathic responses to negative emotion were associated with more strongly negative ratings of hospitalist communication. Finally, while clinicians may worry that encouraging patients to speak further about emotion will result in excessive visit lengths, we did not find a statistical association between empathic responses and encounter duration. To our knowledge, this is the first study to indicate an association between empathy and patient anxiety and communication ratings within the hospitalist model, which is rapidly becoming the predominant model for providing inpatient care in the United States.4,5
As in oncologic care, anxiety is an emotion commonly confronted by clinicians meeting admitted medical patients for the first time. Studies show that not only do patient anxiety levels remain high throughout a hospital course, patients who experience higher levels of anxiety tend to stay longer in the hospital.1,2,27-30 But unlike oncologic care or other therapy provided in an outpatient setting, the hospitalist model does not facilitate “continuity” of care, or the ability to care for the same patients over a long period of time. This reality of inpatient care makes rapid, effective rapport-building critical to establishing strong physician-patient relationships. In this setting, a simple communication tool that is potentially able to reduce inpatients’ anxiety could have a meaningful impact on hospitalist-provided care and patient outcomes.
In terms of the magnitude of the effect of empathic responses, the clinical significance of a 1.65-point decrease in the STAI-S anxiety score is not precisely clear. A prior study that examined the effect of music therapy on anxiety levels in patients with cancer found an average anxiety reduction of approximately 9.5 units on the STAIS-S scale after sensitivity analysis, suggesting a rather large meaningful effect size.31 Given we found a reduction of 1.65 points for each empathic response, however, with a range of 0-14 negative emotions expressed over a median 19-minute encounter, there is opportunity for hospitalists to achieve a clinically significant decrease in patient anxiety during an admission encounter. The potential to reduce anxiety is extended further when we consider that the impact of an empathic response may apply not just to the admission encounter alone but also to numerous other patient-clinician interactions over the course of a hospitalization.
A healthy body of communication research supports the associations we found in our study between empathy and patient ratings of communication and physicians. Families in ICU conferences rate communication more positively when physicians express empathy,12 and a number of studies indicate an association between empathy and patient satisfaction in outpatient settings.8 Given the associations we found with negative ratings on the items in our study, promoting empathic responses to expressions of emotion and, more importantly, stressing avoidance of nonempathic responses may be relevant efforts in working to improve patient satisfaction scores on surveys reporting “top box” percentages, such as Hospital Consumer Assessment of Healthcare Providers and Systems (HCAHPS). More notably, evidence indicates that empathy has positive impacts beyond satisfaction surveys, such as adherence, better diagnostic and clinical outcomes, and strengthening of patient enablement.8Not all hospitalist responses to emotion were associated with patient ratings across the 7 communication items we assessed. For example, we did not find an association between how physicians responded to patient expressions of negative emotion and patient perception that enough time was spent in the visit or the degree to which talking with the doctor met a patient’s overall needs. It follows logically, and other research supports, that empathy would influence patient ratings of physician caring and trust,32 whereas other communication factors we were unable to measure (eg, physician body language, tone, and use of jargon and patient health literacy and primary language) may have a more significant association with patient ratings of the other items we assessed.
In considering the clinical application of our results, it is important to note that communication skills, including responding empathically to patient expressions of negative emotion, can be imparted through training in the same way as abdominal examination or electrocardiogram interpretation skills.33-35 However, training of hospitalists in communication skills requires time and some financial investment on the part of the physician, their hospital or group, or, ideally, both. Effective training methods, like those for other skill acquisition, involve learner-centered teaching and practicing skills with role-play and feedback.36 Given the importance of a learner-centered approach, learning would likely be better received and more effective if it was tailored to the specific needs and patient scenarios commonly encountered by hospitalist physicians. As these programs are developed, it will be important to assess the impact of any training on the patient-reported outcomes we assessed in this observational study, along with clinical outcomes.
Our study has several limitations. First, we were only able to evaluate whether hospitalists verbally responded to patient emotion and were thus not able to account for nonverbal empathy such as facial expressions, body language, or voice tone. Second, given our patient consent rate of 63%, patients who agreed to participate in the study may have had different opinions than those who declined to participate. Also, hospitalists and patients may have behaved differently as a result of being audio recorded. We only included patients who spoke English, and our patient population was predominately non-Hispanic white. Patients who spoke other languages or came from other cultural backgrounds may have had different responses. Third, we did not use a single validated scale for patient ratings of communication, and multiple analyses increase our risk of finding statistically significant associations by chance. The skewing of the communication rating items toward high scores may also have led to our results being driven by outliers, although the model we chose for analysis does penalize for this. Furthermore, our sample size was small, leading to wide CIs and potential for lack of statistical associations due to insufficient power. Our findings warrant replication in larger studies. Fourth, the setting of our study in an academic center may affect generalizability. Finally, the age of our data (collected between 2008 and 2009) is also a limitation. Given a recent focus on communication and patient experience since the initiation of HCAHPS feedback, a similar analysis of empathy and communication methods now may result in different outcomes.
In conclusion, our results suggest that enhancing hospitalists’ empathic responses to patient expressions of negative emotion could decrease patient anxiety and improve patients’ perceptions of (and thus possibly their relationships with) hospitalists, without sacrificing efficiency. Future work should focus on tailoring and implementing communication skills training programs for hospitalists and evaluating the impact of training on patient outcomes.
Acknowledgments
The authors extend their sincere thanks to the patients and physicians who participated in this study. Dr. Anderson was funded by the National Palliative Care Research Center and the University of California, San Francisco Clinical and Translational Science Institute Career Development Program, National Institutes of Health (NIH) grant number 5 KL2 RR024130-04. Project costs were funded by a grant from the University of California, San Francisco Academic Senate.
Disclosure
All coauthors have seen and agree with the contents of this manuscript. This submission is not under review by any other publication. Wendy Anderson received funding for this project from the National Palliative Care Research Center, University of California San Francisco Clinical and Translational Science Institute (NIH grant number 5KL2RR024130-04), and the University of San Francisco Academic Senate [From Section 2 of Author Disclosure Form]. Andy Auerbach has a Patient-Centered Outcomes Research Institute research grant in development [From Section 3 of the Author Disclosure Form].
1. Walker FB, Novack DH, Kaiser DL, Knight A, Oblinger P. Anxiety and depression among medical and surgical patients nearing hospital discharge. J Gen Intern Med. 1987;2(2):99-101. PubMed
2. Castillo MI, Cooke M, Macfarlane B, Aitken LM. Factors associated with anxiety in critically ill patients: A prospective observational cohort study. Int J Nurs Stud. 2016;60:225-233. PubMed
3. Anderson WG, Winters K, Auerbach AD. Patient concerns at hospital admission. Arch Intern Med. 2011;171(15):1399-1400. PubMed
4. Kuo Y-F, Sharma G, Freeman JL, Goodwin JS. Growth in the care of older patients by hospitalists in the United States. N Engl J Med. 2009;360(11):1102-1112. PubMed
5. Wachter RM, Goldman L. Zero to 50,000 - The 20th Anniversary of the Hospitalist. N Engl J Med. 2016;375(11):1009-1011. PubMed
6. Mack JW, Block SD, Nilsson M, et al. Measuring therapeutic alliance between oncologists and patients with advanced cancer: the Human Connection Scale. Cancer. 2009;115(14):3302-3311. PubMed
7. Huff NG, Nadig N, Ford DW, Cox CE. Therapeutic Alliance between the Caregivers of Critical Illness Survivors and Intensive Care Unit Clinicians. [published correction appears in Ann Am Thorac Soc. 2016;13(4):576]. Ann Am Thorac Soc. 2015;12(11):1646-1653. PubMed
8. Derksen F, Bensing J, Lagro-Janssen A. Effectiveness of empathy in general practice: a systematic review. Br J Gen Pract. 2013;63(606):e76-e84. PubMed
9. Dwamena F, Holmes-Rovner M, Gaulden CM, et al. Interventions for providers to promote a patient-centred approach in clinical consultations. Cochrane Database Syst Rev. 2012;12:CD003267. PubMed
10. Fogarty LA, Curbow BA, Wingard JR, McDonnell K, Somerfield MR. Can 40 seconds of compassion reduce patient anxiety? J Clin Oncol. 1999;17(1):371-379. PubMed
11. Roter DL, Hall JA, Kern DE, Barker LR, Cole KA, Roca RP. Improving physicians’ interviewing skills and reducing patients’ emotional distress. A randomized clinical trial. Arch Intern Med. 1995;155(17):1877-1884. PubMed
12. Stapleton RD, Engelberg RA, Wenrich MD, Goss CH, Curtis JR. Clinician statements and family satisfaction with family conferences in the intensive care unit. Crit Care Med. 2006;34(6):1679-1685. PubMed
13. Hojat M, Louis DZ, Markham FW, Wender R, Rabinowitz C, Gonnella JS. Physicians’ empathy and clinical outcomes for diabetic patients. Acad Med. 2011;86(3):359-364. PubMed
14. Anderson WG, Winters K, Arnold RM, Puntillo KA, White DB, Auerbach AD. Studying physician-patient communication in the acute care setting: the hospitalist rapport study. Patient Educ Couns. 2011;82(2):275-279. PubMed
15. Pollak KI, Arnold RM, Jeffreys AS, et al. Oncologist communication about emotion during visits with patients with advanced cancer. J Clin Oncol. 2007;25(36):5748-5752. PubMed
16. Suchman AL, Markakis K, Beckman HB, Frankel R. A model of empathic communication in the medical interview. JAMA. 1997;277(8):678-682. PubMed
17. Adams K, Cimino JEW, Arnold RM, Anderson WG. Why should I talk about emotion? Communication patterns associated with physician discussion of patient expressions of negative emotion in hospital admission encounters. Patient Educ Couns. 2012;89(1):44-50. PubMed
18. Julian LJ. Measures of anxiety: State-Trait Anxiety Inventory (STAI), Beck Anxiety Inventory (BAI), and Hospital Anxiety and Depression Scale-Anxiety (HADS-A). Arthritis Care Res (Hoboken). 2011;63 Suppl 11:S467-S472. PubMed
19. Speilberger C, Ritterband L, Sydeman S, Reheiser E, Unger K. Assessment of emotional states and personality traits: measuring psychological vital signs. In: Butcher J, editor. Clinical personality assessment: practical approaches. New York: Oxford University Press; 1995.
20. Safran DG, Kosinski M, Tarlov AR, et al. The Primary Care Assessment Survey: tests of data quality and measurement performance. Med Care. 1998;36(5):728-739. PubMed
21. Azoulay E, Pochard F, Kentish-Barnes N, et al. Risk of post-traumatic stress symptoms in family members of intensive care unit patients. Am J Respir Crit Care Med. 2005;171(9):987-994. PubMed
22. Lynn J. Perspectives on care at the close of life. Serving patients who may die soon and their families: the role of hospice and other services. JAMA. 2001;285(7):925-932. PubMed
23. Kennifer SL, Alexander SC, Pollak KI, et al. Negative emotions in cancer care: do oncologists’ responses depend on severity and type of emotion? Patient Educ Couns. 2009;76(1):51-56. PubMed
24. Butow PN, Brown RF, Cogar S, Tattersall MHN, Dunn SM. Oncologists’ reactions to cancer patients’ verbal cues. Psychooncology. 2002;11(1):47-58. PubMed
25. Levinson W, Gorawara-Bhat R, Lamb J. A study of patient clues and physician responses in primary care and surgical settings. JAMA. 2000;284(8):1021-1027. PubMed
26. Cohen J. A coefficient of agreement for nominal scales. Educ Psychol Meas. 1960;20(1):37-46.
27. Fulop G. Anxiety disorders in the general hospital setting. Psychiatr Med. 1990;8(3):187-195. PubMed
28. Gerson S, Mistry R, Bastani R, et al. Symptoms of depression and anxiety (MHI) following acute medical/surgical hospitalization and post-discharge psychiatric diagnoses (DSM) in 839 geriatric US veterans. Int J Geriatr Psychiatry. 2004;19(12):1155-1167. PubMed
29. Kathol RG, Wenzel RP. Natural history of symptoms of depression and anxiety during inpatient treatment on general medicine wards. J Gen Intern Med. 1992;7(3):287-293. PubMed
30. Unsal A, Unaldi C, Baytemir C. Anxiety and depression levels of inpatients in the city centre of Kirşehir in Turkey. Int J Nurs Pract. 2011;17(4):411-418. PubMed
31. Bradt J, Dileo C, Grocke D, Magill L. Music interventions for improving psychological and physical outcomes in cancer patients. [Update appears in Cochrane Database Syst Rev. 2016;(8):CD006911] Cochrane Database Syst Rev. 2011;(8):CD006911. PubMed
32. Kim SS, Kaplowitz S, Johnston MV. The effects of physician empathy on patient satisfaction and compliance. Eval Health Prof. 2004;27(3):237-251. PubMed
33. Tulsky JA, Arnold RM, Alexander SC, et al. Enhancing communication between oncologists and patients with a computer-based training program: a randomized trial. Ann Intern Med. 2011;155(9):593-601. PubMed
34. Bays AM, Engelberg RA, Back AL, et al. Interprofessional communication skills training for serious illness: evaluation of a small-group, simulated patient intervention. J Palliat Med. 2014;17(2):159-166. PubMed
35. Epstein RM, Duberstein PR, Fenton JJ, et al. Effect of a Patient-Centered Communication Intervention on Oncologist-Patient Communication, Quality of Life, and Health Care Utilization in Advanced Cancer: The VOICE Randomized Clinical Trial. JAMA Oncol. 2017;3(1):92-100. PubMed
36. Berkhof M, van Rijssen HJ, Schellart AJM, Anema JR, van der Beek AJ. Effective training strategies for teaching communication skills to physicians: an overview of systematic reviews. Patient Educ Couns. 2011;84(2):152-162. PubMed
Admission to a hospital can be a stressful event,1,2 and patients report having many concerns at the time of hospital admission.3 Over the last 20 years, the United States has widely adopted the hospitalist model of inpatient care. Although this model has clear benefits, it also has the potential to contribute to patient stress, as hospitalized patients generally lack preexisting relationships with their inpatient physicians.4,5 In this changing hospital environment, defining and promoting effective medical communication has become an essential goal of both individual practitioners and medical centers.
Successful communication and strong therapeutic relationships with physicians support patients’ coping with illness-associated stress6,7 as well as promote adherence to medical treatment plans.8 Empathy serves as an important building block of patient-centered communication and encourages a strong therapeutic alliance.9 Studies from primary care, oncology, and intensive care unit (ICU) settings indicate that physician empathy is associated with decreased emotional distress,10,11 improved ratings of communication,12 and even better medical outcomes.13
Prior work has shown that hospitalists, like other clinicians, underutilize empathy as a tool in their daily interactions with patients.14-16 Our prior qualitative analysis of audio-recorded hospitalist-patient admission encounters indicated that how hospitalists respond to patient expressions of negative emotion influences relationships with patients and alignment around care plans.17 To determine whether empathic communication is associated with patient-reported outcomes in the hospitalist model, we quantitatively analyzed coded admission encounters and survey data to examine the association between hospitalists’ responses to patient expressions of negative emotion (anxiety, sadness, and anger) and patient anxiety and ratings of communication. Given the often-limited time hospitalists have to complete admission encounters, we also examined the association between response to emotion and encounter length.
METHODS
We analyzed data collected as part of an observational study of hospitalist-patient communication during hospital admission encounters14 to assess the association between the way physicians responded to patient expressions of negative emotion and patient anxiety, ratings of communication in the encounter, and encounter length. We collected data between August 2008 and March 2009 on the general medical service at 2 urban hospitals that are part of an academic medical center. Participants were attending hospitalists (not physician trainees), and patients admitted under participating hospitalists’ care who were able to communicate verbally in English and provide informed consent for the study. The institutional review board at the University of California, San Francisco approved the study; physician and patient participants provided written informed consent.
Enrollment and data collection has been described previously.17 Our cohort for this analysis included 76 patients of 27 physicians who completed encounter audio recordings and pre- and postencounter surveys. Following enrollment, patients completed a preencounter survey to collect demographic information and to measure their baseline anxiety via the State Anxiety Scale (STAI-S), which assesses transient anxious mood using 20 items answered on a 4-point scale for a final score range of 20 to 80.10,18,19 We timed and audio-recorded admission encounters. Encounter recordings were obtained solely from patient interactions with attending hospitalists and did not take into account the time patients may have spent with other physicians, including trainees. After the encounter, patients completed postencounter surveys, which included the STAI-S and patients’ ratings of communication during the encounter. To rate communication, patients responded to 7 items on a 0- to 10-point scale that were derived from previous work (Table 1)12,20,21; the anchors were “not at all” and “completely.” To identify patients with serious illness, which we used as a covariate in regression models, we asked physicians on a postencounter survey whether or not they “would be surprised by this patient’s death or admission to the ICU in the next year.”22
We considered physician as a clustering variable in the calculation of robust standard errors for all models. In addition, we included in each model covariates that were associated with the outcome at P ≤ 0.10, including patient gender, patient age, serious illness,22 preencounter anxiety, encounter length, and hospital. We considered P values < 0.05 to be statistically significant. We used Stata SE 13 (StataCorp LLC, College Station, TX) for all statistical analyses.
RESULTS
We analyzed data from admission encounters with 76 patients (consent rate 63%) and 27 hospitalists (consent rate 91%). Their characteristics are shown in Table 3. Median encounter length was 19 minutes (mean 21 minutes, range 3-68). Patients expressed negative emotion in 190 instances across all encounters; median number of expressions per encounter was 1 (range 0-14). Hospitalists responded empathically to 32% (n = 61) of the patient expressions, neutrally to 43% (n = 81), and nonempathically to 25% (n = 48).
The STAI-S was normally distributed. The mean preencounter STAI-S score was 39 (standard deviation [SD] 8.9). Mean postencounter STAI-S score was 38 (SD 10.7). Mean change in anxiety over the course of the encounter, calculated as the postencounter minus preencounter mean was −1.2 (SD 7.6). Table 1 shows summary statistics for the patient ratings of communication items. All items were rated highly. Across the items, between 51% and 78% of patients rated the highest score of 10.
Across the range of frequencies of emotional expressions per encounter in our data set (0-14 expressions), each additional empathic hospitalist response was associated with a 1.65-point decrease in the STAI-S (95% confidence interval [CI], 0.48-2.82). We did not find significant associations between changes in the STAI-S and the number of neutral hospitalist responses (−0.65 per response; 95% CI, −1.67-0.37) or nonempathic hospitalist responses (0.61 per response; 95% CI, −0.88-2.10).
In addition, nonempathic responses were associated with more negative ratings of communication for 5 of the 7 items: ease of understanding information, covering points of interest, the doctor listening, the doctor caring, and trusting the doctor. For example, for the item “I felt this doctor cared about me,” each nonempathic hospitalist response was associated with a more than doubling of negative patient ratings (aRE: 2.3; 95% CI, 1.32-4.16). Neutral physician responses to patient expressions of negative emotion were associated with less negative patient ratings for 2 of the items: covering points of interest (aRE 0.68; 95% CI, 0.51-0.90) and trusting the doctor (aRE: 0.86; 95% CI, 0.75-0.99).
We did not find a statistical association between encounter length and the number of empathic hospitalist responses in the encounter (percent change in encounter length per response [PC]: 1%; 95% CI, −8%-10%) or the number of nonempathic responses (PC: 18%; 95% CI, −2%-42%). We did find a statistically significant association between the number of neutral responses and encounter length (PC: 13%; 95% CI, 3%-24%), corresponding to 2.5 minutes of additional encounter time per neutral response for the median encounter length of 19 minutes.
DISCUSSION
Our study set out to measure how hospitalists responded to expressions of negative emotion during admission encounters with patients and how those responses correlated with patient anxiety, ratings of communication, and encounter length. We found that empathic responses were associated with diminishing patient anxiety after the visit, as well as with better ratings of several domains of hospitalist communication. Moreover, nonempathic responses to negative emotion were associated with more strongly negative ratings of hospitalist communication. Finally, while clinicians may worry that encouraging patients to speak further about emotion will result in excessive visit lengths, we did not find a statistical association between empathic responses and encounter duration. To our knowledge, this is the first study to indicate an association between empathy and patient anxiety and communication ratings within the hospitalist model, which is rapidly becoming the predominant model for providing inpatient care in the United States.4,5
As in oncologic care, anxiety is an emotion commonly confronted by clinicians meeting admitted medical patients for the first time. Studies show that not only do patient anxiety levels remain high throughout a hospital course, patients who experience higher levels of anxiety tend to stay longer in the hospital.1,2,27-30 But unlike oncologic care or other therapy provided in an outpatient setting, the hospitalist model does not facilitate “continuity” of care, or the ability to care for the same patients over a long period of time. This reality of inpatient care makes rapid, effective rapport-building critical to establishing strong physician-patient relationships. In this setting, a simple communication tool that is potentially able to reduce inpatients’ anxiety could have a meaningful impact on hospitalist-provided care and patient outcomes.
In terms of the magnitude of the effect of empathic responses, the clinical significance of a 1.65-point decrease in the STAI-S anxiety score is not precisely clear. A prior study that examined the effect of music therapy on anxiety levels in patients with cancer found an average anxiety reduction of approximately 9.5 units on the STAIS-S scale after sensitivity analysis, suggesting a rather large meaningful effect size.31 Given we found a reduction of 1.65 points for each empathic response, however, with a range of 0-14 negative emotions expressed over a median 19-minute encounter, there is opportunity for hospitalists to achieve a clinically significant decrease in patient anxiety during an admission encounter. The potential to reduce anxiety is extended further when we consider that the impact of an empathic response may apply not just to the admission encounter alone but also to numerous other patient-clinician interactions over the course of a hospitalization.
A healthy body of communication research supports the associations we found in our study between empathy and patient ratings of communication and physicians. Families in ICU conferences rate communication more positively when physicians express empathy,12 and a number of studies indicate an association between empathy and patient satisfaction in outpatient settings.8 Given the associations we found with negative ratings on the items in our study, promoting empathic responses to expressions of emotion and, more importantly, stressing avoidance of nonempathic responses may be relevant efforts in working to improve patient satisfaction scores on surveys reporting “top box” percentages, such as Hospital Consumer Assessment of Healthcare Providers and Systems (HCAHPS). More notably, evidence indicates that empathy has positive impacts beyond satisfaction surveys, such as adherence, better diagnostic and clinical outcomes, and strengthening of patient enablement.8Not all hospitalist responses to emotion were associated with patient ratings across the 7 communication items we assessed. For example, we did not find an association between how physicians responded to patient expressions of negative emotion and patient perception that enough time was spent in the visit or the degree to which talking with the doctor met a patient’s overall needs. It follows logically, and other research supports, that empathy would influence patient ratings of physician caring and trust,32 whereas other communication factors we were unable to measure (eg, physician body language, tone, and use of jargon and patient health literacy and primary language) may have a more significant association with patient ratings of the other items we assessed.
In considering the clinical application of our results, it is important to note that communication skills, including responding empathically to patient expressions of negative emotion, can be imparted through training in the same way as abdominal examination or electrocardiogram interpretation skills.33-35 However, training of hospitalists in communication skills requires time and some financial investment on the part of the physician, their hospital or group, or, ideally, both. Effective training methods, like those for other skill acquisition, involve learner-centered teaching and practicing skills with role-play and feedback.36 Given the importance of a learner-centered approach, learning would likely be better received and more effective if it was tailored to the specific needs and patient scenarios commonly encountered by hospitalist physicians. As these programs are developed, it will be important to assess the impact of any training on the patient-reported outcomes we assessed in this observational study, along with clinical outcomes.
Our study has several limitations. First, we were only able to evaluate whether hospitalists verbally responded to patient emotion and were thus not able to account for nonverbal empathy such as facial expressions, body language, or voice tone. Second, given our patient consent rate of 63%, patients who agreed to participate in the study may have had different opinions than those who declined to participate. Also, hospitalists and patients may have behaved differently as a result of being audio recorded. We only included patients who spoke English, and our patient population was predominately non-Hispanic white. Patients who spoke other languages or came from other cultural backgrounds may have had different responses. Third, we did not use a single validated scale for patient ratings of communication, and multiple analyses increase our risk of finding statistically significant associations by chance. The skewing of the communication rating items toward high scores may also have led to our results being driven by outliers, although the model we chose for analysis does penalize for this. Furthermore, our sample size was small, leading to wide CIs and potential for lack of statistical associations due to insufficient power. Our findings warrant replication in larger studies. Fourth, the setting of our study in an academic center may affect generalizability. Finally, the age of our data (collected between 2008 and 2009) is also a limitation. Given a recent focus on communication and patient experience since the initiation of HCAHPS feedback, a similar analysis of empathy and communication methods now may result in different outcomes.
In conclusion, our results suggest that enhancing hospitalists’ empathic responses to patient expressions of negative emotion could decrease patient anxiety and improve patients’ perceptions of (and thus possibly their relationships with) hospitalists, without sacrificing efficiency. Future work should focus on tailoring and implementing communication skills training programs for hospitalists and evaluating the impact of training on patient outcomes.
Acknowledgments
The authors extend their sincere thanks to the patients and physicians who participated in this study. Dr. Anderson was funded by the National Palliative Care Research Center and the University of California, San Francisco Clinical and Translational Science Institute Career Development Program, National Institutes of Health (NIH) grant number 5 KL2 RR024130-04. Project costs were funded by a grant from the University of California, San Francisco Academic Senate.
Disclosure
All coauthors have seen and agree with the contents of this manuscript. This submission is not under review by any other publication. Wendy Anderson received funding for this project from the National Palliative Care Research Center, University of California San Francisco Clinical and Translational Science Institute (NIH grant number 5KL2RR024130-04), and the University of San Francisco Academic Senate [From Section 2 of Author Disclosure Form]. Andy Auerbach has a Patient-Centered Outcomes Research Institute research grant in development [From Section 3 of the Author Disclosure Form].
Admission to a hospital can be a stressful event,1,2 and patients report having many concerns at the time of hospital admission.3 Over the last 20 years, the United States has widely adopted the hospitalist model of inpatient care. Although this model has clear benefits, it also has the potential to contribute to patient stress, as hospitalized patients generally lack preexisting relationships with their inpatient physicians.4,5 In this changing hospital environment, defining and promoting effective medical communication has become an essential goal of both individual practitioners and medical centers.
Successful communication and strong therapeutic relationships with physicians support patients’ coping with illness-associated stress6,7 as well as promote adherence to medical treatment plans.8 Empathy serves as an important building block of patient-centered communication and encourages a strong therapeutic alliance.9 Studies from primary care, oncology, and intensive care unit (ICU) settings indicate that physician empathy is associated with decreased emotional distress,10,11 improved ratings of communication,12 and even better medical outcomes.13
Prior work has shown that hospitalists, like other clinicians, underutilize empathy as a tool in their daily interactions with patients.14-16 Our prior qualitative analysis of audio-recorded hospitalist-patient admission encounters indicated that how hospitalists respond to patient expressions of negative emotion influences relationships with patients and alignment around care plans.17 To determine whether empathic communication is associated with patient-reported outcomes in the hospitalist model, we quantitatively analyzed coded admission encounters and survey data to examine the association between hospitalists’ responses to patient expressions of negative emotion (anxiety, sadness, and anger) and patient anxiety and ratings of communication. Given the often-limited time hospitalists have to complete admission encounters, we also examined the association between response to emotion and encounter length.
METHODS
We analyzed data collected as part of an observational study of hospitalist-patient communication during hospital admission encounters14 to assess the association between the way physicians responded to patient expressions of negative emotion and patient anxiety, ratings of communication in the encounter, and encounter length. We collected data between August 2008 and March 2009 on the general medical service at 2 urban hospitals that are part of an academic medical center. Participants were attending hospitalists (not physician trainees), and patients admitted under participating hospitalists’ care who were able to communicate verbally in English and provide informed consent for the study. The institutional review board at the University of California, San Francisco approved the study; physician and patient participants provided written informed consent.
Enrollment and data collection has been described previously.17 Our cohort for this analysis included 76 patients of 27 physicians who completed encounter audio recordings and pre- and postencounter surveys. Following enrollment, patients completed a preencounter survey to collect demographic information and to measure their baseline anxiety via the State Anxiety Scale (STAI-S), which assesses transient anxious mood using 20 items answered on a 4-point scale for a final score range of 20 to 80.10,18,19 We timed and audio-recorded admission encounters. Encounter recordings were obtained solely from patient interactions with attending hospitalists and did not take into account the time patients may have spent with other physicians, including trainees. After the encounter, patients completed postencounter surveys, which included the STAI-S and patients’ ratings of communication during the encounter. To rate communication, patients responded to 7 items on a 0- to 10-point scale that were derived from previous work (Table 1)12,20,21; the anchors were “not at all” and “completely.” To identify patients with serious illness, which we used as a covariate in regression models, we asked physicians on a postencounter survey whether or not they “would be surprised by this patient’s death or admission to the ICU in the next year.”22
We considered physician as a clustering variable in the calculation of robust standard errors for all models. In addition, we included in each model covariates that were associated with the outcome at P ≤ 0.10, including patient gender, patient age, serious illness,22 preencounter anxiety, encounter length, and hospital. We considered P values < 0.05 to be statistically significant. We used Stata SE 13 (StataCorp LLC, College Station, TX) for all statistical analyses.
RESULTS
We analyzed data from admission encounters with 76 patients (consent rate 63%) and 27 hospitalists (consent rate 91%). Their characteristics are shown in Table 3. Median encounter length was 19 minutes (mean 21 minutes, range 3-68). Patients expressed negative emotion in 190 instances across all encounters; median number of expressions per encounter was 1 (range 0-14). Hospitalists responded empathically to 32% (n = 61) of the patient expressions, neutrally to 43% (n = 81), and nonempathically to 25% (n = 48).
The STAI-S was normally distributed. The mean preencounter STAI-S score was 39 (standard deviation [SD] 8.9). Mean postencounter STAI-S score was 38 (SD 10.7). Mean change in anxiety over the course of the encounter, calculated as the postencounter minus preencounter mean was −1.2 (SD 7.6). Table 1 shows summary statistics for the patient ratings of communication items. All items were rated highly. Across the items, between 51% and 78% of patients rated the highest score of 10.
Across the range of frequencies of emotional expressions per encounter in our data set (0-14 expressions), each additional empathic hospitalist response was associated with a 1.65-point decrease in the STAI-S (95% confidence interval [CI], 0.48-2.82). We did not find significant associations between changes in the STAI-S and the number of neutral hospitalist responses (−0.65 per response; 95% CI, −1.67-0.37) or nonempathic hospitalist responses (0.61 per response; 95% CI, −0.88-2.10).
In addition, nonempathic responses were associated with more negative ratings of communication for 5 of the 7 items: ease of understanding information, covering points of interest, the doctor listening, the doctor caring, and trusting the doctor. For example, for the item “I felt this doctor cared about me,” each nonempathic hospitalist response was associated with a more than doubling of negative patient ratings (aRE: 2.3; 95% CI, 1.32-4.16). Neutral physician responses to patient expressions of negative emotion were associated with less negative patient ratings for 2 of the items: covering points of interest (aRE 0.68; 95% CI, 0.51-0.90) and trusting the doctor (aRE: 0.86; 95% CI, 0.75-0.99).
We did not find a statistical association between encounter length and the number of empathic hospitalist responses in the encounter (percent change in encounter length per response [PC]: 1%; 95% CI, −8%-10%) or the number of nonempathic responses (PC: 18%; 95% CI, −2%-42%). We did find a statistically significant association between the number of neutral responses and encounter length (PC: 13%; 95% CI, 3%-24%), corresponding to 2.5 minutes of additional encounter time per neutral response for the median encounter length of 19 minutes.
DISCUSSION
Our study set out to measure how hospitalists responded to expressions of negative emotion during admission encounters with patients and how those responses correlated with patient anxiety, ratings of communication, and encounter length. We found that empathic responses were associated with diminishing patient anxiety after the visit, as well as with better ratings of several domains of hospitalist communication. Moreover, nonempathic responses to negative emotion were associated with more strongly negative ratings of hospitalist communication. Finally, while clinicians may worry that encouraging patients to speak further about emotion will result in excessive visit lengths, we did not find a statistical association between empathic responses and encounter duration. To our knowledge, this is the first study to indicate an association between empathy and patient anxiety and communication ratings within the hospitalist model, which is rapidly becoming the predominant model for providing inpatient care in the United States.4,5
As in oncologic care, anxiety is an emotion commonly confronted by clinicians meeting admitted medical patients for the first time. Studies show that not only do patient anxiety levels remain high throughout a hospital course, patients who experience higher levels of anxiety tend to stay longer in the hospital.1,2,27-30 But unlike oncologic care or other therapy provided in an outpatient setting, the hospitalist model does not facilitate “continuity” of care, or the ability to care for the same patients over a long period of time. This reality of inpatient care makes rapid, effective rapport-building critical to establishing strong physician-patient relationships. In this setting, a simple communication tool that is potentially able to reduce inpatients’ anxiety could have a meaningful impact on hospitalist-provided care and patient outcomes.
In terms of the magnitude of the effect of empathic responses, the clinical significance of a 1.65-point decrease in the STAI-S anxiety score is not precisely clear. A prior study that examined the effect of music therapy on anxiety levels in patients with cancer found an average anxiety reduction of approximately 9.5 units on the STAIS-S scale after sensitivity analysis, suggesting a rather large meaningful effect size.31 Given we found a reduction of 1.65 points for each empathic response, however, with a range of 0-14 negative emotions expressed over a median 19-minute encounter, there is opportunity for hospitalists to achieve a clinically significant decrease in patient anxiety during an admission encounter. The potential to reduce anxiety is extended further when we consider that the impact of an empathic response may apply not just to the admission encounter alone but also to numerous other patient-clinician interactions over the course of a hospitalization.
A healthy body of communication research supports the associations we found in our study between empathy and patient ratings of communication and physicians. Families in ICU conferences rate communication more positively when physicians express empathy,12 and a number of studies indicate an association between empathy and patient satisfaction in outpatient settings.8 Given the associations we found with negative ratings on the items in our study, promoting empathic responses to expressions of emotion and, more importantly, stressing avoidance of nonempathic responses may be relevant efforts in working to improve patient satisfaction scores on surveys reporting “top box” percentages, such as Hospital Consumer Assessment of Healthcare Providers and Systems (HCAHPS). More notably, evidence indicates that empathy has positive impacts beyond satisfaction surveys, such as adherence, better diagnostic and clinical outcomes, and strengthening of patient enablement.8Not all hospitalist responses to emotion were associated with patient ratings across the 7 communication items we assessed. For example, we did not find an association between how physicians responded to patient expressions of negative emotion and patient perception that enough time was spent in the visit or the degree to which talking with the doctor met a patient’s overall needs. It follows logically, and other research supports, that empathy would influence patient ratings of physician caring and trust,32 whereas other communication factors we were unable to measure (eg, physician body language, tone, and use of jargon and patient health literacy and primary language) may have a more significant association with patient ratings of the other items we assessed.
In considering the clinical application of our results, it is important to note that communication skills, including responding empathically to patient expressions of negative emotion, can be imparted through training in the same way as abdominal examination or electrocardiogram interpretation skills.33-35 However, training of hospitalists in communication skills requires time and some financial investment on the part of the physician, their hospital or group, or, ideally, both. Effective training methods, like those for other skill acquisition, involve learner-centered teaching and practicing skills with role-play and feedback.36 Given the importance of a learner-centered approach, learning would likely be better received and more effective if it was tailored to the specific needs and patient scenarios commonly encountered by hospitalist physicians. As these programs are developed, it will be important to assess the impact of any training on the patient-reported outcomes we assessed in this observational study, along with clinical outcomes.
Our study has several limitations. First, we were only able to evaluate whether hospitalists verbally responded to patient emotion and were thus not able to account for nonverbal empathy such as facial expressions, body language, or voice tone. Second, given our patient consent rate of 63%, patients who agreed to participate in the study may have had different opinions than those who declined to participate. Also, hospitalists and patients may have behaved differently as a result of being audio recorded. We only included patients who spoke English, and our patient population was predominately non-Hispanic white. Patients who spoke other languages or came from other cultural backgrounds may have had different responses. Third, we did not use a single validated scale for patient ratings of communication, and multiple analyses increase our risk of finding statistically significant associations by chance. The skewing of the communication rating items toward high scores may also have led to our results being driven by outliers, although the model we chose for analysis does penalize for this. Furthermore, our sample size was small, leading to wide CIs and potential for lack of statistical associations due to insufficient power. Our findings warrant replication in larger studies. Fourth, the setting of our study in an academic center may affect generalizability. Finally, the age of our data (collected between 2008 and 2009) is also a limitation. Given a recent focus on communication and patient experience since the initiation of HCAHPS feedback, a similar analysis of empathy and communication methods now may result in different outcomes.
In conclusion, our results suggest that enhancing hospitalists’ empathic responses to patient expressions of negative emotion could decrease patient anxiety and improve patients’ perceptions of (and thus possibly their relationships with) hospitalists, without sacrificing efficiency. Future work should focus on tailoring and implementing communication skills training programs for hospitalists and evaluating the impact of training on patient outcomes.
Acknowledgments
The authors extend their sincere thanks to the patients and physicians who participated in this study. Dr. Anderson was funded by the National Palliative Care Research Center and the University of California, San Francisco Clinical and Translational Science Institute Career Development Program, National Institutes of Health (NIH) grant number 5 KL2 RR024130-04. Project costs were funded by a grant from the University of California, San Francisco Academic Senate.
Disclosure
All coauthors have seen and agree with the contents of this manuscript. This submission is not under review by any other publication. Wendy Anderson received funding for this project from the National Palliative Care Research Center, University of California San Francisco Clinical and Translational Science Institute (NIH grant number 5KL2RR024130-04), and the University of San Francisco Academic Senate [From Section 2 of Author Disclosure Form]. Andy Auerbach has a Patient-Centered Outcomes Research Institute research grant in development [From Section 3 of the Author Disclosure Form].
1. Walker FB, Novack DH, Kaiser DL, Knight A, Oblinger P. Anxiety and depression among medical and surgical patients nearing hospital discharge. J Gen Intern Med. 1987;2(2):99-101. PubMed
2. Castillo MI, Cooke M, Macfarlane B, Aitken LM. Factors associated with anxiety in critically ill patients: A prospective observational cohort study. Int J Nurs Stud. 2016;60:225-233. PubMed
3. Anderson WG, Winters K, Auerbach AD. Patient concerns at hospital admission. Arch Intern Med. 2011;171(15):1399-1400. PubMed
4. Kuo Y-F, Sharma G, Freeman JL, Goodwin JS. Growth in the care of older patients by hospitalists in the United States. N Engl J Med. 2009;360(11):1102-1112. PubMed
5. Wachter RM, Goldman L. Zero to 50,000 - The 20th Anniversary of the Hospitalist. N Engl J Med. 2016;375(11):1009-1011. PubMed
6. Mack JW, Block SD, Nilsson M, et al. Measuring therapeutic alliance between oncologists and patients with advanced cancer: the Human Connection Scale. Cancer. 2009;115(14):3302-3311. PubMed
7. Huff NG, Nadig N, Ford DW, Cox CE. Therapeutic Alliance between the Caregivers of Critical Illness Survivors and Intensive Care Unit Clinicians. [published correction appears in Ann Am Thorac Soc. 2016;13(4):576]. Ann Am Thorac Soc. 2015;12(11):1646-1653. PubMed
8. Derksen F, Bensing J, Lagro-Janssen A. Effectiveness of empathy in general practice: a systematic review. Br J Gen Pract. 2013;63(606):e76-e84. PubMed
9. Dwamena F, Holmes-Rovner M, Gaulden CM, et al. Interventions for providers to promote a patient-centred approach in clinical consultations. Cochrane Database Syst Rev. 2012;12:CD003267. PubMed
10. Fogarty LA, Curbow BA, Wingard JR, McDonnell K, Somerfield MR. Can 40 seconds of compassion reduce patient anxiety? J Clin Oncol. 1999;17(1):371-379. PubMed
11. Roter DL, Hall JA, Kern DE, Barker LR, Cole KA, Roca RP. Improving physicians’ interviewing skills and reducing patients’ emotional distress. A randomized clinical trial. Arch Intern Med. 1995;155(17):1877-1884. PubMed
12. Stapleton RD, Engelberg RA, Wenrich MD, Goss CH, Curtis JR. Clinician statements and family satisfaction with family conferences in the intensive care unit. Crit Care Med. 2006;34(6):1679-1685. PubMed
13. Hojat M, Louis DZ, Markham FW, Wender R, Rabinowitz C, Gonnella JS. Physicians’ empathy and clinical outcomes for diabetic patients. Acad Med. 2011;86(3):359-364. PubMed
14. Anderson WG, Winters K, Arnold RM, Puntillo KA, White DB, Auerbach AD. Studying physician-patient communication in the acute care setting: the hospitalist rapport study. Patient Educ Couns. 2011;82(2):275-279. PubMed
15. Pollak KI, Arnold RM, Jeffreys AS, et al. Oncologist communication about emotion during visits with patients with advanced cancer. J Clin Oncol. 2007;25(36):5748-5752. PubMed
16. Suchman AL, Markakis K, Beckman HB, Frankel R. A model of empathic communication in the medical interview. JAMA. 1997;277(8):678-682. PubMed
17. Adams K, Cimino JEW, Arnold RM, Anderson WG. Why should I talk about emotion? Communication patterns associated with physician discussion of patient expressions of negative emotion in hospital admission encounters. Patient Educ Couns. 2012;89(1):44-50. PubMed
18. Julian LJ. Measures of anxiety: State-Trait Anxiety Inventory (STAI), Beck Anxiety Inventory (BAI), and Hospital Anxiety and Depression Scale-Anxiety (HADS-A). Arthritis Care Res (Hoboken). 2011;63 Suppl 11:S467-S472. PubMed
19. Speilberger C, Ritterband L, Sydeman S, Reheiser E, Unger K. Assessment of emotional states and personality traits: measuring psychological vital signs. In: Butcher J, editor. Clinical personality assessment: practical approaches. New York: Oxford University Press; 1995.
20. Safran DG, Kosinski M, Tarlov AR, et al. The Primary Care Assessment Survey: tests of data quality and measurement performance. Med Care. 1998;36(5):728-739. PubMed
21. Azoulay E, Pochard F, Kentish-Barnes N, et al. Risk of post-traumatic stress symptoms in family members of intensive care unit patients. Am J Respir Crit Care Med. 2005;171(9):987-994. PubMed
22. Lynn J. Perspectives on care at the close of life. Serving patients who may die soon and their families: the role of hospice and other services. JAMA. 2001;285(7):925-932. PubMed
23. Kennifer SL, Alexander SC, Pollak KI, et al. Negative emotions in cancer care: do oncologists’ responses depend on severity and type of emotion? Patient Educ Couns. 2009;76(1):51-56. PubMed
24. Butow PN, Brown RF, Cogar S, Tattersall MHN, Dunn SM. Oncologists’ reactions to cancer patients’ verbal cues. Psychooncology. 2002;11(1):47-58. PubMed
25. Levinson W, Gorawara-Bhat R, Lamb J. A study of patient clues and physician responses in primary care and surgical settings. JAMA. 2000;284(8):1021-1027. PubMed
26. Cohen J. A coefficient of agreement for nominal scales. Educ Psychol Meas. 1960;20(1):37-46.
27. Fulop G. Anxiety disorders in the general hospital setting. Psychiatr Med. 1990;8(3):187-195. PubMed
28. Gerson S, Mistry R, Bastani R, et al. Symptoms of depression and anxiety (MHI) following acute medical/surgical hospitalization and post-discharge psychiatric diagnoses (DSM) in 839 geriatric US veterans. Int J Geriatr Psychiatry. 2004;19(12):1155-1167. PubMed
29. Kathol RG, Wenzel RP. Natural history of symptoms of depression and anxiety during inpatient treatment on general medicine wards. J Gen Intern Med. 1992;7(3):287-293. PubMed
30. Unsal A, Unaldi C, Baytemir C. Anxiety and depression levels of inpatients in the city centre of Kirşehir in Turkey. Int J Nurs Pract. 2011;17(4):411-418. PubMed
31. Bradt J, Dileo C, Grocke D, Magill L. Music interventions for improving psychological and physical outcomes in cancer patients. [Update appears in Cochrane Database Syst Rev. 2016;(8):CD006911] Cochrane Database Syst Rev. 2011;(8):CD006911. PubMed
32. Kim SS, Kaplowitz S, Johnston MV. The effects of physician empathy on patient satisfaction and compliance. Eval Health Prof. 2004;27(3):237-251. PubMed
33. Tulsky JA, Arnold RM, Alexander SC, et al. Enhancing communication between oncologists and patients with a computer-based training program: a randomized trial. Ann Intern Med. 2011;155(9):593-601. PubMed
34. Bays AM, Engelberg RA, Back AL, et al. Interprofessional communication skills training for serious illness: evaluation of a small-group, simulated patient intervention. J Palliat Med. 2014;17(2):159-166. PubMed
35. Epstein RM, Duberstein PR, Fenton JJ, et al. Effect of a Patient-Centered Communication Intervention on Oncologist-Patient Communication, Quality of Life, and Health Care Utilization in Advanced Cancer: The VOICE Randomized Clinical Trial. JAMA Oncol. 2017;3(1):92-100. PubMed
36. Berkhof M, van Rijssen HJ, Schellart AJM, Anema JR, van der Beek AJ. Effective training strategies for teaching communication skills to physicians: an overview of systematic reviews. Patient Educ Couns. 2011;84(2):152-162. PubMed
1. Walker FB, Novack DH, Kaiser DL, Knight A, Oblinger P. Anxiety and depression among medical and surgical patients nearing hospital discharge. J Gen Intern Med. 1987;2(2):99-101. PubMed
2. Castillo MI, Cooke M, Macfarlane B, Aitken LM. Factors associated with anxiety in critically ill patients: A prospective observational cohort study. Int J Nurs Stud. 2016;60:225-233. PubMed
3. Anderson WG, Winters K, Auerbach AD. Patient concerns at hospital admission. Arch Intern Med. 2011;171(15):1399-1400. PubMed
4. Kuo Y-F, Sharma G, Freeman JL, Goodwin JS. Growth in the care of older patients by hospitalists in the United States. N Engl J Med. 2009;360(11):1102-1112. PubMed
5. Wachter RM, Goldman L. Zero to 50,000 - The 20th Anniversary of the Hospitalist. N Engl J Med. 2016;375(11):1009-1011. PubMed
6. Mack JW, Block SD, Nilsson M, et al. Measuring therapeutic alliance between oncologists and patients with advanced cancer: the Human Connection Scale. Cancer. 2009;115(14):3302-3311. PubMed
7. Huff NG, Nadig N, Ford DW, Cox CE. Therapeutic Alliance between the Caregivers of Critical Illness Survivors and Intensive Care Unit Clinicians. [published correction appears in Ann Am Thorac Soc. 2016;13(4):576]. Ann Am Thorac Soc. 2015;12(11):1646-1653. PubMed
8. Derksen F, Bensing J, Lagro-Janssen A. Effectiveness of empathy in general practice: a systematic review. Br J Gen Pract. 2013;63(606):e76-e84. PubMed
9. Dwamena F, Holmes-Rovner M, Gaulden CM, et al. Interventions for providers to promote a patient-centred approach in clinical consultations. Cochrane Database Syst Rev. 2012;12:CD003267. PubMed
10. Fogarty LA, Curbow BA, Wingard JR, McDonnell K, Somerfield MR. Can 40 seconds of compassion reduce patient anxiety? J Clin Oncol. 1999;17(1):371-379. PubMed
11. Roter DL, Hall JA, Kern DE, Barker LR, Cole KA, Roca RP. Improving physicians’ interviewing skills and reducing patients’ emotional distress. A randomized clinical trial. Arch Intern Med. 1995;155(17):1877-1884. PubMed
12. Stapleton RD, Engelberg RA, Wenrich MD, Goss CH, Curtis JR. Clinician statements and family satisfaction with family conferences in the intensive care unit. Crit Care Med. 2006;34(6):1679-1685. PubMed
13. Hojat M, Louis DZ, Markham FW, Wender R, Rabinowitz C, Gonnella JS. Physicians’ empathy and clinical outcomes for diabetic patients. Acad Med. 2011;86(3):359-364. PubMed
14. Anderson WG, Winters K, Arnold RM, Puntillo KA, White DB, Auerbach AD. Studying physician-patient communication in the acute care setting: the hospitalist rapport study. Patient Educ Couns. 2011;82(2):275-279. PubMed
15. Pollak KI, Arnold RM, Jeffreys AS, et al. Oncologist communication about emotion during visits with patients with advanced cancer. J Clin Oncol. 2007;25(36):5748-5752. PubMed
16. Suchman AL, Markakis K, Beckman HB, Frankel R. A model of empathic communication in the medical interview. JAMA. 1997;277(8):678-682. PubMed
17. Adams K, Cimino JEW, Arnold RM, Anderson WG. Why should I talk about emotion? Communication patterns associated with physician discussion of patient expressions of negative emotion in hospital admission encounters. Patient Educ Couns. 2012;89(1):44-50. PubMed
18. Julian LJ. Measures of anxiety: State-Trait Anxiety Inventory (STAI), Beck Anxiety Inventory (BAI), and Hospital Anxiety and Depression Scale-Anxiety (HADS-A). Arthritis Care Res (Hoboken). 2011;63 Suppl 11:S467-S472. PubMed
19. Speilberger C, Ritterband L, Sydeman S, Reheiser E, Unger K. Assessment of emotional states and personality traits: measuring psychological vital signs. In: Butcher J, editor. Clinical personality assessment: practical approaches. New York: Oxford University Press; 1995.
20. Safran DG, Kosinski M, Tarlov AR, et al. The Primary Care Assessment Survey: tests of data quality and measurement performance. Med Care. 1998;36(5):728-739. PubMed
21. Azoulay E, Pochard F, Kentish-Barnes N, et al. Risk of post-traumatic stress symptoms in family members of intensive care unit patients. Am J Respir Crit Care Med. 2005;171(9):987-994. PubMed
22. Lynn J. Perspectives on care at the close of life. Serving patients who may die soon and their families: the role of hospice and other services. JAMA. 2001;285(7):925-932. PubMed
23. Kennifer SL, Alexander SC, Pollak KI, et al. Negative emotions in cancer care: do oncologists’ responses depend on severity and type of emotion? Patient Educ Couns. 2009;76(1):51-56. PubMed
24. Butow PN, Brown RF, Cogar S, Tattersall MHN, Dunn SM. Oncologists’ reactions to cancer patients’ verbal cues. Psychooncology. 2002;11(1):47-58. PubMed
25. Levinson W, Gorawara-Bhat R, Lamb J. A study of patient clues and physician responses in primary care and surgical settings. JAMA. 2000;284(8):1021-1027. PubMed
26. Cohen J. A coefficient of agreement for nominal scales. Educ Psychol Meas. 1960;20(1):37-46.
27. Fulop G. Anxiety disorders in the general hospital setting. Psychiatr Med. 1990;8(3):187-195. PubMed
28. Gerson S, Mistry R, Bastani R, et al. Symptoms of depression and anxiety (MHI) following acute medical/surgical hospitalization and post-discharge psychiatric diagnoses (DSM) in 839 geriatric US veterans. Int J Geriatr Psychiatry. 2004;19(12):1155-1167. PubMed
29. Kathol RG, Wenzel RP. Natural history of symptoms of depression and anxiety during inpatient treatment on general medicine wards. J Gen Intern Med. 1992;7(3):287-293. PubMed
30. Unsal A, Unaldi C, Baytemir C. Anxiety and depression levels of inpatients in the city centre of Kirşehir in Turkey. Int J Nurs Pract. 2011;17(4):411-418. PubMed
31. Bradt J, Dileo C, Grocke D, Magill L. Music interventions for improving psychological and physical outcomes in cancer patients. [Update appears in Cochrane Database Syst Rev. 2016;(8):CD006911] Cochrane Database Syst Rev. 2011;(8):CD006911. PubMed
32. Kim SS, Kaplowitz S, Johnston MV. The effects of physician empathy on patient satisfaction and compliance. Eval Health Prof. 2004;27(3):237-251. PubMed
33. Tulsky JA, Arnold RM, Alexander SC, et al. Enhancing communication between oncologists and patients with a computer-based training program: a randomized trial. Ann Intern Med. 2011;155(9):593-601. PubMed
34. Bays AM, Engelberg RA, Back AL, et al. Interprofessional communication skills training for serious illness: evaluation of a small-group, simulated patient intervention. J Palliat Med. 2014;17(2):159-166. PubMed
35. Epstein RM, Duberstein PR, Fenton JJ, et al. Effect of a Patient-Centered Communication Intervention on Oncologist-Patient Communication, Quality of Life, and Health Care Utilization in Advanced Cancer: The VOICE Randomized Clinical Trial. JAMA Oncol. 2017;3(1):92-100. PubMed
36. Berkhof M, van Rijssen HJ, Schellart AJM, Anema JR, van der Beek AJ. Effective training strategies for teaching communication skills to physicians: an overview of systematic reviews. Patient Educ Couns. 2011;84(2):152-162. PubMed
© 2017 Society of Hospital Medicine
Sound and Light Levels Are Similarly Disruptive in ICU and non-ICU Wards
The hospital environment fails to promote adequate sleep for acutely or critically ill patients. Intensive care units (ICUs) have received the most scrutiny, because critically ill patients suffer from severely fragmented sleep as well as a lack of deeper, more restorative sleep.1-4 ICU survivors frequently cite sleep deprivation, contributed to by ambient noise, as a major stressor while receiving care.5,6 Importantly, efforts to modify the ICU environment to promote sleep have been associated with reductions in delirium.7,8 However, sleep deprivation and delirium in the hospital are not limited to ICU patients.
Sleep in the non-ICU setting is also notoriously poor, with 50%-80% of patients reporting sleep as “unsound” or otherwise subjectively poor.9-11 Additionally, patients frequently ask for and/or receive pharmacological sleeping aids12 despite little evidence of efficacy13 and increasing evidence of harm.14 Here too, efforts to improve sleep seems to attenuate risk of delirium,15 which remains a substantial problem on general wards, with incidence reported as high as 20%-30%. The reasons for poor sleep in the hospital are multifactorial, but data suggest that the inpatient environment, including noise and light levels, which are measurable and modifiable entities, contribute significantly to the problem.16
The World Health Organization (WHO) recommends that nighttime baseline noise levels do not exceed 30 decibels (dB) and that nighttime noise peaks (ie, loud noises) do not exceed 40 dB17; most studies suggest that ICU and general ward rooms are above this range on average.10,18 Others have also demonstrated an association between loud noises and patients’ subjective perception of poor sleep.10,19 However, when considering clinically important noise, peak and average noise levels may not be the key factor in causing arousals from sleep. Buxton and colleagues20 found that noise quality affects arousal probability; for example, electronic alarms and conversational noise are more likely to cause awakenings compared with the opening or closing of doors and ice machines. Importantly, peak and average noise levels may also matter less for sleep than do sound level changes (SLCs), which are defined as the difference between background/baseline noise and peak noise. Using healthy subjects exposed to simulated ICU noise, Stanchina et al.21 found that SLCs >17.5 dB were more likely to cause polysomnographic arousals from sleep regardless of peak noise level. This sound pressure change of approximately 20 dB would be perceived as 4 times louder, or, as an example, would be the difference between normal conversation between 2 people (~40 dB) that is then interrupted by the start of a vacuum cleaner (~60 dB). To our knowledge, no other studies have closely examined SLCs in different hospital environments.
Ambient light also likely affects sleep quality in the hospital. The circadian rhythm system, which controls the human sleep–wake cycle as well as multiple other physiologic functions, depends on ambient light as the primary external factor for regulating the internal clock.22,23 Insufficient and inappropriately timed light exposure can desynchronize the biological clock, thereby negatively affecting sleep quality.24,25 Conversely, patients exposed to early-morning bright light may sleep better while in the hospital.16 In addition to sleep patterns, ambient light affects other aspects of patient care; for example, lower light levels in the hospital have recently been associated with higher levels of fatigue and mood disturbance.26A growing body of data has investigated the ambient environment in the ICU, but fewer studies have focused on sound and light analysis in other inpatient areas such as the general ward and telemetry floors. We examined sound and light levels in the ICU and non-ICU environment, hypothesizing that average sound levels would be higher in the ICU than on non-ICU floors but that the number of SLCs >17.5 dB would be similar. Additionally, we expected that average light levels would be higher in the ICU than on non-ICU floors.
METHODS
This was an observational study of the sound and light environment in the inpatient setting. Per our Institutional Review Board, no consent was required. Battery-operated sound-level (SDL600, Extech Instruments, Nashua, NH) and light-level (SDL400, Extech Instruments, Nashua, NH) meters were placed in 24 patient rooms in our tertiary-care adult hospital in La Jolla, CA. Recordings were obtained in randomly selected, single-patient occupied rooms that were from 3 different hospital units and included 8 general ward rooms, 8 telemetry floor rooms, and 8 ICU rooms. We recorded for approximately 24-72 hours. Depending on the geographic layout of the room, meters were placed as close to the head of the patient’s bed as possible and were generally not placed farther than 2 meters away from the patient’s head of bed; all rooms contained a window.
Sound Measurements
Sound meters measured ambient noise in dB every 2 seconds and were set for A-weighted frequency measurements. We averaged individual data points to obtain hourly averages for ICU and non-ICU rooms. For hourly sound averages, we further separated the data to compare the general ward telemetry floors (both non-ICU), the latter of which has more patient monitoring and a lower nurse-to-patient ratio compared with the general ward floor.
Data from ICU versus non-ICU rooms were analyzed for the number of sound peaks throughout the 24-hour day and for sound peak over the nighttime, defined as the number of times sound levels exceeded 65 dB, 70 dB, or 80 dB, which were averaged over 24 hours and over the nighttime (10 PM to 6 AM). We also calculated the number of average SLCs ≥17.5 dB observed over 24 hours and over the nighttime.
Light Measurements
Light meters measured luminescence in lux at a frequency of 120 seconds. We averaged individual data points to obtain hourly averages for ICU and non-ICU rooms. In addition to hourly averages, light-level data were analyzed for maximum levels throughout the day and night.
Statistical Analysis
Hourly sound-level averages between the 3 floors were evaluated using a 1-way analysis of variance (ANOVA); sound averages from the general ward and telemetry floor were also compared at each hour using a Student t test. Light-level data, sound-level peak data, as well as SLC data were also evaluated using a Student t test.
RESULTS
Sound Measurements
Examples of the raw data distribution for individual sound recordings in an ICU and non-ICU room are shown in Figure 1A and 1B. Sound-level analysis with specific average values and significance levels between ICU and non-ICU rooms (with non-ICU rooms further divided between telemetry and general ward floors for purposes of hourly averages) are shown in Table 1. The average hourly values in all 3 locations were always above the 30-35 dB level (nighttime and daytime, respectively) recommended by the WHO (Figure 1C). A 1-way ANOVA analysis revealed significant differences between the 3 floors at all time points except for 10 AM. An analysis of the means at each time point between the telemetry floor and the general ward floor showed that the telemetry floor had significantly higher sound averages compared with the general ward floor at 10 PM, 11 PM, and 12 AM. Sound levels dropped during the nighttime on both non-ICU wards but remained fairly constant throughout the day and night in the ICU.
Importantly, despite average and peak sound levels showing that the ICU environment is louder overall, there were an equivalent number of SLCs ≥ 17.5 dB in the ICU and on non-ICU floors. The number of SLCs ≥ 17.5 dB is not statistically different when comparing ICU and non-ICU rooms either averaged over 24 hours or averaged over the nighttime (Figure 1E).
Light Measurements
Examples of light levels over a 24-hour period in an ICU and non-ICU room are shown in Figure 2A and 2B, respectively. Maximum average light levels (reported here as average value ± standard deviation to demonstrate variability within the data) in the ICU were 169.7 ± 127.1 lux and occurred at 1 PM, while maximum average light levels in the non-ICU rooms were 213.5 ± 341.6 lux and occurred at 5 PM (Figure 2C). Average light levels in the morning hours remained low and ranged from 15.9 ± 12.7 lux to 38.9 ± 43.4 lux in the ICU and from 22.3 ± 17.5 lux to 100.7 ± 92.0 lux on the non-ICU floors. The maximum measured level from any of the recordings was 2530 lux and occurred in a general ward room in the 5 PM hour. Overall, light averages remained low, but this particular room had light levels that were significantly higher than the others. A t test analysis of the hourly averages revealed only 1 time point of significant difference between the 2 floors; at 7 AM, the general ward floor had a higher lux level of 49.9 ± 27.5 versus 19.2 ± 10.7 in the ICU (P = 0.038). Otherwise, there were no differences between light levels in ICU rooms versus non-ICU rooms. Evaluation of the data revealed a substantial amount of variability in light levels throughout the daytime hours. Light levels during the nighttime remained low and were not significantly different between the 2 groups.
DISCUSSION
To our knowledge, this is the first study to directly compare the ICU and non-ICU environment for its potential impact on sleep and circadian alignment. Our study adds to the literature with several novel findings. First, average sound levels on non-ICU wards are lower than in the ICU. Second, although quieter on average, SLCs >17.5 dB occurred an equivalent number of times for both the ICU and non-ICU wards. Third, average daytime light levels in both the ICU and non-ICU environment are low. Lastly, peak light levels for both ICU and non-ICU wards occur later in the day instead of in the morning. All of the above have potential impact for optimizing the ward environment to better aid in sleep for patients.
Sound-Level Findings
Data on sound levels for non-ICU floors are limited but mostly consistent with our finding
Average and peak sound levels contribute to the ambient noise experienced by patients but may not be the source of sleep disruptions. Using polysomnography in healthy subjects exposed to recordings of ICU noise, Stanchina et al.21 showed that SLCs from baseline and not peak sound levels determined whether a subject was aroused from sleep by sound. Accordingly, they also found that increasing baseline sound levels by using white noise reduced the number of arousals that subjects experienced. To our knowledge, other studies have not quantified and compared SLCs in the ICU and non-ICU environments. Our data show that patients on non-ICU floors experience at least the same number of SLCs, and thereby the same potential for arousals from sleep, when compared with ICU patients. The higher baseline level of noise in the ICU likely explains the relatively lower number of SLCs when compared with the non-ICU floors. Although decreasing overall noise to promote sleep in the hospital seems like the obvious solution, the treatment for noise pollution in the hospital may actually be more background noise, not less.
Recent studies support the clinical implications of our findings. First, decreasing overall noise levels is difficult to accomplish.29 Second, recent studies utilized white noise in different hospital settings with some success in improving patients’ subjective sleep quality, although more studies using objective data measurements are needed to further understand the impact of white noise on sleep in hospitalized patients.30,31 Third, efforts at reducing interruptions—which likely will decrease the number of SLCs—such as clustering nursing care or reducing intermittent alarms may be more beneficial in improving sleep than efforts at decreasing average sound levels. For example, Bartick et al. reduced the number of patient interruptions at night by eliminating routine vital signs and clustering medication administration. Although they included other interventions as well, we note that this approach likely reduced SLCs and was associated with a reduction in the use of sedative medications.32 Ultimately, our data show that a focus on reducing SLCs will be one necessary component of a multipronged solution to improving inpatient sleep.33
Light-Level Findings
Because of its effect on circadian rhythms, the daily light-dark cycle has a powerful impact on human physiology and behavior, which includes sleep.34 Little is understood about how light affects sleep and other circadian-related functions in general ward patients, as it is not commonly measured. Our findings suggest that patients admitted to the hospital are exposed to light levels and patterns that may not optimally promote wake and sleep. Encouragingly, we did not find excessive average light levels during the nighttime in either ICU or non-ICU environment of our hospital, although others have described intrusive nighttime light in the hospital setting.35,36 Even short bursts of low or moderate light during the nighttime can cause circadian phase delay,37 and efforts to maintain darkness in patient rooms at night should continue.
Our measurements show that average daytime light levels did not exceed 250 lux, which corresponds to low, office-level lighting, while the brightest average light levels occurred in the afternoon for both environments. These levels are consistent with other reports26,35,36 as is the light-level variability noted throughout the day (which is not unexpected given room positioning, patient preference, curtains, etc). The level and amount of daytime light needed to maintain circadian rhythms in humans is still unknown.38 Brighter light is generally more effective at influencing the circadian pacemaker in a dose-dependent manner.39 Although entrainment (synchronization of the body’s biological rhythm with environmental cues such as ambient light) of the human circadian rhythm has been shown with low light levels (eg, <100 lux), these studies included healthy volunteers in a carefully controlled, constant, routine environment.23 How these data apply to acutely ill subjects in the hospital environment is not clear. We note that low to moderate levels of light (50-1000 lux) are less effective for entrainment of the circadian rhythm in older people (age >65 years, the majority of our admissions) compared with younger people. Thus, older, hospitalized patients may require greater light levels for regulation of the sleep-wake cycle.40 These data are important when designing interventions to improve light for and maintain circadian rhythms in hospitalized patients. For example, Simons et al. found that dynamic light-application therapy, which achieved a maximum average lux level of <800 lux, did not reduce rates of delirium in critically ill patients (mean age ~65). One interpretation of these results, though there are many others, is that the light levels achieved were not high enough to influence circadian timing in hospitalized, mostly elderly patients. The physiological impact of light on the circadian rhythm in hospitalized patients still remains to be measured.
LIMITATIONS
Our study does have a few limitations. We did not assess sound quality, which is another determinant of arousal potential.20 Also, a shorter measurement interval might be useful in determining sharper sound increases. It may also be important to consider A- versus C-weighted measurements of sound levels, as A-weighted measurements usually reflect higher-frequency sound while C-weighted measurements usually reflect low-frequency noise18; we obtained only A-weighted measurements in our study. However, A-weighted measurements are generally considered more reflective of what the human ear considers noise and are used more standardly than C-weighted measurements.
Regarding light measurements, we recorded from rooms facing different cardinal directions and during different times of the year, which likely contributed to some of the variability in the daytime light levels on both floors. Additionally, light levels were not measured directly at the patient’s eye level. However, given that overhead fluorescent lighting was the primary source of lighting, it is doubtful that we substantially underestimated optic-nerve light levels. In the future, it may also be important to measure the different wavelengths of lights, as blue light may have a greater impact on sleep than other wavelengths.41 Although our findings align with others’, we note that this was a single-center study, which could limit the generalizability of our findings given inter-hospital variations in patient volume, interior layout and structure, and geographic location.
CONCLUSIONS
Overall, our study suggests that the light and sound environment for sleep in the inpatient setting, including both the ICU and non-ICU wards, has multiple areas for improvement. Our data also suggest specific directions for future clinical efforts at improvement. For example, efforts to decrease average sound levels may worsen sleep fragmentation. Similarly, more light during the day may be more helpful than further attempts to limit light during the night.
Disclosure
This research was funded in part by a NIH/NCATS flagship Clinical and Translational Science Award Grant (5KL2TR001112). None of the authors report any conflict of interest, financial or otherwise, in the preparation of this article.
1. Freedman NS, Gazendam J, Levan L, Pack AI, Schwab RJ. Abnormal sleep/wake
cycles and the effect of environmental noise on sleep disruption in the intensive
care unit. Am J Respir Crit Care Med. 2001;163(2):451-457. PubMed
2. Watson PL, Pandharipande P, Gehlbach BK, et al. Atypical sleep in ventilated
patients: empirical electroencephalography findings and the path toward revised ICU sleep scoring criteria. Crit Care Med. 2013;41(8):1958-1967. PubMed
3. Gehlbach BK, Chapotot F, Leproult R, et al. Temporal disorganization of circadian rhythmicity and sleep-wake regulation in mechanically ventilated patients receiving continuous intravenous sedation. Sleep. 2012;35(8):1105-1114. PubMed
4. Elliott R, McKinley S, Cistulli P, Fien M. Characterisation of sleep in intensive care using 24-hour polysomnography: an observational study. Crit Care. 2013;17(2):R46. PubMed
5. Novaes MA, Aronovich A, Ferraz MB, Knobel E. Stressors in ICU: patients’ evaluation. Intensive Care Med. 1997;23(12):1282-1285. PubMed
6. Tembo AC, Parker V, Higgins I. The experience of sleep deprivation in intensive care patients: findings from a larger hermeneutic phenomenological study. Intensive Crit Care Nurs. 2013;29(6):310-316. PubMed
7. Kamdar BB, Yang J, King LM, et al. Developing, implementing, and evaluating a multifaceted quality improvement intervention to promote sleep in an ICU. Am J Med Qual. 2014;29(6):546-554. PubMed
8. Patel J, Baldwin J, Bunting P, Laha S. The effect of a multicomponent multidisciplinary bundle of interventions on sleep and delirium in medical and surgical intensive care patients. Anaesthesia. 2014;69(6):540-549. PubMed
9. Manian FA, Manian CJ. Sleep quality in adult hospitalized patients with infection: an observational study. Am J Med Sci. 2015;349(1):56-60. PubMed
10. Park MJ, Yoo JH, Cho BW, Kim KT, Jeong WC, Ha M. Noise in hospital rooms and sleep disturbance in hospitalized medical patients. Environ Health Toxicol. 2014;29:e2014006. PubMed
11. Dobing S, Frolova N, McAlister F, Ringrose J. Sleep quality and factors influencing self-reported sleep duration and quality in the general internal medicine inpatient population. PLoS One. 2016;11(6):e0156735. PubMed
12. Gillis CM, Poyant JO, Degrado JR, Ye L, Anger KE, Owens RL. Inpatient pharmacological
sleep aid utilization is common at a tertiary medical center. J Hosp Med. 2014;9(10):652-657. PubMed
13. Krenk L, Jennum P, Kehlet H. Postoperative sleep disturbances after zolpidem treatment in fast-track hip and knee replacement. J Clin Sleep Med. 2014;10(3):321-326. PubMed
14. Kolla BP, Lovely JK, Mansukhani MP, Morgenthaler TI. Zolpidem is independently
associated with increased risk of inpatient falls. J Hosp Med. 2013;8(1):1-6. PubMed
15. Inouye SK, Bogardus ST Jr, Charpentier PA, et al. A multicomponent intervention to prevent delirium in hospitalized older patients. N Engl J Med. 1999;340(9):669-676. PubMed
16. Bano M, Chiaromanni F, Corrias M, et al. The influence of environmental factors on sleep quality in hospitalized medical patients. Front Neurol. 2014;5:267. PubMed
17. Berglund BLTSD. Guidelines for Community Noise. World Health Organization. 1999.
18. Knauert M, Jeon S, Murphy TE, Yaggi HK, Pisani MA, Redeker NS. Comparing average levels and peak occurrence of overnight sound in the medical intensive care unit on A-weighted and C-weighted decibel scales. J Crit Care. 2016;36:1-7. PubMed
19. Yoder JC, Staisiunas PG, Meltzer DO, Knutson KL, Arora VM. Noise and sleep among adult medical inpatients: far from a quiet night. Arch Intern Med. 2012;172(1):68-70. PubMed
20. Buxton OM, Ellenbogen JM, Wang W, et al. Sleep disruption due to hospital noises: a prospective evaluation. Ann Intern Med. 2012;157(3):170-179. PubMed
21. Stanchina ML, Abu-Hijleh M, Chaudhry BK, Carlisle CC, Millman RP. The influence of white noise on sleep in subjects exposed to ICU noise. Sleep Med. 2005;6(5):423-428. PubMed
22. Czeisler CA, Allan JS, Strogatz SH, et al. Bright light resets the human circadian pacemaker independent of the timing of the sleep-wake cycle. Science. 1986;233(4764):667-671. PubMed
23. Duffy JF, Czeisler CA. Effect of light on human circadian physiology. Sleep Med Clin. 2009;4(2):165-177. PubMed
24. Lewy AJ, Wehr TA, Goodwin FK, Newsome DA, Markey SP. Light suppresses melatonin secretion in humans. Science. 1980;210(4475):1267-1269. PubMed
25. Zeitzer JM, Dijk DJ, Kronauer R, Brown E, Czeisler C. Sensitivity of the human circadian pacemaker to nocturnal light: melatonin phase resetting and suppression. J Physiol. 2000;526:695-702. PubMed
26. Bernhofer EI, Higgins PA, Daly BJ, Burant CJ, Hornick TR. Hospital lighting and its association with sleep, mood and pain in medical inpatients. J Adv Nurs. 2014;70(5):1164-1173. PubMed
27. Darbyshire JL, Young JD. An investigation of sound levels on intensive care units with reference to the WHO guidelines. Crit Care. 2013;17(5):R187. PubMed
28. Gillis S. Pharmacologic treatment of depression during pregnancy. J Midwifery Womens Health. 2000;45(4):357-359. PubMed
29. Tainter CR, Levine AR, Quraishi SA, et al. Noise levels in surgical ICUs are consistently above recommended standards. Crit Care Med. 2016;44(1):147-152. PubMed
30. Farrehi PM, Clore KR, Scott JR, Vanini G, Clauw DJ. Efficacy of Sleep Tool Education During Hospitalization: A Randomized Controlled Trial. Am J Med. 2016;129(12):1329.e9-1329.e17. PubMed
31. Farokhnezhad Afshar P, Bahramnezhad F, Asgari P, Shiri M. Effect of white noise on sleep in patients admitted to a coronary care. J Caring Sci. 2016;5(2):103-109. PubMed
32. Bartick MC, Thai X, Schmidt T, Altaye A, Solet JM. Decrease in as-needed sedative use by limiting nighttime sleep disruptions from hospital staff. J Hosp Med. 2010;5(3):E20-E24. PubMed
33. Tamrat R, Huynh-Le MP, Goyal M. Non-pharmacologic interventions to improve the sleep of hospitalized patients: a systematic review. J Gen Intern Med. 2014;29(5):788-795. PubMed
34. Dijk DJ, Archer SN. Light, sleep, and circadian rhythms: together again. PLoS Biol. 2009;7(6):e1000145. PubMed
35. Verceles AC, Liu X, Terrin ML, et al. Ambient light levels and critical care outcomes. J Crit Care. 2013;28(1):110.e1-110.e8. PubMed
36. Hu RF, Hegadoren KM, Wang XY, Jiang XY. An investigation of light and sound levels on intensive care units in China. Aust Crit Care. 2016;29(2):62-67. PubMed
37. Zeitzer JM, Ruby NF, Fisicaro RA, Heller HC. Response of the human circadian system to millisecond flashes of light. PLoS One. 2011;6(7):e22078. PubMed
38. Duffy JF, Wright KP, Jr. Entrainment of the human circadian system by light. J Biol Rhythms. 2005;20(4):326-338. PubMed
39. Wright KP Jr, Gronfier C, Duffy JF, Czeisler CA. Intrinsic period and light intensity determine the phase relationship between melatonin and sleep in humans. J Biol Rhythms. 2005;20(2):168-177. PubMed
40. Duffy JF, Zeitzer JM, Czeisler CA. Decreased sensitivity to phase-delaying effects of moderate intensity light in older subjects. Neurobiol Aging. 2007;28(5):799-807. PubMed
41. Figueiro MG, Plitnick BA, Lok A, et al. Tailored lighting intervention improves measures of sleep, depression, and agitation in persons with Alzheimer’s disease and related dementia living in long-term care facilities. Clin Interv Aging. 2014;9:1527-1537. PubMed
The hospital environment fails to promote adequate sleep for acutely or critically ill patients. Intensive care units (ICUs) have received the most scrutiny, because critically ill patients suffer from severely fragmented sleep as well as a lack of deeper, more restorative sleep.1-4 ICU survivors frequently cite sleep deprivation, contributed to by ambient noise, as a major stressor while receiving care.5,6 Importantly, efforts to modify the ICU environment to promote sleep have been associated with reductions in delirium.7,8 However, sleep deprivation and delirium in the hospital are not limited to ICU patients.
Sleep in the non-ICU setting is also notoriously poor, with 50%-80% of patients reporting sleep as “unsound” or otherwise subjectively poor.9-11 Additionally, patients frequently ask for and/or receive pharmacological sleeping aids12 despite little evidence of efficacy13 and increasing evidence of harm.14 Here too, efforts to improve sleep seems to attenuate risk of delirium,15 which remains a substantial problem on general wards, with incidence reported as high as 20%-30%. The reasons for poor sleep in the hospital are multifactorial, but data suggest that the inpatient environment, including noise and light levels, which are measurable and modifiable entities, contribute significantly to the problem.16
The World Health Organization (WHO) recommends that nighttime baseline noise levels do not exceed 30 decibels (dB) and that nighttime noise peaks (ie, loud noises) do not exceed 40 dB17; most studies suggest that ICU and general ward rooms are above this range on average.10,18 Others have also demonstrated an association between loud noises and patients’ subjective perception of poor sleep.10,19 However, when considering clinically important noise, peak and average noise levels may not be the key factor in causing arousals from sleep. Buxton and colleagues20 found that noise quality affects arousal probability; for example, electronic alarms and conversational noise are more likely to cause awakenings compared with the opening or closing of doors and ice machines. Importantly, peak and average noise levels may also matter less for sleep than do sound level changes (SLCs), which are defined as the difference between background/baseline noise and peak noise. Using healthy subjects exposed to simulated ICU noise, Stanchina et al.21 found that SLCs >17.5 dB were more likely to cause polysomnographic arousals from sleep regardless of peak noise level. This sound pressure change of approximately 20 dB would be perceived as 4 times louder, or, as an example, would be the difference between normal conversation between 2 people (~40 dB) that is then interrupted by the start of a vacuum cleaner (~60 dB). To our knowledge, no other studies have closely examined SLCs in different hospital environments.
Ambient light also likely affects sleep quality in the hospital. The circadian rhythm system, which controls the human sleep–wake cycle as well as multiple other physiologic functions, depends on ambient light as the primary external factor for regulating the internal clock.22,23 Insufficient and inappropriately timed light exposure can desynchronize the biological clock, thereby negatively affecting sleep quality.24,25 Conversely, patients exposed to early-morning bright light may sleep better while in the hospital.16 In addition to sleep patterns, ambient light affects other aspects of patient care; for example, lower light levels in the hospital have recently been associated with higher levels of fatigue and mood disturbance.26A growing body of data has investigated the ambient environment in the ICU, but fewer studies have focused on sound and light analysis in other inpatient areas such as the general ward and telemetry floors. We examined sound and light levels in the ICU and non-ICU environment, hypothesizing that average sound levels would be higher in the ICU than on non-ICU floors but that the number of SLCs >17.5 dB would be similar. Additionally, we expected that average light levels would be higher in the ICU than on non-ICU floors.
METHODS
This was an observational study of the sound and light environment in the inpatient setting. Per our Institutional Review Board, no consent was required. Battery-operated sound-level (SDL600, Extech Instruments, Nashua, NH) and light-level (SDL400, Extech Instruments, Nashua, NH) meters were placed in 24 patient rooms in our tertiary-care adult hospital in La Jolla, CA. Recordings were obtained in randomly selected, single-patient occupied rooms that were from 3 different hospital units and included 8 general ward rooms, 8 telemetry floor rooms, and 8 ICU rooms. We recorded for approximately 24-72 hours. Depending on the geographic layout of the room, meters were placed as close to the head of the patient’s bed as possible and were generally not placed farther than 2 meters away from the patient’s head of bed; all rooms contained a window.
Sound Measurements
Sound meters measured ambient noise in dB every 2 seconds and were set for A-weighted frequency measurements. We averaged individual data points to obtain hourly averages for ICU and non-ICU rooms. For hourly sound averages, we further separated the data to compare the general ward telemetry floors (both non-ICU), the latter of which has more patient monitoring and a lower nurse-to-patient ratio compared with the general ward floor.
Data from ICU versus non-ICU rooms were analyzed for the number of sound peaks throughout the 24-hour day and for sound peak over the nighttime, defined as the number of times sound levels exceeded 65 dB, 70 dB, or 80 dB, which were averaged over 24 hours and over the nighttime (10 PM to 6 AM). We also calculated the number of average SLCs ≥17.5 dB observed over 24 hours and over the nighttime.
Light Measurements
Light meters measured luminescence in lux at a frequency of 120 seconds. We averaged individual data points to obtain hourly averages for ICU and non-ICU rooms. In addition to hourly averages, light-level data were analyzed for maximum levels throughout the day and night.
Statistical Analysis
Hourly sound-level averages between the 3 floors were evaluated using a 1-way analysis of variance (ANOVA); sound averages from the general ward and telemetry floor were also compared at each hour using a Student t test. Light-level data, sound-level peak data, as well as SLC data were also evaluated using a Student t test.
RESULTS
Sound Measurements
Examples of the raw data distribution for individual sound recordings in an ICU and non-ICU room are shown in Figure 1A and 1B. Sound-level analysis with specific average values and significance levels between ICU and non-ICU rooms (with non-ICU rooms further divided between telemetry and general ward floors for purposes of hourly averages) are shown in Table 1. The average hourly values in all 3 locations were always above the 30-35 dB level (nighttime and daytime, respectively) recommended by the WHO (Figure 1C). A 1-way ANOVA analysis revealed significant differences between the 3 floors at all time points except for 10 AM. An analysis of the means at each time point between the telemetry floor and the general ward floor showed that the telemetry floor had significantly higher sound averages compared with the general ward floor at 10 PM, 11 PM, and 12 AM. Sound levels dropped during the nighttime on both non-ICU wards but remained fairly constant throughout the day and night in the ICU.
Importantly, despite average and peak sound levels showing that the ICU environment is louder overall, there were an equivalent number of SLCs ≥ 17.5 dB in the ICU and on non-ICU floors. The number of SLCs ≥ 17.5 dB is not statistically different when comparing ICU and non-ICU rooms either averaged over 24 hours or averaged over the nighttime (Figure 1E).
Light Measurements
Examples of light levels over a 24-hour period in an ICU and non-ICU room are shown in Figure 2A and 2B, respectively. Maximum average light levels (reported here as average value ± standard deviation to demonstrate variability within the data) in the ICU were 169.7 ± 127.1 lux and occurred at 1 PM, while maximum average light levels in the non-ICU rooms were 213.5 ± 341.6 lux and occurred at 5 PM (Figure 2C). Average light levels in the morning hours remained low and ranged from 15.9 ± 12.7 lux to 38.9 ± 43.4 lux in the ICU and from 22.3 ± 17.5 lux to 100.7 ± 92.0 lux on the non-ICU floors. The maximum measured level from any of the recordings was 2530 lux and occurred in a general ward room in the 5 PM hour. Overall, light averages remained low, but this particular room had light levels that were significantly higher than the others. A t test analysis of the hourly averages revealed only 1 time point of significant difference between the 2 floors; at 7 AM, the general ward floor had a higher lux level of 49.9 ± 27.5 versus 19.2 ± 10.7 in the ICU (P = 0.038). Otherwise, there were no differences between light levels in ICU rooms versus non-ICU rooms. Evaluation of the data revealed a substantial amount of variability in light levels throughout the daytime hours. Light levels during the nighttime remained low and were not significantly different between the 2 groups.
DISCUSSION
To our knowledge, this is the first study to directly compare the ICU and non-ICU environment for its potential impact on sleep and circadian alignment. Our study adds to the literature with several novel findings. First, average sound levels on non-ICU wards are lower than in the ICU. Second, although quieter on average, SLCs >17.5 dB occurred an equivalent number of times for both the ICU and non-ICU wards. Third, average daytime light levels in both the ICU and non-ICU environment are low. Lastly, peak light levels for both ICU and non-ICU wards occur later in the day instead of in the morning. All of the above have potential impact for optimizing the ward environment to better aid in sleep for patients.
Sound-Level Findings
Data on sound levels for non-ICU floors are limited but mostly consistent with our finding
Average and peak sound levels contribute to the ambient noise experienced by patients but may not be the source of sleep disruptions. Using polysomnography in healthy subjects exposed to recordings of ICU noise, Stanchina et al.21 showed that SLCs from baseline and not peak sound levels determined whether a subject was aroused from sleep by sound. Accordingly, they also found that increasing baseline sound levels by using white noise reduced the number of arousals that subjects experienced. To our knowledge, other studies have not quantified and compared SLCs in the ICU and non-ICU environments. Our data show that patients on non-ICU floors experience at least the same number of SLCs, and thereby the same potential for arousals from sleep, when compared with ICU patients. The higher baseline level of noise in the ICU likely explains the relatively lower number of SLCs when compared with the non-ICU floors. Although decreasing overall noise to promote sleep in the hospital seems like the obvious solution, the treatment for noise pollution in the hospital may actually be more background noise, not less.
Recent studies support the clinical implications of our findings. First, decreasing overall noise levels is difficult to accomplish.29 Second, recent studies utilized white noise in different hospital settings with some success in improving patients’ subjective sleep quality, although more studies using objective data measurements are needed to further understand the impact of white noise on sleep in hospitalized patients.30,31 Third, efforts at reducing interruptions—which likely will decrease the number of SLCs—such as clustering nursing care or reducing intermittent alarms may be more beneficial in improving sleep than efforts at decreasing average sound levels. For example, Bartick et al. reduced the number of patient interruptions at night by eliminating routine vital signs and clustering medication administration. Although they included other interventions as well, we note that this approach likely reduced SLCs and was associated with a reduction in the use of sedative medications.32 Ultimately, our data show that a focus on reducing SLCs will be one necessary component of a multipronged solution to improving inpatient sleep.33
Light-Level Findings
Because of its effect on circadian rhythms, the daily light-dark cycle has a powerful impact on human physiology and behavior, which includes sleep.34 Little is understood about how light affects sleep and other circadian-related functions in general ward patients, as it is not commonly measured. Our findings suggest that patients admitted to the hospital are exposed to light levels and patterns that may not optimally promote wake and sleep. Encouragingly, we did not find excessive average light levels during the nighttime in either ICU or non-ICU environment of our hospital, although others have described intrusive nighttime light in the hospital setting.35,36 Even short bursts of low or moderate light during the nighttime can cause circadian phase delay,37 and efforts to maintain darkness in patient rooms at night should continue.
Our measurements show that average daytime light levels did not exceed 250 lux, which corresponds to low, office-level lighting, while the brightest average light levels occurred in the afternoon for both environments. These levels are consistent with other reports26,35,36 as is the light-level variability noted throughout the day (which is not unexpected given room positioning, patient preference, curtains, etc). The level and amount of daytime light needed to maintain circadian rhythms in humans is still unknown.38 Brighter light is generally more effective at influencing the circadian pacemaker in a dose-dependent manner.39 Although entrainment (synchronization of the body’s biological rhythm with environmental cues such as ambient light) of the human circadian rhythm has been shown with low light levels (eg, <100 lux), these studies included healthy volunteers in a carefully controlled, constant, routine environment.23 How these data apply to acutely ill subjects in the hospital environment is not clear. We note that low to moderate levels of light (50-1000 lux) are less effective for entrainment of the circadian rhythm in older people (age >65 years, the majority of our admissions) compared with younger people. Thus, older, hospitalized patients may require greater light levels for regulation of the sleep-wake cycle.40 These data are important when designing interventions to improve light for and maintain circadian rhythms in hospitalized patients. For example, Simons et al. found that dynamic light-application therapy, which achieved a maximum average lux level of <800 lux, did not reduce rates of delirium in critically ill patients (mean age ~65). One interpretation of these results, though there are many others, is that the light levels achieved were not high enough to influence circadian timing in hospitalized, mostly elderly patients. The physiological impact of light on the circadian rhythm in hospitalized patients still remains to be measured.
LIMITATIONS
Our study does have a few limitations. We did not assess sound quality, which is another determinant of arousal potential.20 Also, a shorter measurement interval might be useful in determining sharper sound increases. It may also be important to consider A- versus C-weighted measurements of sound levels, as A-weighted measurements usually reflect higher-frequency sound while C-weighted measurements usually reflect low-frequency noise18; we obtained only A-weighted measurements in our study. However, A-weighted measurements are generally considered more reflective of what the human ear considers noise and are used more standardly than C-weighted measurements.
Regarding light measurements, we recorded from rooms facing different cardinal directions and during different times of the year, which likely contributed to some of the variability in the daytime light levels on both floors. Additionally, light levels were not measured directly at the patient’s eye level. However, given that overhead fluorescent lighting was the primary source of lighting, it is doubtful that we substantially underestimated optic-nerve light levels. In the future, it may also be important to measure the different wavelengths of lights, as blue light may have a greater impact on sleep than other wavelengths.41 Although our findings align with others’, we note that this was a single-center study, which could limit the generalizability of our findings given inter-hospital variations in patient volume, interior layout and structure, and geographic location.
CONCLUSIONS
Overall, our study suggests that the light and sound environment for sleep in the inpatient setting, including both the ICU and non-ICU wards, has multiple areas for improvement. Our data also suggest specific directions for future clinical efforts at improvement. For example, efforts to decrease average sound levels may worsen sleep fragmentation. Similarly, more light during the day may be more helpful than further attempts to limit light during the night.
Disclosure
This research was funded in part by a NIH/NCATS flagship Clinical and Translational Science Award Grant (5KL2TR001112). None of the authors report any conflict of interest, financial or otherwise, in the preparation of this article.
The hospital environment fails to promote adequate sleep for acutely or critically ill patients. Intensive care units (ICUs) have received the most scrutiny, because critically ill patients suffer from severely fragmented sleep as well as a lack of deeper, more restorative sleep.1-4 ICU survivors frequently cite sleep deprivation, contributed to by ambient noise, as a major stressor while receiving care.5,6 Importantly, efforts to modify the ICU environment to promote sleep have been associated with reductions in delirium.7,8 However, sleep deprivation and delirium in the hospital are not limited to ICU patients.
Sleep in the non-ICU setting is also notoriously poor, with 50%-80% of patients reporting sleep as “unsound” or otherwise subjectively poor.9-11 Additionally, patients frequently ask for and/or receive pharmacological sleeping aids12 despite little evidence of efficacy13 and increasing evidence of harm.14 Here too, efforts to improve sleep seems to attenuate risk of delirium,15 which remains a substantial problem on general wards, with incidence reported as high as 20%-30%. The reasons for poor sleep in the hospital are multifactorial, but data suggest that the inpatient environment, including noise and light levels, which are measurable and modifiable entities, contribute significantly to the problem.16
The World Health Organization (WHO) recommends that nighttime baseline noise levels do not exceed 30 decibels (dB) and that nighttime noise peaks (ie, loud noises) do not exceed 40 dB17; most studies suggest that ICU and general ward rooms are above this range on average.10,18 Others have also demonstrated an association between loud noises and patients’ subjective perception of poor sleep.10,19 However, when considering clinically important noise, peak and average noise levels may not be the key factor in causing arousals from sleep. Buxton and colleagues20 found that noise quality affects arousal probability; for example, electronic alarms and conversational noise are more likely to cause awakenings compared with the opening or closing of doors and ice machines. Importantly, peak and average noise levels may also matter less for sleep than do sound level changes (SLCs), which are defined as the difference between background/baseline noise and peak noise. Using healthy subjects exposed to simulated ICU noise, Stanchina et al.21 found that SLCs >17.5 dB were more likely to cause polysomnographic arousals from sleep regardless of peak noise level. This sound pressure change of approximately 20 dB would be perceived as 4 times louder, or, as an example, would be the difference between normal conversation between 2 people (~40 dB) that is then interrupted by the start of a vacuum cleaner (~60 dB). To our knowledge, no other studies have closely examined SLCs in different hospital environments.
Ambient light also likely affects sleep quality in the hospital. The circadian rhythm system, which controls the human sleep–wake cycle as well as multiple other physiologic functions, depends on ambient light as the primary external factor for regulating the internal clock.22,23 Insufficient and inappropriately timed light exposure can desynchronize the biological clock, thereby negatively affecting sleep quality.24,25 Conversely, patients exposed to early-morning bright light may sleep better while in the hospital.16 In addition to sleep patterns, ambient light affects other aspects of patient care; for example, lower light levels in the hospital have recently been associated with higher levels of fatigue and mood disturbance.26A growing body of data has investigated the ambient environment in the ICU, but fewer studies have focused on sound and light analysis in other inpatient areas such as the general ward and telemetry floors. We examined sound and light levels in the ICU and non-ICU environment, hypothesizing that average sound levels would be higher in the ICU than on non-ICU floors but that the number of SLCs >17.5 dB would be similar. Additionally, we expected that average light levels would be higher in the ICU than on non-ICU floors.
METHODS
This was an observational study of the sound and light environment in the inpatient setting. Per our Institutional Review Board, no consent was required. Battery-operated sound-level (SDL600, Extech Instruments, Nashua, NH) and light-level (SDL400, Extech Instruments, Nashua, NH) meters were placed in 24 patient rooms in our tertiary-care adult hospital in La Jolla, CA. Recordings were obtained in randomly selected, single-patient occupied rooms that were from 3 different hospital units and included 8 general ward rooms, 8 telemetry floor rooms, and 8 ICU rooms. We recorded for approximately 24-72 hours. Depending on the geographic layout of the room, meters were placed as close to the head of the patient’s bed as possible and were generally not placed farther than 2 meters away from the patient’s head of bed; all rooms contained a window.
Sound Measurements
Sound meters measured ambient noise in dB every 2 seconds and were set for A-weighted frequency measurements. We averaged individual data points to obtain hourly averages for ICU and non-ICU rooms. For hourly sound averages, we further separated the data to compare the general ward telemetry floors (both non-ICU), the latter of which has more patient monitoring and a lower nurse-to-patient ratio compared with the general ward floor.
Data from ICU versus non-ICU rooms were analyzed for the number of sound peaks throughout the 24-hour day and for sound peak over the nighttime, defined as the number of times sound levels exceeded 65 dB, 70 dB, or 80 dB, which were averaged over 24 hours and over the nighttime (10 PM to 6 AM). We also calculated the number of average SLCs ≥17.5 dB observed over 24 hours and over the nighttime.
Light Measurements
Light meters measured luminescence in lux at a frequency of 120 seconds. We averaged individual data points to obtain hourly averages for ICU and non-ICU rooms. In addition to hourly averages, light-level data were analyzed for maximum levels throughout the day and night.
Statistical Analysis
Hourly sound-level averages between the 3 floors were evaluated using a 1-way analysis of variance (ANOVA); sound averages from the general ward and telemetry floor were also compared at each hour using a Student t test. Light-level data, sound-level peak data, as well as SLC data were also evaluated using a Student t test.
RESULTS
Sound Measurements
Examples of the raw data distribution for individual sound recordings in an ICU and non-ICU room are shown in Figure 1A and 1B. Sound-level analysis with specific average values and significance levels between ICU and non-ICU rooms (with non-ICU rooms further divided between telemetry and general ward floors for purposes of hourly averages) are shown in Table 1. The average hourly values in all 3 locations were always above the 30-35 dB level (nighttime and daytime, respectively) recommended by the WHO (Figure 1C). A 1-way ANOVA analysis revealed significant differences between the 3 floors at all time points except for 10 AM. An analysis of the means at each time point between the telemetry floor and the general ward floor showed that the telemetry floor had significantly higher sound averages compared with the general ward floor at 10 PM, 11 PM, and 12 AM. Sound levels dropped during the nighttime on both non-ICU wards but remained fairly constant throughout the day and night in the ICU.
Importantly, despite average and peak sound levels showing that the ICU environment is louder overall, there were an equivalent number of SLCs ≥ 17.5 dB in the ICU and on non-ICU floors. The number of SLCs ≥ 17.5 dB is not statistically different when comparing ICU and non-ICU rooms either averaged over 24 hours or averaged over the nighttime (Figure 1E).
Light Measurements
Examples of light levels over a 24-hour period in an ICU and non-ICU room are shown in Figure 2A and 2B, respectively. Maximum average light levels (reported here as average value ± standard deviation to demonstrate variability within the data) in the ICU were 169.7 ± 127.1 lux and occurred at 1 PM, while maximum average light levels in the non-ICU rooms were 213.5 ± 341.6 lux and occurred at 5 PM (Figure 2C). Average light levels in the morning hours remained low and ranged from 15.9 ± 12.7 lux to 38.9 ± 43.4 lux in the ICU and from 22.3 ± 17.5 lux to 100.7 ± 92.0 lux on the non-ICU floors. The maximum measured level from any of the recordings was 2530 lux and occurred in a general ward room in the 5 PM hour. Overall, light averages remained low, but this particular room had light levels that were significantly higher than the others. A t test analysis of the hourly averages revealed only 1 time point of significant difference between the 2 floors; at 7 AM, the general ward floor had a higher lux level of 49.9 ± 27.5 versus 19.2 ± 10.7 in the ICU (P = 0.038). Otherwise, there were no differences between light levels in ICU rooms versus non-ICU rooms. Evaluation of the data revealed a substantial amount of variability in light levels throughout the daytime hours. Light levels during the nighttime remained low and were not significantly different between the 2 groups.
DISCUSSION
To our knowledge, this is the first study to directly compare the ICU and non-ICU environment for its potential impact on sleep and circadian alignment. Our study adds to the literature with several novel findings. First, average sound levels on non-ICU wards are lower than in the ICU. Second, although quieter on average, SLCs >17.5 dB occurred an equivalent number of times for both the ICU and non-ICU wards. Third, average daytime light levels in both the ICU and non-ICU environment are low. Lastly, peak light levels for both ICU and non-ICU wards occur later in the day instead of in the morning. All of the above have potential impact for optimizing the ward environment to better aid in sleep for patients.
Sound-Level Findings
Data on sound levels for non-ICU floors are limited but mostly consistent with our finding
Average and peak sound levels contribute to the ambient noise experienced by patients but may not be the source of sleep disruptions. Using polysomnography in healthy subjects exposed to recordings of ICU noise, Stanchina et al.21 showed that SLCs from baseline and not peak sound levels determined whether a subject was aroused from sleep by sound. Accordingly, they also found that increasing baseline sound levels by using white noise reduced the number of arousals that subjects experienced. To our knowledge, other studies have not quantified and compared SLCs in the ICU and non-ICU environments. Our data show that patients on non-ICU floors experience at least the same number of SLCs, and thereby the same potential for arousals from sleep, when compared with ICU patients. The higher baseline level of noise in the ICU likely explains the relatively lower number of SLCs when compared with the non-ICU floors. Although decreasing overall noise to promote sleep in the hospital seems like the obvious solution, the treatment for noise pollution in the hospital may actually be more background noise, not less.
Recent studies support the clinical implications of our findings. First, decreasing overall noise levels is difficult to accomplish.29 Second, recent studies utilized white noise in different hospital settings with some success in improving patients’ subjective sleep quality, although more studies using objective data measurements are needed to further understand the impact of white noise on sleep in hospitalized patients.30,31 Third, efforts at reducing interruptions—which likely will decrease the number of SLCs—such as clustering nursing care or reducing intermittent alarms may be more beneficial in improving sleep than efforts at decreasing average sound levels. For example, Bartick et al. reduced the number of patient interruptions at night by eliminating routine vital signs and clustering medication administration. Although they included other interventions as well, we note that this approach likely reduced SLCs and was associated with a reduction in the use of sedative medications.32 Ultimately, our data show that a focus on reducing SLCs will be one necessary component of a multipronged solution to improving inpatient sleep.33
Light-Level Findings
Because of its effect on circadian rhythms, the daily light-dark cycle has a powerful impact on human physiology and behavior, which includes sleep.34 Little is understood about how light affects sleep and other circadian-related functions in general ward patients, as it is not commonly measured. Our findings suggest that patients admitted to the hospital are exposed to light levels and patterns that may not optimally promote wake and sleep. Encouragingly, we did not find excessive average light levels during the nighttime in either ICU or non-ICU environment of our hospital, although others have described intrusive nighttime light in the hospital setting.35,36 Even short bursts of low or moderate light during the nighttime can cause circadian phase delay,37 and efforts to maintain darkness in patient rooms at night should continue.
Our measurements show that average daytime light levels did not exceed 250 lux, which corresponds to low, office-level lighting, while the brightest average light levels occurred in the afternoon for both environments. These levels are consistent with other reports26,35,36 as is the light-level variability noted throughout the day (which is not unexpected given room positioning, patient preference, curtains, etc). The level and amount of daytime light needed to maintain circadian rhythms in humans is still unknown.38 Brighter light is generally more effective at influencing the circadian pacemaker in a dose-dependent manner.39 Although entrainment (synchronization of the body’s biological rhythm with environmental cues such as ambient light) of the human circadian rhythm has been shown with low light levels (eg, <100 lux), these studies included healthy volunteers in a carefully controlled, constant, routine environment.23 How these data apply to acutely ill subjects in the hospital environment is not clear. We note that low to moderate levels of light (50-1000 lux) are less effective for entrainment of the circadian rhythm in older people (age >65 years, the majority of our admissions) compared with younger people. Thus, older, hospitalized patients may require greater light levels for regulation of the sleep-wake cycle.40 These data are important when designing interventions to improve light for and maintain circadian rhythms in hospitalized patients. For example, Simons et al. found that dynamic light-application therapy, which achieved a maximum average lux level of <800 lux, did not reduce rates of delirium in critically ill patients (mean age ~65). One interpretation of these results, though there are many others, is that the light levels achieved were not high enough to influence circadian timing in hospitalized, mostly elderly patients. The physiological impact of light on the circadian rhythm in hospitalized patients still remains to be measured.
LIMITATIONS
Our study does have a few limitations. We did not assess sound quality, which is another determinant of arousal potential.20 Also, a shorter measurement interval might be useful in determining sharper sound increases. It may also be important to consider A- versus C-weighted measurements of sound levels, as A-weighted measurements usually reflect higher-frequency sound while C-weighted measurements usually reflect low-frequency noise18; we obtained only A-weighted measurements in our study. However, A-weighted measurements are generally considered more reflective of what the human ear considers noise and are used more standardly than C-weighted measurements.
Regarding light measurements, we recorded from rooms facing different cardinal directions and during different times of the year, which likely contributed to some of the variability in the daytime light levels on both floors. Additionally, light levels were not measured directly at the patient’s eye level. However, given that overhead fluorescent lighting was the primary source of lighting, it is doubtful that we substantially underestimated optic-nerve light levels. In the future, it may also be important to measure the different wavelengths of lights, as blue light may have a greater impact on sleep than other wavelengths.41 Although our findings align with others’, we note that this was a single-center study, which could limit the generalizability of our findings given inter-hospital variations in patient volume, interior layout and structure, and geographic location.
CONCLUSIONS
Overall, our study suggests that the light and sound environment for sleep in the inpatient setting, including both the ICU and non-ICU wards, has multiple areas for improvement. Our data also suggest specific directions for future clinical efforts at improvement. For example, efforts to decrease average sound levels may worsen sleep fragmentation. Similarly, more light during the day may be more helpful than further attempts to limit light during the night.
Disclosure
This research was funded in part by a NIH/NCATS flagship Clinical and Translational Science Award Grant (5KL2TR001112). None of the authors report any conflict of interest, financial or otherwise, in the preparation of this article.
1. Freedman NS, Gazendam J, Levan L, Pack AI, Schwab RJ. Abnormal sleep/wake
cycles and the effect of environmental noise on sleep disruption in the intensive
care unit. Am J Respir Crit Care Med. 2001;163(2):451-457. PubMed
2. Watson PL, Pandharipande P, Gehlbach BK, et al. Atypical sleep in ventilated
patients: empirical electroencephalography findings and the path toward revised ICU sleep scoring criteria. Crit Care Med. 2013;41(8):1958-1967. PubMed
3. Gehlbach BK, Chapotot F, Leproult R, et al. Temporal disorganization of circadian rhythmicity and sleep-wake regulation in mechanically ventilated patients receiving continuous intravenous sedation. Sleep. 2012;35(8):1105-1114. PubMed
4. Elliott R, McKinley S, Cistulli P, Fien M. Characterisation of sleep in intensive care using 24-hour polysomnography: an observational study. Crit Care. 2013;17(2):R46. PubMed
5. Novaes MA, Aronovich A, Ferraz MB, Knobel E. Stressors in ICU: patients’ evaluation. Intensive Care Med. 1997;23(12):1282-1285. PubMed
6. Tembo AC, Parker V, Higgins I. The experience of sleep deprivation in intensive care patients: findings from a larger hermeneutic phenomenological study. Intensive Crit Care Nurs. 2013;29(6):310-316. PubMed
7. Kamdar BB, Yang J, King LM, et al. Developing, implementing, and evaluating a multifaceted quality improvement intervention to promote sleep in an ICU. Am J Med Qual. 2014;29(6):546-554. PubMed
8. Patel J, Baldwin J, Bunting P, Laha S. The effect of a multicomponent multidisciplinary bundle of interventions on sleep and delirium in medical and surgical intensive care patients. Anaesthesia. 2014;69(6):540-549. PubMed
9. Manian FA, Manian CJ. Sleep quality in adult hospitalized patients with infection: an observational study. Am J Med Sci. 2015;349(1):56-60. PubMed
10. Park MJ, Yoo JH, Cho BW, Kim KT, Jeong WC, Ha M. Noise in hospital rooms and sleep disturbance in hospitalized medical patients. Environ Health Toxicol. 2014;29:e2014006. PubMed
11. Dobing S, Frolova N, McAlister F, Ringrose J. Sleep quality and factors influencing self-reported sleep duration and quality in the general internal medicine inpatient population. PLoS One. 2016;11(6):e0156735. PubMed
12. Gillis CM, Poyant JO, Degrado JR, Ye L, Anger KE, Owens RL. Inpatient pharmacological
sleep aid utilization is common at a tertiary medical center. J Hosp Med. 2014;9(10):652-657. PubMed
13. Krenk L, Jennum P, Kehlet H. Postoperative sleep disturbances after zolpidem treatment in fast-track hip and knee replacement. J Clin Sleep Med. 2014;10(3):321-326. PubMed
14. Kolla BP, Lovely JK, Mansukhani MP, Morgenthaler TI. Zolpidem is independently
associated with increased risk of inpatient falls. J Hosp Med. 2013;8(1):1-6. PubMed
15. Inouye SK, Bogardus ST Jr, Charpentier PA, et al. A multicomponent intervention to prevent delirium in hospitalized older patients. N Engl J Med. 1999;340(9):669-676. PubMed
16. Bano M, Chiaromanni F, Corrias M, et al. The influence of environmental factors on sleep quality in hospitalized medical patients. Front Neurol. 2014;5:267. PubMed
17. Berglund BLTSD. Guidelines for Community Noise. World Health Organization. 1999.
18. Knauert M, Jeon S, Murphy TE, Yaggi HK, Pisani MA, Redeker NS. Comparing average levels and peak occurrence of overnight sound in the medical intensive care unit on A-weighted and C-weighted decibel scales. J Crit Care. 2016;36:1-7. PubMed
19. Yoder JC, Staisiunas PG, Meltzer DO, Knutson KL, Arora VM. Noise and sleep among adult medical inpatients: far from a quiet night. Arch Intern Med. 2012;172(1):68-70. PubMed
20. Buxton OM, Ellenbogen JM, Wang W, et al. Sleep disruption due to hospital noises: a prospective evaluation. Ann Intern Med. 2012;157(3):170-179. PubMed
21. Stanchina ML, Abu-Hijleh M, Chaudhry BK, Carlisle CC, Millman RP. The influence of white noise on sleep in subjects exposed to ICU noise. Sleep Med. 2005;6(5):423-428. PubMed
22. Czeisler CA, Allan JS, Strogatz SH, et al. Bright light resets the human circadian pacemaker independent of the timing of the sleep-wake cycle. Science. 1986;233(4764):667-671. PubMed
23. Duffy JF, Czeisler CA. Effect of light on human circadian physiology. Sleep Med Clin. 2009;4(2):165-177. PubMed
24. Lewy AJ, Wehr TA, Goodwin FK, Newsome DA, Markey SP. Light suppresses melatonin secretion in humans. Science. 1980;210(4475):1267-1269. PubMed
25. Zeitzer JM, Dijk DJ, Kronauer R, Brown E, Czeisler C. Sensitivity of the human circadian pacemaker to nocturnal light: melatonin phase resetting and suppression. J Physiol. 2000;526:695-702. PubMed
26. Bernhofer EI, Higgins PA, Daly BJ, Burant CJ, Hornick TR. Hospital lighting and its association with sleep, mood and pain in medical inpatients. J Adv Nurs. 2014;70(5):1164-1173. PubMed
27. Darbyshire JL, Young JD. An investigation of sound levels on intensive care units with reference to the WHO guidelines. Crit Care. 2013;17(5):R187. PubMed
28. Gillis S. Pharmacologic treatment of depression during pregnancy. J Midwifery Womens Health. 2000;45(4):357-359. PubMed
29. Tainter CR, Levine AR, Quraishi SA, et al. Noise levels in surgical ICUs are consistently above recommended standards. Crit Care Med. 2016;44(1):147-152. PubMed
30. Farrehi PM, Clore KR, Scott JR, Vanini G, Clauw DJ. Efficacy of Sleep Tool Education During Hospitalization: A Randomized Controlled Trial. Am J Med. 2016;129(12):1329.e9-1329.e17. PubMed
31. Farokhnezhad Afshar P, Bahramnezhad F, Asgari P, Shiri M. Effect of white noise on sleep in patients admitted to a coronary care. J Caring Sci. 2016;5(2):103-109. PubMed
32. Bartick MC, Thai X, Schmidt T, Altaye A, Solet JM. Decrease in as-needed sedative use by limiting nighttime sleep disruptions from hospital staff. J Hosp Med. 2010;5(3):E20-E24. PubMed
33. Tamrat R, Huynh-Le MP, Goyal M. Non-pharmacologic interventions to improve the sleep of hospitalized patients: a systematic review. J Gen Intern Med. 2014;29(5):788-795. PubMed
34. Dijk DJ, Archer SN. Light, sleep, and circadian rhythms: together again. PLoS Biol. 2009;7(6):e1000145. PubMed
35. Verceles AC, Liu X, Terrin ML, et al. Ambient light levels and critical care outcomes. J Crit Care. 2013;28(1):110.e1-110.e8. PubMed
36. Hu RF, Hegadoren KM, Wang XY, Jiang XY. An investigation of light and sound levels on intensive care units in China. Aust Crit Care. 2016;29(2):62-67. PubMed
37. Zeitzer JM, Ruby NF, Fisicaro RA, Heller HC. Response of the human circadian system to millisecond flashes of light. PLoS One. 2011;6(7):e22078. PubMed
38. Duffy JF, Wright KP, Jr. Entrainment of the human circadian system by light. J Biol Rhythms. 2005;20(4):326-338. PubMed
39. Wright KP Jr, Gronfier C, Duffy JF, Czeisler CA. Intrinsic period and light intensity determine the phase relationship between melatonin and sleep in humans. J Biol Rhythms. 2005;20(2):168-177. PubMed
40. Duffy JF, Zeitzer JM, Czeisler CA. Decreased sensitivity to phase-delaying effects of moderate intensity light in older subjects. Neurobiol Aging. 2007;28(5):799-807. PubMed
41. Figueiro MG, Plitnick BA, Lok A, et al. Tailored lighting intervention improves measures of sleep, depression, and agitation in persons with Alzheimer’s disease and related dementia living in long-term care facilities. Clin Interv Aging. 2014;9:1527-1537. PubMed
1. Freedman NS, Gazendam J, Levan L, Pack AI, Schwab RJ. Abnormal sleep/wake
cycles and the effect of environmental noise on sleep disruption in the intensive
care unit. Am J Respir Crit Care Med. 2001;163(2):451-457. PubMed
2. Watson PL, Pandharipande P, Gehlbach BK, et al. Atypical sleep in ventilated
patients: empirical electroencephalography findings and the path toward revised ICU sleep scoring criteria. Crit Care Med. 2013;41(8):1958-1967. PubMed
3. Gehlbach BK, Chapotot F, Leproult R, et al. Temporal disorganization of circadian rhythmicity and sleep-wake regulation in mechanically ventilated patients receiving continuous intravenous sedation. Sleep. 2012;35(8):1105-1114. PubMed
4. Elliott R, McKinley S, Cistulli P, Fien M. Characterisation of sleep in intensive care using 24-hour polysomnography: an observational study. Crit Care. 2013;17(2):R46. PubMed
5. Novaes MA, Aronovich A, Ferraz MB, Knobel E. Stressors in ICU: patients’ evaluation. Intensive Care Med. 1997;23(12):1282-1285. PubMed
6. Tembo AC, Parker V, Higgins I. The experience of sleep deprivation in intensive care patients: findings from a larger hermeneutic phenomenological study. Intensive Crit Care Nurs. 2013;29(6):310-316. PubMed
7. Kamdar BB, Yang J, King LM, et al. Developing, implementing, and evaluating a multifaceted quality improvement intervention to promote sleep in an ICU. Am J Med Qual. 2014;29(6):546-554. PubMed
8. Patel J, Baldwin J, Bunting P, Laha S. The effect of a multicomponent multidisciplinary bundle of interventions on sleep and delirium in medical and surgical intensive care patients. Anaesthesia. 2014;69(6):540-549. PubMed
9. Manian FA, Manian CJ. Sleep quality in adult hospitalized patients with infection: an observational study. Am J Med Sci. 2015;349(1):56-60. PubMed
10. Park MJ, Yoo JH, Cho BW, Kim KT, Jeong WC, Ha M. Noise in hospital rooms and sleep disturbance in hospitalized medical patients. Environ Health Toxicol. 2014;29:e2014006. PubMed
11. Dobing S, Frolova N, McAlister F, Ringrose J. Sleep quality and factors influencing self-reported sleep duration and quality in the general internal medicine inpatient population. PLoS One. 2016;11(6):e0156735. PubMed
12. Gillis CM, Poyant JO, Degrado JR, Ye L, Anger KE, Owens RL. Inpatient pharmacological
sleep aid utilization is common at a tertiary medical center. J Hosp Med. 2014;9(10):652-657. PubMed
13. Krenk L, Jennum P, Kehlet H. Postoperative sleep disturbances after zolpidem treatment in fast-track hip and knee replacement. J Clin Sleep Med. 2014;10(3):321-326. PubMed
14. Kolla BP, Lovely JK, Mansukhani MP, Morgenthaler TI. Zolpidem is independently
associated with increased risk of inpatient falls. J Hosp Med. 2013;8(1):1-6. PubMed
15. Inouye SK, Bogardus ST Jr, Charpentier PA, et al. A multicomponent intervention to prevent delirium in hospitalized older patients. N Engl J Med. 1999;340(9):669-676. PubMed
16. Bano M, Chiaromanni F, Corrias M, et al. The influence of environmental factors on sleep quality in hospitalized medical patients. Front Neurol. 2014;5:267. PubMed
17. Berglund BLTSD. Guidelines for Community Noise. World Health Organization. 1999.
18. Knauert M, Jeon S, Murphy TE, Yaggi HK, Pisani MA, Redeker NS. Comparing average levels and peak occurrence of overnight sound in the medical intensive care unit on A-weighted and C-weighted decibel scales. J Crit Care. 2016;36:1-7. PubMed
19. Yoder JC, Staisiunas PG, Meltzer DO, Knutson KL, Arora VM. Noise and sleep among adult medical inpatients: far from a quiet night. Arch Intern Med. 2012;172(1):68-70. PubMed
20. Buxton OM, Ellenbogen JM, Wang W, et al. Sleep disruption due to hospital noises: a prospective evaluation. Ann Intern Med. 2012;157(3):170-179. PubMed
21. Stanchina ML, Abu-Hijleh M, Chaudhry BK, Carlisle CC, Millman RP. The influence of white noise on sleep in subjects exposed to ICU noise. Sleep Med. 2005;6(5):423-428. PubMed
22. Czeisler CA, Allan JS, Strogatz SH, et al. Bright light resets the human circadian pacemaker independent of the timing of the sleep-wake cycle. Science. 1986;233(4764):667-671. PubMed
23. Duffy JF, Czeisler CA. Effect of light on human circadian physiology. Sleep Med Clin. 2009;4(2):165-177. PubMed
24. Lewy AJ, Wehr TA, Goodwin FK, Newsome DA, Markey SP. Light suppresses melatonin secretion in humans. Science. 1980;210(4475):1267-1269. PubMed
25. Zeitzer JM, Dijk DJ, Kronauer R, Brown E, Czeisler C. Sensitivity of the human circadian pacemaker to nocturnal light: melatonin phase resetting and suppression. J Physiol. 2000;526:695-702. PubMed
26. Bernhofer EI, Higgins PA, Daly BJ, Burant CJ, Hornick TR. Hospital lighting and its association with sleep, mood and pain in medical inpatients. J Adv Nurs. 2014;70(5):1164-1173. PubMed
27. Darbyshire JL, Young JD. An investigation of sound levels on intensive care units with reference to the WHO guidelines. Crit Care. 2013;17(5):R187. PubMed
28. Gillis S. Pharmacologic treatment of depression during pregnancy. J Midwifery Womens Health. 2000;45(4):357-359. PubMed
29. Tainter CR, Levine AR, Quraishi SA, et al. Noise levels in surgical ICUs are consistently above recommended standards. Crit Care Med. 2016;44(1):147-152. PubMed
30. Farrehi PM, Clore KR, Scott JR, Vanini G, Clauw DJ. Efficacy of Sleep Tool Education During Hospitalization: A Randomized Controlled Trial. Am J Med. 2016;129(12):1329.e9-1329.e17. PubMed
31. Farokhnezhad Afshar P, Bahramnezhad F, Asgari P, Shiri M. Effect of white noise on sleep in patients admitted to a coronary care. J Caring Sci. 2016;5(2):103-109. PubMed
32. Bartick MC, Thai X, Schmidt T, Altaye A, Solet JM. Decrease in as-needed sedative use by limiting nighttime sleep disruptions from hospital staff. J Hosp Med. 2010;5(3):E20-E24. PubMed
33. Tamrat R, Huynh-Le MP, Goyal M. Non-pharmacologic interventions to improve the sleep of hospitalized patients: a systematic review. J Gen Intern Med. 2014;29(5):788-795. PubMed
34. Dijk DJ, Archer SN. Light, sleep, and circadian rhythms: together again. PLoS Biol. 2009;7(6):e1000145. PubMed
35. Verceles AC, Liu X, Terrin ML, et al. Ambient light levels and critical care outcomes. J Crit Care. 2013;28(1):110.e1-110.e8. PubMed
36. Hu RF, Hegadoren KM, Wang XY, Jiang XY. An investigation of light and sound levels on intensive care units in China. Aust Crit Care. 2016;29(2):62-67. PubMed
37. Zeitzer JM, Ruby NF, Fisicaro RA, Heller HC. Response of the human circadian system to millisecond flashes of light. PLoS One. 2011;6(7):e22078. PubMed
38. Duffy JF, Wright KP, Jr. Entrainment of the human circadian system by light. J Biol Rhythms. 2005;20(4):326-338. PubMed
39. Wright KP Jr, Gronfier C, Duffy JF, Czeisler CA. Intrinsic period and light intensity determine the phase relationship between melatonin and sleep in humans. J Biol Rhythms. 2005;20(2):168-177. PubMed
40. Duffy JF, Zeitzer JM, Czeisler CA. Decreased sensitivity to phase-delaying effects of moderate intensity light in older subjects. Neurobiol Aging. 2007;28(5):799-807. PubMed
41. Figueiro MG, Plitnick BA, Lok A, et al. Tailored lighting intervention improves measures of sleep, depression, and agitation in persons with Alzheimer’s disease and related dementia living in long-term care facilities. Clin Interv Aging. 2014;9:1527-1537. PubMed
© 2017 Society of Hospital Medicine
Reliability of 3-Dimensional Glenoid Component Templating and Correlation to Intraoperative Component Selection
Take-Home Points
- Guidelines regarding glenoid component size selection for primary TSA are lacking.
- Intraoperative in situ glenoid sizing may not be ideal.
- 3-D digital models may be utilized for preoperative templating of glenoid component size in primary TSA.
- 3-D templating that allows for superior-inferior, anterior-posterior, and rotational translation can lead to consistent and reproducible templating of glenoid component size.
- 3-D templating may reduce the risks of implant overhang, peg penetration, and decreased stability ratio.
In 1974, Neer1 introduced the shoulder prosthesis. In 1982, Neer and colleagues2 found significant improvement in shoulder pain and function in patients with glenohumeral osteoarthritis treated with the Neer prosthesis. Since then, use of total shoulder arthroplasty (TSA) has increased. Between 1993 and 2007, TSA use increased 319% in the United States.3 Long-term outcomes studies have found implant survivorship ranging from 87% to 93% at 10 to 15 years.4
Although TSA is a successful procedure, glenoid component failure is the most common complication.5-10 Outcomes of revision surgery for glenoid instability are inferior to those of primary TSA.11 Recent research findings highlight the effect of glenoid size on TSA complications.12 A larger glenoid component increases the stability ratio (peak subluxation force divided by compression load).12 However, insufficient glenoid bone stock, small glenoid diameter, and inability to fit a properly sized reamer owing to soft-tissue constraints may lead surgeons to choose a smaller glenoid component in order to avoid peg penetration, overhang, and soft-tissue damage, respectively. Therefore, preoperative templating of glenoid size is a potential strategy for minimizing complications.
Templating is performed for proximal humeral components, but glenoid sizing typically is deferred to intraoperative in situ sizing with implant-specific targeting guides. This glenoid sizing practice arose out of a lack of standard digital glenoid templates and difficulty in selecting glenoid size based on plain radiographs and/or 2-dimensional (2-D) computed tomography (CT) scans. However, targeting devices are sporadically used during surgery, and intraoperative glenoid vault dimension estimates derived from visualization and palpation are often inaccurate. Often, rather than directly assess glenoid morphology, surgeons infer glenoid size from the size and sex of patients.13
Three-dimensional (3-D) CT can be used to accurately assess glenoid version, bone loss, and implant fit.14-19 We conducted a study to determine if 3-D digital imaging can be consistently and reproducibly used for preoperative templating of glenoid component size and to determine if glenoid sizes derived from templating correlate with the sizes of subsequently implanted glenoids.
Materials and Methods
This retrospective study was conducted at the Center for Shoulder, Elbow, and Sports Medicine at Columbia University Medical Center in New York City and was approved by our Institutional Review Board. Included in the study were all patients who underwent primary TSA for primary glenohumeral osteoarthritis over a 12-month period. Patients were required to have preoperative CT performed according to our study protocol. The CT protocol consisted of 0.5-mm axial cuts of the entire scapula and 3-D reconstruction of the scapula, glenoid, glenohumeral articulation, and proximal humerus. Patients were excluded from the study for primary TSA for a secondary cause of glenohumeral osteoarthritis, inflammatory arthritis, connective tissue disease, prior contralateral TSA, and prior ipsilateral scapula, glenoid, and proximal humerus surgery. Ultimately, 24 patients were included in the study.
CT data were formatted for preoperative templating. The CT images of each patient’s scapula were uploaded into Materialise Interactive Medical Image Control System (Mimics) software. Mimics allows 3-D image rendering and editing from various imaging modalities and formats. The software was used to create the 3-D scapula models for templating. Prior studies have validated the anatomical precision of 3-D models created with Mimics.20
Mimics was also used to digitize in 3-D the glenoid components from the Bigliani-Flatow Shoulder System (Zimmer Biomet). Glenoid components of 3 different sizes (40 mm, 46 mm, 52 mm) were used. (The Bigliani glenoid component was digitized, as this implant system was used for primary TSA in all 24 patients.) Each glenoid component was traced in 3-D with a Gage 2000 coordinate-measuring machine (Brown & Sharpe) and was processed with custom software. The custom software, cited in previous work by our group,17 created the same coordinate system for each scapula based on anatomical reference points. These digitized 3-D images of glenoid components were uploaded with the digitized 3-D scapulae derived from patients’ CT scans to the Magics software. Magics allows for manipulation and interaction of multiple 3-D models by creating electronic stereolithography files that provide 3-D surface geometry.
Three fellowship-trained shoulder surgeons and 4 shoulder fellows templated the most appropriately sized glenoid component for each of the 24 patients. At the time of templating, the surgeon was blinded to the size of the glenoid implant used in the surgery. In Magics, each scapula was positioned in 3-D similar to how it would appear with the patient in the beach-chair position during surgery. In both study arms, surgeons selected the largest component that maximized the area of contact while avoiding peg penetration of the glenoid vault or component overhang. In addition, surgeons were instructed to correct glenoid version to as near neutral as possible with component positioning but were not permitted to remove glenoid bone stock to correct deformity. All surgeons based placement of the glenoid component on the patient’s actual bone stock and not on osteophytes, which are readily appreciable on 3-D CT.
In study arm 1, the 3-D view of the glenoid was restricted to the initial view in the beach-chair position. The surgeon then manipulated the 3-D glenoid component template across a single 2-D plane, either the superior-inferior plane or the anterior-posterior plane, over the surface of the 3-D glenoid (Figure 1).
In study arm 2, surgeons were permitted to rotate the 3-D glenoid template and scapula in any manner (Figure 2).
Interobserver agreement was determined by comparing prosthetic glenoid component size selection among all study surgeons, and intraobserver agreement was determined by comparing glenoid size selection during 2 sessions separated by at least 3 weeks.
After each trial, the order of patients’ scapula images was randomly rearranged to reduce recall bias. Kappa (κ) coefficients were calculated for interobserver and intraobserver agreement. Kappas ranged from −1.0 (least agreement) to +1.0 (complete agreement). A κ of 0 indicated an observer selection was equivalent to random chance. The level of agreement was categorized according to κ using a system described by Landis and Koch21 (Table 1).
Results
The group of 24 patients consisted of 15 men and 9 women. Mean age was 70.3 years (range, 56-88 years). Primary TSA was performed in 14 right shoulders and 10 left shoulders. Of the 24 patients, 20 (83%) had a 46-mm glenoid component implanted, 3 male patients had a 52-mm glenoid component implanted, and 1 female patient had a 40-mm glenoid component implanted.
Study Arm 1: Glenoid Templating Based on 2 df
In study arm 1, overall intraobserver agreement was substantial, as defined in the statistical literature.21 Among all surgeons who participated, intraobserver agreeement was 0.76 (substantial), 0.60 (substantial), and 0.58 (moderate) for the 40-mm, 46-mm, and 52-mm glenoid components, respectively (overall κ = 0.67, substantial agreement). Trial 1 interobserver agreement was 0.56 (moderate) (P < .001), 0.25 (fair) (P < .001), and 0.21 (fair) (P < .001) for the 40-mm, 46-mm, and 52-mm glenoid components, respectively (overall κ = 0.36, fair agreement) (P < .001), and trial 2 interobserver agreement was 0.58 (moderate) (P < .001), 0.18 (poor) (P = .003), and 0.24 (fair) (P <.001) for the 40-mm, 46-mm, and 52-mm glenoid components, respectively (overall κ = 0.32, fair agreement) (P < .001). In study arm 1, therefore, trials 1 and 2 both showed fair interobserver agreement.
Study Arm 2: Glenoid Templating Based on 6 df
In study arm 2, a mean correlation of 0.42 (moderate agreement) was found between glenoid component size in 3-D templating and the glenoid component size ultimately selected during surgery (Table 3).
In study arm 2, overall intraobserver agreement was moderate. Among all surgeons who participated, intraobserver agreement was 0.80 (excellent), 0.43 (moderate), and 0.47 (moderate) for the 40-mm, 46-mm, and 52-mm glenoid components, respectively (overall κ = 0.58, moderate agreement). Trial 1 interobserver agreement was 0.75 (substantial) (P < .001), 0.39 (fair) (P < .001), and 0.50 (moderate) (P < .001) for the 40-mm, 46-mm, and 52-mm glenoid components, respectively (overall κ = 0.54, moderate agreement) (P < .001), and trial 2 interobserver agreement was 0.66 (substantial) (P < .001), 0.28 (fair) (P = .003), and 0.40 (moderate) (P < .001) for the 40-mm, 46-mm, and 52-mm glenoid components, respectively (overall κ = 0.43, moderate agreement) (P < .001).
Discussion
Our results showed that 3-D glenoid templating had reproducible intraobserver and interobserver agreement. Overall intraobserver agreement was substantial (κ = 0.67) for study arm 1 and moderate (κ = 0.58) for study arm 2. Interobserver agreement was fair for trials 1 and 2 (κ = 0.36 and 0.32) in arm 1 and moderate for trials 1 and 2 (κ = 0.54 and 0.43) in arm 2.
Intraobserver and interobserver agreement values, particularly in study arm 2, which incorporated rotation (6 df), are consistent with values in commonly used classification systems, such as the Neer system for proximal humerus fractures, the Frykman system for distal radius fractures, and the King system for adolescent idiopathic scoliosis.22-30 Sidor and colleagues27 found overall interobserver agreement of 0.50 and overall intraobserver agreement of 0.66 for the Neer system, and Illarramendi and colleagues24 found overall interobserver agreement of 0.43 and overall intraobserver agreement of 0.61 for the Frykman system.
In study arm 2, overall interobserver and intraobserver agreement was moderate. A higher level of surgeon agreement is unlikely given the lack of well-defined parameters for determining glenoid component size. Therefore, glenoid size selection is largely a matter of surgeon preference. More research is needed to establish concrete guidelines for glenoid component size selection. Once guidelines are adopted, interobserver agreement in templating may increase.
In both study arms, the component that surgeons selected during templating tended to be smaller than the component they selected during surgery. In study arm 1, 32% of patients had a smaller component selected based on computer modeling, and 7% had a larger component selected. In study arm 2, the difference was narrower: 27% of patients had a smaller component selected during templating, and 16% had a larger component selected. A statistically significant difference (P < .001) in templated and implanted component sizes was found between men and women: Templated glenoid components were smaller than implanted components in 53% of women and larger than implanted components in 33% of men. Differences between templated and implanted components may be attributable to visualization differences. During templating, the entire glenoid can be visualized and the slightest peg penetration or component overhang detected; in contrast, during surgery, anatomical constraints preclude such a comprehensive assessment.
Differences in agreement between templated and implanted glenoid components suggest that the size of implanted components may not be ideal. In this study, the distribution of the templated glenoid sizes was much wider than that of the implanted glenoid sizes. During templating, each glenoid component can be definitively visualized and assessed for possible peg penetration and overhang. Visualization allows surgeons to base glenoid size selection solely on glenoid morphology, as opposed to factors such as patient sex and height. In addition, interobserver and intraobserver agreement values for the 40-mm glenoid component were considerably higher than those for components of other sizes, indicating that the 40-mm component was consistently and reproducibly selected for the same patients. Hence, templating may particularly help prevent peg penetration and component overhang for patients with a smaller diameter glenoid.
More research on 3-D templating is warranted given the results of this study and other studies.12,17,31 Scalise and colleagues31 found that, in TSA planning, surgeons’ use of 2-D (vs 3-D) imaging led them to overestimate glenoid component sizes (P = .006). In our study, the glenoid size selected during 3-D templating was, in many cases, smaller than the size selected during surgery. In order to avoid peg penetration and glenoid overhang, anecdotal guidelines commonly used in glenoid size selection, likely was the driving force in selecting smaller glenoid components during templating. Although anterior, superior, and inferior glenoid overhang typically can be assessed during surgery, posterior overhang is more difficult to evaluate. Three-dimensional modeling allows surgeons to determine optimal glenoid component size and position. In addition, intraoperative evaluation of glenoid component peg penetration is challenging, and peg penetration becomes evident only after it has occurred. During templating, however, surgeons were able to easily assess for peg penetration, and smaller glenoid components were selected.
A limitation of this study is that intraoperative glenoid version correction or peg containment was not quantified. More research is needed on the relationship between glenoid size selection and component overhang and peg penetration. Another limitation was use of only 1 TSA system (with 3 glenoid sizes, all with inline pegs); reliability of 3-D templating was not evaluated across different component designs. Last, given the absence of guidelines for glenoid component size selection, there was surgeon bias in preoperative templating and in intraoperative selection of glenoid size. Surgeons had differing opinions on the importance of maximizing the contact area of the component and correcting glenoid deformity and version.
Our study results showed that preoperative 3-D templating that allows for superior-inferior, anterior-posterior, and rotational translation was consistent and reproducible in determining glenoid component size, and use of this templating may reduce the risks of implant overhang, peg penetration, and decreased stability ratio. These results highlight the possibility that glenoid component sizes selected during surgery may not be ideal. More research is needed to determine if intraoperative glenoid size selection leads to adequate version correction and peg containment. The present study supports use of 3-D templating in primary TSA planning.
1. Neer CS 2nd. Replacement arthroplasty for glenohumeral osteoarthritis. J Bone Joint Surg Am. 1974;56(1):1-13.
2. Neer CS 2nd, Watson KC, Stanton FJ. Recent experience in total shoulder replacement. J Bone Joint Surg Am. 1982;64(3):319-337.
3. Day JS, Lau E, Ong KL, Williams GR, Ramsey ML, Kurtz SM. Prevalence and projections of total shoulder and elbow arthroplasty in the United States to 2015. J Shoulder Elbow Surg. 2010;19(8):1115-1120.
4. Torchia ME, Cofield RH, Settergren CR. Total shoulder arthroplasty with the Neer prosthesis: long-term results. J Shoulder Elbow Surg. 1997;6(6):495-505.
5. Barrett WP, Franklin JL, Jackins SE, Wyss CR, Matsen FA 3rd. Total shoulder arthroplasty. J Bone Joint Surg Am. 1987;69(6):865-872.
6. Bohsali KI, Wirth MA, Rockwood CA Jr. Complications of total shoulder arthroplasty. J Bone Joint Surg Am. 2006;88(10):2279-2292.
7. Matsen FA 3rd, Bicknell RT, Lippitt SB. Shoulder arthroplasty: the socket perspective. J Shoulder Elbow Surg. 2007;16(5 suppl):S241-S247.
8. Matsen FA 3rd, Clinton J, Lynch J, Bertelsen A, Richardson ML. Glenoid component failure in total shoulder arthroplasty. J Bone Joint Surg Am. 2008;90(4):885-896.
9. Pearl ML, Romeo AA, Wirth MA, Yamaguchi K, Nicholson GP, Creighton RA. Decision making in contemporary shoulder arthroplasty. Instr Course Lect. 2005;54:69-85.
10. Wirth MA, Rockwood CA Jr. Complications of total shoulder-replacement arthroplasty. J Bone Joint Surg Am. 1996;78(4):603-616.
11. Sanchez-Sotelo J, Sperling JW, Rowland CM, Cofield RH. Instability after shoulder arthroplasty: results of surgical treatment. J Bone Joint Surg Am. 2003;85(4):622-631.
12. Tammachote N, Sperling JW, Berglund LJ, Steinmann SP, Cofield RH, An KN. The effect of glenoid component size on the stability of total shoulder arthroplasty. J Shoulder Elbow Surg. 2007;16(3 suppl):S102-S106.
13. Iannotti JP, Greeson C, Downing D, Sabesan V, Bryan JA. Effect of glenoid deformity on glenoid component placement in primary shoulder arthroplasty. J Shoulder Elbow Surg. 2012;21(1):48-55.
14. Briem D, Ruecker AH, Neumann J, et al. 3D fluoroscopic navigated reaming of the glenoid for total shoulder arthroplasty (TSA). Comput Aided Surg. 2011;16(2):93-99.
15. Budge MD, Lewis GS, Schaefer E, Coquia S, Flemming DJ, Armstrong AD. Comparison of standard two-dimensional and three-dimensional corrected glenoid version measurements. J Shoulder Elbow Surg. 2011;20(4):577-583.
16. Chuang TY, Adams CR, Burkhart SS. Use of preoperative three-dimensional computed tomography to quantify glenoid bone loss in shoulder instability. Arthroscopy. 2008;24(4):376-382.
17. Nowak DD, Bahu MJ, Gardner TR, et al. Simulation of surgical glenoid resurfacing using three-dimensional computed tomography of the arthritic glenohumeral joint: the amount of glenoid retroversion that can be corrected. J Shoulder Elbow Surg. 2009;18(5):680-688.
18. Scalise JJ, Bryan J, Polster J, Brems JJ, Iannotti JP. Quantitative analysis of glenoid bone loss in osteoarthritis using three-dimensional computed tomography scans. J Shoulder Elbow Surg. 2008;17(2):328-335.
19. Scalise JJ, Codsi MJ, Bryan J, Iannotti JP. The three-dimensional glenoid vault model can estimate normal glenoid version in osteoarthritis. J Shoulder Elbow Surg. 2008;17(3):487-491.
20. Bryce CD, Pennypacker JL, Kulkarni N, et al. Validation of three-dimensional models of in situ scapulae. J Shoulder Elbow Surg. 2008;17(5):825-832.
21. Landis JR, Koch GG. The measurement of observer agreement for categorical data. Biometrics. 1977;33(1):159-174.
22. Cummings RJ, Loveless EA, Campbell J, Samelson S, Mazur JM. Interobserver reliability and intraobserver reproducibility of the system of King et al. for the classification of adolescent idiopathic scoliosis. J Bone Joint Surg Am. 1998;80(8):1107-1111.
23. Humphrey CA, Dirschl DR, Ellis TJ. Interobserver reliability of a CT-based fracture classification system. J Orthop Trauma. 2005;19(9):616-622.
24. Illarramendi A, González Della Valle A, Segal E, De Carli P, Maignon G, Gallucci G. Evaluation of simplified Frykman and AO classifications of fractures of the distal radius. Assessment of interobserver and intraobserver agreement. Int Orthop. 1998;22(2):111-115.
25. Lenke LG, Betz RR, Bridwell KH, et al. Intraobserver and interobserver reliability of the classification of thoracic adolescent idiopathic scoliosis. J Bone Joint Surg Am. 1998;80(8):1097-1106.
26. Ploegmakers JJ, Mader K, Pennig D, Verheyen CC. Four distal radial fracture classification systems tested amongst a large panel of Dutch trauma surgeons. Injury. 2007;38(11):1268-1272.
27. Sidor ML, Zuckerman JD, Lyon T, Koval K, Cuomo F, Schoenberg N. The Neer classification system for proximal humeral fractures. An assessment of interobserver reliability and intraobserver reproducibility. J Bone Joint Surg Am. 1993;75(12):1745-1750.
28. Siebenrock KA, Gerber C. The reproducibility of classification of fractures of the proximal end of the humerus. J Bone Joint Surg Am. 1993;75(12):1751-1755.
29. Thomsen NO, Overgaard S, Olsen LH, Hansen H, Nielsen ST. Observer variation in the radiographic classification of ankle fractures. J Bone Joint Surg Br. 1991;73(4):676-678.
30. Ward WT, Vogt M, Grudziak JS, Tümer Y, Cook PC, Fitch RD. Severin classification system for evaluation of the results of operative treatment of congenital dislocation of the hip. A study of intraobserver and interobserver reliability. J Bone Joint Surg Am. 1997;79(5):656-663.
31. Scalise JJ, Codsi MJ, Bryan J, Brems JJ, Iannotti JP. The influence of three-dimensional computed tomography images of the shoulder in preoperative planning for total shoulder arthroplasty. J Bone Joint Surg Am. 2008;90(11):2438-2445.
Take-Home Points
- Guidelines regarding glenoid component size selection for primary TSA are lacking.
- Intraoperative in situ glenoid sizing may not be ideal.
- 3-D digital models may be utilized for preoperative templating of glenoid component size in primary TSA.
- 3-D templating that allows for superior-inferior, anterior-posterior, and rotational translation can lead to consistent and reproducible templating of glenoid component size.
- 3-D templating may reduce the risks of implant overhang, peg penetration, and decreased stability ratio.
In 1974, Neer1 introduced the shoulder prosthesis. In 1982, Neer and colleagues2 found significant improvement in shoulder pain and function in patients with glenohumeral osteoarthritis treated with the Neer prosthesis. Since then, use of total shoulder arthroplasty (TSA) has increased. Between 1993 and 2007, TSA use increased 319% in the United States.3 Long-term outcomes studies have found implant survivorship ranging from 87% to 93% at 10 to 15 years.4
Although TSA is a successful procedure, glenoid component failure is the most common complication.5-10 Outcomes of revision surgery for glenoid instability are inferior to those of primary TSA.11 Recent research findings highlight the effect of glenoid size on TSA complications.12 A larger glenoid component increases the stability ratio (peak subluxation force divided by compression load).12 However, insufficient glenoid bone stock, small glenoid diameter, and inability to fit a properly sized reamer owing to soft-tissue constraints may lead surgeons to choose a smaller glenoid component in order to avoid peg penetration, overhang, and soft-tissue damage, respectively. Therefore, preoperative templating of glenoid size is a potential strategy for minimizing complications.
Templating is performed for proximal humeral components, but glenoid sizing typically is deferred to intraoperative in situ sizing with implant-specific targeting guides. This glenoid sizing practice arose out of a lack of standard digital glenoid templates and difficulty in selecting glenoid size based on plain radiographs and/or 2-dimensional (2-D) computed tomography (CT) scans. However, targeting devices are sporadically used during surgery, and intraoperative glenoid vault dimension estimates derived from visualization and palpation are often inaccurate. Often, rather than directly assess glenoid morphology, surgeons infer glenoid size from the size and sex of patients.13
Three-dimensional (3-D) CT can be used to accurately assess glenoid version, bone loss, and implant fit.14-19 We conducted a study to determine if 3-D digital imaging can be consistently and reproducibly used for preoperative templating of glenoid component size and to determine if glenoid sizes derived from templating correlate with the sizes of subsequently implanted glenoids.
Materials and Methods
This retrospective study was conducted at the Center for Shoulder, Elbow, and Sports Medicine at Columbia University Medical Center in New York City and was approved by our Institutional Review Board. Included in the study were all patients who underwent primary TSA for primary glenohumeral osteoarthritis over a 12-month period. Patients were required to have preoperative CT performed according to our study protocol. The CT protocol consisted of 0.5-mm axial cuts of the entire scapula and 3-D reconstruction of the scapula, glenoid, glenohumeral articulation, and proximal humerus. Patients were excluded from the study for primary TSA for a secondary cause of glenohumeral osteoarthritis, inflammatory arthritis, connective tissue disease, prior contralateral TSA, and prior ipsilateral scapula, glenoid, and proximal humerus surgery. Ultimately, 24 patients were included in the study.
CT data were formatted for preoperative templating. The CT images of each patient’s scapula were uploaded into Materialise Interactive Medical Image Control System (Mimics) software. Mimics allows 3-D image rendering and editing from various imaging modalities and formats. The software was used to create the 3-D scapula models for templating. Prior studies have validated the anatomical precision of 3-D models created with Mimics.20
Mimics was also used to digitize in 3-D the glenoid components from the Bigliani-Flatow Shoulder System (Zimmer Biomet). Glenoid components of 3 different sizes (40 mm, 46 mm, 52 mm) were used. (The Bigliani glenoid component was digitized, as this implant system was used for primary TSA in all 24 patients.) Each glenoid component was traced in 3-D with a Gage 2000 coordinate-measuring machine (Brown & Sharpe) and was processed with custom software. The custom software, cited in previous work by our group,17 created the same coordinate system for each scapula based on anatomical reference points. These digitized 3-D images of glenoid components were uploaded with the digitized 3-D scapulae derived from patients’ CT scans to the Magics software. Magics allows for manipulation and interaction of multiple 3-D models by creating electronic stereolithography files that provide 3-D surface geometry.
Three fellowship-trained shoulder surgeons and 4 shoulder fellows templated the most appropriately sized glenoid component for each of the 24 patients. At the time of templating, the surgeon was blinded to the size of the glenoid implant used in the surgery. In Magics, each scapula was positioned in 3-D similar to how it would appear with the patient in the beach-chair position during surgery. In both study arms, surgeons selected the largest component that maximized the area of contact while avoiding peg penetration of the glenoid vault or component overhang. In addition, surgeons were instructed to correct glenoid version to as near neutral as possible with component positioning but were not permitted to remove glenoid bone stock to correct deformity. All surgeons based placement of the glenoid component on the patient’s actual bone stock and not on osteophytes, which are readily appreciable on 3-D CT.
In study arm 1, the 3-D view of the glenoid was restricted to the initial view in the beach-chair position. The surgeon then manipulated the 3-D glenoid component template across a single 2-D plane, either the superior-inferior plane or the anterior-posterior plane, over the surface of the 3-D glenoid (Figure 1).
In study arm 2, surgeons were permitted to rotate the 3-D glenoid template and scapula in any manner (Figure 2).
Interobserver agreement was determined by comparing prosthetic glenoid component size selection among all study surgeons, and intraobserver agreement was determined by comparing glenoid size selection during 2 sessions separated by at least 3 weeks.
After each trial, the order of patients’ scapula images was randomly rearranged to reduce recall bias. Kappa (κ) coefficients were calculated for interobserver and intraobserver agreement. Kappas ranged from −1.0 (least agreement) to +1.0 (complete agreement). A κ of 0 indicated an observer selection was equivalent to random chance. The level of agreement was categorized according to κ using a system described by Landis and Koch21 (Table 1).
Results
The group of 24 patients consisted of 15 men and 9 women. Mean age was 70.3 years (range, 56-88 years). Primary TSA was performed in 14 right shoulders and 10 left shoulders. Of the 24 patients, 20 (83%) had a 46-mm glenoid component implanted, 3 male patients had a 52-mm glenoid component implanted, and 1 female patient had a 40-mm glenoid component implanted.
Study Arm 1: Glenoid Templating Based on 2 df
In study arm 1, overall intraobserver agreement was substantial, as defined in the statistical literature.21 Among all surgeons who participated, intraobserver agreeement was 0.76 (substantial), 0.60 (substantial), and 0.58 (moderate) for the 40-mm, 46-mm, and 52-mm glenoid components, respectively (overall κ = 0.67, substantial agreement). Trial 1 interobserver agreement was 0.56 (moderate) (P < .001), 0.25 (fair) (P < .001), and 0.21 (fair) (P < .001) for the 40-mm, 46-mm, and 52-mm glenoid components, respectively (overall κ = 0.36, fair agreement) (P < .001), and trial 2 interobserver agreement was 0.58 (moderate) (P < .001), 0.18 (poor) (P = .003), and 0.24 (fair) (P <.001) for the 40-mm, 46-mm, and 52-mm glenoid components, respectively (overall κ = 0.32, fair agreement) (P < .001). In study arm 1, therefore, trials 1 and 2 both showed fair interobserver agreement.
Study Arm 2: Glenoid Templating Based on 6 df
In study arm 2, a mean correlation of 0.42 (moderate agreement) was found between glenoid component size in 3-D templating and the glenoid component size ultimately selected during surgery (Table 3).
In study arm 2, overall intraobserver agreement was moderate. Among all surgeons who participated, intraobserver agreement was 0.80 (excellent), 0.43 (moderate), and 0.47 (moderate) for the 40-mm, 46-mm, and 52-mm glenoid components, respectively (overall κ = 0.58, moderate agreement). Trial 1 interobserver agreement was 0.75 (substantial) (P < .001), 0.39 (fair) (P < .001), and 0.50 (moderate) (P < .001) for the 40-mm, 46-mm, and 52-mm glenoid components, respectively (overall κ = 0.54, moderate agreement) (P < .001), and trial 2 interobserver agreement was 0.66 (substantial) (P < .001), 0.28 (fair) (P = .003), and 0.40 (moderate) (P < .001) for the 40-mm, 46-mm, and 52-mm glenoid components, respectively (overall κ = 0.43, moderate agreement) (P < .001).
Discussion
Our results showed that 3-D glenoid templating had reproducible intraobserver and interobserver agreement. Overall intraobserver agreement was substantial (κ = 0.67) for study arm 1 and moderate (κ = 0.58) for study arm 2. Interobserver agreement was fair for trials 1 and 2 (κ = 0.36 and 0.32) in arm 1 and moderate for trials 1 and 2 (κ = 0.54 and 0.43) in arm 2.
Intraobserver and interobserver agreement values, particularly in study arm 2, which incorporated rotation (6 df), are consistent with values in commonly used classification systems, such as the Neer system for proximal humerus fractures, the Frykman system for distal radius fractures, and the King system for adolescent idiopathic scoliosis.22-30 Sidor and colleagues27 found overall interobserver agreement of 0.50 and overall intraobserver agreement of 0.66 for the Neer system, and Illarramendi and colleagues24 found overall interobserver agreement of 0.43 and overall intraobserver agreement of 0.61 for the Frykman system.
In study arm 2, overall interobserver and intraobserver agreement was moderate. A higher level of surgeon agreement is unlikely given the lack of well-defined parameters for determining glenoid component size. Therefore, glenoid size selection is largely a matter of surgeon preference. More research is needed to establish concrete guidelines for glenoid component size selection. Once guidelines are adopted, interobserver agreement in templating may increase.
In both study arms, the component that surgeons selected during templating tended to be smaller than the component they selected during surgery. In study arm 1, 32% of patients had a smaller component selected based on computer modeling, and 7% had a larger component selected. In study arm 2, the difference was narrower: 27% of patients had a smaller component selected during templating, and 16% had a larger component selected. A statistically significant difference (P < .001) in templated and implanted component sizes was found between men and women: Templated glenoid components were smaller than implanted components in 53% of women and larger than implanted components in 33% of men. Differences between templated and implanted components may be attributable to visualization differences. During templating, the entire glenoid can be visualized and the slightest peg penetration or component overhang detected; in contrast, during surgery, anatomical constraints preclude such a comprehensive assessment.
Differences in agreement between templated and implanted glenoid components suggest that the size of implanted components may not be ideal. In this study, the distribution of the templated glenoid sizes was much wider than that of the implanted glenoid sizes. During templating, each glenoid component can be definitively visualized and assessed for possible peg penetration and overhang. Visualization allows surgeons to base glenoid size selection solely on glenoid morphology, as opposed to factors such as patient sex and height. In addition, interobserver and intraobserver agreement values for the 40-mm glenoid component were considerably higher than those for components of other sizes, indicating that the 40-mm component was consistently and reproducibly selected for the same patients. Hence, templating may particularly help prevent peg penetration and component overhang for patients with a smaller diameter glenoid.
More research on 3-D templating is warranted given the results of this study and other studies.12,17,31 Scalise and colleagues31 found that, in TSA planning, surgeons’ use of 2-D (vs 3-D) imaging led them to overestimate glenoid component sizes (P = .006). In our study, the glenoid size selected during 3-D templating was, in many cases, smaller than the size selected during surgery. In order to avoid peg penetration and glenoid overhang, anecdotal guidelines commonly used in glenoid size selection, likely was the driving force in selecting smaller glenoid components during templating. Although anterior, superior, and inferior glenoid overhang typically can be assessed during surgery, posterior overhang is more difficult to evaluate. Three-dimensional modeling allows surgeons to determine optimal glenoid component size and position. In addition, intraoperative evaluation of glenoid component peg penetration is challenging, and peg penetration becomes evident only after it has occurred. During templating, however, surgeons were able to easily assess for peg penetration, and smaller glenoid components were selected.
A limitation of this study is that intraoperative glenoid version correction or peg containment was not quantified. More research is needed on the relationship between glenoid size selection and component overhang and peg penetration. Another limitation was use of only 1 TSA system (with 3 glenoid sizes, all with inline pegs); reliability of 3-D templating was not evaluated across different component designs. Last, given the absence of guidelines for glenoid component size selection, there was surgeon bias in preoperative templating and in intraoperative selection of glenoid size. Surgeons had differing opinions on the importance of maximizing the contact area of the component and correcting glenoid deformity and version.
Our study results showed that preoperative 3-D templating that allows for superior-inferior, anterior-posterior, and rotational translation was consistent and reproducible in determining glenoid component size, and use of this templating may reduce the risks of implant overhang, peg penetration, and decreased stability ratio. These results highlight the possibility that glenoid component sizes selected during surgery may not be ideal. More research is needed to determine if intraoperative glenoid size selection leads to adequate version correction and peg containment. The present study supports use of 3-D templating in primary TSA planning.
Take-Home Points
- Guidelines regarding glenoid component size selection for primary TSA are lacking.
- Intraoperative in situ glenoid sizing may not be ideal.
- 3-D digital models may be utilized for preoperative templating of glenoid component size in primary TSA.
- 3-D templating that allows for superior-inferior, anterior-posterior, and rotational translation can lead to consistent and reproducible templating of glenoid component size.
- 3-D templating may reduce the risks of implant overhang, peg penetration, and decreased stability ratio.
In 1974, Neer1 introduced the shoulder prosthesis. In 1982, Neer and colleagues2 found significant improvement in shoulder pain and function in patients with glenohumeral osteoarthritis treated with the Neer prosthesis. Since then, use of total shoulder arthroplasty (TSA) has increased. Between 1993 and 2007, TSA use increased 319% in the United States.3 Long-term outcomes studies have found implant survivorship ranging from 87% to 93% at 10 to 15 years.4
Although TSA is a successful procedure, glenoid component failure is the most common complication.5-10 Outcomes of revision surgery for glenoid instability are inferior to those of primary TSA.11 Recent research findings highlight the effect of glenoid size on TSA complications.12 A larger glenoid component increases the stability ratio (peak subluxation force divided by compression load).12 However, insufficient glenoid bone stock, small glenoid diameter, and inability to fit a properly sized reamer owing to soft-tissue constraints may lead surgeons to choose a smaller glenoid component in order to avoid peg penetration, overhang, and soft-tissue damage, respectively. Therefore, preoperative templating of glenoid size is a potential strategy for minimizing complications.
Templating is performed for proximal humeral components, but glenoid sizing typically is deferred to intraoperative in situ sizing with implant-specific targeting guides. This glenoid sizing practice arose out of a lack of standard digital glenoid templates and difficulty in selecting glenoid size based on plain radiographs and/or 2-dimensional (2-D) computed tomography (CT) scans. However, targeting devices are sporadically used during surgery, and intraoperative glenoid vault dimension estimates derived from visualization and palpation are often inaccurate. Often, rather than directly assess glenoid morphology, surgeons infer glenoid size from the size and sex of patients.13
Three-dimensional (3-D) CT can be used to accurately assess glenoid version, bone loss, and implant fit.14-19 We conducted a study to determine if 3-D digital imaging can be consistently and reproducibly used for preoperative templating of glenoid component size and to determine if glenoid sizes derived from templating correlate with the sizes of subsequently implanted glenoids.
Materials and Methods
This retrospective study was conducted at the Center for Shoulder, Elbow, and Sports Medicine at Columbia University Medical Center in New York City and was approved by our Institutional Review Board. Included in the study were all patients who underwent primary TSA for primary glenohumeral osteoarthritis over a 12-month period. Patients were required to have preoperative CT performed according to our study protocol. The CT protocol consisted of 0.5-mm axial cuts of the entire scapula and 3-D reconstruction of the scapula, glenoid, glenohumeral articulation, and proximal humerus. Patients were excluded from the study for primary TSA for a secondary cause of glenohumeral osteoarthritis, inflammatory arthritis, connective tissue disease, prior contralateral TSA, and prior ipsilateral scapula, glenoid, and proximal humerus surgery. Ultimately, 24 patients were included in the study.
CT data were formatted for preoperative templating. The CT images of each patient’s scapula were uploaded into Materialise Interactive Medical Image Control System (Mimics) software. Mimics allows 3-D image rendering and editing from various imaging modalities and formats. The software was used to create the 3-D scapula models for templating. Prior studies have validated the anatomical precision of 3-D models created with Mimics.20
Mimics was also used to digitize in 3-D the glenoid components from the Bigliani-Flatow Shoulder System (Zimmer Biomet). Glenoid components of 3 different sizes (40 mm, 46 mm, 52 mm) were used. (The Bigliani glenoid component was digitized, as this implant system was used for primary TSA in all 24 patients.) Each glenoid component was traced in 3-D with a Gage 2000 coordinate-measuring machine (Brown & Sharpe) and was processed with custom software. The custom software, cited in previous work by our group,17 created the same coordinate system for each scapula based on anatomical reference points. These digitized 3-D images of glenoid components were uploaded with the digitized 3-D scapulae derived from patients’ CT scans to the Magics software. Magics allows for manipulation and interaction of multiple 3-D models by creating electronic stereolithography files that provide 3-D surface geometry.
Three fellowship-trained shoulder surgeons and 4 shoulder fellows templated the most appropriately sized glenoid component for each of the 24 patients. At the time of templating, the surgeon was blinded to the size of the glenoid implant used in the surgery. In Magics, each scapula was positioned in 3-D similar to how it would appear with the patient in the beach-chair position during surgery. In both study arms, surgeons selected the largest component that maximized the area of contact while avoiding peg penetration of the glenoid vault or component overhang. In addition, surgeons were instructed to correct glenoid version to as near neutral as possible with component positioning but were not permitted to remove glenoid bone stock to correct deformity. All surgeons based placement of the glenoid component on the patient’s actual bone stock and not on osteophytes, which are readily appreciable on 3-D CT.
In study arm 1, the 3-D view of the glenoid was restricted to the initial view in the beach-chair position. The surgeon then manipulated the 3-D glenoid component template across a single 2-D plane, either the superior-inferior plane or the anterior-posterior plane, over the surface of the 3-D glenoid (Figure 1).
In study arm 2, surgeons were permitted to rotate the 3-D glenoid template and scapula in any manner (Figure 2).
Interobserver agreement was determined by comparing prosthetic glenoid component size selection among all study surgeons, and intraobserver agreement was determined by comparing glenoid size selection during 2 sessions separated by at least 3 weeks.
After each trial, the order of patients’ scapula images was randomly rearranged to reduce recall bias. Kappa (κ) coefficients were calculated for interobserver and intraobserver agreement. Kappas ranged from −1.0 (least agreement) to +1.0 (complete agreement). A κ of 0 indicated an observer selection was equivalent to random chance. The level of agreement was categorized according to κ using a system described by Landis and Koch21 (Table 1).
Results
The group of 24 patients consisted of 15 men and 9 women. Mean age was 70.3 years (range, 56-88 years). Primary TSA was performed in 14 right shoulders and 10 left shoulders. Of the 24 patients, 20 (83%) had a 46-mm glenoid component implanted, 3 male patients had a 52-mm glenoid component implanted, and 1 female patient had a 40-mm glenoid component implanted.
Study Arm 1: Glenoid Templating Based on 2 df
In study arm 1, overall intraobserver agreement was substantial, as defined in the statistical literature.21 Among all surgeons who participated, intraobserver agreeement was 0.76 (substantial), 0.60 (substantial), and 0.58 (moderate) for the 40-mm, 46-mm, and 52-mm glenoid components, respectively (overall κ = 0.67, substantial agreement). Trial 1 interobserver agreement was 0.56 (moderate) (P < .001), 0.25 (fair) (P < .001), and 0.21 (fair) (P < .001) for the 40-mm, 46-mm, and 52-mm glenoid components, respectively (overall κ = 0.36, fair agreement) (P < .001), and trial 2 interobserver agreement was 0.58 (moderate) (P < .001), 0.18 (poor) (P = .003), and 0.24 (fair) (P <.001) for the 40-mm, 46-mm, and 52-mm glenoid components, respectively (overall κ = 0.32, fair agreement) (P < .001). In study arm 1, therefore, trials 1 and 2 both showed fair interobserver agreement.
Study Arm 2: Glenoid Templating Based on 6 df
In study arm 2, a mean correlation of 0.42 (moderate agreement) was found between glenoid component size in 3-D templating and the glenoid component size ultimately selected during surgery (Table 3).
In study arm 2, overall intraobserver agreement was moderate. Among all surgeons who participated, intraobserver agreement was 0.80 (excellent), 0.43 (moderate), and 0.47 (moderate) for the 40-mm, 46-mm, and 52-mm glenoid components, respectively (overall κ = 0.58, moderate agreement). Trial 1 interobserver agreement was 0.75 (substantial) (P < .001), 0.39 (fair) (P < .001), and 0.50 (moderate) (P < .001) for the 40-mm, 46-mm, and 52-mm glenoid components, respectively (overall κ = 0.54, moderate agreement) (P < .001), and trial 2 interobserver agreement was 0.66 (substantial) (P < .001), 0.28 (fair) (P = .003), and 0.40 (moderate) (P < .001) for the 40-mm, 46-mm, and 52-mm glenoid components, respectively (overall κ = 0.43, moderate agreement) (P < .001).
Discussion
Our results showed that 3-D glenoid templating had reproducible intraobserver and interobserver agreement. Overall intraobserver agreement was substantial (κ = 0.67) for study arm 1 and moderate (κ = 0.58) for study arm 2. Interobserver agreement was fair for trials 1 and 2 (κ = 0.36 and 0.32) in arm 1 and moderate for trials 1 and 2 (κ = 0.54 and 0.43) in arm 2.
Intraobserver and interobserver agreement values, particularly in study arm 2, which incorporated rotation (6 df), are consistent with values in commonly used classification systems, such as the Neer system for proximal humerus fractures, the Frykman system for distal radius fractures, and the King system for adolescent idiopathic scoliosis.22-30 Sidor and colleagues27 found overall interobserver agreement of 0.50 and overall intraobserver agreement of 0.66 for the Neer system, and Illarramendi and colleagues24 found overall interobserver agreement of 0.43 and overall intraobserver agreement of 0.61 for the Frykman system.
In study arm 2, overall interobserver and intraobserver agreement was moderate. A higher level of surgeon agreement is unlikely given the lack of well-defined parameters for determining glenoid component size. Therefore, glenoid size selection is largely a matter of surgeon preference. More research is needed to establish concrete guidelines for glenoid component size selection. Once guidelines are adopted, interobserver agreement in templating may increase.
In both study arms, the component that surgeons selected during templating tended to be smaller than the component they selected during surgery. In study arm 1, 32% of patients had a smaller component selected based on computer modeling, and 7% had a larger component selected. In study arm 2, the difference was narrower: 27% of patients had a smaller component selected during templating, and 16% had a larger component selected. A statistically significant difference (P < .001) in templated and implanted component sizes was found between men and women: Templated glenoid components were smaller than implanted components in 53% of women and larger than implanted components in 33% of men. Differences between templated and implanted components may be attributable to visualization differences. During templating, the entire glenoid can be visualized and the slightest peg penetration or component overhang detected; in contrast, during surgery, anatomical constraints preclude such a comprehensive assessment.
Differences in agreement between templated and implanted glenoid components suggest that the size of implanted components may not be ideal. In this study, the distribution of the templated glenoid sizes was much wider than that of the implanted glenoid sizes. During templating, each glenoid component can be definitively visualized and assessed for possible peg penetration and overhang. Visualization allows surgeons to base glenoid size selection solely on glenoid morphology, as opposed to factors such as patient sex and height. In addition, interobserver and intraobserver agreement values for the 40-mm glenoid component were considerably higher than those for components of other sizes, indicating that the 40-mm component was consistently and reproducibly selected for the same patients. Hence, templating may particularly help prevent peg penetration and component overhang for patients with a smaller diameter glenoid.
More research on 3-D templating is warranted given the results of this study and other studies.12,17,31 Scalise and colleagues31 found that, in TSA planning, surgeons’ use of 2-D (vs 3-D) imaging led them to overestimate glenoid component sizes (P = .006). In our study, the glenoid size selected during 3-D templating was, in many cases, smaller than the size selected during surgery. In order to avoid peg penetration and glenoid overhang, anecdotal guidelines commonly used in glenoid size selection, likely was the driving force in selecting smaller glenoid components during templating. Although anterior, superior, and inferior glenoid overhang typically can be assessed during surgery, posterior overhang is more difficult to evaluate. Three-dimensional modeling allows surgeons to determine optimal glenoid component size and position. In addition, intraoperative evaluation of glenoid component peg penetration is challenging, and peg penetration becomes evident only after it has occurred. During templating, however, surgeons were able to easily assess for peg penetration, and smaller glenoid components were selected.
A limitation of this study is that intraoperative glenoid version correction or peg containment was not quantified. More research is needed on the relationship between glenoid size selection and component overhang and peg penetration. Another limitation was use of only 1 TSA system (with 3 glenoid sizes, all with inline pegs); reliability of 3-D templating was not evaluated across different component designs. Last, given the absence of guidelines for glenoid component size selection, there was surgeon bias in preoperative templating and in intraoperative selection of glenoid size. Surgeons had differing opinions on the importance of maximizing the contact area of the component and correcting glenoid deformity and version.
Our study results showed that preoperative 3-D templating that allows for superior-inferior, anterior-posterior, and rotational translation was consistent and reproducible in determining glenoid component size, and use of this templating may reduce the risks of implant overhang, peg penetration, and decreased stability ratio. These results highlight the possibility that glenoid component sizes selected during surgery may not be ideal. More research is needed to determine if intraoperative glenoid size selection leads to adequate version correction and peg containment. The present study supports use of 3-D templating in primary TSA planning.
1. Neer CS 2nd. Replacement arthroplasty for glenohumeral osteoarthritis. J Bone Joint Surg Am. 1974;56(1):1-13.
2. Neer CS 2nd, Watson KC, Stanton FJ. Recent experience in total shoulder replacement. J Bone Joint Surg Am. 1982;64(3):319-337.
3. Day JS, Lau E, Ong KL, Williams GR, Ramsey ML, Kurtz SM. Prevalence and projections of total shoulder and elbow arthroplasty in the United States to 2015. J Shoulder Elbow Surg. 2010;19(8):1115-1120.
4. Torchia ME, Cofield RH, Settergren CR. Total shoulder arthroplasty with the Neer prosthesis: long-term results. J Shoulder Elbow Surg. 1997;6(6):495-505.
5. Barrett WP, Franklin JL, Jackins SE, Wyss CR, Matsen FA 3rd. Total shoulder arthroplasty. J Bone Joint Surg Am. 1987;69(6):865-872.
6. Bohsali KI, Wirth MA, Rockwood CA Jr. Complications of total shoulder arthroplasty. J Bone Joint Surg Am. 2006;88(10):2279-2292.
7. Matsen FA 3rd, Bicknell RT, Lippitt SB. Shoulder arthroplasty: the socket perspective. J Shoulder Elbow Surg. 2007;16(5 suppl):S241-S247.
8. Matsen FA 3rd, Clinton J, Lynch J, Bertelsen A, Richardson ML. Glenoid component failure in total shoulder arthroplasty. J Bone Joint Surg Am. 2008;90(4):885-896.
9. Pearl ML, Romeo AA, Wirth MA, Yamaguchi K, Nicholson GP, Creighton RA. Decision making in contemporary shoulder arthroplasty. Instr Course Lect. 2005;54:69-85.
10. Wirth MA, Rockwood CA Jr. Complications of total shoulder-replacement arthroplasty. J Bone Joint Surg Am. 1996;78(4):603-616.
11. Sanchez-Sotelo J, Sperling JW, Rowland CM, Cofield RH. Instability after shoulder arthroplasty: results of surgical treatment. J Bone Joint Surg Am. 2003;85(4):622-631.
12. Tammachote N, Sperling JW, Berglund LJ, Steinmann SP, Cofield RH, An KN. The effect of glenoid component size on the stability of total shoulder arthroplasty. J Shoulder Elbow Surg. 2007;16(3 suppl):S102-S106.
13. Iannotti JP, Greeson C, Downing D, Sabesan V, Bryan JA. Effect of glenoid deformity on glenoid component placement in primary shoulder arthroplasty. J Shoulder Elbow Surg. 2012;21(1):48-55.
14. Briem D, Ruecker AH, Neumann J, et al. 3D fluoroscopic navigated reaming of the glenoid for total shoulder arthroplasty (TSA). Comput Aided Surg. 2011;16(2):93-99.
15. Budge MD, Lewis GS, Schaefer E, Coquia S, Flemming DJ, Armstrong AD. Comparison of standard two-dimensional and three-dimensional corrected glenoid version measurements. J Shoulder Elbow Surg. 2011;20(4):577-583.
16. Chuang TY, Adams CR, Burkhart SS. Use of preoperative three-dimensional computed tomography to quantify glenoid bone loss in shoulder instability. Arthroscopy. 2008;24(4):376-382.
17. Nowak DD, Bahu MJ, Gardner TR, et al. Simulation of surgical glenoid resurfacing using three-dimensional computed tomography of the arthritic glenohumeral joint: the amount of glenoid retroversion that can be corrected. J Shoulder Elbow Surg. 2009;18(5):680-688.
18. Scalise JJ, Bryan J, Polster J, Brems JJ, Iannotti JP. Quantitative analysis of glenoid bone loss in osteoarthritis using three-dimensional computed tomography scans. J Shoulder Elbow Surg. 2008;17(2):328-335.
19. Scalise JJ, Codsi MJ, Bryan J, Iannotti JP. The three-dimensional glenoid vault model can estimate normal glenoid version in osteoarthritis. J Shoulder Elbow Surg. 2008;17(3):487-491.
20. Bryce CD, Pennypacker JL, Kulkarni N, et al. Validation of three-dimensional models of in situ scapulae. J Shoulder Elbow Surg. 2008;17(5):825-832.
21. Landis JR, Koch GG. The measurement of observer agreement for categorical data. Biometrics. 1977;33(1):159-174.
22. Cummings RJ, Loveless EA, Campbell J, Samelson S, Mazur JM. Interobserver reliability and intraobserver reproducibility of the system of King et al. for the classification of adolescent idiopathic scoliosis. J Bone Joint Surg Am. 1998;80(8):1107-1111.
23. Humphrey CA, Dirschl DR, Ellis TJ. Interobserver reliability of a CT-based fracture classification system. J Orthop Trauma. 2005;19(9):616-622.
24. Illarramendi A, González Della Valle A, Segal E, De Carli P, Maignon G, Gallucci G. Evaluation of simplified Frykman and AO classifications of fractures of the distal radius. Assessment of interobserver and intraobserver agreement. Int Orthop. 1998;22(2):111-115.
25. Lenke LG, Betz RR, Bridwell KH, et al. Intraobserver and interobserver reliability of the classification of thoracic adolescent idiopathic scoliosis. J Bone Joint Surg Am. 1998;80(8):1097-1106.
26. Ploegmakers JJ, Mader K, Pennig D, Verheyen CC. Four distal radial fracture classification systems tested amongst a large panel of Dutch trauma surgeons. Injury. 2007;38(11):1268-1272.
27. Sidor ML, Zuckerman JD, Lyon T, Koval K, Cuomo F, Schoenberg N. The Neer classification system for proximal humeral fractures. An assessment of interobserver reliability and intraobserver reproducibility. J Bone Joint Surg Am. 1993;75(12):1745-1750.
28. Siebenrock KA, Gerber C. The reproducibility of classification of fractures of the proximal end of the humerus. J Bone Joint Surg Am. 1993;75(12):1751-1755.
29. Thomsen NO, Overgaard S, Olsen LH, Hansen H, Nielsen ST. Observer variation in the radiographic classification of ankle fractures. J Bone Joint Surg Br. 1991;73(4):676-678.
30. Ward WT, Vogt M, Grudziak JS, Tümer Y, Cook PC, Fitch RD. Severin classification system for evaluation of the results of operative treatment of congenital dislocation of the hip. A study of intraobserver and interobserver reliability. J Bone Joint Surg Am. 1997;79(5):656-663.
31. Scalise JJ, Codsi MJ, Bryan J, Brems JJ, Iannotti JP. The influence of three-dimensional computed tomography images of the shoulder in preoperative planning for total shoulder arthroplasty. J Bone Joint Surg Am. 2008;90(11):2438-2445.
1. Neer CS 2nd. Replacement arthroplasty for glenohumeral osteoarthritis. J Bone Joint Surg Am. 1974;56(1):1-13.
2. Neer CS 2nd, Watson KC, Stanton FJ. Recent experience in total shoulder replacement. J Bone Joint Surg Am. 1982;64(3):319-337.
3. Day JS, Lau E, Ong KL, Williams GR, Ramsey ML, Kurtz SM. Prevalence and projections of total shoulder and elbow arthroplasty in the United States to 2015. J Shoulder Elbow Surg. 2010;19(8):1115-1120.
4. Torchia ME, Cofield RH, Settergren CR. Total shoulder arthroplasty with the Neer prosthesis: long-term results. J Shoulder Elbow Surg. 1997;6(6):495-505.
5. Barrett WP, Franklin JL, Jackins SE, Wyss CR, Matsen FA 3rd. Total shoulder arthroplasty. J Bone Joint Surg Am. 1987;69(6):865-872.
6. Bohsali KI, Wirth MA, Rockwood CA Jr. Complications of total shoulder arthroplasty. J Bone Joint Surg Am. 2006;88(10):2279-2292.
7. Matsen FA 3rd, Bicknell RT, Lippitt SB. Shoulder arthroplasty: the socket perspective. J Shoulder Elbow Surg. 2007;16(5 suppl):S241-S247.
8. Matsen FA 3rd, Clinton J, Lynch J, Bertelsen A, Richardson ML. Glenoid component failure in total shoulder arthroplasty. J Bone Joint Surg Am. 2008;90(4):885-896.
9. Pearl ML, Romeo AA, Wirth MA, Yamaguchi K, Nicholson GP, Creighton RA. Decision making in contemporary shoulder arthroplasty. Instr Course Lect. 2005;54:69-85.
10. Wirth MA, Rockwood CA Jr. Complications of total shoulder-replacement arthroplasty. J Bone Joint Surg Am. 1996;78(4):603-616.
11. Sanchez-Sotelo J, Sperling JW, Rowland CM, Cofield RH. Instability after shoulder arthroplasty: results of surgical treatment. J Bone Joint Surg Am. 2003;85(4):622-631.
12. Tammachote N, Sperling JW, Berglund LJ, Steinmann SP, Cofield RH, An KN. The effect of glenoid component size on the stability of total shoulder arthroplasty. J Shoulder Elbow Surg. 2007;16(3 suppl):S102-S106.
13. Iannotti JP, Greeson C, Downing D, Sabesan V, Bryan JA. Effect of glenoid deformity on glenoid component placement in primary shoulder arthroplasty. J Shoulder Elbow Surg. 2012;21(1):48-55.
14. Briem D, Ruecker AH, Neumann J, et al. 3D fluoroscopic navigated reaming of the glenoid for total shoulder arthroplasty (TSA). Comput Aided Surg. 2011;16(2):93-99.
15. Budge MD, Lewis GS, Schaefer E, Coquia S, Flemming DJ, Armstrong AD. Comparison of standard two-dimensional and three-dimensional corrected glenoid version measurements. J Shoulder Elbow Surg. 2011;20(4):577-583.
16. Chuang TY, Adams CR, Burkhart SS. Use of preoperative three-dimensional computed tomography to quantify glenoid bone loss in shoulder instability. Arthroscopy. 2008;24(4):376-382.
17. Nowak DD, Bahu MJ, Gardner TR, et al. Simulation of surgical glenoid resurfacing using three-dimensional computed tomography of the arthritic glenohumeral joint: the amount of glenoid retroversion that can be corrected. J Shoulder Elbow Surg. 2009;18(5):680-688.
18. Scalise JJ, Bryan J, Polster J, Brems JJ, Iannotti JP. Quantitative analysis of glenoid bone loss in osteoarthritis using three-dimensional computed tomography scans. J Shoulder Elbow Surg. 2008;17(2):328-335.
19. Scalise JJ, Codsi MJ, Bryan J, Iannotti JP. The three-dimensional glenoid vault model can estimate normal glenoid version in osteoarthritis. J Shoulder Elbow Surg. 2008;17(3):487-491.
20. Bryce CD, Pennypacker JL, Kulkarni N, et al. Validation of three-dimensional models of in situ scapulae. J Shoulder Elbow Surg. 2008;17(5):825-832.
21. Landis JR, Koch GG. The measurement of observer agreement for categorical data. Biometrics. 1977;33(1):159-174.
22. Cummings RJ, Loveless EA, Campbell J, Samelson S, Mazur JM. Interobserver reliability and intraobserver reproducibility of the system of King et al. for the classification of adolescent idiopathic scoliosis. J Bone Joint Surg Am. 1998;80(8):1107-1111.
23. Humphrey CA, Dirschl DR, Ellis TJ. Interobserver reliability of a CT-based fracture classification system. J Orthop Trauma. 2005;19(9):616-622.
24. Illarramendi A, González Della Valle A, Segal E, De Carli P, Maignon G, Gallucci G. Evaluation of simplified Frykman and AO classifications of fractures of the distal radius. Assessment of interobserver and intraobserver agreement. Int Orthop. 1998;22(2):111-115.
25. Lenke LG, Betz RR, Bridwell KH, et al. Intraobserver and interobserver reliability of the classification of thoracic adolescent idiopathic scoliosis. J Bone Joint Surg Am. 1998;80(8):1097-1106.
26. Ploegmakers JJ, Mader K, Pennig D, Verheyen CC. Four distal radial fracture classification systems tested amongst a large panel of Dutch trauma surgeons. Injury. 2007;38(11):1268-1272.
27. Sidor ML, Zuckerman JD, Lyon T, Koval K, Cuomo F, Schoenberg N. The Neer classification system for proximal humeral fractures. An assessment of interobserver reliability and intraobserver reproducibility. J Bone Joint Surg Am. 1993;75(12):1745-1750.
28. Siebenrock KA, Gerber C. The reproducibility of classification of fractures of the proximal end of the humerus. J Bone Joint Surg Am. 1993;75(12):1751-1755.
29. Thomsen NO, Overgaard S, Olsen LH, Hansen H, Nielsen ST. Observer variation in the radiographic classification of ankle fractures. J Bone Joint Surg Br. 1991;73(4):676-678.
30. Ward WT, Vogt M, Grudziak JS, Tümer Y, Cook PC, Fitch RD. Severin classification system for evaluation of the results of operative treatment of congenital dislocation of the hip. A study of intraobserver and interobserver reliability. J Bone Joint Surg Am. 1997;79(5):656-663.
31. Scalise JJ, Codsi MJ, Bryan J, Brems JJ, Iannotti JP. The influence of three-dimensional computed tomography images of the shoulder in preoperative planning for total shoulder arthroplasty. J Bone Joint Surg Am. 2008;90(11):2438-2445.
Association Between Anemia and Fatigue in Hospitalized Patients: Does the Measure of Anemia Matter?
Fatigue is the most common clinical symptom of anemia and is a significant concern to patients.1,2 In ambulatory patients, lower hemoglobin (Hb) concentration is associated with increased fatigue.2,3 Accordingly, therapies that treat anemia by increasing Hb concentration, such as erythropoiesis stimulating agents,4-7 often use fatigue as an outcome measure.
In hospitalized patients, transfusion of red blood cell increases Hb concentration and is the primary treatment for anemia. However, the extent to which transfusion and changes in Hb concentration affect hospitalized patients’ fatigue levels is not well established. Guidelines support transfusing patients with symptoms of anemia, such as fatigue, on the assumption that the increased oxygen delivery will improve the symptoms of anemia. While transfusion studies in hospitalized patients have consistently reported that transfusion at lower or “restrictive” Hb concentrations is safe compared with transfusion at higher Hb concentrations,8-10 these studies have mainly used cardiac events and mortality as outcomes rather than patient symptoms, such as fatigue. Nevertheless, they have resulted in hospitals increasingly adopting restrictive transfusion policies that discourage transfusion at higher Hb levels.11,12 Consequently, the rate of transfusion in hospitalized patients has decreased,13 raising questions of whether some patients with lower Hb concentrations may experience increased fatigue as a result of restrictive transfusion policies. Fatigue among hospitalized patients is important not only because it is an adverse symptom but because it may result in decreased activity levels, deconditioning, and losses in functional status.14,15While the effect of alternative transfusion policies on fatigue in hospitalized patients could be answered by a randomized clinical trial using fatigue and functional status as outcomes, an important first step is to assess whether the Hb concentration of hospitalized patients is associated with their fatigue level during hospitalization. Because hospitalized patients often have acute illnesses that can cause fatigue in and of themselves, it is possible that anemia is not associated with fatigue in hospitalized patients despite anemia’s association with fatigue in ambulatory patients. Additionally, Hb concentration varies during hospitalization,16 raising the question of what measures of Hb during hospitalization might be most associated with anemia-related fatigue.
The objective of this study is to explore multiple Hb measures in hospitalized medical patients with anemia and test whether any of these Hb measures are associated with patients’ fatigue level.
METHODS
Study Design
We performed a prospective, observational study of hospitalized patients with anemia on the general medicine services at The University of Chicago Medical Center (UCMC). The institutional review board approved the study procedures, and all study subjects provided informed consent.
Study Eligibility
Between April 2014 and June 2015, all general medicine inpatients were approached for written consent for The University of Chicago Hospitalist Project,17 a research infrastructure at UCMC. Among patients consenting to participate in the Hospitalist Project, patients were eligible if they had Hb <9 g/dL at any point during their hospitalization and were age ≥50 years. Hb concentration of <9 g/dL was chosen to include the range of Hb values covered by most restrictive transfusion policies.8-10,18 Age ≥50 years was an inclusion criteria because anemia is more strongly associated with poor outcomes, including functional impairment, among older patients compared with younger patients.14,19-21 If patients were not eligible for inclusion at the time of consent for the Hospitalist Project, their Hb values were reviewed twice daily until hospital discharge to assess if their Hb was <9 g/dL. Proxies were sought to answer questions for patients who failed the Short Portable Mental Status Questionnaire.22
Patient Demographic Data Collection
Research assistants abstracted patient age and sex from the electronic health record (EHR), and asked patients to self-identify their race. The individual comorbidities included as part of the Charlson Comorbidity Index were identified using International Classification of Diseases, 9th Revision codes from hospital administrative data for each encounter and specifically included the following: myocardial infarction, congestive heart failure, peripheral vascular disease, cerebrovascular disease, dementia, chronic pulmonary disease, rheumatic disease, peptic ulcer disease, liver disease, diabetes, hemiplegia and/or paraplegia, renal disease, cancer, and human immunodeficiency virus/acquired immunodeficiency syndrome.23 We also used Healthcare Cost and Utilization Project (www.hcup-us.ahrq.gov/toolssoftware/ccs/ccs.jsp) diagnosis categories to identify whether patients had sickle cell (SC) anemia, gastrointestinal bleeding (GIB), or a depressive disorder (DD) because these conditions are associated with anemia (SC and GIB) and fatigue (DD).24
Measuring Anemia
Hb measures were available only when hospital providers ordered them as part of routine practice. The first Hb concentration <9 g/dL during a patient’s hospitalization, which made them eligible for study participation, was obtained through manual review of the EHR. All additional Hb values during the patient’s hospitalization were obtained from the hospital’s administrative data mart. All Hb values collected for each patient during the hospitalization were used to calculate summary measures of Hb during the hospitalization, including the mean Hb, median Hb, minimum Hb, maximum Hb, admission (first recorded) Hb, and discharge (last recorded) Hb. Hb measures were analyzed both as a continuous variable and as a categorical variable created by dividing the continuous Hb measures into integer ranges of 3 groups of approximately the same size.
Measuring Fatigue
Our primary outcome was patients’ level of fatigue reported during hospitalization, measured using the Functional Assessment of Chronic Illness Therapy (FACIT)-Anemia questionnaire. Fatigue was measured using a 13-question fatigue subscale,1,2,25 which measures fatigue within the past 7 days. Scores on the fatigue subscale range from 0 to 52, with lower scores reflecting greater levels of fatigue. As soon as patients met the eligibility criteria for study participation during their hospitalization (age ≥50 years and Hb <9 g/dL), they were approached to answer the FACIT questions. Values for missing data in the fatigue subscale for individual subjects were filled in using a prorated score from their answered questions as long as >50% of the items in the fatigue subscale were answered, in accordance with recommendations for addressing missing data in the FACIT.26 Fatigue was analyzed as a continuous variable and as a dichotomous variable created by dividing the sample into high (FACIT <27) and low (FACIT ≥27) levels of fatigue based on the median FACIT score of the population. Previous literature has shown a FACIT fatigue subscale score between 23 and 26 to be associated with an Eastern Cooperative Oncology Group (ECOG)27 C Performance Status rating of 2 to 33 compared to scores ≥27.
Statistical Analysis
Statistical analysis was performed using Stata statistical software (StataCorp, College Station, TX). Descriptive statistics were used to characterize patient demographics. Analysis of variance was used to test for differences in the mean fatigue levels across Hb measures. χ2 tests were performed to test for associations between high fatigue levels and the Hb measures. Multivariable analysis, including both linear and logistic regression models, were used to test the association of Hb concentration and fatigue. P values <0.05 using a 2-tailed test were deemed statistically significant.
RESULTS
Patient Characteristics
During the study period, 8559 patients were admitted to the general medicine service. Of those, 5073 (59%) consented for participation in the Hospitalist Project, and 3670 (72%) completed the Hospitalist Project inpatient interview. Of these patients, 1292 (35%) had Hb <9 g/dL, and 784 (61%) were 50 years or older and completed the FACIT questionnaire.
Table 1 reports the demographic characteristics and comorbidities for the sample, the mean (standard deviation [SD]) for the 6 Hb measures, and mean (SD) and median FACIT scores.
Bivariate Association of Fatigue and Hb
Categorizing patients into low, middle, or high Hb for each of the 6 Hb measures, minimum Hb was strongly associated with fatigue, with a weaker association for mean Hb and no statistically significant association for the other measures.
Minimum Hb. Patients with a minimum Hb <7 g/dL and patients with Hb 7-8 g/dL had higher fatigue levels (FACIT = 25 for each) than patients with a minimum Hb ≥8 g/dL (FACIT = 29; P < 0.001; Table 2). When excluding patients with SC and/or GIB because their average minimum Hb differed from the average minimum Hb of the full population (P < 0.001), patients with a minimum Hb <7 g/dL or 7-8 g/dL had even higher fatigue levels (FACIT = 23 and FACIT = 24, respectively), with no change in the fatigue level of patients with a minimum Hb ≥8 g/dL (FACIT = 29; P < 0.001; Table 2). Lower minimum Hb continued to be associated with higher fatigue levels when analyzed in 0.5 g/dL increments (Figure).
Mean Hb and Other Measures. Fatigue levels were high for 47% of patients with a mean Hb <8g /dL and 53% of patients with a mean Hb 8-9 g/dL compared with 43% of patients with a mean Hb ≥9 g/dL (P = 0.05). However, the association between high fatigue and mean Hb was not statistically significant when patients with SC and/or GIB were excluded (Table 2). None of the other 4 Hb measures was significantly associated with fatigue.
Linear Regression of Fatigue on Hb
In linear regression models, minimum Hb consistently predicted patient fatigue, mean Hb had a less robust association with fatigue, and the other Hb measures were not associated with patient fatigue. Increases in minimum Hb (analyzed as a continuous variable) were associated with reduced fatigue (higher FACIT score; β = 1.4; P = 0.005). In models in which minimum Hb was a categorical variable, patients with a minimum Hb of <7 g/dL or 7-8 g/dL had greater fatigue (lower FACIT score) than patients whose minimum Hb was ≥8 g/dL (Hb <7 g/dL: β = −4.2; P ≤ 0.001; Hb 7-8 g/dL: β = −4.1; P < 0.001). These results control for patients’ age, sex, individual comorbidities, and whether their minimum Hb occurred before or after the measurement of fatigue during hospitalization (Model 1), and the results are unchanged when also controlling for the number of Hb laboratory draws patients had during their hospitalization (Model 2; Table 3). In a stratified analysis excluding patients with either SC and/or GIB, changes in minimum Hb were associated with larger changes in patient fatigue levels (Supplemental Table 1). We also stratified our analysis to include only patients whose minimum Hb occurred before the measurement of their fatigue level during hospitalization to avoid a spurious association of fatigue with minimum Hb occurring after fatigue was measured. In both Models 1 and 2, minimum Hb remained a predictor of patients’ fatigue levels with similar effect sizes, although in Model 2, the results did not quite reach a statistically significant level, in part due to larger confidence intervals from the smaller sample size of this stratified analysis (Supplemental Table 2a). We further stratified this analysis to include only patients whose transfusion, if they received one, occurred after their minimum Hb and the measurement of their fatigue level to account for the possibility that a transfusion could affect the fatigue level patients report. In this analysis, most of the estimates of the effect of minimum Hb on fatigue were larger than those seen when only analyzing patients whose minimum Hb occurred before the measurement of their fatigue level, although again, the smaller sample size of this additional stratified analysis does produce larger confidence intervals for these estimates (Supplemental Table 2b).
No Hb measure other than minimum or mean had significant association with patient fatigue levels in linear regression models.
Logistic Regression of High Fatigue Level on Hb
Using logistic regression, minimum Hb analyzed as a categorical variable predicted increased odds of a high fatigue level. Patients with a minimum Hb <7 g/dL were 50% (odds ratio [OR] = 1.5; P = 0.03) more likely to have high fatigue and patients with a minimum Hb 7-8 g/dL were 90% (OR = 1.9; P < 0.001) more likely to have high fatigue compared with patients with a minimum Hb ≥8 g/dL in Model 1. These results were similar in Model 2, although the effect was only statistically significant in the 7-8 g/dL Hb group (Table 3). When excluding SC and/or GIB patients, the odds of having high fatigue as minimum Hb decreased were the same or higher for both models compared to the full population of patients. However, again, in Model 2, the effect was only statistically significant in the 7-8 g/dL Hb group (Supplemental Table 1).
Patients with a mean Hb <8 g/dL were 20% to 30% more likely to have high fatigue and patients with mean Hb 8-9 g/dL were 50% more likely to have high fatigue compared with patients with a mean Hb ≥9 g/dL, but the effects were only statistically significant for patients with a mean Hb 8-9 g/dL in both Models 1 and 2 (Table 3). These results were similar when excluding patients with SC and/or GIB, but they were only significant for patients with a mean Hb 8-9 g/dL in Model 1 and patients with a mean Hb <8 g/dL in the Model 2 (Supplemental Table 3).
DISCUSSION
These results demonstrate that minimum Hb during hospitalization is associated with fatigue in hospitalized patients age ≥50 years, and the association is stronger among patients without SC and/or GIB as comorbidities. The analysis of Hb as a continuous and categorical variable and the use of both linear and logistic regression models support the robustness of these associations and illuminate their clinical significance. For example, in linear regression with minimum Hb a continuous variable, the coefficient of 1.4 suggests that an increase of 2 g/dL in Hb, as might be expected from transfusion of 2 units of red blood cells, would be associated with about a 3-point improvement in fatigue. Additionally, as a categorical variable, a minimum Hb ≥8 g/dL compared with a minimum Hb <7 g/dL or 7-8 g/dL is associated with a 3- to 4-point improvement in fatigue. Previous literature suggests that a difference of 3 in the FACIT score is the minimum clinically important difference in fatigue,3 and changes in minimum Hb in either model predict changes in fatigue that are in the range of potential clinical significance.
The clinical significance of the findings is also reflected in the results of the logistic regressions, which may be mapped to potential effects on functional status. Specifically, the odds of having a high fatigue level (FACIT <27) increase 90% for persons with a minimum Hb 7–8 g/dL compared with persons with a minimum Hb ≥8 g/dL. For persons with a minimum Hb <7 g/dL, point estimates suggest a smaller (50%) increase in the odds of high fatigue, but the 95% confidence interval overlaps heavily with the estimate of patients whose minimum Hb is 7-8 g/dL. While it might be expected that patients with a minimum Hb <7 g/dL have greater levels of fatigue compared with patients with a minimum Hb 7-8 g/dL, we did not observe such a pattern. One reason may be that the confidence intervals of our estimated effects are wide enough that we cannot exclude such a pattern. Another possible explanation is that in both groups, the fatigue levels are sufficiently severe, such that the difference in their fatigue levels may not be clinically meaningful. For example, a FACIT score of 23 to 26 has been shown to be associated with an ECOG performance status of 2 to 3, requiring bed rest for at least part of the day.3 Therefore, patients with a minimum Hb 7-8 g/dL (mean FACIT score = 24; Table 2) or a minimum Hb of <7 g/dL (mean FACIT score = 23; Table 2) are already functionally limited to the point of being partially bed bound, such that further decreases in their Hb may not produce additional fatigue in part because they reduce their activity sufficiently to prevent an increase in fatigue. In such cases, the potential benefits of increased Hb may be better assessed by measuring fatigue in response to a specific and provoked activity level, a concept known as fatigability.20
That minimum Hb is more strongly associated with fatigue than any other measure of Hb during hospitalization may not be surprising. Mean, median, maximum, and discharge Hb may all be affected by transfusion during hospitalization that could affect fatigue. Admission Hb may not reflect true oxygen-carrying capacity because of hemoconcentration.
The association between Hb and fatigue in hospitalized patients is important because increased fatigue could contribute to slower clinical recovery in hospitalized patients. Additionally, increased fatigue during hospitalization and at hospital discharge could exacerbate the known deleterious consequences of fatigue on patients and their health outcomes14,15 after hospital discharge. Although one previous study, the Functional Outcomes in Cardiovascular Patients Undergoing Surgical Hip Fracture Repair (FOCUS)8 trial, did not report differences in patients’ fatigue levels at 30 and 60 days postdischarge when transfused at restrictive (8 g/dL) compared with liberal (10 g/dL) Hb thresholds, confidence in the validity of this finding is reduced by the fact that more than half of the patients were lost to follow-up at the 30- and 60-day time points. Further, patients in the restrictive transfusion arm of FOCUS were transfused to maintain Hb levels at or above 8 g/dL. This transfusion threshold of 8 g/dL may have mitigated the high levels of fatigue that are seen in our study when patients’ Hb drops below 8 g/dL, and maintaining a Hb level of 7 g/dL is now the standard of care in stable hospitalized patients. Lastly, FOCUS was limited to postoperative hip fracture patients, and the generalizability of FOCUS to hospitalized medicine patients with anemia is limited.
Therefore, our results support guideline suggestions that practitioners incorporate the presence of patient symptoms such as fatigue into transfusion decisions, particularly if patients’ Hb is <8 g/dL.18 Though reasonable, the suggestion to incorporate symptoms such as fatigue into transfusion decisions has not been strongly supported by evidence so far, and it may often be neglected in practice. Definitive evidence to support such recommendations would benefit from study through an optimal trial18 that incorporates symptoms into decision making. Our findings add support for a study of transfusion strategies that incorporates patients’ fatigue level in addition to Hb concentration.
This study has several limitations. Although our sample size is large and includes patients with a range of comorbidities that we believe are representative of hospitalized general medicine patients, as a single-center, observational study, our results may not be generalizable to other centers. Additionally, although these data support a reliable association between hospitalized patients’ minimum Hb and fatigue level, the observational design of this study cannot prove that this relationship is causal. Also, patients’ Hb values were measured at the discretion of their clinician, and therefore, the measures of Hb were not uniformly measured for participating patients. In addition, fatigue was only measured at one time point during a patient’s hospitalization, and it is possible that patients’ fatigue levels change during hospitalization in relation to variables we did not consider. Finally, our study was not designed to assess the association of Hb with longer-term functional outcomes, which may be of greater concern than fatigue.
CONCLUSION
In hospitalized patients ≥50 years old, minimum Hb is reliably associated with patients’ fatigue level. Patients whose minimum Hb is <8 g/dL experience higher fatigue levels compared to patients whose minimum Hb is ≥8 g/dL. Additional studies are warranted to understand if patients may benefit from improved fatigue levels by correcting their anemia through transfusion.
1. Yellen SB, Cella DF, Webster K, Blendowski C, Kaplan E. Measuring fatigue and other anemia-related symptoms with the Functional Assessment of Cancer Therapy (FACT) measurement system. J Pain Symptom Manage. 1997;13(2):63-74.
2. Cella D, Lai JS, Chang CH, Peterman A, Slavin M. Fatigue in cancer patients compared with fatigue in the general United States population. Cancer. 2002;94(2):528-538. doi:10.1002/cncr.10245.
3. Cella D, Eton DT, Lai J-S, Peterman AH, Merkel DE. Combining anchor and distribution-based methods to derive minimal clinically important differences on the Functional Assessment of Cancer Therapy (FACT) anemia and fatigue scales. J Pain Symptom Manage. 2002;24(6):547-561.
4. Tonelli M, Hemmelgarn B, Reiman T, et al. Benefits and harms of erythropoiesis-stimulating agents for anemia related to cancer: a meta-analysis. CMAJ Can Med Assoc J J Assoc Medicale Can. 2009;180(11):E62-E71. doi:10.1503/cmaj.090470.
5. Foley RN, Curtis BM, Parfrey PS. Erythropoietin Therapy, Hemoglobin Targets, and Quality of Life in Healthy Hemodialysis Patients: A Randomized Trial. Clin J Am Soc Nephrol. 2009;4(4):726-733. doi:10.2215/CJN.04950908.
6. Keown PA, Churchill DN, Poulin-Costello M, et al. Dialysis patients treated with Epoetin alfa show improved anemia symptoms: A new analysis of the Canadian Erythropoietin Study Group trial. Hemodial Int Int Symp Home Hemodial. 2010;14(2):168-173. doi:10.1111/j.1542-4758.2009.00422.x.
7. Palmer SC, Saglimbene V, Mavridis D, et al. Erythropoiesis-stimulating agents for anaemia in adults with chronic kidney disease: a network meta-analysis. Cochrane Database Syst Rev. 2014:CD010590.
8. Carson JL, Terrin ML, Noveck H, et al. Liberal or Restrictive Transfusion in high-risk patients after hip surgery. N Engl J Med. 2011;365(26):2453-2462. doi:10.1056/NEJMoa1012452.
9. Holst LB, Haase N, Wetterslev J, et al. Transfusion requirements in septic shock (TRISS) trial – comparing the effects and safety of liberal versus restrictive red blood cell transfusion in septic shock patients in the ICU: protocol for a randomised controlled trial. Trials. 2013;14:150. doi:10.1186/1745-6215-14-150.
10. Hébert PC, Wells G, Blajchman MA, et al. A multicenter, randomized, controlled clinical trial of transfusion requirements in critical care. N Engl J Med. 1999;340(6):409-417. doi:10.1056/NEJM199902113400601.
11. Corwin HL, Theus JW, Cargile CS, Lang NP. Red blood cell transfusion: Impact of an education program and a clinical guideline on transfusion practice. J Hosp Med. 2014;9(12):745-749. doi:10.1002/jhm.2237.
12. Saxena, S, editor. The Transfusion Committee: Putting Patient Safety First, 2nd Edition. Bethesda (MD): American Association of Blood Banks; 2013.
13. The 2011 National Blood Collection and Utilization Report. http://www.hhs.gov/ash/bloodsafety/2011-nbcus.pdf. Accessed August 16, 2017.
14. Vestergaard S, Nayfield SG, Patel KV, et al. Fatigue in a Representative Population of Older Persons and Its Association With Functional Impairment, Functional Limitation, and Disability. J Gerontol A Biol Sci Med Sci. 2009;64A(1):76-82. doi:10.1093/gerona/gln017.
15. Gill TM, Desai MM, Gahbauer EA, Holford TR, Williams CS. Restricted activity among community-living older persons: incidence, precipitants, and health care utilization. Ann Intern Med. 2001;135(5):313-321.
16. Koch CG, Li L, Sun Z, et al. Hospital-acquired anemia: Prevalence, outcomes, and healthcare implications. J Hosp Med. 2013;8(9):506-512. doi:10.1002/jhm.2061.
17. Meltzer D, Manning WG, Morrison J, et al. Effects of Physician Experience on Costs and Outcomes on an Academic General Medicine Service: Results of a Trial of Hospitalists. Ann Intern Med. 2002;137(11):866-874. doi:10.7326/0003-4819-137-11-200212030-00007.
18. Carson JL, Grossman BJ, Kleinman S, et al. Red Blood Cell Transfusion: A Clinical Practice Guideline From the AABB*. Ann Intern Med. 2012;157(1):49-58. doi:10.7326/0003-4819-157-1-201206190-00429.
19. Moreh E, Jacobs JM, Stessman J. Fatigue, function, and mortality in older adults. J Gerontol A Biol Sci Med Sci. 2010;65(8):887-895. doi:10.1093/gerona/glq064.
20. Eldadah BA. Fatigue and Fatigability in Older Adults. PM&R. 2010;2(5):406-413. doi:10.1016/j.pmrj.2010.03.022.
21. Hardy SE, Studenski SA. Fatigue Predicts Mortality among Older Adults. J Am Geriatr Soc. 2008;56(10):1910-1914. doi:10.1111/j.1532-5415.2008.01957.x.
22. Pfeiffer E. A short portable mental status questionnaire for the assessment of organic brain deficit in elderly patients. J Am Geriatr Soc. 1975;23(10):433-441.
23. Quan H, Sundararajan V, Halfon P, et al. Coding algorithms for defining comorbidities in ICD-9-CM and ICD-10 administrative data. Med Care. 2005;43(11):1130-1139.
24. HCUP Clinical Classifications Software (CCS) for ICD-9-CM. Healthcare Cost and Utilization Project (HCUP). 2006-2009. Agency for Healthcare Research and Quality, Rockville, MD. https://www.hcup-us.ahrq.gov/toolssoftware/ccs/ccs.jsp. Accessed November 22, 2016.
25. Cella DF, Tulsky DS, Gray G, et al. The Functional Assessment of Cancer Therapy scale: development and validation of the general measure. J Clin Oncol Off J Am Soc Clin Oncol. 1993;11(3):570-579.
26. Webster K, Cella D, Yost K. The Functional Assessment of Chronic Illness Therapy (FACIT) Measurement System: properties, applications, and interpretation. Health Qual Life Outcomes. 2003;1:79. doi:10.1186/1477-7525-1-79.
27. Oken MMMD a, Creech RHMD b, Tormey DCMD, et al. Toxicity and response criteria of the Eastern Cooperative Oncology Group. J Clin Oncol. 1982;5(6):649-656.
Fatigue is the most common clinical symptom of anemia and is a significant concern to patients.1,2 In ambulatory patients, lower hemoglobin (Hb) concentration is associated with increased fatigue.2,3 Accordingly, therapies that treat anemia by increasing Hb concentration, such as erythropoiesis stimulating agents,4-7 often use fatigue as an outcome measure.
In hospitalized patients, transfusion of red blood cell increases Hb concentration and is the primary treatment for anemia. However, the extent to which transfusion and changes in Hb concentration affect hospitalized patients’ fatigue levels is not well established. Guidelines support transfusing patients with symptoms of anemia, such as fatigue, on the assumption that the increased oxygen delivery will improve the symptoms of anemia. While transfusion studies in hospitalized patients have consistently reported that transfusion at lower or “restrictive” Hb concentrations is safe compared with transfusion at higher Hb concentrations,8-10 these studies have mainly used cardiac events and mortality as outcomes rather than patient symptoms, such as fatigue. Nevertheless, they have resulted in hospitals increasingly adopting restrictive transfusion policies that discourage transfusion at higher Hb levels.11,12 Consequently, the rate of transfusion in hospitalized patients has decreased,13 raising questions of whether some patients with lower Hb concentrations may experience increased fatigue as a result of restrictive transfusion policies. Fatigue among hospitalized patients is important not only because it is an adverse symptom but because it may result in decreased activity levels, deconditioning, and losses in functional status.14,15While the effect of alternative transfusion policies on fatigue in hospitalized patients could be answered by a randomized clinical trial using fatigue and functional status as outcomes, an important first step is to assess whether the Hb concentration of hospitalized patients is associated with their fatigue level during hospitalization. Because hospitalized patients often have acute illnesses that can cause fatigue in and of themselves, it is possible that anemia is not associated with fatigue in hospitalized patients despite anemia’s association with fatigue in ambulatory patients. Additionally, Hb concentration varies during hospitalization,16 raising the question of what measures of Hb during hospitalization might be most associated with anemia-related fatigue.
The objective of this study is to explore multiple Hb measures in hospitalized medical patients with anemia and test whether any of these Hb measures are associated with patients’ fatigue level.
METHODS
Study Design
We performed a prospective, observational study of hospitalized patients with anemia on the general medicine services at The University of Chicago Medical Center (UCMC). The institutional review board approved the study procedures, and all study subjects provided informed consent.
Study Eligibility
Between April 2014 and June 2015, all general medicine inpatients were approached for written consent for The University of Chicago Hospitalist Project,17 a research infrastructure at UCMC. Among patients consenting to participate in the Hospitalist Project, patients were eligible if they had Hb <9 g/dL at any point during their hospitalization and were age ≥50 years. Hb concentration of <9 g/dL was chosen to include the range of Hb values covered by most restrictive transfusion policies.8-10,18 Age ≥50 years was an inclusion criteria because anemia is more strongly associated with poor outcomes, including functional impairment, among older patients compared with younger patients.14,19-21 If patients were not eligible for inclusion at the time of consent for the Hospitalist Project, their Hb values were reviewed twice daily until hospital discharge to assess if their Hb was <9 g/dL. Proxies were sought to answer questions for patients who failed the Short Portable Mental Status Questionnaire.22
Patient Demographic Data Collection
Research assistants abstracted patient age and sex from the electronic health record (EHR), and asked patients to self-identify their race. The individual comorbidities included as part of the Charlson Comorbidity Index were identified using International Classification of Diseases, 9th Revision codes from hospital administrative data for each encounter and specifically included the following: myocardial infarction, congestive heart failure, peripheral vascular disease, cerebrovascular disease, dementia, chronic pulmonary disease, rheumatic disease, peptic ulcer disease, liver disease, diabetes, hemiplegia and/or paraplegia, renal disease, cancer, and human immunodeficiency virus/acquired immunodeficiency syndrome.23 We also used Healthcare Cost and Utilization Project (www.hcup-us.ahrq.gov/toolssoftware/ccs/ccs.jsp) diagnosis categories to identify whether patients had sickle cell (SC) anemia, gastrointestinal bleeding (GIB), or a depressive disorder (DD) because these conditions are associated with anemia (SC and GIB) and fatigue (DD).24
Measuring Anemia
Hb measures were available only when hospital providers ordered them as part of routine practice. The first Hb concentration <9 g/dL during a patient’s hospitalization, which made them eligible for study participation, was obtained through manual review of the EHR. All additional Hb values during the patient’s hospitalization were obtained from the hospital’s administrative data mart. All Hb values collected for each patient during the hospitalization were used to calculate summary measures of Hb during the hospitalization, including the mean Hb, median Hb, minimum Hb, maximum Hb, admission (first recorded) Hb, and discharge (last recorded) Hb. Hb measures were analyzed both as a continuous variable and as a categorical variable created by dividing the continuous Hb measures into integer ranges of 3 groups of approximately the same size.
Measuring Fatigue
Our primary outcome was patients’ level of fatigue reported during hospitalization, measured using the Functional Assessment of Chronic Illness Therapy (FACIT)-Anemia questionnaire. Fatigue was measured using a 13-question fatigue subscale,1,2,25 which measures fatigue within the past 7 days. Scores on the fatigue subscale range from 0 to 52, with lower scores reflecting greater levels of fatigue. As soon as patients met the eligibility criteria for study participation during their hospitalization (age ≥50 years and Hb <9 g/dL), they were approached to answer the FACIT questions. Values for missing data in the fatigue subscale for individual subjects were filled in using a prorated score from their answered questions as long as >50% of the items in the fatigue subscale were answered, in accordance with recommendations for addressing missing data in the FACIT.26 Fatigue was analyzed as a continuous variable and as a dichotomous variable created by dividing the sample into high (FACIT <27) and low (FACIT ≥27) levels of fatigue based on the median FACIT score of the population. Previous literature has shown a FACIT fatigue subscale score between 23 and 26 to be associated with an Eastern Cooperative Oncology Group (ECOG)27 C Performance Status rating of 2 to 33 compared to scores ≥27.
Statistical Analysis
Statistical analysis was performed using Stata statistical software (StataCorp, College Station, TX). Descriptive statistics were used to characterize patient demographics. Analysis of variance was used to test for differences in the mean fatigue levels across Hb measures. χ2 tests were performed to test for associations between high fatigue levels and the Hb measures. Multivariable analysis, including both linear and logistic regression models, were used to test the association of Hb concentration and fatigue. P values <0.05 using a 2-tailed test were deemed statistically significant.
RESULTS
Patient Characteristics
During the study period, 8559 patients were admitted to the general medicine service. Of those, 5073 (59%) consented for participation in the Hospitalist Project, and 3670 (72%) completed the Hospitalist Project inpatient interview. Of these patients, 1292 (35%) had Hb <9 g/dL, and 784 (61%) were 50 years or older and completed the FACIT questionnaire.
Table 1 reports the demographic characteristics and comorbidities for the sample, the mean (standard deviation [SD]) for the 6 Hb measures, and mean (SD) and median FACIT scores.
Bivariate Association of Fatigue and Hb
Categorizing patients into low, middle, or high Hb for each of the 6 Hb measures, minimum Hb was strongly associated with fatigue, with a weaker association for mean Hb and no statistically significant association for the other measures.
Minimum Hb. Patients with a minimum Hb <7 g/dL and patients with Hb 7-8 g/dL had higher fatigue levels (FACIT = 25 for each) than patients with a minimum Hb ≥8 g/dL (FACIT = 29; P < 0.001; Table 2). When excluding patients with SC and/or GIB because their average minimum Hb differed from the average minimum Hb of the full population (P < 0.001), patients with a minimum Hb <7 g/dL or 7-8 g/dL had even higher fatigue levels (FACIT = 23 and FACIT = 24, respectively), with no change in the fatigue level of patients with a minimum Hb ≥8 g/dL (FACIT = 29; P < 0.001; Table 2). Lower minimum Hb continued to be associated with higher fatigue levels when analyzed in 0.5 g/dL increments (Figure).
Mean Hb and Other Measures. Fatigue levels were high for 47% of patients with a mean Hb <8g /dL and 53% of patients with a mean Hb 8-9 g/dL compared with 43% of patients with a mean Hb ≥9 g/dL (P = 0.05). However, the association between high fatigue and mean Hb was not statistically significant when patients with SC and/or GIB were excluded (Table 2). None of the other 4 Hb measures was significantly associated with fatigue.
Linear Regression of Fatigue on Hb
In linear regression models, minimum Hb consistently predicted patient fatigue, mean Hb had a less robust association with fatigue, and the other Hb measures were not associated with patient fatigue. Increases in minimum Hb (analyzed as a continuous variable) were associated with reduced fatigue (higher FACIT score; β = 1.4; P = 0.005). In models in which minimum Hb was a categorical variable, patients with a minimum Hb of <7 g/dL or 7-8 g/dL had greater fatigue (lower FACIT score) than patients whose minimum Hb was ≥8 g/dL (Hb <7 g/dL: β = −4.2; P ≤ 0.001; Hb 7-8 g/dL: β = −4.1; P < 0.001). These results control for patients’ age, sex, individual comorbidities, and whether their minimum Hb occurred before or after the measurement of fatigue during hospitalization (Model 1), and the results are unchanged when also controlling for the number of Hb laboratory draws patients had during their hospitalization (Model 2; Table 3). In a stratified analysis excluding patients with either SC and/or GIB, changes in minimum Hb were associated with larger changes in patient fatigue levels (Supplemental Table 1). We also stratified our analysis to include only patients whose minimum Hb occurred before the measurement of their fatigue level during hospitalization to avoid a spurious association of fatigue with minimum Hb occurring after fatigue was measured. In both Models 1 and 2, minimum Hb remained a predictor of patients’ fatigue levels with similar effect sizes, although in Model 2, the results did not quite reach a statistically significant level, in part due to larger confidence intervals from the smaller sample size of this stratified analysis (Supplemental Table 2a). We further stratified this analysis to include only patients whose transfusion, if they received one, occurred after their minimum Hb and the measurement of their fatigue level to account for the possibility that a transfusion could affect the fatigue level patients report. In this analysis, most of the estimates of the effect of minimum Hb on fatigue were larger than those seen when only analyzing patients whose minimum Hb occurred before the measurement of their fatigue level, although again, the smaller sample size of this additional stratified analysis does produce larger confidence intervals for these estimates (Supplemental Table 2b).
No Hb measure other than minimum or mean had significant association with patient fatigue levels in linear regression models.
Logistic Regression of High Fatigue Level on Hb
Using logistic regression, minimum Hb analyzed as a categorical variable predicted increased odds of a high fatigue level. Patients with a minimum Hb <7 g/dL were 50% (odds ratio [OR] = 1.5; P = 0.03) more likely to have high fatigue and patients with a minimum Hb 7-8 g/dL were 90% (OR = 1.9; P < 0.001) more likely to have high fatigue compared with patients with a minimum Hb ≥8 g/dL in Model 1. These results were similar in Model 2, although the effect was only statistically significant in the 7-8 g/dL Hb group (Table 3). When excluding SC and/or GIB patients, the odds of having high fatigue as minimum Hb decreased were the same or higher for both models compared to the full population of patients. However, again, in Model 2, the effect was only statistically significant in the 7-8 g/dL Hb group (Supplemental Table 1).
Patients with a mean Hb <8 g/dL were 20% to 30% more likely to have high fatigue and patients with mean Hb 8-9 g/dL were 50% more likely to have high fatigue compared with patients with a mean Hb ≥9 g/dL, but the effects were only statistically significant for patients with a mean Hb 8-9 g/dL in both Models 1 and 2 (Table 3). These results were similar when excluding patients with SC and/or GIB, but they were only significant for patients with a mean Hb 8-9 g/dL in Model 1 and patients with a mean Hb <8 g/dL in the Model 2 (Supplemental Table 3).
DISCUSSION
These results demonstrate that minimum Hb during hospitalization is associated with fatigue in hospitalized patients age ≥50 years, and the association is stronger among patients without SC and/or GIB as comorbidities. The analysis of Hb as a continuous and categorical variable and the use of both linear and logistic regression models support the robustness of these associations and illuminate their clinical significance. For example, in linear regression with minimum Hb a continuous variable, the coefficient of 1.4 suggests that an increase of 2 g/dL in Hb, as might be expected from transfusion of 2 units of red blood cells, would be associated with about a 3-point improvement in fatigue. Additionally, as a categorical variable, a minimum Hb ≥8 g/dL compared with a minimum Hb <7 g/dL or 7-8 g/dL is associated with a 3- to 4-point improvement in fatigue. Previous literature suggests that a difference of 3 in the FACIT score is the minimum clinically important difference in fatigue,3 and changes in minimum Hb in either model predict changes in fatigue that are in the range of potential clinical significance.
The clinical significance of the findings is also reflected in the results of the logistic regressions, which may be mapped to potential effects on functional status. Specifically, the odds of having a high fatigue level (FACIT <27) increase 90% for persons with a minimum Hb 7–8 g/dL compared with persons with a minimum Hb ≥8 g/dL. For persons with a minimum Hb <7 g/dL, point estimates suggest a smaller (50%) increase in the odds of high fatigue, but the 95% confidence interval overlaps heavily with the estimate of patients whose minimum Hb is 7-8 g/dL. While it might be expected that patients with a minimum Hb <7 g/dL have greater levels of fatigue compared with patients with a minimum Hb 7-8 g/dL, we did not observe such a pattern. One reason may be that the confidence intervals of our estimated effects are wide enough that we cannot exclude such a pattern. Another possible explanation is that in both groups, the fatigue levels are sufficiently severe, such that the difference in their fatigue levels may not be clinically meaningful. For example, a FACIT score of 23 to 26 has been shown to be associated with an ECOG performance status of 2 to 3, requiring bed rest for at least part of the day.3 Therefore, patients with a minimum Hb 7-8 g/dL (mean FACIT score = 24; Table 2) or a minimum Hb of <7 g/dL (mean FACIT score = 23; Table 2) are already functionally limited to the point of being partially bed bound, such that further decreases in their Hb may not produce additional fatigue in part because they reduce their activity sufficiently to prevent an increase in fatigue. In such cases, the potential benefits of increased Hb may be better assessed by measuring fatigue in response to a specific and provoked activity level, a concept known as fatigability.20
That minimum Hb is more strongly associated with fatigue than any other measure of Hb during hospitalization may not be surprising. Mean, median, maximum, and discharge Hb may all be affected by transfusion during hospitalization that could affect fatigue. Admission Hb may not reflect true oxygen-carrying capacity because of hemoconcentration.
The association between Hb and fatigue in hospitalized patients is important because increased fatigue could contribute to slower clinical recovery in hospitalized patients. Additionally, increased fatigue during hospitalization and at hospital discharge could exacerbate the known deleterious consequences of fatigue on patients and their health outcomes14,15 after hospital discharge. Although one previous study, the Functional Outcomes in Cardiovascular Patients Undergoing Surgical Hip Fracture Repair (FOCUS)8 trial, did not report differences in patients’ fatigue levels at 30 and 60 days postdischarge when transfused at restrictive (8 g/dL) compared with liberal (10 g/dL) Hb thresholds, confidence in the validity of this finding is reduced by the fact that more than half of the patients were lost to follow-up at the 30- and 60-day time points. Further, patients in the restrictive transfusion arm of FOCUS were transfused to maintain Hb levels at or above 8 g/dL. This transfusion threshold of 8 g/dL may have mitigated the high levels of fatigue that are seen in our study when patients’ Hb drops below 8 g/dL, and maintaining a Hb level of 7 g/dL is now the standard of care in stable hospitalized patients. Lastly, FOCUS was limited to postoperative hip fracture patients, and the generalizability of FOCUS to hospitalized medicine patients with anemia is limited.
Therefore, our results support guideline suggestions that practitioners incorporate the presence of patient symptoms such as fatigue into transfusion decisions, particularly if patients’ Hb is <8 g/dL.18 Though reasonable, the suggestion to incorporate symptoms such as fatigue into transfusion decisions has not been strongly supported by evidence so far, and it may often be neglected in practice. Definitive evidence to support such recommendations would benefit from study through an optimal trial18 that incorporates symptoms into decision making. Our findings add support for a study of transfusion strategies that incorporates patients’ fatigue level in addition to Hb concentration.
This study has several limitations. Although our sample size is large and includes patients with a range of comorbidities that we believe are representative of hospitalized general medicine patients, as a single-center, observational study, our results may not be generalizable to other centers. Additionally, although these data support a reliable association between hospitalized patients’ minimum Hb and fatigue level, the observational design of this study cannot prove that this relationship is causal. Also, patients’ Hb values were measured at the discretion of their clinician, and therefore, the measures of Hb were not uniformly measured for participating patients. In addition, fatigue was only measured at one time point during a patient’s hospitalization, and it is possible that patients’ fatigue levels change during hospitalization in relation to variables we did not consider. Finally, our study was not designed to assess the association of Hb with longer-term functional outcomes, which may be of greater concern than fatigue.
CONCLUSION
In hospitalized patients ≥50 years old, minimum Hb is reliably associated with patients’ fatigue level. Patients whose minimum Hb is <8 g/dL experience higher fatigue levels compared to patients whose minimum Hb is ≥8 g/dL. Additional studies are warranted to understand if patients may benefit from improved fatigue levels by correcting their anemia through transfusion.
Fatigue is the most common clinical symptom of anemia and is a significant concern to patients.1,2 In ambulatory patients, lower hemoglobin (Hb) concentration is associated with increased fatigue.2,3 Accordingly, therapies that treat anemia by increasing Hb concentration, such as erythropoiesis stimulating agents,4-7 often use fatigue as an outcome measure.
In hospitalized patients, transfusion of red blood cell increases Hb concentration and is the primary treatment for anemia. However, the extent to which transfusion and changes in Hb concentration affect hospitalized patients’ fatigue levels is not well established. Guidelines support transfusing patients with symptoms of anemia, such as fatigue, on the assumption that the increased oxygen delivery will improve the symptoms of anemia. While transfusion studies in hospitalized patients have consistently reported that transfusion at lower or “restrictive” Hb concentrations is safe compared with transfusion at higher Hb concentrations,8-10 these studies have mainly used cardiac events and mortality as outcomes rather than patient symptoms, such as fatigue. Nevertheless, they have resulted in hospitals increasingly adopting restrictive transfusion policies that discourage transfusion at higher Hb levels.11,12 Consequently, the rate of transfusion in hospitalized patients has decreased,13 raising questions of whether some patients with lower Hb concentrations may experience increased fatigue as a result of restrictive transfusion policies. Fatigue among hospitalized patients is important not only because it is an adverse symptom but because it may result in decreased activity levels, deconditioning, and losses in functional status.14,15While the effect of alternative transfusion policies on fatigue in hospitalized patients could be answered by a randomized clinical trial using fatigue and functional status as outcomes, an important first step is to assess whether the Hb concentration of hospitalized patients is associated with their fatigue level during hospitalization. Because hospitalized patients often have acute illnesses that can cause fatigue in and of themselves, it is possible that anemia is not associated with fatigue in hospitalized patients despite anemia’s association with fatigue in ambulatory patients. Additionally, Hb concentration varies during hospitalization,16 raising the question of what measures of Hb during hospitalization might be most associated with anemia-related fatigue.
The objective of this study is to explore multiple Hb measures in hospitalized medical patients with anemia and test whether any of these Hb measures are associated with patients’ fatigue level.
METHODS
Study Design
We performed a prospective, observational study of hospitalized patients with anemia on the general medicine services at The University of Chicago Medical Center (UCMC). The institutional review board approved the study procedures, and all study subjects provided informed consent.
Study Eligibility
Between April 2014 and June 2015, all general medicine inpatients were approached for written consent for The University of Chicago Hospitalist Project,17 a research infrastructure at UCMC. Among patients consenting to participate in the Hospitalist Project, patients were eligible if they had Hb <9 g/dL at any point during their hospitalization and were age ≥50 years. Hb concentration of <9 g/dL was chosen to include the range of Hb values covered by most restrictive transfusion policies.8-10,18 Age ≥50 years was an inclusion criteria because anemia is more strongly associated with poor outcomes, including functional impairment, among older patients compared with younger patients.14,19-21 If patients were not eligible for inclusion at the time of consent for the Hospitalist Project, their Hb values were reviewed twice daily until hospital discharge to assess if their Hb was <9 g/dL. Proxies were sought to answer questions for patients who failed the Short Portable Mental Status Questionnaire.22
Patient Demographic Data Collection
Research assistants abstracted patient age and sex from the electronic health record (EHR), and asked patients to self-identify their race. The individual comorbidities included as part of the Charlson Comorbidity Index were identified using International Classification of Diseases, 9th Revision codes from hospital administrative data for each encounter and specifically included the following: myocardial infarction, congestive heart failure, peripheral vascular disease, cerebrovascular disease, dementia, chronic pulmonary disease, rheumatic disease, peptic ulcer disease, liver disease, diabetes, hemiplegia and/or paraplegia, renal disease, cancer, and human immunodeficiency virus/acquired immunodeficiency syndrome.23 We also used Healthcare Cost and Utilization Project (www.hcup-us.ahrq.gov/toolssoftware/ccs/ccs.jsp) diagnosis categories to identify whether patients had sickle cell (SC) anemia, gastrointestinal bleeding (GIB), or a depressive disorder (DD) because these conditions are associated with anemia (SC and GIB) and fatigue (DD).24
Measuring Anemia
Hb measures were available only when hospital providers ordered them as part of routine practice. The first Hb concentration <9 g/dL during a patient’s hospitalization, which made them eligible for study participation, was obtained through manual review of the EHR. All additional Hb values during the patient’s hospitalization were obtained from the hospital’s administrative data mart. All Hb values collected for each patient during the hospitalization were used to calculate summary measures of Hb during the hospitalization, including the mean Hb, median Hb, minimum Hb, maximum Hb, admission (first recorded) Hb, and discharge (last recorded) Hb. Hb measures were analyzed both as a continuous variable and as a categorical variable created by dividing the continuous Hb measures into integer ranges of 3 groups of approximately the same size.
Measuring Fatigue
Our primary outcome was patients’ level of fatigue reported during hospitalization, measured using the Functional Assessment of Chronic Illness Therapy (FACIT)-Anemia questionnaire. Fatigue was measured using a 13-question fatigue subscale,1,2,25 which measures fatigue within the past 7 days. Scores on the fatigue subscale range from 0 to 52, with lower scores reflecting greater levels of fatigue. As soon as patients met the eligibility criteria for study participation during their hospitalization (age ≥50 years and Hb <9 g/dL), they were approached to answer the FACIT questions. Values for missing data in the fatigue subscale for individual subjects were filled in using a prorated score from their answered questions as long as >50% of the items in the fatigue subscale were answered, in accordance with recommendations for addressing missing data in the FACIT.26 Fatigue was analyzed as a continuous variable and as a dichotomous variable created by dividing the sample into high (FACIT <27) and low (FACIT ≥27) levels of fatigue based on the median FACIT score of the population. Previous literature has shown a FACIT fatigue subscale score between 23 and 26 to be associated with an Eastern Cooperative Oncology Group (ECOG)27 C Performance Status rating of 2 to 33 compared to scores ≥27.
Statistical Analysis
Statistical analysis was performed using Stata statistical software (StataCorp, College Station, TX). Descriptive statistics were used to characterize patient demographics. Analysis of variance was used to test for differences in the mean fatigue levels across Hb measures. χ2 tests were performed to test for associations between high fatigue levels and the Hb measures. Multivariable analysis, including both linear and logistic regression models, were used to test the association of Hb concentration and fatigue. P values <0.05 using a 2-tailed test were deemed statistically significant.
RESULTS
Patient Characteristics
During the study period, 8559 patients were admitted to the general medicine service. Of those, 5073 (59%) consented for participation in the Hospitalist Project, and 3670 (72%) completed the Hospitalist Project inpatient interview. Of these patients, 1292 (35%) had Hb <9 g/dL, and 784 (61%) were 50 years or older and completed the FACIT questionnaire.
Table 1 reports the demographic characteristics and comorbidities for the sample, the mean (standard deviation [SD]) for the 6 Hb measures, and mean (SD) and median FACIT scores.
Bivariate Association of Fatigue and Hb
Categorizing patients into low, middle, or high Hb for each of the 6 Hb measures, minimum Hb was strongly associated with fatigue, with a weaker association for mean Hb and no statistically significant association for the other measures.
Minimum Hb. Patients with a minimum Hb <7 g/dL and patients with Hb 7-8 g/dL had higher fatigue levels (FACIT = 25 for each) than patients with a minimum Hb ≥8 g/dL (FACIT = 29; P < 0.001; Table 2). When excluding patients with SC and/or GIB because their average minimum Hb differed from the average minimum Hb of the full population (P < 0.001), patients with a minimum Hb <7 g/dL or 7-8 g/dL had even higher fatigue levels (FACIT = 23 and FACIT = 24, respectively), with no change in the fatigue level of patients with a minimum Hb ≥8 g/dL (FACIT = 29; P < 0.001; Table 2). Lower minimum Hb continued to be associated with higher fatigue levels when analyzed in 0.5 g/dL increments (Figure).
Mean Hb and Other Measures. Fatigue levels were high for 47% of patients with a mean Hb <8g /dL and 53% of patients with a mean Hb 8-9 g/dL compared with 43% of patients with a mean Hb ≥9 g/dL (P = 0.05). However, the association between high fatigue and mean Hb was not statistically significant when patients with SC and/or GIB were excluded (Table 2). None of the other 4 Hb measures was significantly associated with fatigue.
Linear Regression of Fatigue on Hb
In linear regression models, minimum Hb consistently predicted patient fatigue, mean Hb had a less robust association with fatigue, and the other Hb measures were not associated with patient fatigue. Increases in minimum Hb (analyzed as a continuous variable) were associated with reduced fatigue (higher FACIT score; β = 1.4; P = 0.005). In models in which minimum Hb was a categorical variable, patients with a minimum Hb of <7 g/dL or 7-8 g/dL had greater fatigue (lower FACIT score) than patients whose minimum Hb was ≥8 g/dL (Hb <7 g/dL: β = −4.2; P ≤ 0.001; Hb 7-8 g/dL: β = −4.1; P < 0.001). These results control for patients’ age, sex, individual comorbidities, and whether their minimum Hb occurred before or after the measurement of fatigue during hospitalization (Model 1), and the results are unchanged when also controlling for the number of Hb laboratory draws patients had during their hospitalization (Model 2; Table 3). In a stratified analysis excluding patients with either SC and/or GIB, changes in minimum Hb were associated with larger changes in patient fatigue levels (Supplemental Table 1). We also stratified our analysis to include only patients whose minimum Hb occurred before the measurement of their fatigue level during hospitalization to avoid a spurious association of fatigue with minimum Hb occurring after fatigue was measured. In both Models 1 and 2, minimum Hb remained a predictor of patients’ fatigue levels with similar effect sizes, although in Model 2, the results did not quite reach a statistically significant level, in part due to larger confidence intervals from the smaller sample size of this stratified analysis (Supplemental Table 2a). We further stratified this analysis to include only patients whose transfusion, if they received one, occurred after their minimum Hb and the measurement of their fatigue level to account for the possibility that a transfusion could affect the fatigue level patients report. In this analysis, most of the estimates of the effect of minimum Hb on fatigue were larger than those seen when only analyzing patients whose minimum Hb occurred before the measurement of their fatigue level, although again, the smaller sample size of this additional stratified analysis does produce larger confidence intervals for these estimates (Supplemental Table 2b).
No Hb measure other than minimum or mean had significant association with patient fatigue levels in linear regression models.
Logistic Regression of High Fatigue Level on Hb
Using logistic regression, minimum Hb analyzed as a categorical variable predicted increased odds of a high fatigue level. Patients with a minimum Hb <7 g/dL were 50% (odds ratio [OR] = 1.5; P = 0.03) more likely to have high fatigue and patients with a minimum Hb 7-8 g/dL were 90% (OR = 1.9; P < 0.001) more likely to have high fatigue compared with patients with a minimum Hb ≥8 g/dL in Model 1. These results were similar in Model 2, although the effect was only statistically significant in the 7-8 g/dL Hb group (Table 3). When excluding SC and/or GIB patients, the odds of having high fatigue as minimum Hb decreased were the same or higher for both models compared to the full population of patients. However, again, in Model 2, the effect was only statistically significant in the 7-8 g/dL Hb group (Supplemental Table 1).
Patients with a mean Hb <8 g/dL were 20% to 30% more likely to have high fatigue and patients with mean Hb 8-9 g/dL were 50% more likely to have high fatigue compared with patients with a mean Hb ≥9 g/dL, but the effects were only statistically significant for patients with a mean Hb 8-9 g/dL in both Models 1 and 2 (Table 3). These results were similar when excluding patients with SC and/or GIB, but they were only significant for patients with a mean Hb 8-9 g/dL in Model 1 and patients with a mean Hb <8 g/dL in the Model 2 (Supplemental Table 3).
DISCUSSION
These results demonstrate that minimum Hb during hospitalization is associated with fatigue in hospitalized patients age ≥50 years, and the association is stronger among patients without SC and/or GIB as comorbidities. The analysis of Hb as a continuous and categorical variable and the use of both linear and logistic regression models support the robustness of these associations and illuminate their clinical significance. For example, in linear regression with minimum Hb a continuous variable, the coefficient of 1.4 suggests that an increase of 2 g/dL in Hb, as might be expected from transfusion of 2 units of red blood cells, would be associated with about a 3-point improvement in fatigue. Additionally, as a categorical variable, a minimum Hb ≥8 g/dL compared with a minimum Hb <7 g/dL or 7-8 g/dL is associated with a 3- to 4-point improvement in fatigue. Previous literature suggests that a difference of 3 in the FACIT score is the minimum clinically important difference in fatigue,3 and changes in minimum Hb in either model predict changes in fatigue that are in the range of potential clinical significance.
The clinical significance of the findings is also reflected in the results of the logistic regressions, which may be mapped to potential effects on functional status. Specifically, the odds of having a high fatigue level (FACIT <27) increase 90% for persons with a minimum Hb 7–8 g/dL compared with persons with a minimum Hb ≥8 g/dL. For persons with a minimum Hb <7 g/dL, point estimates suggest a smaller (50%) increase in the odds of high fatigue, but the 95% confidence interval overlaps heavily with the estimate of patients whose minimum Hb is 7-8 g/dL. While it might be expected that patients with a minimum Hb <7 g/dL have greater levels of fatigue compared with patients with a minimum Hb 7-8 g/dL, we did not observe such a pattern. One reason may be that the confidence intervals of our estimated effects are wide enough that we cannot exclude such a pattern. Another possible explanation is that in both groups, the fatigue levels are sufficiently severe, such that the difference in their fatigue levels may not be clinically meaningful. For example, a FACIT score of 23 to 26 has been shown to be associated with an ECOG performance status of 2 to 3, requiring bed rest for at least part of the day.3 Therefore, patients with a minimum Hb 7-8 g/dL (mean FACIT score = 24; Table 2) or a minimum Hb of <7 g/dL (mean FACIT score = 23; Table 2) are already functionally limited to the point of being partially bed bound, such that further decreases in their Hb may not produce additional fatigue in part because they reduce their activity sufficiently to prevent an increase in fatigue. In such cases, the potential benefits of increased Hb may be better assessed by measuring fatigue in response to a specific and provoked activity level, a concept known as fatigability.20
That minimum Hb is more strongly associated with fatigue than any other measure of Hb during hospitalization may not be surprising. Mean, median, maximum, and discharge Hb may all be affected by transfusion during hospitalization that could affect fatigue. Admission Hb may not reflect true oxygen-carrying capacity because of hemoconcentration.
The association between Hb and fatigue in hospitalized patients is important because increased fatigue could contribute to slower clinical recovery in hospitalized patients. Additionally, increased fatigue during hospitalization and at hospital discharge could exacerbate the known deleterious consequences of fatigue on patients and their health outcomes14,15 after hospital discharge. Although one previous study, the Functional Outcomes in Cardiovascular Patients Undergoing Surgical Hip Fracture Repair (FOCUS)8 trial, did not report differences in patients’ fatigue levels at 30 and 60 days postdischarge when transfused at restrictive (8 g/dL) compared with liberal (10 g/dL) Hb thresholds, confidence in the validity of this finding is reduced by the fact that more than half of the patients were lost to follow-up at the 30- and 60-day time points. Further, patients in the restrictive transfusion arm of FOCUS were transfused to maintain Hb levels at or above 8 g/dL. This transfusion threshold of 8 g/dL may have mitigated the high levels of fatigue that are seen in our study when patients’ Hb drops below 8 g/dL, and maintaining a Hb level of 7 g/dL is now the standard of care in stable hospitalized patients. Lastly, FOCUS was limited to postoperative hip fracture patients, and the generalizability of FOCUS to hospitalized medicine patients with anemia is limited.
Therefore, our results support guideline suggestions that practitioners incorporate the presence of patient symptoms such as fatigue into transfusion decisions, particularly if patients’ Hb is <8 g/dL.18 Though reasonable, the suggestion to incorporate symptoms such as fatigue into transfusion decisions has not been strongly supported by evidence so far, and it may often be neglected in practice. Definitive evidence to support such recommendations would benefit from study through an optimal trial18 that incorporates symptoms into decision making. Our findings add support for a study of transfusion strategies that incorporates patients’ fatigue level in addition to Hb concentration.
This study has several limitations. Although our sample size is large and includes patients with a range of comorbidities that we believe are representative of hospitalized general medicine patients, as a single-center, observational study, our results may not be generalizable to other centers. Additionally, although these data support a reliable association between hospitalized patients’ minimum Hb and fatigue level, the observational design of this study cannot prove that this relationship is causal. Also, patients’ Hb values were measured at the discretion of their clinician, and therefore, the measures of Hb were not uniformly measured for participating patients. In addition, fatigue was only measured at one time point during a patient’s hospitalization, and it is possible that patients’ fatigue levels change during hospitalization in relation to variables we did not consider. Finally, our study was not designed to assess the association of Hb with longer-term functional outcomes, which may be of greater concern than fatigue.
CONCLUSION
In hospitalized patients ≥50 years old, minimum Hb is reliably associated with patients’ fatigue level. Patients whose minimum Hb is <8 g/dL experience higher fatigue levels compared to patients whose minimum Hb is ≥8 g/dL. Additional studies are warranted to understand if patients may benefit from improved fatigue levels by correcting their anemia through transfusion.
1. Yellen SB, Cella DF, Webster K, Blendowski C, Kaplan E. Measuring fatigue and other anemia-related symptoms with the Functional Assessment of Cancer Therapy (FACT) measurement system. J Pain Symptom Manage. 1997;13(2):63-74.
2. Cella D, Lai JS, Chang CH, Peterman A, Slavin M. Fatigue in cancer patients compared with fatigue in the general United States population. Cancer. 2002;94(2):528-538. doi:10.1002/cncr.10245.
3. Cella D, Eton DT, Lai J-S, Peterman AH, Merkel DE. Combining anchor and distribution-based methods to derive minimal clinically important differences on the Functional Assessment of Cancer Therapy (FACT) anemia and fatigue scales. J Pain Symptom Manage. 2002;24(6):547-561.
4. Tonelli M, Hemmelgarn B, Reiman T, et al. Benefits and harms of erythropoiesis-stimulating agents for anemia related to cancer: a meta-analysis. CMAJ Can Med Assoc J J Assoc Medicale Can. 2009;180(11):E62-E71. doi:10.1503/cmaj.090470.
5. Foley RN, Curtis BM, Parfrey PS. Erythropoietin Therapy, Hemoglobin Targets, and Quality of Life in Healthy Hemodialysis Patients: A Randomized Trial. Clin J Am Soc Nephrol. 2009;4(4):726-733. doi:10.2215/CJN.04950908.
6. Keown PA, Churchill DN, Poulin-Costello M, et al. Dialysis patients treated with Epoetin alfa show improved anemia symptoms: A new analysis of the Canadian Erythropoietin Study Group trial. Hemodial Int Int Symp Home Hemodial. 2010;14(2):168-173. doi:10.1111/j.1542-4758.2009.00422.x.
7. Palmer SC, Saglimbene V, Mavridis D, et al. Erythropoiesis-stimulating agents for anaemia in adults with chronic kidney disease: a network meta-analysis. Cochrane Database Syst Rev. 2014:CD010590.
8. Carson JL, Terrin ML, Noveck H, et al. Liberal or Restrictive Transfusion in high-risk patients after hip surgery. N Engl J Med. 2011;365(26):2453-2462. doi:10.1056/NEJMoa1012452.
9. Holst LB, Haase N, Wetterslev J, et al. Transfusion requirements in septic shock (TRISS) trial – comparing the effects and safety of liberal versus restrictive red blood cell transfusion in septic shock patients in the ICU: protocol for a randomised controlled trial. Trials. 2013;14:150. doi:10.1186/1745-6215-14-150.
10. Hébert PC, Wells G, Blajchman MA, et al. A multicenter, randomized, controlled clinical trial of transfusion requirements in critical care. N Engl J Med. 1999;340(6):409-417. doi:10.1056/NEJM199902113400601.
11. Corwin HL, Theus JW, Cargile CS, Lang NP. Red blood cell transfusion: Impact of an education program and a clinical guideline on transfusion practice. J Hosp Med. 2014;9(12):745-749. doi:10.1002/jhm.2237.
12. Saxena, S, editor. The Transfusion Committee: Putting Patient Safety First, 2nd Edition. Bethesda (MD): American Association of Blood Banks; 2013.
13. The 2011 National Blood Collection and Utilization Report. http://www.hhs.gov/ash/bloodsafety/2011-nbcus.pdf. Accessed August 16, 2017.
14. Vestergaard S, Nayfield SG, Patel KV, et al. Fatigue in a Representative Population of Older Persons and Its Association With Functional Impairment, Functional Limitation, and Disability. J Gerontol A Biol Sci Med Sci. 2009;64A(1):76-82. doi:10.1093/gerona/gln017.
15. Gill TM, Desai MM, Gahbauer EA, Holford TR, Williams CS. Restricted activity among community-living older persons: incidence, precipitants, and health care utilization. Ann Intern Med. 2001;135(5):313-321.
16. Koch CG, Li L, Sun Z, et al. Hospital-acquired anemia: Prevalence, outcomes, and healthcare implications. J Hosp Med. 2013;8(9):506-512. doi:10.1002/jhm.2061.
17. Meltzer D, Manning WG, Morrison J, et al. Effects of Physician Experience on Costs and Outcomes on an Academic General Medicine Service: Results of a Trial of Hospitalists. Ann Intern Med. 2002;137(11):866-874. doi:10.7326/0003-4819-137-11-200212030-00007.
18. Carson JL, Grossman BJ, Kleinman S, et al. Red Blood Cell Transfusion: A Clinical Practice Guideline From the AABB*. Ann Intern Med. 2012;157(1):49-58. doi:10.7326/0003-4819-157-1-201206190-00429.
19. Moreh E, Jacobs JM, Stessman J. Fatigue, function, and mortality in older adults. J Gerontol A Biol Sci Med Sci. 2010;65(8):887-895. doi:10.1093/gerona/glq064.
20. Eldadah BA. Fatigue and Fatigability in Older Adults. PM&R. 2010;2(5):406-413. doi:10.1016/j.pmrj.2010.03.022.
21. Hardy SE, Studenski SA. Fatigue Predicts Mortality among Older Adults. J Am Geriatr Soc. 2008;56(10):1910-1914. doi:10.1111/j.1532-5415.2008.01957.x.
22. Pfeiffer E. A short portable mental status questionnaire for the assessment of organic brain deficit in elderly patients. J Am Geriatr Soc. 1975;23(10):433-441.
23. Quan H, Sundararajan V, Halfon P, et al. Coding algorithms for defining comorbidities in ICD-9-CM and ICD-10 administrative data. Med Care. 2005;43(11):1130-1139.
24. HCUP Clinical Classifications Software (CCS) for ICD-9-CM. Healthcare Cost and Utilization Project (HCUP). 2006-2009. Agency for Healthcare Research and Quality, Rockville, MD. https://www.hcup-us.ahrq.gov/toolssoftware/ccs/ccs.jsp. Accessed November 22, 2016.
25. Cella DF, Tulsky DS, Gray G, et al. The Functional Assessment of Cancer Therapy scale: development and validation of the general measure. J Clin Oncol Off J Am Soc Clin Oncol. 1993;11(3):570-579.
26. Webster K, Cella D, Yost K. The Functional Assessment of Chronic Illness Therapy (FACIT) Measurement System: properties, applications, and interpretation. Health Qual Life Outcomes. 2003;1:79. doi:10.1186/1477-7525-1-79.
27. Oken MMMD a, Creech RHMD b, Tormey DCMD, et al. Toxicity and response criteria of the Eastern Cooperative Oncology Group. J Clin Oncol. 1982;5(6):649-656.
1. Yellen SB, Cella DF, Webster K, Blendowski C, Kaplan E. Measuring fatigue and other anemia-related symptoms with the Functional Assessment of Cancer Therapy (FACT) measurement system. J Pain Symptom Manage. 1997;13(2):63-74.
2. Cella D, Lai JS, Chang CH, Peterman A, Slavin M. Fatigue in cancer patients compared with fatigue in the general United States population. Cancer. 2002;94(2):528-538. doi:10.1002/cncr.10245.
3. Cella D, Eton DT, Lai J-S, Peterman AH, Merkel DE. Combining anchor and distribution-based methods to derive minimal clinically important differences on the Functional Assessment of Cancer Therapy (FACT) anemia and fatigue scales. J Pain Symptom Manage. 2002;24(6):547-561.
4. Tonelli M, Hemmelgarn B, Reiman T, et al. Benefits and harms of erythropoiesis-stimulating agents for anemia related to cancer: a meta-analysis. CMAJ Can Med Assoc J J Assoc Medicale Can. 2009;180(11):E62-E71. doi:10.1503/cmaj.090470.
5. Foley RN, Curtis BM, Parfrey PS. Erythropoietin Therapy, Hemoglobin Targets, and Quality of Life in Healthy Hemodialysis Patients: A Randomized Trial. Clin J Am Soc Nephrol. 2009;4(4):726-733. doi:10.2215/CJN.04950908.
6. Keown PA, Churchill DN, Poulin-Costello M, et al. Dialysis patients treated with Epoetin alfa show improved anemia symptoms: A new analysis of the Canadian Erythropoietin Study Group trial. Hemodial Int Int Symp Home Hemodial. 2010;14(2):168-173. doi:10.1111/j.1542-4758.2009.00422.x.
7. Palmer SC, Saglimbene V, Mavridis D, et al. Erythropoiesis-stimulating agents for anaemia in adults with chronic kidney disease: a network meta-analysis. Cochrane Database Syst Rev. 2014:CD010590.
8. Carson JL, Terrin ML, Noveck H, et al. Liberal or Restrictive Transfusion in high-risk patients after hip surgery. N Engl J Med. 2011;365(26):2453-2462. doi:10.1056/NEJMoa1012452.
9. Holst LB, Haase N, Wetterslev J, et al. Transfusion requirements in septic shock (TRISS) trial – comparing the effects and safety of liberal versus restrictive red blood cell transfusion in septic shock patients in the ICU: protocol for a randomised controlled trial. Trials. 2013;14:150. doi:10.1186/1745-6215-14-150.
10. Hébert PC, Wells G, Blajchman MA, et al. A multicenter, randomized, controlled clinical trial of transfusion requirements in critical care. N Engl J Med. 1999;340(6):409-417. doi:10.1056/NEJM199902113400601.
11. Corwin HL, Theus JW, Cargile CS, Lang NP. Red blood cell transfusion: Impact of an education program and a clinical guideline on transfusion practice. J Hosp Med. 2014;9(12):745-749. doi:10.1002/jhm.2237.
12. Saxena, S, editor. The Transfusion Committee: Putting Patient Safety First, 2nd Edition. Bethesda (MD): American Association of Blood Banks; 2013.
13. The 2011 National Blood Collection and Utilization Report. http://www.hhs.gov/ash/bloodsafety/2011-nbcus.pdf. Accessed August 16, 2017.
14. Vestergaard S, Nayfield SG, Patel KV, et al. Fatigue in a Representative Population of Older Persons and Its Association With Functional Impairment, Functional Limitation, and Disability. J Gerontol A Biol Sci Med Sci. 2009;64A(1):76-82. doi:10.1093/gerona/gln017.
15. Gill TM, Desai MM, Gahbauer EA, Holford TR, Williams CS. Restricted activity among community-living older persons: incidence, precipitants, and health care utilization. Ann Intern Med. 2001;135(5):313-321.
16. Koch CG, Li L, Sun Z, et al. Hospital-acquired anemia: Prevalence, outcomes, and healthcare implications. J Hosp Med. 2013;8(9):506-512. doi:10.1002/jhm.2061.
17. Meltzer D, Manning WG, Morrison J, et al. Effects of Physician Experience on Costs and Outcomes on an Academic General Medicine Service: Results of a Trial of Hospitalists. Ann Intern Med. 2002;137(11):866-874. doi:10.7326/0003-4819-137-11-200212030-00007.
18. Carson JL, Grossman BJ, Kleinman S, et al. Red Blood Cell Transfusion: A Clinical Practice Guideline From the AABB*. Ann Intern Med. 2012;157(1):49-58. doi:10.7326/0003-4819-157-1-201206190-00429.
19. Moreh E, Jacobs JM, Stessman J. Fatigue, function, and mortality in older adults. J Gerontol A Biol Sci Med Sci. 2010;65(8):887-895. doi:10.1093/gerona/glq064.
20. Eldadah BA. Fatigue and Fatigability in Older Adults. PM&R. 2010;2(5):406-413. doi:10.1016/j.pmrj.2010.03.022.
21. Hardy SE, Studenski SA. Fatigue Predicts Mortality among Older Adults. J Am Geriatr Soc. 2008;56(10):1910-1914. doi:10.1111/j.1532-5415.2008.01957.x.
22. Pfeiffer E. A short portable mental status questionnaire for the assessment of organic brain deficit in elderly patients. J Am Geriatr Soc. 1975;23(10):433-441.
23. Quan H, Sundararajan V, Halfon P, et al. Coding algorithms for defining comorbidities in ICD-9-CM and ICD-10 administrative data. Med Care. 2005;43(11):1130-1139.
24. HCUP Clinical Classifications Software (CCS) for ICD-9-CM. Healthcare Cost and Utilization Project (HCUP). 2006-2009. Agency for Healthcare Research and Quality, Rockville, MD. https://www.hcup-us.ahrq.gov/toolssoftware/ccs/ccs.jsp. Accessed November 22, 2016.
25. Cella DF, Tulsky DS, Gray G, et al. The Functional Assessment of Cancer Therapy scale: development and validation of the general measure. J Clin Oncol Off J Am Soc Clin Oncol. 1993;11(3):570-579.
26. Webster K, Cella D, Yost K. The Functional Assessment of Chronic Illness Therapy (FACIT) Measurement System: properties, applications, and interpretation. Health Qual Life Outcomes. 2003;1:79. doi:10.1186/1477-7525-1-79.
27. Oken MMMD a, Creech RHMD b, Tormey DCMD, et al. Toxicity and response criteria of the Eastern Cooperative Oncology Group. J Clin Oncol. 1982;5(6):649-656.
© 2017 Society of Hospital Medicine
A Longitudinal Study of Transfusion Utilization in Hospitalized Veterans
Abstract
- Background: Although transfusion guidelines have changed considerably over the past 2 decades, the adoption of patient blood management programs has not been fully realized across hospitals in the United States.
- Objective: To evaluate trends in red blood cell (RBC), platelet, and plasma transfusion at 3 Veterans Health Administration (VHA) hospitals from 2000 through 2010.
- Methods: Data from all hospitalizations were collected from January 2000 through December 2010. Blood bank data (including the type and volume of products administered) were available electronically from each hospital. These files were linked to inpatient data, which included ICD-9-CM diagnoses (principal and secondary) and procedures during hospitalization. Statistical analyses were conducted using generalized linear models to evaluate trends over time. The unit of observation was hospitalization, with categorization by type.
- Results: There were 176,521 hospitalizations in 69,621 patients; of these, 13.6% of hospitalizations involved transfusion of blood products (12.7% RBCs, 1.4% platelets, 3.0% plasma). Transfusion occurred in 25.2% of surgical and 5.3% of medical hospitalizations. Transfusion use peaked in 2002 for surgical hospitalizations and declined afterwards (P < 0.001). There was no significant change in transfusion use over time (P = 0.126) for medical hospitalizations. In hospitalizations that involved transfusions, there was a 20.3% reduction in the proportion of hospitalizations in which ≥ 3 units of RBCs were given (from 51.7% to 41.1%; P < 0.001) and a 73.6% increase when 1 RBC unit was given (from 8.0% to 13.8%; P < 0.001) from 2000-2010. Of the hospitalizations with RBC transfusion, 9.6% involved the use of 1 unit over the entire study period. The most common principal diagnoses for medical patients receiving transfusion were anemia, malignancy, heart failure, pneumonia and renal failure. Over time, transfusion utilization increased in patients who were admitted for infection (P = 0.009).
- Conclusion: Blood transfusions in 3 VHA hospitals have decreased over time for surgical patients but remained the same for medical patients. Further study examining appropriateness of blood products in medical patients appears necessary.
Key words: Transfusion; red blood cells; plasma; platelets; veterans.
Transfusion practices during hospitalization have changed considerably over the past 2 decades. Guided by evidence from randomized controlled trials, patient blood management programs have been expanded [1]. Such programs include recommendations regarding minimization of blood loss during surgery, prevention and treatment of anemia, strategies for reducing transfusions in both medical and surgical patients, improved blood utilization, education of health professionals, and standardization of blood management-related metrics [2]. Some of the guidelines have been incorporated into the Choosing Wisely initiative of the American Board of Internal Medicine Foundation, including: (a) don’t transfuse more units of blood than absolutely necessary, (b) don’t transfuse red blood cells for iron deficiency without hemodynamic instability, (c) don’t routinely use blood products to reverse warfarin, and (d) don’t perform serial blood counts on clinically stable patients [3]. Although there has been growing interest in blood management, only 37.8% of the 607 AABB (formerly, American Association of Blood Banks) facilities in the United States reported having a patient blood management program in 2013 [2].
While the importance of blood safety is recognized, data regarding the overall trends in practices are conflicting. A study using the Nationwide Inpatient Sample indicated that there was a 5.6% annual mean increase in the transfusion of blood products from 2002 to 2011 in the United States [4]. This contrasts with the experience of Kaiser Permanente in Northern California, in which the incidence of RBC transfusion decreased by 3.2% from 2009 to 2013 [5]. A decline in rates of intraoperative transfusion was also reported among elderly veterans in the United States from 1997 to 2009 [6].
We conducted a study in hospitalized veterans with 2 main objectives: (a) to evaluate trends in utilization of red blood cells (RBCs), platelets, and plasma over time, and (b) to identify those groups of veterans who received specific blood products. We were particularly interested in transfusion use in medical patients.
Methods
Participants were hospitalized veterans at 3 Department of Veterans Affairs (VA) medical centers. Data from all hospitalizations were collected from January 2000 through December 2010. Blood bank data (including the type and volume of products administered) were available electronically from each hospital. These files were linked to inpatient data, which included ICD-9-CM diagnoses (principal and secondary) and procedures during hospitalization.
Statistical analyses were conducted using generalized linear models to evaluate trends over time. The unit of observation was hospitalization, with categorization by type. Surgical hospitalizations were defined as admissions in which any surgical procedure occurred, whereas medical hospitalizations were defined as admissions without any surgery. Alpha was set at 0.05, 2-tailed. All analyses were conducted in Stata/MP 14.1 (StataCorp, College Station, TX). The study received institutional review board approval from the VA Ann Arbor Healthcare System.
Results
From 2000 through 2010, there were 176,521 hospitalizations in 69,621 patients. Within this cohort, 6% were < 40 years of age, 66% were 40 to 69 years of age, and 28% were 70 years or older at the time of admission. In this cohort, 96% of patients were male. Overall, 13.6% of all hospitalizations involved transfusion of a blood product (12.7% RBCs, 1.4% platelets, 3.0% plasma).
Transfusion occurred in 25.2% of surgical hospitalizations and 5.3% of medical hospitalizations. For surgical hospitalizations, transfusion use peaked in 2002 (when 30.9% of the surgical hospitalizations involved a trans-fusion) and significantly declined afterwards (P < 0.001). By 2010, 22.5% of the surgical hospitalizations involved a transfusion. Most of the surgeries where blood products were transfused involved cardiovascular procedures. For medical hospitalizations only, there was no significant change in transfusion use over time, either from 2000 to 2010 (P = 0.126) or from 2002 to 2010 (P = 0.072). In 2010, 5.2% of the medical hospitalizations involved a transfusion.
Rates of transfusion varied by principal diagnosis (Figure 1). For patients admitted with a principal diagnosis of infection (n = 20,981 hospitalizations), there was an increase in the percentage of hospitalizations in which transfusions (RBCs, platelet, plasma) were administered over time (P = 0.009) (Figure 1). For patients admitted with a principal diagnosis of malignancy (n = 12,904 hospitalizations), cardiovascular disease (n = 40,324 hospitalizations), and other diagnoses (n = 102,312 hospitalizations), there were no significant linear trends over the entire study period (P = 0.191, P = 0.052, P = 0.314, respectively). Rather, blood utilization peaked in year 2002 and significantly declined afterwards for patients admitted for malignancy (P < 0.001) and for cardiovascular disease (P < 0.001).
The most common principal diagnoses for medical patients receiving any transfusion (RBCs, platelet, plasma) are listed in Table 1. For medical patients with a principal diagnosis of anemia, 88% of hospitalizations involved a transfusion (Table 1). Transfusion occurred in 6% to 11% of medical hospitalizations with malignancies, heart failure, pneumonia or renal failure (Table 1). A considerable proportion (43%) of medical patients with gastrointestinal hemorrhage received a transfusion.
9.6% (2154/22,344) involved the use of only 1 unit, 43.8% (9791/22,344) involved 2 units, and 46.5% (10,399/22,344) involved 3 or more units during the hospitalization. From 2000 through 2010, there was a 20.3% reduction in the proportion of hospitalizations in which 3 or more units of RBCs were given (from 51.7% to 41.1%; P < 0.001). That is, among those hospitalizations in which a RBC transfusion occurred, a smaller proportion of hospitalizations involved the administration of 3 or more units of RBCs from 2000 through 2010 (Figure 2). There was an 11.5% increase in the proportion of hospitalizations in which 2 units of RBCs were used (from 40.4% to 45.0%; P < 0.001). In addition, there was a 73.6% increase in the proportion of hospitalizations in which 1 RBC unit was given (from 8.0% to 13.8%;
P = 0.001).
16.8 mL/hospitalization in 2010. For plasma, the mean mL/hospitalization was 28.9 in year 2000, increased to 50.1 mL/hospitalization in year 2008, and declined, thereafter, to 35.1 mL/hospitalization in year 2010.
Discussion
We also observed secular trends in the volume of RBCs administered. There was an increase in the percentage of hospitalizations in which 1 or 2 RBC units were used and a decline in transfusion of 3 or more units. The reduction in the use of 3 or more RBC units may reflect the adoption and integration of recommendations in patient blood management by clinicians,
which encourage assessment of the patients’ symptoms in determining whether additional units are necessary [7]. Such guidelines also endorse the avoidance of routine
administration of 2 units of RBCs if 1 unit is sufficient [8]. We have previously shown that, after coronary artery bypass grafting, 2 RBC units doubled the risk of pneumonia [9]; additional analyses indicated that 1 or 2 units of RBCs were associated with increased postoperative morbidity [10]. In addition, our previous research indicated that the probability of infection increased considerably between 1 and 2 RBC units, with a more gradual increase beyond 2 units [11]. With this evidence in mind, some studies at single sites have reported that there was a dramatic decline from 2 RBC units before initiation of patient blood management programs to 1 unit after the programs were implemented [12,13].
Medical patients who received a transfusion were often admitted for reason of anemia, cancer, organ failure, or pneumonia. Some researchers are now reporting that blood use, at certain sites, is becoming more common in medical rather than surgical patients, which may be due to an expansion of patient blood management procedures in surgery [16]. There are a substantial number of patient blood management programs among surgical specialties and their adoption has expanded [17]. Although there are fewer patient blood management programs in the nonsurgical setting, some have been targeted to internal medicine physicians and specifically, to hospitalists [1,18]. For example, a toolkit from the Society of Hospital Medicine centers on anemia management and includes anemia assessment, treatment, evaluation of RBC transfusion risk, blood conservation, optimization of coagulation, and patient-centered decision-making [19]. Additionally, bundling of patient blood management strategies has been launched to help encourage a wider adoption of such programs [20].
While guidelines regarding use of RBCs are becoming increasingly recognized, recommendations for the use of platelets and plasma are hampered by the paucity of evidence from randomized controlled trials [21,22]. There is moderate-quality evidence for the use of platelets with therapy-induced hypoproliferative thrombocytopenia in hospitalized patients [21], but low quality evidence for other uses. Moreover, a recent review of plasma transfusion in bleeding patients found no randomized controlled trials on plasma use in hospitalized patients, although several trials were currently underway [22].
Our findings need to be considered in the context of the following limitations. The data were from 3 VA hospitals, so the results may not reflect patterns of usage at other hospitals. However, AABB reports that there has been a general decrease in transfusion of allogeneic whole blood and RBC units since 2008 at the AABB-affiliated sites in the United States [2]; this is similar to the pattern that we observed in surgical patients. In addition, we report an overall view of trends without having details regarding which specific factors influenced changes in transfusion during this 11-year period. It is possible that the severity of hospitalized patients may have changed with time which could have influenced decisions regarding the need for transfusion.
In conclusion, the use of blood products decreased in surgical patients since 2002 but remained the same in medical patients in this VA population. Transfusions increased over time for patients who were admitted to the hospital for reason of infection, but decreased since 2002 for those admitted for cardiovascular disease or cancer. The number of RBC units per hospitalization decreased over time. Additional surveillance is needed to determine whether recent evidence regarding blood management has been incorporated into clinical practice for medical patients, as we strive to deliver optimal care to our veterans.
Corresponding author: Mary A.M. Rogers, PhD, MS, Dept. of Internal Medicine, Univ. of Michigan, 016-422W NCRC, Ann Arbor, MI 48109-2800, [email protected].
Funding/support: Department of Veterans Affairs, Clinical Sciences Research & Development Service Merit Review Award (EPID-011-11S). The contents do not represent the views of the U.S. Department of Veterans Affairs or the U.S. Government.
Financial disclosures: None.
Author contributions: conception and design, MAMR, SS; analysis and interpretation of data, MAMR, JDB, DR, LK, SS; drafting of article, MAMR; critical revision of the article, MAMR, MTG, DR, LK, SS, VC; statistical expertise, MAMR, DR; obtaining of funding, MTG, SS, VC; administrative or technical support, MTG, LK, SS, VC; collection and assembly of data, JDB, LK.
1. Hohmuth B, Ozawa S, Ashton M, Melseth RL. Patient-centered blood management. J Hosp Med 2014;9:60–5.
2. Whitaker B, Rajbhandary S, Harris A. The 2013 AABB blood collection, utilization, and patient blood management survey report. United States Department of Health and Human Services, AABB; 2015.
3. Cassel CK, Guest JA. Choosing wisely: helping physicians and patients make smart decisions about their care. JAMA 2012;307:1801–2.
4. Pathak R, Bhatt VR, Karmacharya P, et al. Trends in blood-product transfusion among inpatients in the United States from 2002 to 2011: data from the nationwide inpatient sample. J Hosp Med 2014;9:800–1.
5. Roubinian NH, Escobar GJ, Liu V, et al. Trends in red blood cell transfusion and 30-day mortality among hospitalized patients. Transfusion 2014;54:2678–86.
6. Chen A, Trivedi AN, Jiang L, et al. Hospital blood transfusion patterns during major noncardiac surgery and surgical mortality. Medicine (Baltimore) 2015;94:e1342.
7. Carson JL, Guyatt G, Heddle NM, et al. Clinical practice guidelines from the AABB: Red blood cell transfusion thresholds and storage. JAMA 2016;316:2025–35.
8. Hicks LK, Bering H, Carson KR, et al. The ASH choosing wisely® campaign: five hematologic tests and treatments to question. Blood 2013;122:3879–83.
9. Likosky DS, Paone G, Zhang M, et al. Red blood cell transfusions impact pneumonia rates after coronary artery bypass grafting. Ann Thorac Surg 2015;100:794–801.
10. Paone G, Likosky DS, Brewer R, et al. Transfusion of 1 and 2 units of red blood cells is associated with increased morbidity and mortality. Ann Thorac Surg 2014;97:87–93; discussion 93–4.
11. Rogers MAM, Blumberg N, Heal JM, et al. Role of transfusion in the development of urinary tract–related bloodstream infection. Arch Intern Med 2011;171:1587–9.
12. Oliver JC, Griffin RL, Hannon T, Marques MB. The success of our patient blood management program depended on an institution-wide change in transfusion practices. Transfusion 2014;54:2617–24.
13. Yerrabothala S, Desrosiers KP, Szczepiorkowski ZM, Dunbar NM. Significant reduction in red blood cell transfusions in a general hospital after successful implementation of a restrictive transfusion policy supported by prospective computerized order auditing. Transfusion 2014;54:2640–5.
14. Rehm JP, Otto PS, West WW, et al. Hospital-wide educational program decreases red blood cell transfusions. J Surg Res 1998;75:183–6.
15. Lawler EV, Bradbury BD, Fonda JR, et al. Transfusion burden among patients with chronic kidney disease and anemia. Clin J Am Soc Nephrol 2010;5:667–72.
16. Tinegate H, Pendry K, Murphy M, et al. Where do all the red blood cells (RBCs) go? Results of a survey of RBC use in England and North Wales in 2014. Transfusion 2016;56:139–45.
17. Meybohm P, Herrmann E, Steinbicker AU, et al. Patient blood management is associated with a substantial reduction of red blood cell utilization and safe for patient’s outcome: a prospective, multicenter cohort study with a noninferiority design. Ann Surg 2016;264:203–11.
18. Corwin HL, Theus JW, Cargile CS, Lang NP. Red blood cell transfusion: impact of an education program and a clinical guideline on transfusion practice. J Hosp Med 2014;9:745–9.
19. Society of Hospital Medicine. Anemia prevention and management program implementation toolkit. Accessed at www.hospitalmedicine.org/Web/Quality___Innovation/Implementation_Toolkit/Anemia/anemia_overview.aspx on 9 June 2017.
20. Meybohm P, Richards T, Isbister J, et al. Patient blood management bundles to facilitate implementation. Transfus Med Rev 2017;31:62–71.
21. Kaufman RM, Djulbegovic B, Gernsheimer T, et al. Platelet transfusion: a clinical practice guideline from the AABB. Ann Intern Med 2015;162:205–13.
22. Levy JH, Grottke O, Fries D, Kozek-Langenecker S. Therapeutic plasma transfusion in bleeding patients: A systematic review. Anesth Analg 2017;124:1268–76.
Abstract
- Background: Although transfusion guidelines have changed considerably over the past 2 decades, the adoption of patient blood management programs has not been fully realized across hospitals in the United States.
- Objective: To evaluate trends in red blood cell (RBC), platelet, and plasma transfusion at 3 Veterans Health Administration (VHA) hospitals from 2000 through 2010.
- Methods: Data from all hospitalizations were collected from January 2000 through December 2010. Blood bank data (including the type and volume of products administered) were available electronically from each hospital. These files were linked to inpatient data, which included ICD-9-CM diagnoses (principal and secondary) and procedures during hospitalization. Statistical analyses were conducted using generalized linear models to evaluate trends over time. The unit of observation was hospitalization, with categorization by type.
- Results: There were 176,521 hospitalizations in 69,621 patients; of these, 13.6% of hospitalizations involved transfusion of blood products (12.7% RBCs, 1.4% platelets, 3.0% plasma). Transfusion occurred in 25.2% of surgical and 5.3% of medical hospitalizations. Transfusion use peaked in 2002 for surgical hospitalizations and declined afterwards (P < 0.001). There was no significant change in transfusion use over time (P = 0.126) for medical hospitalizations. In hospitalizations that involved transfusions, there was a 20.3% reduction in the proportion of hospitalizations in which ≥ 3 units of RBCs were given (from 51.7% to 41.1%; P < 0.001) and a 73.6% increase when 1 RBC unit was given (from 8.0% to 13.8%; P < 0.001) from 2000-2010. Of the hospitalizations with RBC transfusion, 9.6% involved the use of 1 unit over the entire study period. The most common principal diagnoses for medical patients receiving transfusion were anemia, malignancy, heart failure, pneumonia and renal failure. Over time, transfusion utilization increased in patients who were admitted for infection (P = 0.009).
- Conclusion: Blood transfusions in 3 VHA hospitals have decreased over time for surgical patients but remained the same for medical patients. Further study examining appropriateness of blood products in medical patients appears necessary.
Key words: Transfusion; red blood cells; plasma; platelets; veterans.
Transfusion practices during hospitalization have changed considerably over the past 2 decades. Guided by evidence from randomized controlled trials, patient blood management programs have been expanded [1]. Such programs include recommendations regarding minimization of blood loss during surgery, prevention and treatment of anemia, strategies for reducing transfusions in both medical and surgical patients, improved blood utilization, education of health professionals, and standardization of blood management-related metrics [2]. Some of the guidelines have been incorporated into the Choosing Wisely initiative of the American Board of Internal Medicine Foundation, including: (a) don’t transfuse more units of blood than absolutely necessary, (b) don’t transfuse red blood cells for iron deficiency without hemodynamic instability, (c) don’t routinely use blood products to reverse warfarin, and (d) don’t perform serial blood counts on clinically stable patients [3]. Although there has been growing interest in blood management, only 37.8% of the 607 AABB (formerly, American Association of Blood Banks) facilities in the United States reported having a patient blood management program in 2013 [2].
While the importance of blood safety is recognized, data regarding the overall trends in practices are conflicting. A study using the Nationwide Inpatient Sample indicated that there was a 5.6% annual mean increase in the transfusion of blood products from 2002 to 2011 in the United States [4]. This contrasts with the experience of Kaiser Permanente in Northern California, in which the incidence of RBC transfusion decreased by 3.2% from 2009 to 2013 [5]. A decline in rates of intraoperative transfusion was also reported among elderly veterans in the United States from 1997 to 2009 [6].
We conducted a study in hospitalized veterans with 2 main objectives: (a) to evaluate trends in utilization of red blood cells (RBCs), platelets, and plasma over time, and (b) to identify those groups of veterans who received specific blood products. We were particularly interested in transfusion use in medical patients.
Methods
Participants were hospitalized veterans at 3 Department of Veterans Affairs (VA) medical centers. Data from all hospitalizations were collected from January 2000 through December 2010. Blood bank data (including the type and volume of products administered) were available electronically from each hospital. These files were linked to inpatient data, which included ICD-9-CM diagnoses (principal and secondary) and procedures during hospitalization.
Statistical analyses were conducted using generalized linear models to evaluate trends over time. The unit of observation was hospitalization, with categorization by type. Surgical hospitalizations were defined as admissions in which any surgical procedure occurred, whereas medical hospitalizations were defined as admissions without any surgery. Alpha was set at 0.05, 2-tailed. All analyses were conducted in Stata/MP 14.1 (StataCorp, College Station, TX). The study received institutional review board approval from the VA Ann Arbor Healthcare System.
Results
From 2000 through 2010, there were 176,521 hospitalizations in 69,621 patients. Within this cohort, 6% were < 40 years of age, 66% were 40 to 69 years of age, and 28% were 70 years or older at the time of admission. In this cohort, 96% of patients were male. Overall, 13.6% of all hospitalizations involved transfusion of a blood product (12.7% RBCs, 1.4% platelets, 3.0% plasma).
Transfusion occurred in 25.2% of surgical hospitalizations and 5.3% of medical hospitalizations. For surgical hospitalizations, transfusion use peaked in 2002 (when 30.9% of the surgical hospitalizations involved a trans-fusion) and significantly declined afterwards (P < 0.001). By 2010, 22.5% of the surgical hospitalizations involved a transfusion. Most of the surgeries where blood products were transfused involved cardiovascular procedures. For medical hospitalizations only, there was no significant change in transfusion use over time, either from 2000 to 2010 (P = 0.126) or from 2002 to 2010 (P = 0.072). In 2010, 5.2% of the medical hospitalizations involved a transfusion.
Rates of transfusion varied by principal diagnosis (Figure 1). For patients admitted with a principal diagnosis of infection (n = 20,981 hospitalizations), there was an increase in the percentage of hospitalizations in which transfusions (RBCs, platelet, plasma) were administered over time (P = 0.009) (Figure 1). For patients admitted with a principal diagnosis of malignancy (n = 12,904 hospitalizations), cardiovascular disease (n = 40,324 hospitalizations), and other diagnoses (n = 102,312 hospitalizations), there were no significant linear trends over the entire study period (P = 0.191, P = 0.052, P = 0.314, respectively). Rather, blood utilization peaked in year 2002 and significantly declined afterwards for patients admitted for malignancy (P < 0.001) and for cardiovascular disease (P < 0.001).
The most common principal diagnoses for medical patients receiving any transfusion (RBCs, platelet, plasma) are listed in Table 1. For medical patients with a principal diagnosis of anemia, 88% of hospitalizations involved a transfusion (Table 1). Transfusion occurred in 6% to 11% of medical hospitalizations with malignancies, heart failure, pneumonia or renal failure (Table 1). A considerable proportion (43%) of medical patients with gastrointestinal hemorrhage received a transfusion.
9.6% (2154/22,344) involved the use of only 1 unit, 43.8% (9791/22,344) involved 2 units, and 46.5% (10,399/22,344) involved 3 or more units during the hospitalization. From 2000 through 2010, there was a 20.3% reduction in the proportion of hospitalizations in which 3 or more units of RBCs were given (from 51.7% to 41.1%; P < 0.001). That is, among those hospitalizations in which a RBC transfusion occurred, a smaller proportion of hospitalizations involved the administration of 3 or more units of RBCs from 2000 through 2010 (Figure 2). There was an 11.5% increase in the proportion of hospitalizations in which 2 units of RBCs were used (from 40.4% to 45.0%; P < 0.001). In addition, there was a 73.6% increase in the proportion of hospitalizations in which 1 RBC unit was given (from 8.0% to 13.8%;
P = 0.001).
16.8 mL/hospitalization in 2010. For plasma, the mean mL/hospitalization was 28.9 in year 2000, increased to 50.1 mL/hospitalization in year 2008, and declined, thereafter, to 35.1 mL/hospitalization in year 2010.
Discussion
We also observed secular trends in the volume of RBCs administered. There was an increase in the percentage of hospitalizations in which 1 or 2 RBC units were used and a decline in transfusion of 3 or more units. The reduction in the use of 3 or more RBC units may reflect the adoption and integration of recommendations in patient blood management by clinicians,
which encourage assessment of the patients’ symptoms in determining whether additional units are necessary [7]. Such guidelines also endorse the avoidance of routine
administration of 2 units of RBCs if 1 unit is sufficient [8]. We have previously shown that, after coronary artery bypass grafting, 2 RBC units doubled the risk of pneumonia [9]; additional analyses indicated that 1 or 2 units of RBCs were associated with increased postoperative morbidity [10]. In addition, our previous research indicated that the probability of infection increased considerably between 1 and 2 RBC units, with a more gradual increase beyond 2 units [11]. With this evidence in mind, some studies at single sites have reported that there was a dramatic decline from 2 RBC units before initiation of patient blood management programs to 1 unit after the programs were implemented [12,13].
Medical patients who received a transfusion were often admitted for reason of anemia, cancer, organ failure, or pneumonia. Some researchers are now reporting that blood use, at certain sites, is becoming more common in medical rather than surgical patients, which may be due to an expansion of patient blood management procedures in surgery [16]. There are a substantial number of patient blood management programs among surgical specialties and their adoption has expanded [17]. Although there are fewer patient blood management programs in the nonsurgical setting, some have been targeted to internal medicine physicians and specifically, to hospitalists [1,18]. For example, a toolkit from the Society of Hospital Medicine centers on anemia management and includes anemia assessment, treatment, evaluation of RBC transfusion risk, blood conservation, optimization of coagulation, and patient-centered decision-making [19]. Additionally, bundling of patient blood management strategies has been launched to help encourage a wider adoption of such programs [20].
While guidelines regarding use of RBCs are becoming increasingly recognized, recommendations for the use of platelets and plasma are hampered by the paucity of evidence from randomized controlled trials [21,22]. There is moderate-quality evidence for the use of platelets with therapy-induced hypoproliferative thrombocytopenia in hospitalized patients [21], but low quality evidence for other uses. Moreover, a recent review of plasma transfusion in bleeding patients found no randomized controlled trials on plasma use in hospitalized patients, although several trials were currently underway [22].
Our findings need to be considered in the context of the following limitations. The data were from 3 VA hospitals, so the results may not reflect patterns of usage at other hospitals. However, AABB reports that there has been a general decrease in transfusion of allogeneic whole blood and RBC units since 2008 at the AABB-affiliated sites in the United States [2]; this is similar to the pattern that we observed in surgical patients. In addition, we report an overall view of trends without having details regarding which specific factors influenced changes in transfusion during this 11-year period. It is possible that the severity of hospitalized patients may have changed with time which could have influenced decisions regarding the need for transfusion.
In conclusion, the use of blood products decreased in surgical patients since 2002 but remained the same in medical patients in this VA population. Transfusions increased over time for patients who were admitted to the hospital for reason of infection, but decreased since 2002 for those admitted for cardiovascular disease or cancer. The number of RBC units per hospitalization decreased over time. Additional surveillance is needed to determine whether recent evidence regarding blood management has been incorporated into clinical practice for medical patients, as we strive to deliver optimal care to our veterans.
Corresponding author: Mary A.M. Rogers, PhD, MS, Dept. of Internal Medicine, Univ. of Michigan, 016-422W NCRC, Ann Arbor, MI 48109-2800, [email protected].
Funding/support: Department of Veterans Affairs, Clinical Sciences Research & Development Service Merit Review Award (EPID-011-11S). The contents do not represent the views of the U.S. Department of Veterans Affairs or the U.S. Government.
Financial disclosures: None.
Author contributions: conception and design, MAMR, SS; analysis and interpretation of data, MAMR, JDB, DR, LK, SS; drafting of article, MAMR; critical revision of the article, MAMR, MTG, DR, LK, SS, VC; statistical expertise, MAMR, DR; obtaining of funding, MTG, SS, VC; administrative or technical support, MTG, LK, SS, VC; collection and assembly of data, JDB, LK.
Abstract
- Background: Although transfusion guidelines have changed considerably over the past 2 decades, the adoption of patient blood management programs has not been fully realized across hospitals in the United States.
- Objective: To evaluate trends in red blood cell (RBC), platelet, and plasma transfusion at 3 Veterans Health Administration (VHA) hospitals from 2000 through 2010.
- Methods: Data from all hospitalizations were collected from January 2000 through December 2010. Blood bank data (including the type and volume of products administered) were available electronically from each hospital. These files were linked to inpatient data, which included ICD-9-CM diagnoses (principal and secondary) and procedures during hospitalization. Statistical analyses were conducted using generalized linear models to evaluate trends over time. The unit of observation was hospitalization, with categorization by type.
- Results: There were 176,521 hospitalizations in 69,621 patients; of these, 13.6% of hospitalizations involved transfusion of blood products (12.7% RBCs, 1.4% platelets, 3.0% plasma). Transfusion occurred in 25.2% of surgical and 5.3% of medical hospitalizations. Transfusion use peaked in 2002 for surgical hospitalizations and declined afterwards (P < 0.001). There was no significant change in transfusion use over time (P = 0.126) for medical hospitalizations. In hospitalizations that involved transfusions, there was a 20.3% reduction in the proportion of hospitalizations in which ≥ 3 units of RBCs were given (from 51.7% to 41.1%; P < 0.001) and a 73.6% increase when 1 RBC unit was given (from 8.0% to 13.8%; P < 0.001) from 2000-2010. Of the hospitalizations with RBC transfusion, 9.6% involved the use of 1 unit over the entire study period. The most common principal diagnoses for medical patients receiving transfusion were anemia, malignancy, heart failure, pneumonia and renal failure. Over time, transfusion utilization increased in patients who were admitted for infection (P = 0.009).
- Conclusion: Blood transfusions in 3 VHA hospitals have decreased over time for surgical patients but remained the same for medical patients. Further study examining appropriateness of blood products in medical patients appears necessary.
Key words: Transfusion; red blood cells; plasma; platelets; veterans.
Transfusion practices during hospitalization have changed considerably over the past 2 decades. Guided by evidence from randomized controlled trials, patient blood management programs have been expanded [1]. Such programs include recommendations regarding minimization of blood loss during surgery, prevention and treatment of anemia, strategies for reducing transfusions in both medical and surgical patients, improved blood utilization, education of health professionals, and standardization of blood management-related metrics [2]. Some of the guidelines have been incorporated into the Choosing Wisely initiative of the American Board of Internal Medicine Foundation, including: (a) don’t transfuse more units of blood than absolutely necessary, (b) don’t transfuse red blood cells for iron deficiency without hemodynamic instability, (c) don’t routinely use blood products to reverse warfarin, and (d) don’t perform serial blood counts on clinically stable patients [3]. Although there has been growing interest in blood management, only 37.8% of the 607 AABB (formerly, American Association of Blood Banks) facilities in the United States reported having a patient blood management program in 2013 [2].
While the importance of blood safety is recognized, data regarding the overall trends in practices are conflicting. A study using the Nationwide Inpatient Sample indicated that there was a 5.6% annual mean increase in the transfusion of blood products from 2002 to 2011 in the United States [4]. This contrasts with the experience of Kaiser Permanente in Northern California, in which the incidence of RBC transfusion decreased by 3.2% from 2009 to 2013 [5]. A decline in rates of intraoperative transfusion was also reported among elderly veterans in the United States from 1997 to 2009 [6].
We conducted a study in hospitalized veterans with 2 main objectives: (a) to evaluate trends in utilization of red blood cells (RBCs), platelets, and plasma over time, and (b) to identify those groups of veterans who received specific blood products. We were particularly interested in transfusion use in medical patients.
Methods
Participants were hospitalized veterans at 3 Department of Veterans Affairs (VA) medical centers. Data from all hospitalizations were collected from January 2000 through December 2010. Blood bank data (including the type and volume of products administered) were available electronically from each hospital. These files were linked to inpatient data, which included ICD-9-CM diagnoses (principal and secondary) and procedures during hospitalization.
Statistical analyses were conducted using generalized linear models to evaluate trends over time. The unit of observation was hospitalization, with categorization by type. Surgical hospitalizations were defined as admissions in which any surgical procedure occurred, whereas medical hospitalizations were defined as admissions without any surgery. Alpha was set at 0.05, 2-tailed. All analyses were conducted in Stata/MP 14.1 (StataCorp, College Station, TX). The study received institutional review board approval from the VA Ann Arbor Healthcare System.
Results
From 2000 through 2010, there were 176,521 hospitalizations in 69,621 patients. Within this cohort, 6% were < 40 years of age, 66% were 40 to 69 years of age, and 28% were 70 years or older at the time of admission. In this cohort, 96% of patients were male. Overall, 13.6% of all hospitalizations involved transfusion of a blood product (12.7% RBCs, 1.4% platelets, 3.0% plasma).
Transfusion occurred in 25.2% of surgical hospitalizations and 5.3% of medical hospitalizations. For surgical hospitalizations, transfusion use peaked in 2002 (when 30.9% of the surgical hospitalizations involved a trans-fusion) and significantly declined afterwards (P < 0.001). By 2010, 22.5% of the surgical hospitalizations involved a transfusion. Most of the surgeries where blood products were transfused involved cardiovascular procedures. For medical hospitalizations only, there was no significant change in transfusion use over time, either from 2000 to 2010 (P = 0.126) or from 2002 to 2010 (P = 0.072). In 2010, 5.2% of the medical hospitalizations involved a transfusion.
Rates of transfusion varied by principal diagnosis (Figure 1). For patients admitted with a principal diagnosis of infection (n = 20,981 hospitalizations), there was an increase in the percentage of hospitalizations in which transfusions (RBCs, platelet, plasma) were administered over time (P = 0.009) (Figure 1). For patients admitted with a principal diagnosis of malignancy (n = 12,904 hospitalizations), cardiovascular disease (n = 40,324 hospitalizations), and other diagnoses (n = 102,312 hospitalizations), there were no significant linear trends over the entire study period (P = 0.191, P = 0.052, P = 0.314, respectively). Rather, blood utilization peaked in year 2002 and significantly declined afterwards for patients admitted for malignancy (P < 0.001) and for cardiovascular disease (P < 0.001).
The most common principal diagnoses for medical patients receiving any transfusion (RBCs, platelet, plasma) are listed in Table 1. For medical patients with a principal diagnosis of anemia, 88% of hospitalizations involved a transfusion (Table 1). Transfusion occurred in 6% to 11% of medical hospitalizations with malignancies, heart failure, pneumonia or renal failure (Table 1). A considerable proportion (43%) of medical patients with gastrointestinal hemorrhage received a transfusion.
9.6% (2154/22,344) involved the use of only 1 unit, 43.8% (9791/22,344) involved 2 units, and 46.5% (10,399/22,344) involved 3 or more units during the hospitalization. From 2000 through 2010, there was a 20.3% reduction in the proportion of hospitalizations in which 3 or more units of RBCs were given (from 51.7% to 41.1%; P < 0.001). That is, among those hospitalizations in which a RBC transfusion occurred, a smaller proportion of hospitalizations involved the administration of 3 or more units of RBCs from 2000 through 2010 (Figure 2). There was an 11.5% increase in the proportion of hospitalizations in which 2 units of RBCs were used (from 40.4% to 45.0%; P < 0.001). In addition, there was a 73.6% increase in the proportion of hospitalizations in which 1 RBC unit was given (from 8.0% to 13.8%;
P = 0.001).
16.8 mL/hospitalization in 2010. For plasma, the mean mL/hospitalization was 28.9 in year 2000, increased to 50.1 mL/hospitalization in year 2008, and declined, thereafter, to 35.1 mL/hospitalization in year 2010.
Discussion
We also observed secular trends in the volume of RBCs administered. There was an increase in the percentage of hospitalizations in which 1 or 2 RBC units were used and a decline in transfusion of 3 or more units. The reduction in the use of 3 or more RBC units may reflect the adoption and integration of recommendations in patient blood management by clinicians,
which encourage assessment of the patients’ symptoms in determining whether additional units are necessary [7]. Such guidelines also endorse the avoidance of routine
administration of 2 units of RBCs if 1 unit is sufficient [8]. We have previously shown that, after coronary artery bypass grafting, 2 RBC units doubled the risk of pneumonia [9]; additional analyses indicated that 1 or 2 units of RBCs were associated with increased postoperative morbidity [10]. In addition, our previous research indicated that the probability of infection increased considerably between 1 and 2 RBC units, with a more gradual increase beyond 2 units [11]. With this evidence in mind, some studies at single sites have reported that there was a dramatic decline from 2 RBC units before initiation of patient blood management programs to 1 unit after the programs were implemented [12,13].
Medical patients who received a transfusion were often admitted for reason of anemia, cancer, organ failure, or pneumonia. Some researchers are now reporting that blood use, at certain sites, is becoming more common in medical rather than surgical patients, which may be due to an expansion of patient blood management procedures in surgery [16]. There are a substantial number of patient blood management programs among surgical specialties and their adoption has expanded [17]. Although there are fewer patient blood management programs in the nonsurgical setting, some have been targeted to internal medicine physicians and specifically, to hospitalists [1,18]. For example, a toolkit from the Society of Hospital Medicine centers on anemia management and includes anemia assessment, treatment, evaluation of RBC transfusion risk, blood conservation, optimization of coagulation, and patient-centered decision-making [19]. Additionally, bundling of patient blood management strategies has been launched to help encourage a wider adoption of such programs [20].
While guidelines regarding use of RBCs are becoming increasingly recognized, recommendations for the use of platelets and plasma are hampered by the paucity of evidence from randomized controlled trials [21,22]. There is moderate-quality evidence for the use of platelets with therapy-induced hypoproliferative thrombocytopenia in hospitalized patients [21], but low quality evidence for other uses. Moreover, a recent review of plasma transfusion in bleeding patients found no randomized controlled trials on plasma use in hospitalized patients, although several trials were currently underway [22].
Our findings need to be considered in the context of the following limitations. The data were from 3 VA hospitals, so the results may not reflect patterns of usage at other hospitals. However, AABB reports that there has been a general decrease in transfusion of allogeneic whole blood and RBC units since 2008 at the AABB-affiliated sites in the United States [2]; this is similar to the pattern that we observed in surgical patients. In addition, we report an overall view of trends without having details regarding which specific factors influenced changes in transfusion during this 11-year period. It is possible that the severity of hospitalized patients may have changed with time which could have influenced decisions regarding the need for transfusion.
In conclusion, the use of blood products decreased in surgical patients since 2002 but remained the same in medical patients in this VA population. Transfusions increased over time for patients who were admitted to the hospital for reason of infection, but decreased since 2002 for those admitted for cardiovascular disease or cancer. The number of RBC units per hospitalization decreased over time. Additional surveillance is needed to determine whether recent evidence regarding blood management has been incorporated into clinical practice for medical patients, as we strive to deliver optimal care to our veterans.
Corresponding author: Mary A.M. Rogers, PhD, MS, Dept. of Internal Medicine, Univ. of Michigan, 016-422W NCRC, Ann Arbor, MI 48109-2800, [email protected].
Funding/support: Department of Veterans Affairs, Clinical Sciences Research & Development Service Merit Review Award (EPID-011-11S). The contents do not represent the views of the U.S. Department of Veterans Affairs or the U.S. Government.
Financial disclosures: None.
Author contributions: conception and design, MAMR, SS; analysis and interpretation of data, MAMR, JDB, DR, LK, SS; drafting of article, MAMR; critical revision of the article, MAMR, MTG, DR, LK, SS, VC; statistical expertise, MAMR, DR; obtaining of funding, MTG, SS, VC; administrative or technical support, MTG, LK, SS, VC; collection and assembly of data, JDB, LK.
1. Hohmuth B, Ozawa S, Ashton M, Melseth RL. Patient-centered blood management. J Hosp Med 2014;9:60–5.
2. Whitaker B, Rajbhandary S, Harris A. The 2013 AABB blood collection, utilization, and patient blood management survey report. United States Department of Health and Human Services, AABB; 2015.
3. Cassel CK, Guest JA. Choosing wisely: helping physicians and patients make smart decisions about their care. JAMA 2012;307:1801–2.
4. Pathak R, Bhatt VR, Karmacharya P, et al. Trends in blood-product transfusion among inpatients in the United States from 2002 to 2011: data from the nationwide inpatient sample. J Hosp Med 2014;9:800–1.
5. Roubinian NH, Escobar GJ, Liu V, et al. Trends in red blood cell transfusion and 30-day mortality among hospitalized patients. Transfusion 2014;54:2678–86.
6. Chen A, Trivedi AN, Jiang L, et al. Hospital blood transfusion patterns during major noncardiac surgery and surgical mortality. Medicine (Baltimore) 2015;94:e1342.
7. Carson JL, Guyatt G, Heddle NM, et al. Clinical practice guidelines from the AABB: Red blood cell transfusion thresholds and storage. JAMA 2016;316:2025–35.
8. Hicks LK, Bering H, Carson KR, et al. The ASH choosing wisely® campaign: five hematologic tests and treatments to question. Blood 2013;122:3879–83.
9. Likosky DS, Paone G, Zhang M, et al. Red blood cell transfusions impact pneumonia rates after coronary artery bypass grafting. Ann Thorac Surg 2015;100:794–801.
10. Paone G, Likosky DS, Brewer R, et al. Transfusion of 1 and 2 units of red blood cells is associated with increased morbidity and mortality. Ann Thorac Surg 2014;97:87–93; discussion 93–4.
11. Rogers MAM, Blumberg N, Heal JM, et al. Role of transfusion in the development of urinary tract–related bloodstream infection. Arch Intern Med 2011;171:1587–9.
12. Oliver JC, Griffin RL, Hannon T, Marques MB. The success of our patient blood management program depended on an institution-wide change in transfusion practices. Transfusion 2014;54:2617–24.
13. Yerrabothala S, Desrosiers KP, Szczepiorkowski ZM, Dunbar NM. Significant reduction in red blood cell transfusions in a general hospital after successful implementation of a restrictive transfusion policy supported by prospective computerized order auditing. Transfusion 2014;54:2640–5.
14. Rehm JP, Otto PS, West WW, et al. Hospital-wide educational program decreases red blood cell transfusions. J Surg Res 1998;75:183–6.
15. Lawler EV, Bradbury BD, Fonda JR, et al. Transfusion burden among patients with chronic kidney disease and anemia. Clin J Am Soc Nephrol 2010;5:667–72.
16. Tinegate H, Pendry K, Murphy M, et al. Where do all the red blood cells (RBCs) go? Results of a survey of RBC use in England and North Wales in 2014. Transfusion 2016;56:139–45.
17. Meybohm P, Herrmann E, Steinbicker AU, et al. Patient blood management is associated with a substantial reduction of red blood cell utilization and safe for patient’s outcome: a prospective, multicenter cohort study with a noninferiority design. Ann Surg 2016;264:203–11.
18. Corwin HL, Theus JW, Cargile CS, Lang NP. Red blood cell transfusion: impact of an education program and a clinical guideline on transfusion practice. J Hosp Med 2014;9:745–9.
19. Society of Hospital Medicine. Anemia prevention and management program implementation toolkit. Accessed at www.hospitalmedicine.org/Web/Quality___Innovation/Implementation_Toolkit/Anemia/anemia_overview.aspx on 9 June 2017.
20. Meybohm P, Richards T, Isbister J, et al. Patient blood management bundles to facilitate implementation. Transfus Med Rev 2017;31:62–71.
21. Kaufman RM, Djulbegovic B, Gernsheimer T, et al. Platelet transfusion: a clinical practice guideline from the AABB. Ann Intern Med 2015;162:205–13.
22. Levy JH, Grottke O, Fries D, Kozek-Langenecker S. Therapeutic plasma transfusion in bleeding patients: A systematic review. Anesth Analg 2017;124:1268–76.
1. Hohmuth B, Ozawa S, Ashton M, Melseth RL. Patient-centered blood management. J Hosp Med 2014;9:60–5.
2. Whitaker B, Rajbhandary S, Harris A. The 2013 AABB blood collection, utilization, and patient blood management survey report. United States Department of Health and Human Services, AABB; 2015.
3. Cassel CK, Guest JA. Choosing wisely: helping physicians and patients make smart decisions about their care. JAMA 2012;307:1801–2.
4. Pathak R, Bhatt VR, Karmacharya P, et al. Trends in blood-product transfusion among inpatients in the United States from 2002 to 2011: data from the nationwide inpatient sample. J Hosp Med 2014;9:800–1.
5. Roubinian NH, Escobar GJ, Liu V, et al. Trends in red blood cell transfusion and 30-day mortality among hospitalized patients. Transfusion 2014;54:2678–86.
6. Chen A, Trivedi AN, Jiang L, et al. Hospital blood transfusion patterns during major noncardiac surgery and surgical mortality. Medicine (Baltimore) 2015;94:e1342.
7. Carson JL, Guyatt G, Heddle NM, et al. Clinical practice guidelines from the AABB: Red blood cell transfusion thresholds and storage. JAMA 2016;316:2025–35.
8. Hicks LK, Bering H, Carson KR, et al. The ASH choosing wisely® campaign: five hematologic tests and treatments to question. Blood 2013;122:3879–83.
9. Likosky DS, Paone G, Zhang M, et al. Red blood cell transfusions impact pneumonia rates after coronary artery bypass grafting. Ann Thorac Surg 2015;100:794–801.
10. Paone G, Likosky DS, Brewer R, et al. Transfusion of 1 and 2 units of red blood cells is associated with increased morbidity and mortality. Ann Thorac Surg 2014;97:87–93; discussion 93–4.
11. Rogers MAM, Blumberg N, Heal JM, et al. Role of transfusion in the development of urinary tract–related bloodstream infection. Arch Intern Med 2011;171:1587–9.
12. Oliver JC, Griffin RL, Hannon T, Marques MB. The success of our patient blood management program depended on an institution-wide change in transfusion practices. Transfusion 2014;54:2617–24.
13. Yerrabothala S, Desrosiers KP, Szczepiorkowski ZM, Dunbar NM. Significant reduction in red blood cell transfusions in a general hospital after successful implementation of a restrictive transfusion policy supported by prospective computerized order auditing. Transfusion 2014;54:2640–5.
14. Rehm JP, Otto PS, West WW, et al. Hospital-wide educational program decreases red blood cell transfusions. J Surg Res 1998;75:183–6.
15. Lawler EV, Bradbury BD, Fonda JR, et al. Transfusion burden among patients with chronic kidney disease and anemia. Clin J Am Soc Nephrol 2010;5:667–72.
16. Tinegate H, Pendry K, Murphy M, et al. Where do all the red blood cells (RBCs) go? Results of a survey of RBC use in England and North Wales in 2014. Transfusion 2016;56:139–45.
17. Meybohm P, Herrmann E, Steinbicker AU, et al. Patient blood management is associated with a substantial reduction of red blood cell utilization and safe for patient’s outcome: a prospective, multicenter cohort study with a noninferiority design. Ann Surg 2016;264:203–11.
18. Corwin HL, Theus JW, Cargile CS, Lang NP. Red blood cell transfusion: impact of an education program and a clinical guideline on transfusion practice. J Hosp Med 2014;9:745–9.
19. Society of Hospital Medicine. Anemia prevention and management program implementation toolkit. Accessed at www.hospitalmedicine.org/Web/Quality___Innovation/Implementation_Toolkit/Anemia/anemia_overview.aspx on 9 June 2017.
20. Meybohm P, Richards T, Isbister J, et al. Patient blood management bundles to facilitate implementation. Transfus Med Rev 2017;31:62–71.
21. Kaufman RM, Djulbegovic B, Gernsheimer T, et al. Platelet transfusion: a clinical practice guideline from the AABB. Ann Intern Med 2015;162:205–13.
22. Levy JH, Grottke O, Fries D, Kozek-Langenecker S. Therapeutic plasma transfusion in bleeding patients: A systematic review. Anesth Analg 2017;124:1268–76.
Management and Prevention of Intraoperative Acetabular Fracture in Primary Total Hip Arthroplasty
Take Home Points
- IAF is an uncommon, but serious complication of primary THA.
- Small (<50 mm) cups are at higher risk for causing IAF.
- Prompt recognition is critical to prevent component migration and need for revision.
- Posterior column integrity is cirtical to a successful outcome when IAF occurs.
- Initial stable fixation, with or without intraoperative acetabular revision, is critical for successful outcome when IAF is identified.
Intraoperative acetabular fracture (IAF) is a rare complication of primary total hip arthroplasty (THA).1-3 IAFs commonly occur with impaction of the acetabular component. Studies have found that underreaming of the acetabulum and impaction of relatively large, elliptic, or monoblock components may increase the risk of IAFs.2-5 There is a paucity of literature on risk factors, treatment strategies, and outcomes of this potentially devastating complication.
In this article, we report on the incidence of IAF in primary THA at our high-volume institution and present strategies for managing and preventing this rare fracture.
Materials and Methods
Between 1997 and 2015, more than 20 fellowship-trained arthroplasty surgeons performed 21,519 primary THAs at our institution. After obtaining Institutional Review Board approval for this study, we retrospectively searched the hospital database and identified 16 patients (16 hips) who sustained an IAF in primary THA. Mean age of the cohort (13 women, 3 men) at time of surgery was 70 years (range, 42-89 years). Of the 16 patients, 13 had a preoperative diagnosis of osteoarthritis, 2 had posttraumatic arthritis, and 1 had rheumatoid arthritis. A posterolateral approach was used with 14 patients and a modified anterolateral approach with the other 2. Surgical technique and implant selection varied among surgeons. Thirteen THAs were performed with an all-press-fit technique and 3 with a hybrid technique (uncemented acetabular component, cemented femoral component). In 9 cases, the acetabular component underwent supplemental screw fixation. Whether to use acetabular component screws or cemented femoral components was decided intraoperatively by the surgeon.
The cohort’s acetabular components were either elliptic modular or hemispheric modular. The elliptic modular component used was the Peripheral Self-Locking (PSL) implant (Stryker Howmedica Osteonics), and the hemispheric modular components used were either the Trident implant (Stryker Howmedica Osteonics) or the ZTT-II implant (DePuy Synthes). Elliptic acetabular components have a peripheral flare, in contrast to true hemispheric acetabular components. Ten elliptic modular and 6 hemispheric modular components were implanted. In all cases, the difference between the final reamer used to prepare the acetabular bed and the true largest external diameter of the impacted shell was 2 mm or less.
The cohort’s 16 femoral components consisted of 8 Secur-Fit uncemented components (Stryker Howmedica Osteonics), 3 Accolade uncemented components (Stryker Howmedica Osteonics), 3 Omnifit EON cemented components (Stryker Howmedica Osteonics), and 2 S-ROM uncemented components (DePuy Synthes).
After surgery, all patients were followed up according to individual surgeon protocol for radiographic and physical examination.
Data on IAF incidence were obtained from a hospital database and were confirmed with electronic medical record (EMR) documentation. Also obtained were IAF causes and locations recorded in operative notes. For fractures identified after surgery, location was obtained from the immediate postoperative radiograph. Fracture management (eg, supplemental screw fixation, fracture reduction and fixation, bone grafting, acetabular component revision, protected weight-bearing) was determined from EMR documentation.
Results
Sixteen patients sustained an IAF in primary THA. All IAFs occurred in cases involving cementless acetabular components. The institution’s incidence of IAF with use of cementless components was 0.0007%.
Of the 5 IAFs (31%) identified during surgery, 4 were noted during impaction of the acetabular component, and 1 was noted during reaming. Eighty percent of these IAFs occurred directly posterior, and 60% were addressed at time of index procedure secondary to acetabular component instability. The other 11 fractures (69%) were identified on standard postoperative anteroposterior pelvis radiographs obtained in the postanesthesia care unit (PACU). Details of component characteristics, fracture location, immediate treatment, and weight-bearing precautions for all 16 patients are listed in the Table.
There were additional complications. One patient sustained an intraoperative proximal femur fracture, which was addressed at the index THA with application of a cerclage wire and reinsertion of the femoral component; no further surgical intervention was required, and the femur fracture healed uneventfully. Another patient had a postoperative ileus that required nasogastric tube decompression and monitoring in the intensive care unit; the ileus resolved spontaneously. A third patient, initially treated with bone grafting and cemented cup insertion, was diagnosed with a periprosthetic joint infection 3 weeks after the index THA and was treated with explantation of all components and girdlestone resection arthroplasty; 1 month after the resection arthroplasty, a persistently draining wound was treated with irrigation and débridement. There were no other medical complications, thromboembolic events, or dislocations.
One to 7 weeks after surgery, patients returned for initial follow-up, and radiographs were obtained for component stability assessment. Three patients presented with gross acetabular instability, and revisions were performed. Standard clinical follow-up continued for all patients per individual surgeon protocol. Mean follow-up was 4 years.
Discussion
IAF is an uncommon complication of THA. The rarity of IAFs makes it difficult to obtain a cohort large enough to study the problem. Given the increasing incidence of primary THAs and the almost ubiquitous use of press-fit acetabular components, surgeons who perform THAs undoubtedly will encounter IAFs in their own practice. In this article, we report our institution’s experience with periprosthetic IAFs and provide a framework for making decisions regarding these complications.
Anatomical locations of IAFs have been associated with variable outcomes. In a 2015 series, Laflamme and colleagues6 found posterior column stability a crucial factor in implant stability. Fractures with posterior column instability had a 67% failure rate, and patients with an intact posterior column reliably had osteointegration occur without further intervention.6 In our series, fractures that violated the posterior column had similar results. All these fractures required further operative intervention, either at the index procedure or in the early postoperative period. Loss of posterior column stability prevents secure fixation of the acetabular component, thereby preventing successful hip reconstruction. One posterior column fracture in our series was not recognized until after surgery, on a PACU radiograph, and 1 posterior column fracture was fully appreciated only after postoperative computed tomography (CT) was obtained during immediate hospitalization after the index procedure. In both cases, conservative management was unsuccessful. Revision arthroplasty (and in 1 case late posterior column fixation) was performed to achieve adequate reconstruction. There were no failures after posterior column fixation. In cases of posterior wall or column fracture, we recommend early aggressive treatment, preferably at the time of index arthroplasty, to prevent catastrophic failure.
Most commonly, periprosthetic IAFs go unnoticed until initial postoperative radiographs are examined.6 Eleven of the 16 IAFs in our series were first recognized on radiographs obtained in the PACU. Surgeons thus have difficult decisions to make. The literature has little discussion on managing early postoperative periprosthetic IAFs. Most recent studies, which consist of small series and case reports, have focused on late and often traumatic IAFs.7-9 These were initially classified by Peterson and Lewallen10 as type I, which are stable radiographically (no movement relative to previous radiographs) and do not produce pain with minor movement of the extremity, or type II, which are unstable radiographically (gross displacement of component) or produce pain with any hip motion. Type I fractures were more common and were often managed with protected weight-bearing and observation. The authors concluded that, in type I fractures, retaining the original acetabular component is difficult; however, when these fractures are treated appropriately, a functional prosthesis can be salvaged, and fracture union can be expected.
Less common are acetabular fractures detected during surgery, as in our study. In an outcome series, Haidukewych and colleagues3 reported on 21 periprosthetic acetabular fractures, all recognized during surgery and managed according to perceived stability of the component. All fractures healed uneventfully, and there were no other complications.
These studies provide a framework for addressing IAFs noticed in the early postoperative period. The diagnostic dilemma presented by these fractures was first discussed by Laflamme and colleagues.6 Nine of the 32 fractures in their series were classified as so-called type III fractures, recognized only after the early postoperative period. Additional radiographs (eg, Judet views) or CT scans were crucial in determining acetabular component stability, given the known poor outcomes associated with posterior column fracture. In our series, only 1 patient had CT performed after intraoperative recognition of fracture, and the extent of the fracture was not readily apparent on the patient’s postoperative radiograph. Given the successful recognition and treatment of these fractures in the early postoperative period in our series,
it is difficult to recommend advanced imaging for all periprosthetic IAFs. Perhaps this success is attributable to our almost universal use of screws for acetabular component fixation. Of the 11 patients with fractures recognized during the postoperative period, 8 had supplemental screw fixation at time of index surgery. If there is a question of fixation during component insertion, we recommend scrutinizing the acetabular rim for fracture and placing supplemental screw fixation. Screws placed for acetabular component fixation provide initial stability and may prevent early component failure in the setting of unrecognized medial or anterior fracture. In addition, when component stability is in question after impaction, we recommend using finger palpation to evaluate the sciatic notch for cortical step-off from an otherwise unrecognized fracture. Protected weight-bearing in the postoperative period may be left to the discretion of the surgeon, and the decision should be based on intraoperative stability of the acetabular component.
In our series, there was a disproportionate representation of fractures associated with elliptic acetabular components. All 5 of the fractures recognized during surgery and 5 of the 11 recognized after surgery occurred with elliptic components. The association between elliptic cup design and periprosthetic IAF was identified earlier, by Haidukewych and colleagues.3 Their series showed a statistically significant increase in fracture incidence with impaction of an elliptic cup into a bed prepared with a hemispheric reamer. In the present series, 75% of our acetabular components were impacted into a bed underreamed by 1 mm to 2 mm. It is typical of many surgeons at our institution to underream by 1 mm to 2 mm regardless of the type of component being implanted, though they show a growing trend to overream by only 1 mm with the PSL component, which has been both safe and reliable in preventing catastrophic posterior column fractures, especially with impaction of small (<50 mm) acetabular components. We have not observed early loosening or other evidence of failure with this technique. Cup impaction generates significant hoop stresses that can easily fracture sclerotic or otherwise poor-quality bone, and the dense bone around the acetabular rim experiences increased stress with impaction of elliptic components.2,11-15 Surgeons must understand the design traits of their components and be cognizant of the true difference between the diameter of the final reamer used and the real diameter of the acetabular component. We recommend having a difference of ≤1 mm to mitigate the risk of IAF occurring with cup insertion. With use of elliptic components, slight overreaming of the acetabular bed should be considered. More study is needed to better define these outcomes.
Study Limitations
Our study had several limitations, including the inherent biases of its retrospective design, small cohort size, and inclusion of multiple surgeons. Small cohort size is unavoidable given the low incidence of these injuries, and our study encompassed the experience of a high-volume hip arthroplasty service. There is the possibility that a subset of fractures may have persistently gone unrecognized, either during or after surgery, and the actual incidence of these complications may be higher. These outcomes represent our institutional experience addressing the complexities of these injuries. The lack of standardization in the management of these fractures in our series reflects the diagnostic dilemma they present, as well as the need for more study focused on their management and outcomes.
Conclusion
IAF, an uncommon complication of primary THA, most commonly occurs during component impaction. Acetabular component and surgical technique may influence the fracture rate. Intraoperative or prompt postoperative recognition of these fractures is crucial, as their location is associated with stability and outcome. Careful examination of postoperative radiographs, judicious use of advanced imaging, and close follow-up are needed to prevent early catastrophic failure. We argue against simply observing these unstable fractures and recommend early treatment with rigid fixation and, when necessary, acetabular component revision.
1. Sharkey PF, Hozack WJ, Callaghan JJ, et al. Acetabular fractures associated with cementless acetabular cup insertion: a report of 13 cases. J Arthroplasty.1999;14(4):426-431.
2. Kim YS, Callaghan JJ, Ahn PB, Brown TD. Fracture of the acetabulum during insertion of an oversized hemispherical component. J Bone Joint Surg Am. 1995;77(1):111-117.
3. Haidukewych GJ, Jacofsky DJ, Hanssen AD, Lewallen DG. Intraoperative fractures of the acetabulum during primary total hip arthroplasty. J Bone Joint Surg Am. 2006;88(9):1952-1956.
4. Curtis MJ, Jinnah RH, Wilson VD, Hungerford DS. The initial stability of uncemented acetabular components. J Bone Joint Surg Br. 1992;74(3):372-376.
5. Lachiewicz PF, Suh PB, Gilbert JA. In vitro initial fixation of porous-coated acetabular total hip components. A biomechanical and comparative study. J Arthroplasty. 1989;4(3):201-205.
6. Laflamme GY, Belzile EL, Fernandes JC, Vendittoli PA, Hébert-Davies J. Periprosthetic fractures of the acetabulum during component insertion: posterior column stability
is crucial. J Arthroplasty. 2015;30(2):265-269.
7. Desai G, Reis MD. Early postoperative acetabular discontinuity after total hip arthroplasty. J Arthroplasty. 2011;26(8):1570.e17-e19.
8. Gelalis ID, Politis AN, Arnaoutoglou CM, Georgakopoulos N, Mitsiou D, Xenakis TA. Traumatic periprosthetic acetabular fracture treated by acute one-stage revision arthroplasty. A case report and review of the literature. Injury. 2010;41(4):421-424.
9. Gras F, Marintschev I, Klos K, Fujak A, Mückley T, Hofmann GO. Navigated percutaneous screw fixation of a periprosthetic acetabular fracture. J Arthroplasty. 2010;25(7):1169.e1-e4.
10. Peterson CA, Lewallen DG. Periprosthetic fracture of the acetabulum after total hip arthroplasty. J Bone Joint Surg Am. 1996;78(8):1206-1213.
11. Hansen TM, Koenman JB, Headley AK. 3-D FEM analysis of interface fixation of acetabular implants. Trans Orthop Res Soc. 1992;17:400.
12. Yerby SA, Taylor JK, Murzic WJ. Acetabular component interface: press-fit fixation. Trans Orthop Res Soc. 1992;17:384.
13. Callaghan JJ. The clinical results and basic science of total hip arthroplasty with porous-coated prostheses. J Bone Joint Surg Am. 1993;75(2):299-310.
14. Cheng SL, Binnington AG, Bragdon CR, Jasty M, Harris WH, Davey JR. The effect of sizing mismatch on bone ingrowth into uncemented porous coated acetabular components: an in vivo canine study. Trans Orthop Res Soc. 1990;15:442.
15. Morscher E, Bereiter H, Lampert C, Cementless press-fit cup: principles, experimental data, and three-year follow-up study. Clin Orthop Relat Res. 1989;(249):12-20.
Take Home Points
- IAF is an uncommon, but serious complication of primary THA.
- Small (<50 mm) cups are at higher risk for causing IAF.
- Prompt recognition is critical to prevent component migration and need for revision.
- Posterior column integrity is cirtical to a successful outcome when IAF occurs.
- Initial stable fixation, with or without intraoperative acetabular revision, is critical for successful outcome when IAF is identified.
Intraoperative acetabular fracture (IAF) is a rare complication of primary total hip arthroplasty (THA).1-3 IAFs commonly occur with impaction of the acetabular component. Studies have found that underreaming of the acetabulum and impaction of relatively large, elliptic, or monoblock components may increase the risk of IAFs.2-5 There is a paucity of literature on risk factors, treatment strategies, and outcomes of this potentially devastating complication.
In this article, we report on the incidence of IAF in primary THA at our high-volume institution and present strategies for managing and preventing this rare fracture.
Materials and Methods
Between 1997 and 2015, more than 20 fellowship-trained arthroplasty surgeons performed 21,519 primary THAs at our institution. After obtaining Institutional Review Board approval for this study, we retrospectively searched the hospital database and identified 16 patients (16 hips) who sustained an IAF in primary THA. Mean age of the cohort (13 women, 3 men) at time of surgery was 70 years (range, 42-89 years). Of the 16 patients, 13 had a preoperative diagnosis of osteoarthritis, 2 had posttraumatic arthritis, and 1 had rheumatoid arthritis. A posterolateral approach was used with 14 patients and a modified anterolateral approach with the other 2. Surgical technique and implant selection varied among surgeons. Thirteen THAs were performed with an all-press-fit technique and 3 with a hybrid technique (uncemented acetabular component, cemented femoral component). In 9 cases, the acetabular component underwent supplemental screw fixation. Whether to use acetabular component screws or cemented femoral components was decided intraoperatively by the surgeon.
The cohort’s acetabular components were either elliptic modular or hemispheric modular. The elliptic modular component used was the Peripheral Self-Locking (PSL) implant (Stryker Howmedica Osteonics), and the hemispheric modular components used were either the Trident implant (Stryker Howmedica Osteonics) or the ZTT-II implant (DePuy Synthes). Elliptic acetabular components have a peripheral flare, in contrast to true hemispheric acetabular components. Ten elliptic modular and 6 hemispheric modular components were implanted. In all cases, the difference between the final reamer used to prepare the acetabular bed and the true largest external diameter of the impacted shell was 2 mm or less.
The cohort’s 16 femoral components consisted of 8 Secur-Fit uncemented components (Stryker Howmedica Osteonics), 3 Accolade uncemented components (Stryker Howmedica Osteonics), 3 Omnifit EON cemented components (Stryker Howmedica Osteonics), and 2 S-ROM uncemented components (DePuy Synthes).
After surgery, all patients were followed up according to individual surgeon protocol for radiographic and physical examination.
Data on IAF incidence were obtained from a hospital database and were confirmed with electronic medical record (EMR) documentation. Also obtained were IAF causes and locations recorded in operative notes. For fractures identified after surgery, location was obtained from the immediate postoperative radiograph. Fracture management (eg, supplemental screw fixation, fracture reduction and fixation, bone grafting, acetabular component revision, protected weight-bearing) was determined from EMR documentation.
Results
Sixteen patients sustained an IAF in primary THA. All IAFs occurred in cases involving cementless acetabular components. The institution’s incidence of IAF with use of cementless components was 0.0007%.
Of the 5 IAFs (31%) identified during surgery, 4 were noted during impaction of the acetabular component, and 1 was noted during reaming. Eighty percent of these IAFs occurred directly posterior, and 60% were addressed at time of index procedure secondary to acetabular component instability. The other 11 fractures (69%) were identified on standard postoperative anteroposterior pelvis radiographs obtained in the postanesthesia care unit (PACU). Details of component characteristics, fracture location, immediate treatment, and weight-bearing precautions for all 16 patients are listed in the Table.
There were additional complications. One patient sustained an intraoperative proximal femur fracture, which was addressed at the index THA with application of a cerclage wire and reinsertion of the femoral component; no further surgical intervention was required, and the femur fracture healed uneventfully. Another patient had a postoperative ileus that required nasogastric tube decompression and monitoring in the intensive care unit; the ileus resolved spontaneously. A third patient, initially treated with bone grafting and cemented cup insertion, was diagnosed with a periprosthetic joint infection 3 weeks after the index THA and was treated with explantation of all components and girdlestone resection arthroplasty; 1 month after the resection arthroplasty, a persistently draining wound was treated with irrigation and débridement. There were no other medical complications, thromboembolic events, or dislocations.
One to 7 weeks after surgery, patients returned for initial follow-up, and radiographs were obtained for component stability assessment. Three patients presented with gross acetabular instability, and revisions were performed. Standard clinical follow-up continued for all patients per individual surgeon protocol. Mean follow-up was 4 years.
Discussion
IAF is an uncommon complication of THA. The rarity of IAFs makes it difficult to obtain a cohort large enough to study the problem. Given the increasing incidence of primary THAs and the almost ubiquitous use of press-fit acetabular components, surgeons who perform THAs undoubtedly will encounter IAFs in their own practice. In this article, we report our institution’s experience with periprosthetic IAFs and provide a framework for making decisions regarding these complications.
Anatomical locations of IAFs have been associated with variable outcomes. In a 2015 series, Laflamme and colleagues6 found posterior column stability a crucial factor in implant stability. Fractures with posterior column instability had a 67% failure rate, and patients with an intact posterior column reliably had osteointegration occur without further intervention.6 In our series, fractures that violated the posterior column had similar results. All these fractures required further operative intervention, either at the index procedure or in the early postoperative period. Loss of posterior column stability prevents secure fixation of the acetabular component, thereby preventing successful hip reconstruction. One posterior column fracture in our series was not recognized until after surgery, on a PACU radiograph, and 1 posterior column fracture was fully appreciated only after postoperative computed tomography (CT) was obtained during immediate hospitalization after the index procedure. In both cases, conservative management was unsuccessful. Revision arthroplasty (and in 1 case late posterior column fixation) was performed to achieve adequate reconstruction. There were no failures after posterior column fixation. In cases of posterior wall or column fracture, we recommend early aggressive treatment, preferably at the time of index arthroplasty, to prevent catastrophic failure.
Most commonly, periprosthetic IAFs go unnoticed until initial postoperative radiographs are examined.6 Eleven of the 16 IAFs in our series were first recognized on radiographs obtained in the PACU. Surgeons thus have difficult decisions to make. The literature has little discussion on managing early postoperative periprosthetic IAFs. Most recent studies, which consist of small series and case reports, have focused on late and often traumatic IAFs.7-9 These were initially classified by Peterson and Lewallen10 as type I, which are stable radiographically (no movement relative to previous radiographs) and do not produce pain with minor movement of the extremity, or type II, which are unstable radiographically (gross displacement of component) or produce pain with any hip motion. Type I fractures were more common and were often managed with protected weight-bearing and observation. The authors concluded that, in type I fractures, retaining the original acetabular component is difficult; however, when these fractures are treated appropriately, a functional prosthesis can be salvaged, and fracture union can be expected.
Less common are acetabular fractures detected during surgery, as in our study. In an outcome series, Haidukewych and colleagues3 reported on 21 periprosthetic acetabular fractures, all recognized during surgery and managed according to perceived stability of the component. All fractures healed uneventfully, and there were no other complications.
These studies provide a framework for addressing IAFs noticed in the early postoperative period. The diagnostic dilemma presented by these fractures was first discussed by Laflamme and colleagues.6 Nine of the 32 fractures in their series were classified as so-called type III fractures, recognized only after the early postoperative period. Additional radiographs (eg, Judet views) or CT scans were crucial in determining acetabular component stability, given the known poor outcomes associated with posterior column fracture. In our series, only 1 patient had CT performed after intraoperative recognition of fracture, and the extent of the fracture was not readily apparent on the patient’s postoperative radiograph. Given the successful recognition and treatment of these fractures in the early postoperative period in our series,
it is difficult to recommend advanced imaging for all periprosthetic IAFs. Perhaps this success is attributable to our almost universal use of screws for acetabular component fixation. Of the 11 patients with fractures recognized during the postoperative period, 8 had supplemental screw fixation at time of index surgery. If there is a question of fixation during component insertion, we recommend scrutinizing the acetabular rim for fracture and placing supplemental screw fixation. Screws placed for acetabular component fixation provide initial stability and may prevent early component failure in the setting of unrecognized medial or anterior fracture. In addition, when component stability is in question after impaction, we recommend using finger palpation to evaluate the sciatic notch for cortical step-off from an otherwise unrecognized fracture. Protected weight-bearing in the postoperative period may be left to the discretion of the surgeon, and the decision should be based on intraoperative stability of the acetabular component.
In our series, there was a disproportionate representation of fractures associated with elliptic acetabular components. All 5 of the fractures recognized during surgery and 5 of the 11 recognized after surgery occurred with elliptic components. The association between elliptic cup design and periprosthetic IAF was identified earlier, by Haidukewych and colleagues.3 Their series showed a statistically significant increase in fracture incidence with impaction of an elliptic cup into a bed prepared with a hemispheric reamer. In the present series, 75% of our acetabular components were impacted into a bed underreamed by 1 mm to 2 mm. It is typical of many surgeons at our institution to underream by 1 mm to 2 mm regardless of the type of component being implanted, though they show a growing trend to overream by only 1 mm with the PSL component, which has been both safe and reliable in preventing catastrophic posterior column fractures, especially with impaction of small (<50 mm) acetabular components. We have not observed early loosening or other evidence of failure with this technique. Cup impaction generates significant hoop stresses that can easily fracture sclerotic or otherwise poor-quality bone, and the dense bone around the acetabular rim experiences increased stress with impaction of elliptic components.2,11-15 Surgeons must understand the design traits of their components and be cognizant of the true difference between the diameter of the final reamer used and the real diameter of the acetabular component. We recommend having a difference of ≤1 mm to mitigate the risk of IAF occurring with cup insertion. With use of elliptic components, slight overreaming of the acetabular bed should be considered. More study is needed to better define these outcomes.
Study Limitations
Our study had several limitations, including the inherent biases of its retrospective design, small cohort size, and inclusion of multiple surgeons. Small cohort size is unavoidable given the low incidence of these injuries, and our study encompassed the experience of a high-volume hip arthroplasty service. There is the possibility that a subset of fractures may have persistently gone unrecognized, either during or after surgery, and the actual incidence of these complications may be higher. These outcomes represent our institutional experience addressing the complexities of these injuries. The lack of standardization in the management of these fractures in our series reflects the diagnostic dilemma they present, as well as the need for more study focused on their management and outcomes.
Conclusion
IAF, an uncommon complication of primary THA, most commonly occurs during component impaction. Acetabular component and surgical technique may influence the fracture rate. Intraoperative or prompt postoperative recognition of these fractures is crucial, as their location is associated with stability and outcome. Careful examination of postoperative radiographs, judicious use of advanced imaging, and close follow-up are needed to prevent early catastrophic failure. We argue against simply observing these unstable fractures and recommend early treatment with rigid fixation and, when necessary, acetabular component revision.
Take Home Points
- IAF is an uncommon, but serious complication of primary THA.
- Small (<50 mm) cups are at higher risk for causing IAF.
- Prompt recognition is critical to prevent component migration and need for revision.
- Posterior column integrity is cirtical to a successful outcome when IAF occurs.
- Initial stable fixation, with or without intraoperative acetabular revision, is critical for successful outcome when IAF is identified.
Intraoperative acetabular fracture (IAF) is a rare complication of primary total hip arthroplasty (THA).1-3 IAFs commonly occur with impaction of the acetabular component. Studies have found that underreaming of the acetabulum and impaction of relatively large, elliptic, or monoblock components may increase the risk of IAFs.2-5 There is a paucity of literature on risk factors, treatment strategies, and outcomes of this potentially devastating complication.
In this article, we report on the incidence of IAF in primary THA at our high-volume institution and present strategies for managing and preventing this rare fracture.
Materials and Methods
Between 1997 and 2015, more than 20 fellowship-trained arthroplasty surgeons performed 21,519 primary THAs at our institution. After obtaining Institutional Review Board approval for this study, we retrospectively searched the hospital database and identified 16 patients (16 hips) who sustained an IAF in primary THA. Mean age of the cohort (13 women, 3 men) at time of surgery was 70 years (range, 42-89 years). Of the 16 patients, 13 had a preoperative diagnosis of osteoarthritis, 2 had posttraumatic arthritis, and 1 had rheumatoid arthritis. A posterolateral approach was used with 14 patients and a modified anterolateral approach with the other 2. Surgical technique and implant selection varied among surgeons. Thirteen THAs were performed with an all-press-fit technique and 3 with a hybrid technique (uncemented acetabular component, cemented femoral component). In 9 cases, the acetabular component underwent supplemental screw fixation. Whether to use acetabular component screws or cemented femoral components was decided intraoperatively by the surgeon.
The cohort’s acetabular components were either elliptic modular or hemispheric modular. The elliptic modular component used was the Peripheral Self-Locking (PSL) implant (Stryker Howmedica Osteonics), and the hemispheric modular components used were either the Trident implant (Stryker Howmedica Osteonics) or the ZTT-II implant (DePuy Synthes). Elliptic acetabular components have a peripheral flare, in contrast to true hemispheric acetabular components. Ten elliptic modular and 6 hemispheric modular components were implanted. In all cases, the difference between the final reamer used to prepare the acetabular bed and the true largest external diameter of the impacted shell was 2 mm or less.
The cohort’s 16 femoral components consisted of 8 Secur-Fit uncemented components (Stryker Howmedica Osteonics), 3 Accolade uncemented components (Stryker Howmedica Osteonics), 3 Omnifit EON cemented components (Stryker Howmedica Osteonics), and 2 S-ROM uncemented components (DePuy Synthes).
After surgery, all patients were followed up according to individual surgeon protocol for radiographic and physical examination.
Data on IAF incidence were obtained from a hospital database and were confirmed with electronic medical record (EMR) documentation. Also obtained were IAF causes and locations recorded in operative notes. For fractures identified after surgery, location was obtained from the immediate postoperative radiograph. Fracture management (eg, supplemental screw fixation, fracture reduction and fixation, bone grafting, acetabular component revision, protected weight-bearing) was determined from EMR documentation.
Results
Sixteen patients sustained an IAF in primary THA. All IAFs occurred in cases involving cementless acetabular components. The institution’s incidence of IAF with use of cementless components was 0.0007%.
Of the 5 IAFs (31%) identified during surgery, 4 were noted during impaction of the acetabular component, and 1 was noted during reaming. Eighty percent of these IAFs occurred directly posterior, and 60% were addressed at time of index procedure secondary to acetabular component instability. The other 11 fractures (69%) were identified on standard postoperative anteroposterior pelvis radiographs obtained in the postanesthesia care unit (PACU). Details of component characteristics, fracture location, immediate treatment, and weight-bearing precautions for all 16 patients are listed in the Table.
There were additional complications. One patient sustained an intraoperative proximal femur fracture, which was addressed at the index THA with application of a cerclage wire and reinsertion of the femoral component; no further surgical intervention was required, and the femur fracture healed uneventfully. Another patient had a postoperative ileus that required nasogastric tube decompression and monitoring in the intensive care unit; the ileus resolved spontaneously. A third patient, initially treated with bone grafting and cemented cup insertion, was diagnosed with a periprosthetic joint infection 3 weeks after the index THA and was treated with explantation of all components and girdlestone resection arthroplasty; 1 month after the resection arthroplasty, a persistently draining wound was treated with irrigation and débridement. There were no other medical complications, thromboembolic events, or dislocations.
One to 7 weeks after surgery, patients returned for initial follow-up, and radiographs were obtained for component stability assessment. Three patients presented with gross acetabular instability, and revisions were performed. Standard clinical follow-up continued for all patients per individual surgeon protocol. Mean follow-up was 4 years.
Discussion
IAF is an uncommon complication of THA. The rarity of IAFs makes it difficult to obtain a cohort large enough to study the problem. Given the increasing incidence of primary THAs and the almost ubiquitous use of press-fit acetabular components, surgeons who perform THAs undoubtedly will encounter IAFs in their own practice. In this article, we report our institution’s experience with periprosthetic IAFs and provide a framework for making decisions regarding these complications.
Anatomical locations of IAFs have been associated with variable outcomes. In a 2015 series, Laflamme and colleagues6 found posterior column stability a crucial factor in implant stability. Fractures with posterior column instability had a 67% failure rate, and patients with an intact posterior column reliably had osteointegration occur without further intervention.6 In our series, fractures that violated the posterior column had similar results. All these fractures required further operative intervention, either at the index procedure or in the early postoperative period. Loss of posterior column stability prevents secure fixation of the acetabular component, thereby preventing successful hip reconstruction. One posterior column fracture in our series was not recognized until after surgery, on a PACU radiograph, and 1 posterior column fracture was fully appreciated only after postoperative computed tomography (CT) was obtained during immediate hospitalization after the index procedure. In both cases, conservative management was unsuccessful. Revision arthroplasty (and in 1 case late posterior column fixation) was performed to achieve adequate reconstruction. There were no failures after posterior column fixation. In cases of posterior wall or column fracture, we recommend early aggressive treatment, preferably at the time of index arthroplasty, to prevent catastrophic failure.
Most commonly, periprosthetic IAFs go unnoticed until initial postoperative radiographs are examined.6 Eleven of the 16 IAFs in our series were first recognized on radiographs obtained in the PACU. Surgeons thus have difficult decisions to make. The literature has little discussion on managing early postoperative periprosthetic IAFs. Most recent studies, which consist of small series and case reports, have focused on late and often traumatic IAFs.7-9 These were initially classified by Peterson and Lewallen10 as type I, which are stable radiographically (no movement relative to previous radiographs) and do not produce pain with minor movement of the extremity, or type II, which are unstable radiographically (gross displacement of component) or produce pain with any hip motion. Type I fractures were more common and were often managed with protected weight-bearing and observation. The authors concluded that, in type I fractures, retaining the original acetabular component is difficult; however, when these fractures are treated appropriately, a functional prosthesis can be salvaged, and fracture union can be expected.
Less common are acetabular fractures detected during surgery, as in our study. In an outcome series, Haidukewych and colleagues3 reported on 21 periprosthetic acetabular fractures, all recognized during surgery and managed according to perceived stability of the component. All fractures healed uneventfully, and there were no other complications.
These studies provide a framework for addressing IAFs noticed in the early postoperative period. The diagnostic dilemma presented by these fractures was first discussed by Laflamme and colleagues.6 Nine of the 32 fractures in their series were classified as so-called type III fractures, recognized only after the early postoperative period. Additional radiographs (eg, Judet views) or CT scans were crucial in determining acetabular component stability, given the known poor outcomes associated with posterior column fracture. In our series, only 1 patient had CT performed after intraoperative recognition of fracture, and the extent of the fracture was not readily apparent on the patient’s postoperative radiograph. Given the successful recognition and treatment of these fractures in the early postoperative period in our series,
it is difficult to recommend advanced imaging for all periprosthetic IAFs. Perhaps this success is attributable to our almost universal use of screws for acetabular component fixation. Of the 11 patients with fractures recognized during the postoperative period, 8 had supplemental screw fixation at time of index surgery. If there is a question of fixation during component insertion, we recommend scrutinizing the acetabular rim for fracture and placing supplemental screw fixation. Screws placed for acetabular component fixation provide initial stability and may prevent early component failure in the setting of unrecognized medial or anterior fracture. In addition, when component stability is in question after impaction, we recommend using finger palpation to evaluate the sciatic notch for cortical step-off from an otherwise unrecognized fracture. Protected weight-bearing in the postoperative period may be left to the discretion of the surgeon, and the decision should be based on intraoperative stability of the acetabular component.
In our series, there was a disproportionate representation of fractures associated with elliptic acetabular components. All 5 of the fractures recognized during surgery and 5 of the 11 recognized after surgery occurred with elliptic components. The association between elliptic cup design and periprosthetic IAF was identified earlier, by Haidukewych and colleagues.3 Their series showed a statistically significant increase in fracture incidence with impaction of an elliptic cup into a bed prepared with a hemispheric reamer. In the present series, 75% of our acetabular components were impacted into a bed underreamed by 1 mm to 2 mm. It is typical of many surgeons at our institution to underream by 1 mm to 2 mm regardless of the type of component being implanted, though they show a growing trend to overream by only 1 mm with the PSL component, which has been both safe and reliable in preventing catastrophic posterior column fractures, especially with impaction of small (<50 mm) acetabular components. We have not observed early loosening or other evidence of failure with this technique. Cup impaction generates significant hoop stresses that can easily fracture sclerotic or otherwise poor-quality bone, and the dense bone around the acetabular rim experiences increased stress with impaction of elliptic components.2,11-15 Surgeons must understand the design traits of their components and be cognizant of the true difference between the diameter of the final reamer used and the real diameter of the acetabular component. We recommend having a difference of ≤1 mm to mitigate the risk of IAF occurring with cup insertion. With use of elliptic components, slight overreaming of the acetabular bed should be considered. More study is needed to better define these outcomes.
Study Limitations
Our study had several limitations, including the inherent biases of its retrospective design, small cohort size, and inclusion of multiple surgeons. Small cohort size is unavoidable given the low incidence of these injuries, and our study encompassed the experience of a high-volume hip arthroplasty service. There is the possibility that a subset of fractures may have persistently gone unrecognized, either during or after surgery, and the actual incidence of these complications may be higher. These outcomes represent our institutional experience addressing the complexities of these injuries. The lack of standardization in the management of these fractures in our series reflects the diagnostic dilemma they present, as well as the need for more study focused on their management and outcomes.
Conclusion
IAF, an uncommon complication of primary THA, most commonly occurs during component impaction. Acetabular component and surgical technique may influence the fracture rate. Intraoperative or prompt postoperative recognition of these fractures is crucial, as their location is associated with stability and outcome. Careful examination of postoperative radiographs, judicious use of advanced imaging, and close follow-up are needed to prevent early catastrophic failure. We argue against simply observing these unstable fractures and recommend early treatment with rigid fixation and, when necessary, acetabular component revision.
1. Sharkey PF, Hozack WJ, Callaghan JJ, et al. Acetabular fractures associated with cementless acetabular cup insertion: a report of 13 cases. J Arthroplasty.1999;14(4):426-431.
2. Kim YS, Callaghan JJ, Ahn PB, Brown TD. Fracture of the acetabulum during insertion of an oversized hemispherical component. J Bone Joint Surg Am. 1995;77(1):111-117.
3. Haidukewych GJ, Jacofsky DJ, Hanssen AD, Lewallen DG. Intraoperative fractures of the acetabulum during primary total hip arthroplasty. J Bone Joint Surg Am. 2006;88(9):1952-1956.
4. Curtis MJ, Jinnah RH, Wilson VD, Hungerford DS. The initial stability of uncemented acetabular components. J Bone Joint Surg Br. 1992;74(3):372-376.
5. Lachiewicz PF, Suh PB, Gilbert JA. In vitro initial fixation of porous-coated acetabular total hip components. A biomechanical and comparative study. J Arthroplasty. 1989;4(3):201-205.
6. Laflamme GY, Belzile EL, Fernandes JC, Vendittoli PA, Hébert-Davies J. Periprosthetic fractures of the acetabulum during component insertion: posterior column stability
is crucial. J Arthroplasty. 2015;30(2):265-269.
7. Desai G, Reis MD. Early postoperative acetabular discontinuity after total hip arthroplasty. J Arthroplasty. 2011;26(8):1570.e17-e19.
8. Gelalis ID, Politis AN, Arnaoutoglou CM, Georgakopoulos N, Mitsiou D, Xenakis TA. Traumatic periprosthetic acetabular fracture treated by acute one-stage revision arthroplasty. A case report and review of the literature. Injury. 2010;41(4):421-424.
9. Gras F, Marintschev I, Klos K, Fujak A, Mückley T, Hofmann GO. Navigated percutaneous screw fixation of a periprosthetic acetabular fracture. J Arthroplasty. 2010;25(7):1169.e1-e4.
10. Peterson CA, Lewallen DG. Periprosthetic fracture of the acetabulum after total hip arthroplasty. J Bone Joint Surg Am. 1996;78(8):1206-1213.
11. Hansen TM, Koenman JB, Headley AK. 3-D FEM analysis of interface fixation of acetabular implants. Trans Orthop Res Soc. 1992;17:400.
12. Yerby SA, Taylor JK, Murzic WJ. Acetabular component interface: press-fit fixation. Trans Orthop Res Soc. 1992;17:384.
13. Callaghan JJ. The clinical results and basic science of total hip arthroplasty with porous-coated prostheses. J Bone Joint Surg Am. 1993;75(2):299-310.
14. Cheng SL, Binnington AG, Bragdon CR, Jasty M, Harris WH, Davey JR. The effect of sizing mismatch on bone ingrowth into uncemented porous coated acetabular components: an in vivo canine study. Trans Orthop Res Soc. 1990;15:442.
15. Morscher E, Bereiter H, Lampert C, Cementless press-fit cup: principles, experimental data, and three-year follow-up study. Clin Orthop Relat Res. 1989;(249):12-20.
1. Sharkey PF, Hozack WJ, Callaghan JJ, et al. Acetabular fractures associated with cementless acetabular cup insertion: a report of 13 cases. J Arthroplasty.1999;14(4):426-431.
2. Kim YS, Callaghan JJ, Ahn PB, Brown TD. Fracture of the acetabulum during insertion of an oversized hemispherical component. J Bone Joint Surg Am. 1995;77(1):111-117.
3. Haidukewych GJ, Jacofsky DJ, Hanssen AD, Lewallen DG. Intraoperative fractures of the acetabulum during primary total hip arthroplasty. J Bone Joint Surg Am. 2006;88(9):1952-1956.
4. Curtis MJ, Jinnah RH, Wilson VD, Hungerford DS. The initial stability of uncemented acetabular components. J Bone Joint Surg Br. 1992;74(3):372-376.
5. Lachiewicz PF, Suh PB, Gilbert JA. In vitro initial fixation of porous-coated acetabular total hip components. A biomechanical and comparative study. J Arthroplasty. 1989;4(3):201-205.
6. Laflamme GY, Belzile EL, Fernandes JC, Vendittoli PA, Hébert-Davies J. Periprosthetic fractures of the acetabulum during component insertion: posterior column stability
is crucial. J Arthroplasty. 2015;30(2):265-269.
7. Desai G, Reis MD. Early postoperative acetabular discontinuity after total hip arthroplasty. J Arthroplasty. 2011;26(8):1570.e17-e19.
8. Gelalis ID, Politis AN, Arnaoutoglou CM, Georgakopoulos N, Mitsiou D, Xenakis TA. Traumatic periprosthetic acetabular fracture treated by acute one-stage revision arthroplasty. A case report and review of the literature. Injury. 2010;41(4):421-424.
9. Gras F, Marintschev I, Klos K, Fujak A, Mückley T, Hofmann GO. Navigated percutaneous screw fixation of a periprosthetic acetabular fracture. J Arthroplasty. 2010;25(7):1169.e1-e4.
10. Peterson CA, Lewallen DG. Periprosthetic fracture of the acetabulum after total hip arthroplasty. J Bone Joint Surg Am. 1996;78(8):1206-1213.
11. Hansen TM, Koenman JB, Headley AK. 3-D FEM analysis of interface fixation of acetabular implants. Trans Orthop Res Soc. 1992;17:400.
12. Yerby SA, Taylor JK, Murzic WJ. Acetabular component interface: press-fit fixation. Trans Orthop Res Soc. 1992;17:384.
13. Callaghan JJ. The clinical results and basic science of total hip arthroplasty with porous-coated prostheses. J Bone Joint Surg Am. 1993;75(2):299-310.
14. Cheng SL, Binnington AG, Bragdon CR, Jasty M, Harris WH, Davey JR. The effect of sizing mismatch on bone ingrowth into uncemented porous coated acetabular components: an in vivo canine study. Trans Orthop Res Soc. 1990;15:442.
15. Morscher E, Bereiter H, Lampert C, Cementless press-fit cup: principles, experimental data, and three-year follow-up study. Clin Orthop Relat Res. 1989;(249):12-20.
Cervical artery dissection related to chiropractic manipulation: One institution’s experience
ABSTRACT
Purpose The purpose of this study was to determine the frequency of patients seen at a single institution who were diagnosed with a cervical vessel dissection related to chiropractic neck manipulation.
Methods We identified cases through a retrospective chart review of patients seen between April 2008 and March 2012 who had a diagnosis of cervical artery dissection following a recent chiropractic manipulation. Relevant imaging studies were reviewed by a board-certified neuroradiologist to confirm the findings of a cervical artery dissection and stroke. We conducted telephone interviews to ascertain the presence of residual symptoms in the affected patients.
Results Of the 141 patients with cervical artery dissection, 12 had documented chiropractic neck manipulation prior to the onset of the symptoms that led to medical presentation. The 12 patients had a total of 16 cervical artery dissections. All 12 patients developed symptoms of acute stroke. All strokes were confirmed with magnetic resonance imaging or computerized tomography. We obtained follow-up information on 9 patients, 8 of whom had residual symptoms and one of whom died as a result of his injury.
Conclusions In this case series, 12 patients with newly diagnosed cervical artery dissection(s) had recent chiropractic neck manipulation. Patients who are considering chiropractic cervical manipulation should be informed of the potential risk and be advised to seek immediate medical attention should they develop symptoms.
A prospective randomized controlled study published in 2012 showed chiropractic manipulation is beneficial in the treatment of neck pain compared with medical treatment, but it showed no significant difference between chiropractic manipulation and physical therapy exercises.1 Although chiropractic manipulation of the cervical spine may be effective, it may also cause harm.
Cerebellar and spinal cord injuries related to cervical chiropractic manipulation were first reported in 1947.2 By 1974, there were 12 reported cases.3 Noninvasive imaging has since greatly improved the diagnosis of cervical artery dissection and of stroke,4 and cervical artery dissection is now recognized as pathogenic of strokes occurring in association with chiropractic manipulation.5
A prospective series published in 2011 reported that, over 4 years, 13 patients were treated at a single institution for cervical arterial dissection following chiropractic treatment.6 That so many patients might be seen for this condition in that time frame at a single institution suggests the risk for such injury may be greater than thought. To explore that possibility, we performed a 4-year retrospective review to determine the experience at OSF Saint Francis Medical Center, which is affiliated with the University of Illinois College of Medicine, Peoria.
METHODS
Data sources. After receiving approval by the local institutional review board, we obtained data from the electronic medical records of OSF Saint Francis Medical Center, Peoria, Ill., using Epic (Epic Systems Corporation, Verona, Wis.) and IDX (General Electric Corporation, Fairfield, Conn.) systems. The records were queried using ICD-9 codes 443.21 and 443.24 to identify patients from April 2008 through March 2012 who had primary or secondary diagnoses of vertebral artery dissection (VAD) or carotid artery dissection (CAD). We reviewed all records of VAD and CAD to identify those that may have been associated with chiropractic manipulation.
Data collection. We abstracted data from 12 patients’ charts. Two patients were unavailable for direct contact: one was involved in ongoing litigation, and one had died (although we were able to speak with his wife). We attempted telephone contact with the 10 remaining patients and reached 8.
Data included the symptoms leading to chiropractic manipulation, symptoms following manipulation, timing of onset of symptoms relative to chiropractic manipulation, identifying information for the treating chiropractor, and residual patient symptoms. We also recorded patients’ ages, sex, locations of dissection, and locations of stroke. All dissections and strokes had been diagnosed during the patient’s initial hospitalization.
A board-certified radiologist (JRD) with a Certificate of Added Qualification in Neuroradiology (American Board of Medical Specialties) reviewed all pertinent imaging to confirm all dissections and strokes.
RESULTS
The medical record query yielded 141 patients with VAD or CAD, 15 of whom had undergone chiropractic manipulation prior to their presentation. The temporal association between chiropractic manipulation and arterial dissection was equivocal for 3 patients. In 12 patients, there was a verifiable temporal association between chiropractic manipulation and the arterial dissection. Three of the 12 patients were men and 9 were women. Ages ranged from 22 to 46 years, with a mean of 35.3 years.
Acute or chronic neck pain was the most common reason for seeking chiropractic care (TABLE 1). Immediately upon performance of cervical manipulation, 10 of the 12 developed acute symptoms different than those that caused them to seek chiropractic care. Two patients developed symptoms 2 to 3 days post-manipulation. Neither of the 2 had a history of neck trauma within the preceding year. Ten of the 12 patients sought immediate medical attention. Two of the 12 patients sought care when their symptoms became more severe, ranging from 2 days to several weeks later (TABLE 2). The treating chiropractor was identified in 7 cases and was different in each of the 7 cases.
A total of 16 cervical artery dissections, 14 VAD and 2 CAD, were confirmed by computed tomography angiography (CTA), magnetic resonance angiography (MRA), or catheter angiography (FIGURE 1). All 12 patients had acute strokes confirmed by MRA or CTA, including 9 in the cerebellum (FIGURE 2), 4 in the cerebrum, 2 in the medulla, and one in the pons.
Long-term outcomes were determined for 9 patients (TABLE 2). One patient’s symptoms resolved. Three patients had dizziness, clumsiness, or balance problems; 3 had persistent headaches; 2 had bilateral visual field abnormalities; and one patient walked with a cane, was no longer driving a car, and was on disability. One patient died as a result of his injury. One of the 12 cases was previously described in a case report.7
DISCUSSION
Dissection of the cervical arteries is more common than dissection in other arteries of comparable size. This increased risk in the cervical arteries is believed to be due to their relative mobility and proximity to bony structures.4
Sudden neck movement, a feature of chiropractic treatment, is one of several known risk factors for ‘spontaneous’ cervical artery dissection.8,9 Symptom onset and stroke may be delayed after a spontaneous cervical artery dissection.10 Spontaneous dissection more commonly involves the carotid arteries;4 however, the vertebral arteries appear more prone to dissection as a consequence of chiropractic manipulation,11 likely due to their relation to the cervical spine.
The vertebral artery runs through foramina in the transverse processes of vertebral bodies C1 through C6 (FIGURE 3). On exiting the C2 transverse process, the vertebral artery has a tortuous course, making several turns over and through adjacent bony structures.12 The artery is most prone to injury between the entrance to the transverse foramen of C6 and the foramen magnum (V2 and V3 segments).13 (The area of highest vulnerability is the tortuous segment from the transverse foramen of C2 to the foramen magnum.)
Sudden movements of the cervical spine may cause arterial dissection, whether the maneuvers are performed by a physician, a chiropractor, or a physical therapist.14 Injuries reported in the literature, however, most commonly follow chiropractic manipulation. In our series of 141 dissections, we found no cases associated with manipulation by other health professionals.
A 2003 study revealed cervical spine manipulation to be an independent and strong risk factor for vertebral artery dissection. The authors believed the relationship was likely causal.5 Data from the Canadian Stroke Consortium showed a 28% incidence of chiropractic manipulation in cases of cervical artery dissection.10
A 2008 study showed an association between vertebrobasilar stroke and chiropractic visits within one month of the vascular event.15 However, the study also showed an association of similar magnitude between vertebrobasilar stroke and visits to primary care physicians within the prior month. This suggests that cervical manipulation by chiropractors poses no more risk for cervical artery dissection than visits to primary care physicians. However, it is hard to reconcile such a conclusion with other studies, including our own, in which 10 patients developed new symptoms immediately with chiropractic manipulation of their cervical spines.
Perhaps the one-month observation period of Cassidy et al was excessive. Many post-manipulation events occur within hours or at most a few days, as would be expected given the hypothesized pathogenic mechanism. Perhaps if they had shortened their interval of study to the preceding 3 days, their findings may have been different.
A recent systematic review and meta-analysis demonstrated a slight association between chiropractic neck manipulation and cervical artery dissection. It stated that the quality of the published literature was very low, and it concluded there was no convincing evidence of causation.16 The fact that 10 of the 12 patients in our case series demonstrated acute symptoms immediately upon receiving spinal manipulation suggests a possible causal link; however, we agree with the authors of the meta-analysis that the quality of the literature is low.
A recent statement from the American Heart Association/American Stroke Association (and endorsed by the American Association of Neurological Surgeons and the Congress of Neurological Surgeons) has recommended that chiropractors inform patients of the statistical association between cervical artery dissection and cervical manipulation.17 In addition, it is important for chiropractors to be aware of the signs and symptoms of cervical artery dissection and stroke and to assess for these symptoms before performing neck manipulation, as illustrated in a recent case report.18 Due to the risk of death, patients who experience symptoms consistent with cervical artery dissection after chiropractic manipulation of the cervical spine should be advised to seek medical care immediately.
Our case series has several limitations. The study was retrospective. Existing documentation of associated chiropractic care was often sparse, necessitating phone calls to supplement the information. We believe it is possible that cases may have been missed because of inaccurate medical record documentation, deficits in the interview process concerning chiropractic care at the time of hospitalization, or because information concerning chiropractic care was not recorded in the chart.
A significant portion of our information came through phone contact with several of the patients. In some cases, we relied heavily on their recollection of events that had occurred anytime from a few days to a few years earlier. The accuracy and completeness of the information supplied by patients was not verified, allowing for potential recall bias.
We do not know whether our experience is consistent with that of other areas of the United States. However, the fact that a similar-size hospital in Phoenix reported similar findings suggests the experience may be more widespread.6
IMPLICATIONS OF OUR FINDINGS
Over a 4-year period at our institution, 12 patients experienced cervical vessel dissection related to chiropractic neck manipulation. A similar institution in another part of the country had previously described 13 such cases. The patients at both institutions were relatively young and incurred substantial residual morbidity. A single patient at each institution died. If these findings are representative of other institutions across the United States, the incidence of stroke secondary to chiropractic manipulation may be higher than supposed.
To assess this problem further, a randomized prospective cohort study could establish the relative risk of chiropractic manipulation of the cervical spine resulting in a cervical artery dissection. But such a study may be methodologically prohibitive. More feasible would be a case-control study similar to one carried out by Smith et al5 in which patients who had experienced cervical artery dissection were matched with subjects who had not incurred such injuries. Comparing the groups’ odds of having received chiropractic manipulation demonstrated that spinal manipulative therapy is an independent risk factor for vertebral artery dissection and is highly suggestive of a causal association. Replicating this study in a different population would be valuable.
Based on our findings, all patients who visit chiropractors for cervical spine manipulation should be informed of the potential risks and of the need to seek immediate medical assistance should symptoms suggestive of dissection or stroke occur during or after manipulation. Until the actual level of risk from chiropractic manipulation is known, patients with neck pain may be better served by equally effective passive physical therapy exercises.1
CORRESPONDENCE
Raymond E. Bertino, MD, 427 West Crestwood Drive, Peoria, IL 61614; [email protected].
ACKNOWLEDGEMENTS
We thank Deepak Nair, MD, for his assistance in reviewing the stroke neurology aspects of this study; Katie Groesch, MD, for her assistance in drafting portions of the Methods and Results sections; Rita Hermacinski for the generation of 3D images; and Stephanie Arthalony for her assistance in gathering information through patient telephone interviews.
1.
2. Pratt-Thomas HR, Knute EB. Cerebellar and spinal injuries after chiropractic manipulation. JAMA. 1947;133:600-603.
3. Miller RG, Burton R. Stroke following chiropractic manipulation of the spine. JAMA. 1974;229:189-190.
4. Schievink WI. Spontaneous dissection of the carotid and vertebral arteries. N Eng J Med. 2001;344:898-906.
5. Smith WS, Johnston SC, Skalabrin EJ, et al. Spinal manipulative therapy is an independent risk factor for vertebral artery dissection. Neurology. 2003;60:1424-1428.
6. Albuquerque FC, Hu YC, Dashti SR, et al. Craniocervical arterial dissections as sequelae of chiropractic manipulation: patterns of injury and management. J Neurosurg. 2011;115:1197-1205.
7. Bertino RE, Talkad AV, DeSanto JR, et al. Chiropractic manipulation of the neck and cervical artery dissection. Ann Intern Med. 2012;157:150-152.
8. Dittrich R, Rohsbach D, Heidbreder A, et al. Mild mechanical traumas are possible risk factors for cervical artery dissection. Cerebrovasc Dis. 2006;23:275-281.
9. Debette S, Leys D. Cervical-artery dissections: predisposing factors, diagnosis, and outcome. Lancet Neurol. 2009;8:668-678.
10. Norris JW, Beletsky V, Nadareishvili ZG. Sudden neck movement and cervical artery dissection. The Canadian Stroke Consortium. CMAJ. 2000;163:38-40.
11. Stevinson C, Ernst E. Risks associated with spinal manipulation. Am J Med. 2002;112:566–571.
12. Doshi AH, Aggarwal A, Patel AB. Normal vascular anatomy. In Naidich TP, Castillo M, Cha S, Smirniotopoulos JG, eds. Imaging of the Brain. Philadelphia, Pa: Saunders;
13. Arnold M, Bousser MG, Fahrni G, et al. Vertebral artery dissection: presenting findings and predictors of outcome. Stroke. 2006;37:2499-2503.
14. Reuter U, Hämling M, Kavuk I, et al. Vertebral artery dissections after chiropractic neck manipulation in Germany over three years. J Neurol. 2006;253:724-730.
15. Cassidy JD, Boyle E, Cote P, et al. Risk of vertebrobasilar stroke and chiropractic care: results of a population-based case-control and case-crossover study. Spine. 2008;17(Suppl 1):S176-S183.
16. Church EW, Sieg EP, Zalatima O, et al. Systematic review and meta-analysis of chiropractic care and cervical artery dissection: No evidence for causation. Cureus. 2016;8:e498.
17. Biller J, Sacco RL, Albuquerque FC, et al. Cervical arterial dissections and association with cervical manipulative therapy: a statement for healthcare professionals from the American Heart Association/American Stroke Association. Stroke. 2014;45:3155-3174.
18. Tarola G, Phillips RB. Chiropractic response to a spontaneous vertebral artery dissection. J Chiropr Med. 2015;14:183-190.
ABSTRACT
Purpose The purpose of this study was to determine the frequency of patients seen at a single institution who were diagnosed with a cervical vessel dissection related to chiropractic neck manipulation.
Methods We identified cases through a retrospective chart review of patients seen between April 2008 and March 2012 who had a diagnosis of cervical artery dissection following a recent chiropractic manipulation. Relevant imaging studies were reviewed by a board-certified neuroradiologist to confirm the findings of a cervical artery dissection and stroke. We conducted telephone interviews to ascertain the presence of residual symptoms in the affected patients.
Results Of the 141 patients with cervical artery dissection, 12 had documented chiropractic neck manipulation prior to the onset of the symptoms that led to medical presentation. The 12 patients had a total of 16 cervical artery dissections. All 12 patients developed symptoms of acute stroke. All strokes were confirmed with magnetic resonance imaging or computerized tomography. We obtained follow-up information on 9 patients, 8 of whom had residual symptoms and one of whom died as a result of his injury.
Conclusions In this case series, 12 patients with newly diagnosed cervical artery dissection(s) had recent chiropractic neck manipulation. Patients who are considering chiropractic cervical manipulation should be informed of the potential risk and be advised to seek immediate medical attention should they develop symptoms.
A prospective randomized controlled study published in 2012 showed chiropractic manipulation is beneficial in the treatment of neck pain compared with medical treatment, but it showed no significant difference between chiropractic manipulation and physical therapy exercises.1 Although chiropractic manipulation of the cervical spine may be effective, it may also cause harm.
Cerebellar and spinal cord injuries related to cervical chiropractic manipulation were first reported in 1947.2 By 1974, there were 12 reported cases.3 Noninvasive imaging has since greatly improved the diagnosis of cervical artery dissection and of stroke,4 and cervical artery dissection is now recognized as pathogenic of strokes occurring in association with chiropractic manipulation.5
A prospective series published in 2011 reported that, over 4 years, 13 patients were treated at a single institution for cervical arterial dissection following chiropractic treatment.6 That so many patients might be seen for this condition in that time frame at a single institution suggests the risk for such injury may be greater than thought. To explore that possibility, we performed a 4-year retrospective review to determine the experience at OSF Saint Francis Medical Center, which is affiliated with the University of Illinois College of Medicine, Peoria.
METHODS
Data sources. After receiving approval by the local institutional review board, we obtained data from the electronic medical records of OSF Saint Francis Medical Center, Peoria, Ill., using Epic (Epic Systems Corporation, Verona, Wis.) and IDX (General Electric Corporation, Fairfield, Conn.) systems. The records were queried using ICD-9 codes 443.21 and 443.24 to identify patients from April 2008 through March 2012 who had primary or secondary diagnoses of vertebral artery dissection (VAD) or carotid artery dissection (CAD). We reviewed all records of VAD and CAD to identify those that may have been associated with chiropractic manipulation.
Data collection. We abstracted data from 12 patients’ charts. Two patients were unavailable for direct contact: one was involved in ongoing litigation, and one had died (although we were able to speak with his wife). We attempted telephone contact with the 10 remaining patients and reached 8.
Data included the symptoms leading to chiropractic manipulation, symptoms following manipulation, timing of onset of symptoms relative to chiropractic manipulation, identifying information for the treating chiropractor, and residual patient symptoms. We also recorded patients’ ages, sex, locations of dissection, and locations of stroke. All dissections and strokes had been diagnosed during the patient’s initial hospitalization.
A board-certified radiologist (JRD) with a Certificate of Added Qualification in Neuroradiology (American Board of Medical Specialties) reviewed all pertinent imaging to confirm all dissections and strokes.
RESULTS
The medical record query yielded 141 patients with VAD or CAD, 15 of whom had undergone chiropractic manipulation prior to their presentation. The temporal association between chiropractic manipulation and arterial dissection was equivocal for 3 patients. In 12 patients, there was a verifiable temporal association between chiropractic manipulation and the arterial dissection. Three of the 12 patients were men and 9 were women. Ages ranged from 22 to 46 years, with a mean of 35.3 years.
Acute or chronic neck pain was the most common reason for seeking chiropractic care (TABLE 1). Immediately upon performance of cervical manipulation, 10 of the 12 developed acute symptoms different than those that caused them to seek chiropractic care. Two patients developed symptoms 2 to 3 days post-manipulation. Neither of the 2 had a history of neck trauma within the preceding year. Ten of the 12 patients sought immediate medical attention. Two of the 12 patients sought care when their symptoms became more severe, ranging from 2 days to several weeks later (TABLE 2). The treating chiropractor was identified in 7 cases and was different in each of the 7 cases.
A total of 16 cervical artery dissections, 14 VAD and 2 CAD, were confirmed by computed tomography angiography (CTA), magnetic resonance angiography (MRA), or catheter angiography (FIGURE 1). All 12 patients had acute strokes confirmed by MRA or CTA, including 9 in the cerebellum (FIGURE 2), 4 in the cerebrum, 2 in the medulla, and one in the pons.
Long-term outcomes were determined for 9 patients (TABLE 2). One patient’s symptoms resolved. Three patients had dizziness, clumsiness, or balance problems; 3 had persistent headaches; 2 had bilateral visual field abnormalities; and one patient walked with a cane, was no longer driving a car, and was on disability. One patient died as a result of his injury. One of the 12 cases was previously described in a case report.7
DISCUSSION
Dissection of the cervical arteries is more common than dissection in other arteries of comparable size. This increased risk in the cervical arteries is believed to be due to their relative mobility and proximity to bony structures.4
Sudden neck movement, a feature of chiropractic treatment, is one of several known risk factors for ‘spontaneous’ cervical artery dissection.8,9 Symptom onset and stroke may be delayed after a spontaneous cervical artery dissection.10 Spontaneous dissection more commonly involves the carotid arteries;4 however, the vertebral arteries appear more prone to dissection as a consequence of chiropractic manipulation,11 likely due to their relation to the cervical spine.
The vertebral artery runs through foramina in the transverse processes of vertebral bodies C1 through C6 (FIGURE 3). On exiting the C2 transverse process, the vertebral artery has a tortuous course, making several turns over and through adjacent bony structures.12 The artery is most prone to injury between the entrance to the transverse foramen of C6 and the foramen magnum (V2 and V3 segments).13 (The area of highest vulnerability is the tortuous segment from the transverse foramen of C2 to the foramen magnum.)
Sudden movements of the cervical spine may cause arterial dissection, whether the maneuvers are performed by a physician, a chiropractor, or a physical therapist.14 Injuries reported in the literature, however, most commonly follow chiropractic manipulation. In our series of 141 dissections, we found no cases associated with manipulation by other health professionals.
A 2003 study revealed cervical spine manipulation to be an independent and strong risk factor for vertebral artery dissection. The authors believed the relationship was likely causal.5 Data from the Canadian Stroke Consortium showed a 28% incidence of chiropractic manipulation in cases of cervical artery dissection.10
A 2008 study showed an association between vertebrobasilar stroke and chiropractic visits within one month of the vascular event.15 However, the study also showed an association of similar magnitude between vertebrobasilar stroke and visits to primary care physicians within the prior month. This suggests that cervical manipulation by chiropractors poses no more risk for cervical artery dissection than visits to primary care physicians. However, it is hard to reconcile such a conclusion with other studies, including our own, in which 10 patients developed new symptoms immediately with chiropractic manipulation of their cervical spines.
Perhaps the one-month observation period of Cassidy et al was excessive. Many post-manipulation events occur within hours or at most a few days, as would be expected given the hypothesized pathogenic mechanism. Perhaps if they had shortened their interval of study to the preceding 3 days, their findings may have been different.
A recent systematic review and meta-analysis demonstrated a slight association between chiropractic neck manipulation and cervical artery dissection. It stated that the quality of the published literature was very low, and it concluded there was no convincing evidence of causation.16 The fact that 10 of the 12 patients in our case series demonstrated acute symptoms immediately upon receiving spinal manipulation suggests a possible causal link; however, we agree with the authors of the meta-analysis that the quality of the literature is low.
A recent statement from the American Heart Association/American Stroke Association (and endorsed by the American Association of Neurological Surgeons and the Congress of Neurological Surgeons) has recommended that chiropractors inform patients of the statistical association between cervical artery dissection and cervical manipulation.17 In addition, it is important for chiropractors to be aware of the signs and symptoms of cervical artery dissection and stroke and to assess for these symptoms before performing neck manipulation, as illustrated in a recent case report.18 Due to the risk of death, patients who experience symptoms consistent with cervical artery dissection after chiropractic manipulation of the cervical spine should be advised to seek medical care immediately.
Our case series has several limitations. The study was retrospective. Existing documentation of associated chiropractic care was often sparse, necessitating phone calls to supplement the information. We believe it is possible that cases may have been missed because of inaccurate medical record documentation, deficits in the interview process concerning chiropractic care at the time of hospitalization, or because information concerning chiropractic care was not recorded in the chart.
A significant portion of our information came through phone contact with several of the patients. In some cases, we relied heavily on their recollection of events that had occurred anytime from a few days to a few years earlier. The accuracy and completeness of the information supplied by patients was not verified, allowing for potential recall bias.
We do not know whether our experience is consistent with that of other areas of the United States. However, the fact that a similar-size hospital in Phoenix reported similar findings suggests the experience may be more widespread.6
IMPLICATIONS OF OUR FINDINGS
Over a 4-year period at our institution, 12 patients experienced cervical vessel dissection related to chiropractic neck manipulation. A similar institution in another part of the country had previously described 13 such cases. The patients at both institutions were relatively young and incurred substantial residual morbidity. A single patient at each institution died. If these findings are representative of other institutions across the United States, the incidence of stroke secondary to chiropractic manipulation may be higher than supposed.
To assess this problem further, a randomized prospective cohort study could establish the relative risk of chiropractic manipulation of the cervical spine resulting in a cervical artery dissection. But such a study may be methodologically prohibitive. More feasible would be a case-control study similar to one carried out by Smith et al5 in which patients who had experienced cervical artery dissection were matched with subjects who had not incurred such injuries. Comparing the groups’ odds of having received chiropractic manipulation demonstrated that spinal manipulative therapy is an independent risk factor for vertebral artery dissection and is highly suggestive of a causal association. Replicating this study in a different population would be valuable.
Based on our findings, all patients who visit chiropractors for cervical spine manipulation should be informed of the potential risks and of the need to seek immediate medical assistance should symptoms suggestive of dissection or stroke occur during or after manipulation. Until the actual level of risk from chiropractic manipulation is known, patients with neck pain may be better served by equally effective passive physical therapy exercises.1
CORRESPONDENCE
Raymond E. Bertino, MD, 427 West Crestwood Drive, Peoria, IL 61614; [email protected].
ACKNOWLEDGEMENTS
We thank Deepak Nair, MD, for his assistance in reviewing the stroke neurology aspects of this study; Katie Groesch, MD, for her assistance in drafting portions of the Methods and Results sections; Rita Hermacinski for the generation of 3D images; and Stephanie Arthalony for her assistance in gathering information through patient telephone interviews.
ABSTRACT
Purpose The purpose of this study was to determine the frequency of patients seen at a single institution who were diagnosed with a cervical vessel dissection related to chiropractic neck manipulation.
Methods We identified cases through a retrospective chart review of patients seen between April 2008 and March 2012 who had a diagnosis of cervical artery dissection following a recent chiropractic manipulation. Relevant imaging studies were reviewed by a board-certified neuroradiologist to confirm the findings of a cervical artery dissection and stroke. We conducted telephone interviews to ascertain the presence of residual symptoms in the affected patients.
Results Of the 141 patients with cervical artery dissection, 12 had documented chiropractic neck manipulation prior to the onset of the symptoms that led to medical presentation. The 12 patients had a total of 16 cervical artery dissections. All 12 patients developed symptoms of acute stroke. All strokes were confirmed with magnetic resonance imaging or computerized tomography. We obtained follow-up information on 9 patients, 8 of whom had residual symptoms and one of whom died as a result of his injury.
Conclusions In this case series, 12 patients with newly diagnosed cervical artery dissection(s) had recent chiropractic neck manipulation. Patients who are considering chiropractic cervical manipulation should be informed of the potential risk and be advised to seek immediate medical attention should they develop symptoms.
A prospective randomized controlled study published in 2012 showed chiropractic manipulation is beneficial in the treatment of neck pain compared with medical treatment, but it showed no significant difference between chiropractic manipulation and physical therapy exercises.1 Although chiropractic manipulation of the cervical spine may be effective, it may also cause harm.
Cerebellar and spinal cord injuries related to cervical chiropractic manipulation were first reported in 1947.2 By 1974, there were 12 reported cases.3 Noninvasive imaging has since greatly improved the diagnosis of cervical artery dissection and of stroke,4 and cervical artery dissection is now recognized as pathogenic of strokes occurring in association with chiropractic manipulation.5
A prospective series published in 2011 reported that, over 4 years, 13 patients were treated at a single institution for cervical arterial dissection following chiropractic treatment.6 That so many patients might be seen for this condition in that time frame at a single institution suggests the risk for such injury may be greater than thought. To explore that possibility, we performed a 4-year retrospective review to determine the experience at OSF Saint Francis Medical Center, which is affiliated with the University of Illinois College of Medicine, Peoria.
METHODS
Data sources. After receiving approval by the local institutional review board, we obtained data from the electronic medical records of OSF Saint Francis Medical Center, Peoria, Ill., using Epic (Epic Systems Corporation, Verona, Wis.) and IDX (General Electric Corporation, Fairfield, Conn.) systems. The records were queried using ICD-9 codes 443.21 and 443.24 to identify patients from April 2008 through March 2012 who had primary or secondary diagnoses of vertebral artery dissection (VAD) or carotid artery dissection (CAD). We reviewed all records of VAD and CAD to identify those that may have been associated with chiropractic manipulation.
Data collection. We abstracted data from 12 patients’ charts. Two patients were unavailable for direct contact: one was involved in ongoing litigation, and one had died (although we were able to speak with his wife). We attempted telephone contact with the 10 remaining patients and reached 8.
Data included the symptoms leading to chiropractic manipulation, symptoms following manipulation, timing of onset of symptoms relative to chiropractic manipulation, identifying information for the treating chiropractor, and residual patient symptoms. We also recorded patients’ ages, sex, locations of dissection, and locations of stroke. All dissections and strokes had been diagnosed during the patient’s initial hospitalization.
A board-certified radiologist (JRD) with a Certificate of Added Qualification in Neuroradiology (American Board of Medical Specialties) reviewed all pertinent imaging to confirm all dissections and strokes.
RESULTS
The medical record query yielded 141 patients with VAD or CAD, 15 of whom had undergone chiropractic manipulation prior to their presentation. The temporal association between chiropractic manipulation and arterial dissection was equivocal for 3 patients. In 12 patients, there was a verifiable temporal association between chiropractic manipulation and the arterial dissection. Three of the 12 patients were men and 9 were women. Ages ranged from 22 to 46 years, with a mean of 35.3 years.
Acute or chronic neck pain was the most common reason for seeking chiropractic care (TABLE 1). Immediately upon performance of cervical manipulation, 10 of the 12 developed acute symptoms different than those that caused them to seek chiropractic care. Two patients developed symptoms 2 to 3 days post-manipulation. Neither of the 2 had a history of neck trauma within the preceding year. Ten of the 12 patients sought immediate medical attention. Two of the 12 patients sought care when their symptoms became more severe, ranging from 2 days to several weeks later (TABLE 2). The treating chiropractor was identified in 7 cases and was different in each of the 7 cases.
A total of 16 cervical artery dissections, 14 VAD and 2 CAD, were confirmed by computed tomography angiography (CTA), magnetic resonance angiography (MRA), or catheter angiography (FIGURE 1). All 12 patients had acute strokes confirmed by MRA or CTA, including 9 in the cerebellum (FIGURE 2), 4 in the cerebrum, 2 in the medulla, and one in the pons.
Long-term outcomes were determined for 9 patients (TABLE 2). One patient’s symptoms resolved. Three patients had dizziness, clumsiness, or balance problems; 3 had persistent headaches; 2 had bilateral visual field abnormalities; and one patient walked with a cane, was no longer driving a car, and was on disability. One patient died as a result of his injury. One of the 12 cases was previously described in a case report.7
DISCUSSION
Dissection of the cervical arteries is more common than dissection in other arteries of comparable size. This increased risk in the cervical arteries is believed to be due to their relative mobility and proximity to bony structures.4
Sudden neck movement, a feature of chiropractic treatment, is one of several known risk factors for ‘spontaneous’ cervical artery dissection.8,9 Symptom onset and stroke may be delayed after a spontaneous cervical artery dissection.10 Spontaneous dissection more commonly involves the carotid arteries;4 however, the vertebral arteries appear more prone to dissection as a consequence of chiropractic manipulation,11 likely due to their relation to the cervical spine.
The vertebral artery runs through foramina in the transverse processes of vertebral bodies C1 through C6 (FIGURE 3). On exiting the C2 transverse process, the vertebral artery has a tortuous course, making several turns over and through adjacent bony structures.12 The artery is most prone to injury between the entrance to the transverse foramen of C6 and the foramen magnum (V2 and V3 segments).13 (The area of highest vulnerability is the tortuous segment from the transverse foramen of C2 to the foramen magnum.)
Sudden movements of the cervical spine may cause arterial dissection, whether the maneuvers are performed by a physician, a chiropractor, or a physical therapist.14 Injuries reported in the literature, however, most commonly follow chiropractic manipulation. In our series of 141 dissections, we found no cases associated with manipulation by other health professionals.
A 2003 study revealed cervical spine manipulation to be an independent and strong risk factor for vertebral artery dissection. The authors believed the relationship was likely causal.5 Data from the Canadian Stroke Consortium showed a 28% incidence of chiropractic manipulation in cases of cervical artery dissection.10
A 2008 study showed an association between vertebrobasilar stroke and chiropractic visits within one month of the vascular event.15 However, the study also showed an association of similar magnitude between vertebrobasilar stroke and visits to primary care physicians within the prior month. This suggests that cervical manipulation by chiropractors poses no more risk for cervical artery dissection than visits to primary care physicians. However, it is hard to reconcile such a conclusion with other studies, including our own, in which 10 patients developed new symptoms immediately with chiropractic manipulation of their cervical spines.
Perhaps the one-month observation period of Cassidy et al was excessive. Many post-manipulation events occur within hours or at most a few days, as would be expected given the hypothesized pathogenic mechanism. Perhaps if they had shortened their interval of study to the preceding 3 days, their findings may have been different.
A recent systematic review and meta-analysis demonstrated a slight association between chiropractic neck manipulation and cervical artery dissection. It stated that the quality of the published literature was very low, and it concluded there was no convincing evidence of causation.16 The fact that 10 of the 12 patients in our case series demonstrated acute symptoms immediately upon receiving spinal manipulation suggests a possible causal link; however, we agree with the authors of the meta-analysis that the quality of the literature is low.
A recent statement from the American Heart Association/American Stroke Association (and endorsed by the American Association of Neurological Surgeons and the Congress of Neurological Surgeons) has recommended that chiropractors inform patients of the statistical association between cervical artery dissection and cervical manipulation.17 In addition, it is important for chiropractors to be aware of the signs and symptoms of cervical artery dissection and stroke and to assess for these symptoms before performing neck manipulation, as illustrated in a recent case report.18 Due to the risk of death, patients who experience symptoms consistent with cervical artery dissection after chiropractic manipulation of the cervical spine should be advised to seek medical care immediately.
Our case series has several limitations. The study was retrospective. Existing documentation of associated chiropractic care was often sparse, necessitating phone calls to supplement the information. We believe it is possible that cases may have been missed because of inaccurate medical record documentation, deficits in the interview process concerning chiropractic care at the time of hospitalization, or because information concerning chiropractic care was not recorded in the chart.
A significant portion of our information came through phone contact with several of the patients. In some cases, we relied heavily on their recollection of events that had occurred anytime from a few days to a few years earlier. The accuracy and completeness of the information supplied by patients was not verified, allowing for potential recall bias.
We do not know whether our experience is consistent with that of other areas of the United States. However, the fact that a similar-size hospital in Phoenix reported similar findings suggests the experience may be more widespread.6
IMPLICATIONS OF OUR FINDINGS
Over a 4-year period at our institution, 12 patients experienced cervical vessel dissection related to chiropractic neck manipulation. A similar institution in another part of the country had previously described 13 such cases. The patients at both institutions were relatively young and incurred substantial residual morbidity. A single patient at each institution died. If these findings are representative of other institutions across the United States, the incidence of stroke secondary to chiropractic manipulation may be higher than supposed.
To assess this problem further, a randomized prospective cohort study could establish the relative risk of chiropractic manipulation of the cervical spine resulting in a cervical artery dissection. But such a study may be methodologically prohibitive. More feasible would be a case-control study similar to one carried out by Smith et al5 in which patients who had experienced cervical artery dissection were matched with subjects who had not incurred such injuries. Comparing the groups’ odds of having received chiropractic manipulation demonstrated that spinal manipulative therapy is an independent risk factor for vertebral artery dissection and is highly suggestive of a causal association. Replicating this study in a different population would be valuable.
Based on our findings, all patients who visit chiropractors for cervical spine manipulation should be informed of the potential risks and of the need to seek immediate medical assistance should symptoms suggestive of dissection or stroke occur during or after manipulation. Until the actual level of risk from chiropractic manipulation is known, patients with neck pain may be better served by equally effective passive physical therapy exercises.1
CORRESPONDENCE
Raymond E. Bertino, MD, 427 West Crestwood Drive, Peoria, IL 61614; [email protected].
ACKNOWLEDGEMENTS
We thank Deepak Nair, MD, for his assistance in reviewing the stroke neurology aspects of this study; Katie Groesch, MD, for her assistance in drafting portions of the Methods and Results sections; Rita Hermacinski for the generation of 3D images; and Stephanie Arthalony for her assistance in gathering information through patient telephone interviews.
1.
2. Pratt-Thomas HR, Knute EB. Cerebellar and spinal injuries after chiropractic manipulation. JAMA. 1947;133:600-603.
3. Miller RG, Burton R. Stroke following chiropractic manipulation of the spine. JAMA. 1974;229:189-190.
4. Schievink WI. Spontaneous dissection of the carotid and vertebral arteries. N Eng J Med. 2001;344:898-906.
5. Smith WS, Johnston SC, Skalabrin EJ, et al. Spinal manipulative therapy is an independent risk factor for vertebral artery dissection. Neurology. 2003;60:1424-1428.
6. Albuquerque FC, Hu YC, Dashti SR, et al. Craniocervical arterial dissections as sequelae of chiropractic manipulation: patterns of injury and management. J Neurosurg. 2011;115:1197-1205.
7. Bertino RE, Talkad AV, DeSanto JR, et al. Chiropractic manipulation of the neck and cervical artery dissection. Ann Intern Med. 2012;157:150-152.
8. Dittrich R, Rohsbach D, Heidbreder A, et al. Mild mechanical traumas are possible risk factors for cervical artery dissection. Cerebrovasc Dis. 2006;23:275-281.
9. Debette S, Leys D. Cervical-artery dissections: predisposing factors, diagnosis, and outcome. Lancet Neurol. 2009;8:668-678.
10. Norris JW, Beletsky V, Nadareishvili ZG. Sudden neck movement and cervical artery dissection. The Canadian Stroke Consortium. CMAJ. 2000;163:38-40.
11. Stevinson C, Ernst E. Risks associated with spinal manipulation. Am J Med. 2002;112:566–571.
12. Doshi AH, Aggarwal A, Patel AB. Normal vascular anatomy. In Naidich TP, Castillo M, Cha S, Smirniotopoulos JG, eds. Imaging of the Brain. Philadelphia, Pa: Saunders;
13. Arnold M, Bousser MG, Fahrni G, et al. Vertebral artery dissection: presenting findings and predictors of outcome. Stroke. 2006;37:2499-2503.
14. Reuter U, Hämling M, Kavuk I, et al. Vertebral artery dissections after chiropractic neck manipulation in Germany over three years. J Neurol. 2006;253:724-730.
15. Cassidy JD, Boyle E, Cote P, et al. Risk of vertebrobasilar stroke and chiropractic care: results of a population-based case-control and case-crossover study. Spine. 2008;17(Suppl 1):S176-S183.
16. Church EW, Sieg EP, Zalatima O, et al. Systematic review and meta-analysis of chiropractic care and cervical artery dissection: No evidence for causation. Cureus. 2016;8:e498.
17. Biller J, Sacco RL, Albuquerque FC, et al. Cervical arterial dissections and association with cervical manipulative therapy: a statement for healthcare professionals from the American Heart Association/American Stroke Association. Stroke. 2014;45:3155-3174.
18. Tarola G, Phillips RB. Chiropractic response to a spontaneous vertebral artery dissection. J Chiropr Med. 2015;14:183-190.
1.
2. Pratt-Thomas HR, Knute EB. Cerebellar and spinal injuries after chiropractic manipulation. JAMA. 1947;133:600-603.
3. Miller RG, Burton R. Stroke following chiropractic manipulation of the spine. JAMA. 1974;229:189-190.
4. Schievink WI. Spontaneous dissection of the carotid and vertebral arteries. N Eng J Med. 2001;344:898-906.
5. Smith WS, Johnston SC, Skalabrin EJ, et al. Spinal manipulative therapy is an independent risk factor for vertebral artery dissection. Neurology. 2003;60:1424-1428.
6. Albuquerque FC, Hu YC, Dashti SR, et al. Craniocervical arterial dissections as sequelae of chiropractic manipulation: patterns of injury and management. J Neurosurg. 2011;115:1197-1205.
7. Bertino RE, Talkad AV, DeSanto JR, et al. Chiropractic manipulation of the neck and cervical artery dissection. Ann Intern Med. 2012;157:150-152.
8. Dittrich R, Rohsbach D, Heidbreder A, et al. Mild mechanical traumas are possible risk factors for cervical artery dissection. Cerebrovasc Dis. 2006;23:275-281.
9. Debette S, Leys D. Cervical-artery dissections: predisposing factors, diagnosis, and outcome. Lancet Neurol. 2009;8:668-678.
10. Norris JW, Beletsky V, Nadareishvili ZG. Sudden neck movement and cervical artery dissection. The Canadian Stroke Consortium. CMAJ. 2000;163:38-40.
11. Stevinson C, Ernst E. Risks associated with spinal manipulation. Am J Med. 2002;112:566–571.
12. Doshi AH, Aggarwal A, Patel AB. Normal vascular anatomy. In Naidich TP, Castillo M, Cha S, Smirniotopoulos JG, eds. Imaging of the Brain. Philadelphia, Pa: Saunders;
13. Arnold M, Bousser MG, Fahrni G, et al. Vertebral artery dissection: presenting findings and predictors of outcome. Stroke. 2006;37:2499-2503.
14. Reuter U, Hämling M, Kavuk I, et al. Vertebral artery dissections after chiropractic neck manipulation in Germany over three years. J Neurol. 2006;253:724-730.
15. Cassidy JD, Boyle E, Cote P, et al. Risk of vertebrobasilar stroke and chiropractic care: results of a population-based case-control and case-crossover study. Spine. 2008;17(Suppl 1):S176-S183.
16. Church EW, Sieg EP, Zalatima O, et al. Systematic review and meta-analysis of chiropractic care and cervical artery dissection: No evidence for causation. Cureus. 2016;8:e498.
17. Biller J, Sacco RL, Albuquerque FC, et al. Cervical arterial dissections and association with cervical manipulative therapy: a statement for healthcare professionals from the American Heart Association/American Stroke Association. Stroke. 2014;45:3155-3174.
18. Tarola G, Phillips RB. Chiropractic response to a spontaneous vertebral artery dissection. J Chiropr Med. 2015;14:183-190.
Percutaneous Release of Trigger Digits
Take-Home Points
- The author had a 90% success rate with no complications in treating almost 600 trigger digits.
- All digits can be safely treated, including multiple fingers on one hand, all in an office setting.
- Percutaneous trigger release appears to be a safe and reliable alternative to open surgery.
- Success rate, discomfort, and cost may make a percutaneous trigger release preferable to even a trial of corticosteroid injection.
- A failed percutaneous release can be successfully treated with an open release, if needed.
Trigger finger, or stenosing flexor tenosynovitis, is a condition characterized by clicking or locking during finger movement, sometimes resulting in the freezing of a digit in flexion or extension1 (Figure 1). [[{"fid":"202300","view_mode":"medstat_image_flush_left","attributes":{"class":"media-element file-medstat-image-flush-left","data-delta":"1"},"fields":{"format":"medstat_image_flush_left","field_file_image_caption[und][0][value]":"Figure 1.","field_file_image_credit[und][0][value]":"","field_file_image_caption[und][0][format]":"plain_text","field_file_image_credit[und][0][format]":"plain_text"},"type":"media","field_deltas":{"1":{"format":"medstat_image_flush_left","field_file_image_caption[und][0][value]":"Figure 1.","field_file_image_credit[und][0][value]":""}}}]]Tendon inflammation is thought to cause constriction of the tendon sheath and bunching of the fibrous bundles of the first annular (A1) pulley, often creating a palpable nodule at the base of the digit.2,3 Many patients experience intermittent joint pain and swelling, which may progress to triggering or complete locking of the digit.1 One of the most common conditions treated by hand surgeons, trigger finger is most often reported in the dominant hand of women in their sixth decade of life and has been associated with several conditions, including diabetes and rheumatoid arthritis.4-6 Other researchers have indicated the thumb and ring finger are most commonly affected, though all fingers can potentially trigger.7,8
Initial treatment often involves injecting corticosteroid into the flexor tendon sheath, at or proximal to the annular pulley system, to reduce inflammation and the fibrous nodule.3 Another injection study found an initial success rate of 57% with a single injection, and 86% with a second injection, but patients were monitored for only 6 months, a period that may have been too short for symptom recurrence.7
On failure of steroid injections, patients typically are treated with open tendon sheath incision.9 This procedure, usually performed in a hospital or outpatient surgery setting, requires postoperative wound care, including dressing changes, suture removal, possible hand therapy, and follow-up physician visits. Operative treatment involves making a 1-cm to 2-cm incision, releasing the A1 pulley, and skin suturing.7,8,10 The most common postoperative complaint is incisional tenderness, though long-term scar pain, infection, nerve injury, and disease recurrence have been reported.8 Overall, the procedure is very successful, providing up to 100% symptom relief.7,8,10
Endoscopic release of trigger finger has also been described as an effective operative treatment. This technique involves passing a small cannula through a palmar incision—using an endoscope and retrograde knife within this 2.7-mm tunnel.10 With this treatment, reduced visibility may increase the risk of nerve injury.10 Although generally successful, endoscopic release requires anesthesia and expensive instruments and has a significant learning curve.8,10
More recently, percutaneous release of trigger finger has been described as a definitive, in-office treatment.5,6,11,12 Percutaneous release has the obvious advantages of no open incision, less scarring, less discomfort, and shorter recovery. Several studies have found comparable success rates for open and percutaneous procedures but consistently shorter recovery with the percutaneous technique.7,8,12 Given its lower recurrence rate (vs steroid injections) and shorter recovery and lower cost (vs a surgical procedure), percutaneous treatment of stenosing tenosynovitis appears to be a safe, highly successful, and minimally invasive treatment method.8 This study represents a single surgeon’s experience with percutaneous tendon sheath incision over a 10-year period.
Methods
Patients presented with symptoms of stenosing flexor tenosynovitis with severity ranging from intermittent triggering to frank locking of the digit. Most patients underwent prior conservative treatment, including corticosteroid injections and hand therapy. With each patient, the senior author discussed the pathophysiology of trigger digit; treatment options, including observation, hand therapy, corticosteroid injection, percutaneous release, and open release; and potential risks and complications. The treatment path—initial corticosteroid injection, percutaneous release, or open release—was left up to the patient. The only exclusion criterion was prior surgery to the involved digit, and there was no discrimination by finger, symptomatic period, or severity. Each released digit was recorded independently. In no case was anticoagulant therapy discontinued.
A complete medical history was obtained for each patient.
Over a 10-year period (March 2003-December 2013), percutaneous release was performed on 596 trigger fingers in 429 patients, 18 years old or older. Of these patients, 279 were female. Mean age was 62 years (range, 26-97 years). Of the 531 releases with handedness recorded, 56.3% were performed on trigger digits on dominant hands (Table 1). [[{"fid":"202302","view_mode":"medstat_image_flush_right","attributes":{"class":"media-element file-medstat-image-flush-right","data-delta":"2"},"fields":{"format":"medstat_image_flush_right","field_file_image_caption[und][0][value]":"Table 1.","field_file_image_credit[und][0][value]":"","field_file_image_caption[und][0][format]":"plain_text","field_file_image_credit[und][0][format]":"plain_text"},"type":"media","field_deltas":{"2":{"format":"medstat_image_flush_right","field_file_image_caption[und][0][value]":"Table 1.","field_file_image_credit[und][0][value]":""}}}]]Mean duration of symptoms before percutaneous release was 9.7 months (range, 0.5-132 months). Of the 596 digits, 69 were reported to have previously sustained trauma, and 161 had been unsuccessfully treated with one or more cortisone injections before undergoing release. Of the suspected comorbidities examined, carpal tunnel syndrome was previously diagnosed in 79 patients and diabetes in 56 patients.1
Of the 429 patients, 313 had a single digit released and 116 had multiple digits released. Of the 116 patients in the multiple-release group, 80 had 2 fingers released, 24 had 3 released, 7 had 4 released, and 5 had 5 released. The 596 released trigger fingers consisted of 188 thumbs, 41 index fingers, 185 middle fingers, 140 ring fingers, and 42 small fingers.
Surgical Technique
In-office percutaneous trigger finger releases were performed with a local anesthetic. One milliliter of lidocaine 1% injection was used to anesthetize the skin, the subcutaneous tissues, and the flexor tendon sheath at the level of the A1 pulley. As described by Pandey and colleagues,6 the proper location of the pulley was confirmed using specific surface landmarks on each digit. After waiting several minutes to allow the anesthetic to take effect, the surgeon inserted an 18-gauge needle into the center of the pulley with the digit held in extension (Figure 2). [[{"fid":"202303","view_mode":"medstat_image_flush_left","attributes":{"class":"media-element file-medstat-image-flush-left","data-delta":"3"},"fields":{"format":"medstat_image_flush_left","field_file_image_caption[und][0][value]":"Figure 2.","field_file_image_credit[und][0][value]":"","field_file_image_caption[und][0][format]":"plain_text","field_file_image_credit[und][0][format]":"plain_text"},"type":"media","field_deltas":{"3":{"format":"medstat_image_flush_left","field_file_image_caption[und][0][value]":"Figure 2.","field_file_image_credit[und][0][value]":""}}}]]The needle was carefully moved longitudinally along the length of the pulley with the bevel of the needle parallel to the tendon. A grating sensation was felt as the fibers of the pulley were cut. Several needle passes were made until the pulley was felt to have been released. Complete release was determined by loss of the grating sensation, along with complete relief of any further symptoms of triggering. The puncture site was cleaned and covered with a light sterile dressing (watch the Video online). There was no postoperative immobilization, and patients were encouraged to immediately return to normal use of the digit. Hand therapy was not prescribed, and pain medications were not dispensed. A 1-week follow-up appointment was scheduled, and patients were advised to return for evaluation in the event of any recurring symptoms (eg, triggering, swelling, stiffness, pain).
Results
were successfully released with 1 percutaneous procedure (recurrence or failure rate, 9.9%). The thumb was the digit most reliably released (success rate, 94.7%) (Table 2). [[{"fid":"202306","view_mode":"medstat_image_flush_right","attributes":{"class":"media-element file-medstat-image-flush-right","data-delta":"4"},"fields":{"format":"medstat_image_flush_right","field_file_image_caption[und][0][value]":"Table 2.","field_file_image_credit[und][0][value]":"","field_file_image_caption[und][0][format]":"plain_text","field_file_image_credit[und][0][format]":"plain_text"},"type":"media","field_deltas":{"4":{"format":"medstat_image_flush_right","field_file_image_caption[und][0][value]":"Table 2.","field_file_image_credit[und][0][value]":""}}}]]Patients with recurrent or unresolved symptoms were given the options of a second percutaneous release or an open surgical procedure. Of the 59 digits unsuccessfully released, as identified by persistent triggering or locking of the digit, 17 were treated with a second percutaneous release (15 were successful), and 40 underwent open tendon sheath incision as a second procedure (success rate, 100%); triggering persisted in the remaining 2 digits, and these were considered failures (the 2 patients did not pursue further treatment).
There were no complications: infection; nerve, artery, or tendon injury; or chronic pain. Some patients had mild stiffness, swelling, or pain for a few days after the procedure, and these effects typically resolved without treatment. In 29 digits, persistent pain or swelling without triggering was successfully treated with a corticosteroid injection.
Discussion
[[{"fid":"202307","view_mode":"medstat_image_flush_left","attributes":{"class":"media-element file-medstat-image-flush-left","data-delta":"5"},"fields":{"format":"medstat_image_flush_left","field_file_image_caption[und][0][value]":"Table 3.","field_file_image_credit[und][0][value]":"","field_file_image_caption[und][0][format]":"plain_text","field_file_image_credit[und][0][format]":"plain_text"},"type":"media","field_deltas":{"5":{"format":"medstat_image_flush_left","field_file_image_caption[und][0][value]":"Table 3.","field_file_image_credit[und][0][value]":""}}}]]Over a 10-year period, 596 percutaneous trigger finger releases were sequentially performed by a single surgeon. The 90% success rate compares favorably with rates found in other studies (Table 3).5-9,12-14 The surgeon’s success rates for individual years vary and demonstrate no clear trend or learning curve with the procedure (Figure 3). There were no significant complications. Patient satisfaction with the procedure was high.[[{"fid":"202308","view_mode":"medstat_image_flush_right","attributes":{"class":"media-element file-medstat-image-flush-right","data-delta":"6"},"fields":{"format":"medstat_image_flush_right","field_file_image_caption[und][0][value]":"Figure 3.","field_file_image_credit[und][0][value]":"","field_file_image_caption[und][0][format]":"plain_text","field_file_image_credit[und][0][format]":"plain_text"},"type":"media","field_deltas":{"6":{"format":"medstat_image_flush_right","field_file_image_caption[und][0][value]":"Figure 3.","field_file_image_credit[und][0][value]":""}}}]]
There were no injuries to digital nerves, arteries, or flexor tendons, either early or late, and no reports of infections or long-term pain or loss of motion. Although it is quite probable that in some procedures the longitudinal passes of the 18-gauge needle may have also slightly cut into the flexor tendon after passing through the A1 pulley, the direction of the needle passes was in line with the direction of the collagen fibers of the tendon, and thus any inadvertent superficial abrasion would not have structurally weakened the tendon. Of the 40 digits that underwent open release after incomplete or failed percutaneous release, none showed significant longitudinal lacerations of the superficialis tendon. During these revision surgeries, the typical intraoperative finding was incomplete release of the A1 pulley, usually at the distal end. Although loss of the grating sensation or relief of further triggering symptoms was considered adequate evidence of a successful release in this study, small tendon attachments could remain and potentially could lead to recurrent triggering. Given the high success rate achieved with the large sample, however, these 2 factors are considered appropriate indicators of successful release.
It is unclear why there was a relatively consistent 10% failure rate and why it did not decrease over the 10-year study period. Although the technique used does not have a significant learning curve, it appears that digits are not actively triggering at time of procedure have a higher failure rate. When a patient’s digit is actively triggering, assessment of the success of the procedure is relatively straightforward, whereas when a digit intermittently triggers and locks and is not symptomatic in the office, success cannot be immediately determined.
No specific digit was significantly more prone to failed releases, though the small finger had the lowest success rate (85.7%). Given that only 56.4% of patients experienced triggering on the dominant hand, there is not enough evidence to suggest a significant relationship between likelihood of a trigger digit and a patient’s hand dominance. Similarly, there was no correlation between the duration of symptoms and the success of the percutaneous procedure.
Investigation of the relationship between the previously suggested comorbidities of carpal tunnel syndrome and diabetes was also inconclusive. Only 79 (18%) of 429 patients reported having carpal tunnel syndrome, and even fewer, 56 (13.0%), reported having diabetes. Only 69 of the 596 treated digits reportedly had sustained trauma before developing triggering symptoms, and only 12 of the 69 were unsuccessfully released. In addition, of the 161 digits in which one or more steroid injections failed to resolve triggering symptoms, 158 (87.3%) were successfully released with 1 percutaneous procedure. Collectively, these data show percutaneous release can effectively eliminate triggering symptoms in a digit that has sustained injury or that has been unsuccessfully treated with nonoperative methods. Failed percutaneous release subsequently can be reliably treated with an open procedure, and results are excellent.
This study had several limitations. It was retrospective, nonblinded, and did not compare outcomes of percutaneous release with those of an open procedure. Data are presented to support the efficacy and safety of percutaneous release as a treatment option. Another limitation is that pre-release treatment was not controlled. Patients had been treated with a variety of nonoperative methods, including use of anti-inflammatory medication, hand therapy, splinting, and one or more corticosteroid injections, both at our office and elsewhere.
Percutaneous release appears to have an advantage in terms of pain relief, but the study did not evaluate or control for procedure discomfort. However, patients who had been treated with a corticosteroid injection before percutaneous release consistently refused corticosteroid injections for subsequent trigger digits, citing the dramatic pain reduction achieved with release relative to injection. Similarly, all patients who had a trigger digit treated with open tendon sheath incision in the past indicated a strong preference for the percutaneous release.
Follow-up on this patient population was inconsistent and incomplete. Many patients did not return, presumably because they considered the procedure a success and thought follow-up was unnecessary. However, some patients may have had a recurrence or an incomplete release and gone elsewhere for treatment.
The results of this study, to date the largest study on percutaneous release of trigger finger, provide more evidence of the safety and efficacy of this procedure as a treatment option. The success rate of percutaneous release is high, surpasses that of nonoperative treatments such as steroid injections, and approaches that of open and endoscopic surgical alternatives. Some of the obvious advantages of percutaneous release are less visible scarring, fewer incision-related complications, and shorter rehabilitation.10 In addition, post-procedure pain is possibly reduced, symptom relief is comparable, operative time is significantly shorter,8 and percutaneous release is easily performed in the office setting.
Percutaneous release is a viable treatment option for stenosing flexor tenosynovitis, regardless of previously used nonoperative treatment methods, duration or severity of symptoms, or trigger digit treated.
1. Makkouk AH, Oetgen ME, Swigart CR, Dodds SD. Trigger finger: etiology, evaluation, and treatment. Curr Rev Musculoskelet Med. 2008;1(2):92-96.
2. Fahey JJ, Bollinger JA. Trigger-finger in adults and children. J Bone Joint Surg Am. 1954;36(6):1200-1218.
3. Marks MR, Gunther SF. Efficacy of cortisone injection in treatment of trigger fingers and thumbs. J Hand Surg Am. 1989;14(4):722-727.
4. Chammas M, Bousquet P, Renard E, Poirier JL, Jaffiol C, Allieu Y. Dupuytren’s disease, carpal tunnel syndrome, trigger finger, and diabetes mellitus. J Hand Surg Am. 1995;20(1):109-114.
5. Habbu R, Putman MD, Adams JE. Percutaneous release of the A1 pulley: a cadaver study. J Hand Surg Am. 2012;37(11):2273-2277.
6. Pandey BK, Sharma S, Manandhar RR, Pradhan RL, Lakhey S, Rijal KP. Percutaneous trigger finger release. Nepal Orthop Assoc J. 2010;1(1):1-5.
7. Sato ES, Gomes dos Santos JB, Belloti JC, Albertoni WM, Faloppa F. Treatment of trigger finger: randomized clinical trial comparing the methods of corticosteroid injection, percutaneous release and open surgery. Rheumatology. 2012;51(1):93-99.
8. Dierks U, Hoffmann R, Meek MF. Open versus percutaneous release of the A1-pulley for stenosing tendovaginitis: a prospective randomized trial. Tech Hand Up Extrem Surg. 2008;12(3):183-187.
9. Tanaka J. Percutaneous trigger finger release. Tech Hand Up Extrem Surg. 1999;3(1):52-57.
10. Pegoli L, Cavalli E, Cortese P, Parolo C, Pajardi G. A comparison of endoscopic and open trigger finger release. Hand Surg. 2008;13(3):147-151.
11. Ryzewicz M, Wolf JM. Trigger digits: principles, management, and complications. J Hand Surg Am. 2006;31(1):135-146.
12. Schramm JM, Nguyen M, Wongworawat MD. The safety of percutaneous trigger finger release. Hand. 2008;3(1):44-46.
13. Paulius KL, Maguina P. Ultrasound-assisted percutaneous trigger finger release: is it safe? Hand. 2009;4(1):35-37.
14. Cihantimur B, Akin S, Ozcan M. Percutaneous treatment of trigger finger. 34 fingers followed 0.5-2 years. Acta Orthop Scand. 1998;69(2):167-168.
Take-Home Points
- The author had a 90% success rate with no complications in treating almost 600 trigger digits.
- All digits can be safely treated, including multiple fingers on one hand, all in an office setting.
- Percutaneous trigger release appears to be a safe and reliable alternative to open surgery.
- Success rate, discomfort, and cost may make a percutaneous trigger release preferable to even a trial of corticosteroid injection.
- A failed percutaneous release can be successfully treated with an open release, if needed.
Trigger finger, or stenosing flexor tenosynovitis, is a condition characterized by clicking or locking during finger movement, sometimes resulting in the freezing of a digit in flexion or extension1 (Figure 1). [[{"fid":"202300","view_mode":"medstat_image_flush_left","attributes":{"class":"media-element file-medstat-image-flush-left","data-delta":"1"},"fields":{"format":"medstat_image_flush_left","field_file_image_caption[und][0][value]":"Figure 1.","field_file_image_credit[und][0][value]":"","field_file_image_caption[und][0][format]":"plain_text","field_file_image_credit[und][0][format]":"plain_text"},"type":"media","field_deltas":{"1":{"format":"medstat_image_flush_left","field_file_image_caption[und][0][value]":"Figure 1.","field_file_image_credit[und][0][value]":""}}}]]Tendon inflammation is thought to cause constriction of the tendon sheath and bunching of the fibrous bundles of the first annular (A1) pulley, often creating a palpable nodule at the base of the digit.2,3 Many patients experience intermittent joint pain and swelling, which may progress to triggering or complete locking of the digit.1 One of the most common conditions treated by hand surgeons, trigger finger is most often reported in the dominant hand of women in their sixth decade of life and has been associated with several conditions, including diabetes and rheumatoid arthritis.4-6 Other researchers have indicated the thumb and ring finger are most commonly affected, though all fingers can potentially trigger.7,8
Initial treatment often involves injecting corticosteroid into the flexor tendon sheath, at or proximal to the annular pulley system, to reduce inflammation and the fibrous nodule.3 Another injection study found an initial success rate of 57% with a single injection, and 86% with a second injection, but patients were monitored for only 6 months, a period that may have been too short for symptom recurrence.7
On failure of steroid injections, patients typically are treated with open tendon sheath incision.9 This procedure, usually performed in a hospital or outpatient surgery setting, requires postoperative wound care, including dressing changes, suture removal, possible hand therapy, and follow-up physician visits. Operative treatment involves making a 1-cm to 2-cm incision, releasing the A1 pulley, and skin suturing.7,8,10 The most common postoperative complaint is incisional tenderness, though long-term scar pain, infection, nerve injury, and disease recurrence have been reported.8 Overall, the procedure is very successful, providing up to 100% symptom relief.7,8,10
Endoscopic release of trigger finger has also been described as an effective operative treatment. This technique involves passing a small cannula through a palmar incision—using an endoscope and retrograde knife within this 2.7-mm tunnel.10 With this treatment, reduced visibility may increase the risk of nerve injury.10 Although generally successful, endoscopic release requires anesthesia and expensive instruments and has a significant learning curve.8,10
More recently, percutaneous release of trigger finger has been described as a definitive, in-office treatment.5,6,11,12 Percutaneous release has the obvious advantages of no open incision, less scarring, less discomfort, and shorter recovery. Several studies have found comparable success rates for open and percutaneous procedures but consistently shorter recovery with the percutaneous technique.7,8,12 Given its lower recurrence rate (vs steroid injections) and shorter recovery and lower cost (vs a surgical procedure), percutaneous treatment of stenosing tenosynovitis appears to be a safe, highly successful, and minimally invasive treatment method.8 This study represents a single surgeon’s experience with percutaneous tendon sheath incision over a 10-year period.
Methods
Patients presented with symptoms of stenosing flexor tenosynovitis with severity ranging from intermittent triggering to frank locking of the digit. Most patients underwent prior conservative treatment, including corticosteroid injections and hand therapy. With each patient, the senior author discussed the pathophysiology of trigger digit; treatment options, including observation, hand therapy, corticosteroid injection, percutaneous release, and open release; and potential risks and complications. The treatment path—initial corticosteroid injection, percutaneous release, or open release—was left up to the patient. The only exclusion criterion was prior surgery to the involved digit, and there was no discrimination by finger, symptomatic period, or severity. Each released digit was recorded independently. In no case was anticoagulant therapy discontinued.
A complete medical history was obtained for each patient.
Over a 10-year period (March 2003-December 2013), percutaneous release was performed on 596 trigger fingers in 429 patients, 18 years old or older. Of these patients, 279 were female. Mean age was 62 years (range, 26-97 years). Of the 531 releases with handedness recorded, 56.3% were performed on trigger digits on dominant hands (Table 1). [[{"fid":"202302","view_mode":"medstat_image_flush_right","attributes":{"class":"media-element file-medstat-image-flush-right","data-delta":"2"},"fields":{"format":"medstat_image_flush_right","field_file_image_caption[und][0][value]":"Table 1.","field_file_image_credit[und][0][value]":"","field_file_image_caption[und][0][format]":"plain_text","field_file_image_credit[und][0][format]":"plain_text"},"type":"media","field_deltas":{"2":{"format":"medstat_image_flush_right","field_file_image_caption[und][0][value]":"Table 1.","field_file_image_credit[und][0][value]":""}}}]]Mean duration of symptoms before percutaneous release was 9.7 months (range, 0.5-132 months). Of the 596 digits, 69 were reported to have previously sustained trauma, and 161 had been unsuccessfully treated with one or more cortisone injections before undergoing release. Of the suspected comorbidities examined, carpal tunnel syndrome was previously diagnosed in 79 patients and diabetes in 56 patients.1
Of the 429 patients, 313 had a single digit released and 116 had multiple digits released. Of the 116 patients in the multiple-release group, 80 had 2 fingers released, 24 had 3 released, 7 had 4 released, and 5 had 5 released. The 596 released trigger fingers consisted of 188 thumbs, 41 index fingers, 185 middle fingers, 140 ring fingers, and 42 small fingers.
Surgical Technique
In-office percutaneous trigger finger releases were performed with a local anesthetic. One milliliter of lidocaine 1% injection was used to anesthetize the skin, the subcutaneous tissues, and the flexor tendon sheath at the level of the A1 pulley. As described by Pandey and colleagues,6 the proper location of the pulley was confirmed using specific surface landmarks on each digit. After waiting several minutes to allow the anesthetic to take effect, the surgeon inserted an 18-gauge needle into the center of the pulley with the digit held in extension (Figure 2). [[{"fid":"202303","view_mode":"medstat_image_flush_left","attributes":{"class":"media-element file-medstat-image-flush-left","data-delta":"3"},"fields":{"format":"medstat_image_flush_left","field_file_image_caption[und][0][value]":"Figure 2.","field_file_image_credit[und][0][value]":"","field_file_image_caption[und][0][format]":"plain_text","field_file_image_credit[und][0][format]":"plain_text"},"type":"media","field_deltas":{"3":{"format":"medstat_image_flush_left","field_file_image_caption[und][0][value]":"Figure 2.","field_file_image_credit[und][0][value]":""}}}]]The needle was carefully moved longitudinally along the length of the pulley with the bevel of the needle parallel to the tendon. A grating sensation was felt as the fibers of the pulley were cut. Several needle passes were made until the pulley was felt to have been released. Complete release was determined by loss of the grating sensation, along with complete relief of any further symptoms of triggering. The puncture site was cleaned and covered with a light sterile dressing (watch the Video online). There was no postoperative immobilization, and patients were encouraged to immediately return to normal use of the digit. Hand therapy was not prescribed, and pain medications were not dispensed. A 1-week follow-up appointment was scheduled, and patients were advised to return for evaluation in the event of any recurring symptoms (eg, triggering, swelling, stiffness, pain).
Results
were successfully released with 1 percutaneous procedure (recurrence or failure rate, 9.9%). The thumb was the digit most reliably released (success rate, 94.7%) (Table 2). [[{"fid":"202306","view_mode":"medstat_image_flush_right","attributes":{"class":"media-element file-medstat-image-flush-right","data-delta":"4"},"fields":{"format":"medstat_image_flush_right","field_file_image_caption[und][0][value]":"Table 2.","field_file_image_credit[und][0][value]":"","field_file_image_caption[und][0][format]":"plain_text","field_file_image_credit[und][0][format]":"plain_text"},"type":"media","field_deltas":{"4":{"format":"medstat_image_flush_right","field_file_image_caption[und][0][value]":"Table 2.","field_file_image_credit[und][0][value]":""}}}]]Patients with recurrent or unresolved symptoms were given the options of a second percutaneous release or an open surgical procedure. Of the 59 digits unsuccessfully released, as identified by persistent triggering or locking of the digit, 17 were treated with a second percutaneous release (15 were successful), and 40 underwent open tendon sheath incision as a second procedure (success rate, 100%); triggering persisted in the remaining 2 digits, and these were considered failures (the 2 patients did not pursue further treatment).
There were no complications: infection; nerve, artery, or tendon injury; or chronic pain. Some patients had mild stiffness, swelling, or pain for a few days after the procedure, and these effects typically resolved without treatment. In 29 digits, persistent pain or swelling without triggering was successfully treated with a corticosteroid injection.
Discussion
[[{"fid":"202307","view_mode":"medstat_image_flush_left","attributes":{"class":"media-element file-medstat-image-flush-left","data-delta":"5"},"fields":{"format":"medstat_image_flush_left","field_file_image_caption[und][0][value]":"Table 3.","field_file_image_credit[und][0][value]":"","field_file_image_caption[und][0][format]":"plain_text","field_file_image_credit[und][0][format]":"plain_text"},"type":"media","field_deltas":{"5":{"format":"medstat_image_flush_left","field_file_image_caption[und][0][value]":"Table 3.","field_file_image_credit[und][0][value]":""}}}]]Over a 10-year period, 596 percutaneous trigger finger releases were sequentially performed by a single surgeon. The 90% success rate compares favorably with rates found in other studies (Table 3).5-9,12-14 The surgeon’s success rates for individual years vary and demonstrate no clear trend or learning curve with the procedure (Figure 3). There were no significant complications. Patient satisfaction with the procedure was high.[[{"fid":"202308","view_mode":"medstat_image_flush_right","attributes":{"class":"media-element file-medstat-image-flush-right","data-delta":"6"},"fields":{"format":"medstat_image_flush_right","field_file_image_caption[und][0][value]":"Figure 3.","field_file_image_credit[und][0][value]":"","field_file_image_caption[und][0][format]":"plain_text","field_file_image_credit[und][0][format]":"plain_text"},"type":"media","field_deltas":{"6":{"format":"medstat_image_flush_right","field_file_image_caption[und][0][value]":"Figure 3.","field_file_image_credit[und][0][value]":""}}}]]
There were no injuries to digital nerves, arteries, or flexor tendons, either early or late, and no reports of infections or long-term pain or loss of motion. Although it is quite probable that in some procedures the longitudinal passes of the 18-gauge needle may have also slightly cut into the flexor tendon after passing through the A1 pulley, the direction of the needle passes was in line with the direction of the collagen fibers of the tendon, and thus any inadvertent superficial abrasion would not have structurally weakened the tendon. Of the 40 digits that underwent open release after incomplete or failed percutaneous release, none showed significant longitudinal lacerations of the superficialis tendon. During these revision surgeries, the typical intraoperative finding was incomplete release of the A1 pulley, usually at the distal end. Although loss of the grating sensation or relief of further triggering symptoms was considered adequate evidence of a successful release in this study, small tendon attachments could remain and potentially could lead to recurrent triggering. Given the high success rate achieved with the large sample, however, these 2 factors are considered appropriate indicators of successful release.
It is unclear why there was a relatively consistent 10% failure rate and why it did not decrease over the 10-year study period. Although the technique used does not have a significant learning curve, it appears that digits are not actively triggering at time of procedure have a higher failure rate. When a patient’s digit is actively triggering, assessment of the success of the procedure is relatively straightforward, whereas when a digit intermittently triggers and locks and is not symptomatic in the office, success cannot be immediately determined.
No specific digit was significantly more prone to failed releases, though the small finger had the lowest success rate (85.7%). Given that only 56.4% of patients experienced triggering on the dominant hand, there is not enough evidence to suggest a significant relationship between likelihood of a trigger digit and a patient’s hand dominance. Similarly, there was no correlation between the duration of symptoms and the success of the percutaneous procedure.
Investigation of the relationship between the previously suggested comorbidities of carpal tunnel syndrome and diabetes was also inconclusive. Only 79 (18%) of 429 patients reported having carpal tunnel syndrome, and even fewer, 56 (13.0%), reported having diabetes. Only 69 of the 596 treated digits reportedly had sustained trauma before developing triggering symptoms, and only 12 of the 69 were unsuccessfully released. In addition, of the 161 digits in which one or more steroid injections failed to resolve triggering symptoms, 158 (87.3%) were successfully released with 1 percutaneous procedure. Collectively, these data show percutaneous release can effectively eliminate triggering symptoms in a digit that has sustained injury or that has been unsuccessfully treated with nonoperative methods. Failed percutaneous release subsequently can be reliably treated with an open procedure, and results are excellent.
This study had several limitations. It was retrospective, nonblinded, and did not compare outcomes of percutaneous release with those of an open procedure. Data are presented to support the efficacy and safety of percutaneous release as a treatment option. Another limitation is that pre-release treatment was not controlled. Patients had been treated with a variety of nonoperative methods, including use of anti-inflammatory medication, hand therapy, splinting, and one or more corticosteroid injections, both at our office and elsewhere.
Percutaneous release appears to have an advantage in terms of pain relief, but the study did not evaluate or control for procedure discomfort. However, patients who had been treated with a corticosteroid injection before percutaneous release consistently refused corticosteroid injections for subsequent trigger digits, citing the dramatic pain reduction achieved with release relative to injection. Similarly, all patients who had a trigger digit treated with open tendon sheath incision in the past indicated a strong preference for the percutaneous release.
Follow-up on this patient population was inconsistent and incomplete. Many patients did not return, presumably because they considered the procedure a success and thought follow-up was unnecessary. However, some patients may have had a recurrence or an incomplete release and gone elsewhere for treatment.
The results of this study, to date the largest study on percutaneous release of trigger finger, provide more evidence of the safety and efficacy of this procedure as a treatment option. The success rate of percutaneous release is high, surpasses that of nonoperative treatments such as steroid injections, and approaches that of open and endoscopic surgical alternatives. Some of the obvious advantages of percutaneous release are less visible scarring, fewer incision-related complications, and shorter rehabilitation.10 In addition, post-procedure pain is possibly reduced, symptom relief is comparable, operative time is significantly shorter,8 and percutaneous release is easily performed in the office setting.
Percutaneous release is a viable treatment option for stenosing flexor tenosynovitis, regardless of previously used nonoperative treatment methods, duration or severity of symptoms, or trigger digit treated.
Take-Home Points
- The author had a 90% success rate with no complications in treating almost 600 trigger digits.
- All digits can be safely treated, including multiple fingers on one hand, all in an office setting.
- Percutaneous trigger release appears to be a safe and reliable alternative to open surgery.
- Success rate, discomfort, and cost may make a percutaneous trigger release preferable to even a trial of corticosteroid injection.
- A failed percutaneous release can be successfully treated with an open release, if needed.
Trigger finger, or stenosing flexor tenosynovitis, is a condition characterized by clicking or locking during finger movement, sometimes resulting in the freezing of a digit in flexion or extension1 (Figure 1). [[{"fid":"202300","view_mode":"medstat_image_flush_left","attributes":{"class":"media-element file-medstat-image-flush-left","data-delta":"1"},"fields":{"format":"medstat_image_flush_left","field_file_image_caption[und][0][value]":"Figure 1.","field_file_image_credit[und][0][value]":"","field_file_image_caption[und][0][format]":"plain_text","field_file_image_credit[und][0][format]":"plain_text"},"type":"media","field_deltas":{"1":{"format":"medstat_image_flush_left","field_file_image_caption[und][0][value]":"Figure 1.","field_file_image_credit[und][0][value]":""}}}]]Tendon inflammation is thought to cause constriction of the tendon sheath and bunching of the fibrous bundles of the first annular (A1) pulley, often creating a palpable nodule at the base of the digit.2,3 Many patients experience intermittent joint pain and swelling, which may progress to triggering or complete locking of the digit.1 One of the most common conditions treated by hand surgeons, trigger finger is most often reported in the dominant hand of women in their sixth decade of life and has been associated with several conditions, including diabetes and rheumatoid arthritis.4-6 Other researchers have indicated the thumb and ring finger are most commonly affected, though all fingers can potentially trigger.7,8
Initial treatment often involves injecting corticosteroid into the flexor tendon sheath, at or proximal to the annular pulley system, to reduce inflammation and the fibrous nodule.3 Another injection study found an initial success rate of 57% with a single injection, and 86% with a second injection, but patients were monitored for only 6 months, a period that may have been too short for symptom recurrence.7
On failure of steroid injections, patients typically are treated with open tendon sheath incision.9 This procedure, usually performed in a hospital or outpatient surgery setting, requires postoperative wound care, including dressing changes, suture removal, possible hand therapy, and follow-up physician visits. Operative treatment involves making a 1-cm to 2-cm incision, releasing the A1 pulley, and skin suturing.7,8,10 The most common postoperative complaint is incisional tenderness, though long-term scar pain, infection, nerve injury, and disease recurrence have been reported.8 Overall, the procedure is very successful, providing up to 100% symptom relief.7,8,10
Endoscopic release of trigger finger has also been described as an effective operative treatment. This technique involves passing a small cannula through a palmar incision—using an endoscope and retrograde knife within this 2.7-mm tunnel.10 With this treatment, reduced visibility may increase the risk of nerve injury.10 Although generally successful, endoscopic release requires anesthesia and expensive instruments and has a significant learning curve.8,10
More recently, percutaneous release of trigger finger has been described as a definitive, in-office treatment.5,6,11,12 Percutaneous release has the obvious advantages of no open incision, less scarring, less discomfort, and shorter recovery. Several studies have found comparable success rates for open and percutaneous procedures but consistently shorter recovery with the percutaneous technique.7,8,12 Given its lower recurrence rate (vs steroid injections) and shorter recovery and lower cost (vs a surgical procedure), percutaneous treatment of stenosing tenosynovitis appears to be a safe, highly successful, and minimally invasive treatment method.8 This study represents a single surgeon’s experience with percutaneous tendon sheath incision over a 10-year period.
Methods
Patients presented with symptoms of stenosing flexor tenosynovitis with severity ranging from intermittent triggering to frank locking of the digit. Most patients underwent prior conservative treatment, including corticosteroid injections and hand therapy. With each patient, the senior author discussed the pathophysiology of trigger digit; treatment options, including observation, hand therapy, corticosteroid injection, percutaneous release, and open release; and potential risks and complications. The treatment path—initial corticosteroid injection, percutaneous release, or open release—was left up to the patient. The only exclusion criterion was prior surgery to the involved digit, and there was no discrimination by finger, symptomatic period, or severity. Each released digit was recorded independently. In no case was anticoagulant therapy discontinued.
A complete medical history was obtained for each patient.
Over a 10-year period (March 2003-December 2013), percutaneous release was performed on 596 trigger fingers in 429 patients, 18 years old or older. Of these patients, 279 were female. Mean age was 62 years (range, 26-97 years). Of the 531 releases with handedness recorded, 56.3% were performed on trigger digits on dominant hands (Table 1). [[{"fid":"202302","view_mode":"medstat_image_flush_right","attributes":{"class":"media-element file-medstat-image-flush-right","data-delta":"2"},"fields":{"format":"medstat_image_flush_right","field_file_image_caption[und][0][value]":"Table 1.","field_file_image_credit[und][0][value]":"","field_file_image_caption[und][0][format]":"plain_text","field_file_image_credit[und][0][format]":"plain_text"},"type":"media","field_deltas":{"2":{"format":"medstat_image_flush_right","field_file_image_caption[und][0][value]":"Table 1.","field_file_image_credit[und][0][value]":""}}}]]Mean duration of symptoms before percutaneous release was 9.7 months (range, 0.5-132 months). Of the 596 digits, 69 were reported to have previously sustained trauma, and 161 had been unsuccessfully treated with one or more cortisone injections before undergoing release. Of the suspected comorbidities examined, carpal tunnel syndrome was previously diagnosed in 79 patients and diabetes in 56 patients.1
Of the 429 patients, 313 had a single digit released and 116 had multiple digits released. Of the 116 patients in the multiple-release group, 80 had 2 fingers released, 24 had 3 released, 7 had 4 released, and 5 had 5 released. The 596 released trigger fingers consisted of 188 thumbs, 41 index fingers, 185 middle fingers, 140 ring fingers, and 42 small fingers.
Surgical Technique
In-office percutaneous trigger finger releases were performed with a local anesthetic. One milliliter of lidocaine 1% injection was used to anesthetize the skin, the subcutaneous tissues, and the flexor tendon sheath at the level of the A1 pulley. As described by Pandey and colleagues,6 the proper location of the pulley was confirmed using specific surface landmarks on each digit. After waiting several minutes to allow the anesthetic to take effect, the surgeon inserted an 18-gauge needle into the center of the pulley with the digit held in extension (Figure 2). [[{"fid":"202303","view_mode":"medstat_image_flush_left","attributes":{"class":"media-element file-medstat-image-flush-left","data-delta":"3"},"fields":{"format":"medstat_image_flush_left","field_file_image_caption[und][0][value]":"Figure 2.","field_file_image_credit[und][0][value]":"","field_file_image_caption[und][0][format]":"plain_text","field_file_image_credit[und][0][format]":"plain_text"},"type":"media","field_deltas":{"3":{"format":"medstat_image_flush_left","field_file_image_caption[und][0][value]":"Figure 2.","field_file_image_credit[und][0][value]":""}}}]]The needle was carefully moved longitudinally along the length of the pulley with the bevel of the needle parallel to the tendon. A grating sensation was felt as the fibers of the pulley were cut. Several needle passes were made until the pulley was felt to have been released. Complete release was determined by loss of the grating sensation, along with complete relief of any further symptoms of triggering. The puncture site was cleaned and covered with a light sterile dressing (watch the Video online). There was no postoperative immobilization, and patients were encouraged to immediately return to normal use of the digit. Hand therapy was not prescribed, and pain medications were not dispensed. A 1-week follow-up appointment was scheduled, and patients were advised to return for evaluation in the event of any recurring symptoms (eg, triggering, swelling, stiffness, pain).
Results
were successfully released with 1 percutaneous procedure (recurrence or failure rate, 9.9%). The thumb was the digit most reliably released (success rate, 94.7%) (Table 2). [[{"fid":"202306","view_mode":"medstat_image_flush_right","attributes":{"class":"media-element file-medstat-image-flush-right","data-delta":"4"},"fields":{"format":"medstat_image_flush_right","field_file_image_caption[und][0][value]":"Table 2.","field_file_image_credit[und][0][value]":"","field_file_image_caption[und][0][format]":"plain_text","field_file_image_credit[und][0][format]":"plain_text"},"type":"media","field_deltas":{"4":{"format":"medstat_image_flush_right","field_file_image_caption[und][0][value]":"Table 2.","field_file_image_credit[und][0][value]":""}}}]]Patients with recurrent or unresolved symptoms were given the options of a second percutaneous release or an open surgical procedure. Of the 59 digits unsuccessfully released, as identified by persistent triggering or locking of the digit, 17 were treated with a second percutaneous release (15 were successful), and 40 underwent open tendon sheath incision as a second procedure (success rate, 100%); triggering persisted in the remaining 2 digits, and these were considered failures (the 2 patients did not pursue further treatment).
There were no complications: infection; nerve, artery, or tendon injury; or chronic pain. Some patients had mild stiffness, swelling, or pain for a few days after the procedure, and these effects typically resolved without treatment. In 29 digits, persistent pain or swelling without triggering was successfully treated with a corticosteroid injection.
Discussion
[[{"fid":"202307","view_mode":"medstat_image_flush_left","attributes":{"class":"media-element file-medstat-image-flush-left","data-delta":"5"},"fields":{"format":"medstat_image_flush_left","field_file_image_caption[und][0][value]":"Table 3.","field_file_image_credit[und][0][value]":"","field_file_image_caption[und][0][format]":"plain_text","field_file_image_credit[und][0][format]":"plain_text"},"type":"media","field_deltas":{"5":{"format":"medstat_image_flush_left","field_file_image_caption[und][0][value]":"Table 3.","field_file_image_credit[und][0][value]":""}}}]]Over a 10-year period, 596 percutaneous trigger finger releases were sequentially performed by a single surgeon. The 90% success rate compares favorably with rates found in other studies (Table 3).5-9,12-14 The surgeon’s success rates for individual years vary and demonstrate no clear trend or learning curve with the procedure (Figure 3). There were no significant complications. Patient satisfaction with the procedure was high.[[{"fid":"202308","view_mode":"medstat_image_flush_right","attributes":{"class":"media-element file-medstat-image-flush-right","data-delta":"6"},"fields":{"format":"medstat_image_flush_right","field_file_image_caption[und][0][value]":"Figure 3.","field_file_image_credit[und][0][value]":"","field_file_image_caption[und][0][format]":"plain_text","field_file_image_credit[und][0][format]":"plain_text"},"type":"media","field_deltas":{"6":{"format":"medstat_image_flush_right","field_file_image_caption[und][0][value]":"Figure 3.","field_file_image_credit[und][0][value]":""}}}]]
There were no injuries to digital nerves, arteries, or flexor tendons, either early or late, and no reports of infections or long-term pain or loss of motion. Although it is quite probable that in some procedures the longitudinal passes of the 18-gauge needle may have also slightly cut into the flexor tendon after passing through the A1 pulley, the direction of the needle passes was in line with the direction of the collagen fibers of the tendon, and thus any inadvertent superficial abrasion would not have structurally weakened the tendon. Of the 40 digits that underwent open release after incomplete or failed percutaneous release, none showed significant longitudinal lacerations of the superficialis tendon. During these revision surgeries, the typical intraoperative finding was incomplete release of the A1 pulley, usually at the distal end. Although loss of the grating sensation or relief of further triggering symptoms was considered adequate evidence of a successful release in this study, small tendon attachments could remain and potentially could lead to recurrent triggering. Given the high success rate achieved with the large sample, however, these 2 factors are considered appropriate indicators of successful release.
It is unclear why there was a relatively consistent 10% failure rate and why it did not decrease over the 10-year study period. Although the technique used does not have a significant learning curve, it appears that digits are not actively triggering at time of procedure have a higher failure rate. When a patient’s digit is actively triggering, assessment of the success of the procedure is relatively straightforward, whereas when a digit intermittently triggers and locks and is not symptomatic in the office, success cannot be immediately determined.
No specific digit was significantly more prone to failed releases, though the small finger had the lowest success rate (85.7%). Given that only 56.4% of patients experienced triggering on the dominant hand, there is not enough evidence to suggest a significant relationship between likelihood of a trigger digit and a patient’s hand dominance. Similarly, there was no correlation between the duration of symptoms and the success of the percutaneous procedure.
Investigation of the relationship between the previously suggested comorbidities of carpal tunnel syndrome and diabetes was also inconclusive. Only 79 (18%) of 429 patients reported having carpal tunnel syndrome, and even fewer, 56 (13.0%), reported having diabetes. Only 69 of the 596 treated digits reportedly had sustained trauma before developing triggering symptoms, and only 12 of the 69 were unsuccessfully released. In addition, of the 161 digits in which one or more steroid injections failed to resolve triggering symptoms, 158 (87.3%) were successfully released with 1 percutaneous procedure. Collectively, these data show percutaneous release can effectively eliminate triggering symptoms in a digit that has sustained injury or that has been unsuccessfully treated with nonoperative methods. Failed percutaneous release subsequently can be reliably treated with an open procedure, and results are excellent.
This study had several limitations. It was retrospective, nonblinded, and did not compare outcomes of percutaneous release with those of an open procedure. Data are presented to support the efficacy and safety of percutaneous release as a treatment option. Another limitation is that pre-release treatment was not controlled. Patients had been treated with a variety of nonoperative methods, including use of anti-inflammatory medication, hand therapy, splinting, and one or more corticosteroid injections, both at our office and elsewhere.
Percutaneous release appears to have an advantage in terms of pain relief, but the study did not evaluate or control for procedure discomfort. However, patients who had been treated with a corticosteroid injection before percutaneous release consistently refused corticosteroid injections for subsequent trigger digits, citing the dramatic pain reduction achieved with release relative to injection. Similarly, all patients who had a trigger digit treated with open tendon sheath incision in the past indicated a strong preference for the percutaneous release.
Follow-up on this patient population was inconsistent and incomplete. Many patients did not return, presumably because they considered the procedure a success and thought follow-up was unnecessary. However, some patients may have had a recurrence or an incomplete release and gone elsewhere for treatment.
The results of this study, to date the largest study on percutaneous release of trigger finger, provide more evidence of the safety and efficacy of this procedure as a treatment option. The success rate of percutaneous release is high, surpasses that of nonoperative treatments such as steroid injections, and approaches that of open and endoscopic surgical alternatives. Some of the obvious advantages of percutaneous release are less visible scarring, fewer incision-related complications, and shorter rehabilitation.10 In addition, post-procedure pain is possibly reduced, symptom relief is comparable, operative time is significantly shorter,8 and percutaneous release is easily performed in the office setting.
Percutaneous release is a viable treatment option for stenosing flexor tenosynovitis, regardless of previously used nonoperative treatment methods, duration or severity of symptoms, or trigger digit treated.
1. Makkouk AH, Oetgen ME, Swigart CR, Dodds SD. Trigger finger: etiology, evaluation, and treatment. Curr Rev Musculoskelet Med. 2008;1(2):92-96.
2. Fahey JJ, Bollinger JA. Trigger-finger in adults and children. J Bone Joint Surg Am. 1954;36(6):1200-1218.
3. Marks MR, Gunther SF. Efficacy of cortisone injection in treatment of trigger fingers and thumbs. J Hand Surg Am. 1989;14(4):722-727.
4. Chammas M, Bousquet P, Renard E, Poirier JL, Jaffiol C, Allieu Y. Dupuytren’s disease, carpal tunnel syndrome, trigger finger, and diabetes mellitus. J Hand Surg Am. 1995;20(1):109-114.
5. Habbu R, Putman MD, Adams JE. Percutaneous release of the A1 pulley: a cadaver study. J Hand Surg Am. 2012;37(11):2273-2277.
6. Pandey BK, Sharma S, Manandhar RR, Pradhan RL, Lakhey S, Rijal KP. Percutaneous trigger finger release. Nepal Orthop Assoc J. 2010;1(1):1-5.
7. Sato ES, Gomes dos Santos JB, Belloti JC, Albertoni WM, Faloppa F. Treatment of trigger finger: randomized clinical trial comparing the methods of corticosteroid injection, percutaneous release and open surgery. Rheumatology. 2012;51(1):93-99.
8. Dierks U, Hoffmann R, Meek MF. Open versus percutaneous release of the A1-pulley for stenosing tendovaginitis: a prospective randomized trial. Tech Hand Up Extrem Surg. 2008;12(3):183-187.
9. Tanaka J. Percutaneous trigger finger release. Tech Hand Up Extrem Surg. 1999;3(1):52-57.
10. Pegoli L, Cavalli E, Cortese P, Parolo C, Pajardi G. A comparison of endoscopic and open trigger finger release. Hand Surg. 2008;13(3):147-151.
11. Ryzewicz M, Wolf JM. Trigger digits: principles, management, and complications. J Hand Surg Am. 2006;31(1):135-146.
12. Schramm JM, Nguyen M, Wongworawat MD. The safety of percutaneous trigger finger release. Hand. 2008;3(1):44-46.
13. Paulius KL, Maguina P. Ultrasound-assisted percutaneous trigger finger release: is it safe? Hand. 2009;4(1):35-37.
14. Cihantimur B, Akin S, Ozcan M. Percutaneous treatment of trigger finger. 34 fingers followed 0.5-2 years. Acta Orthop Scand. 1998;69(2):167-168.
1. Makkouk AH, Oetgen ME, Swigart CR, Dodds SD. Trigger finger: etiology, evaluation, and treatment. Curr Rev Musculoskelet Med. 2008;1(2):92-96.
2. Fahey JJ, Bollinger JA. Trigger-finger in adults and children. J Bone Joint Surg Am. 1954;36(6):1200-1218.
3. Marks MR, Gunther SF. Efficacy of cortisone injection in treatment of trigger fingers and thumbs. J Hand Surg Am. 1989;14(4):722-727.
4. Chammas M, Bousquet P, Renard E, Poirier JL, Jaffiol C, Allieu Y. Dupuytren’s disease, carpal tunnel syndrome, trigger finger, and diabetes mellitus. J Hand Surg Am. 1995;20(1):109-114.
5. Habbu R, Putman MD, Adams JE. Percutaneous release of the A1 pulley: a cadaver study. J Hand Surg Am. 2012;37(11):2273-2277.
6. Pandey BK, Sharma S, Manandhar RR, Pradhan RL, Lakhey S, Rijal KP. Percutaneous trigger finger release. Nepal Orthop Assoc J. 2010;1(1):1-5.
7. Sato ES, Gomes dos Santos JB, Belloti JC, Albertoni WM, Faloppa F. Treatment of trigger finger: randomized clinical trial comparing the methods of corticosteroid injection, percutaneous release and open surgery. Rheumatology. 2012;51(1):93-99.
8. Dierks U, Hoffmann R, Meek MF. Open versus percutaneous release of the A1-pulley for stenosing tendovaginitis: a prospective randomized trial. Tech Hand Up Extrem Surg. 2008;12(3):183-187.
9. Tanaka J. Percutaneous trigger finger release. Tech Hand Up Extrem Surg. 1999;3(1):52-57.
10. Pegoli L, Cavalli E, Cortese P, Parolo C, Pajardi G. A comparison of endoscopic and open trigger finger release. Hand Surg. 2008;13(3):147-151.
11. Ryzewicz M, Wolf JM. Trigger digits: principles, management, and complications. J Hand Surg Am. 2006;31(1):135-146.
12. Schramm JM, Nguyen M, Wongworawat MD. The safety of percutaneous trigger finger release. Hand. 2008;3(1):44-46.
13. Paulius KL, Maguina P. Ultrasound-assisted percutaneous trigger finger release: is it safe? Hand. 2009;4(1):35-37.
14. Cihantimur B, Akin S, Ozcan M. Percutaneous treatment of trigger finger. 34 fingers followed 0.5-2 years. Acta Orthop Scand. 1998;69(2):167-168.
Patient Preference Before and After Arthroscopic Rotator Cuff Repair: Which Is More Important, Pain Relief or Strength Return?
Take-Home Points
- Pain relief and return of strength are important satisfaction variables for patients undergoing ARCR.
- Pain relief and strength return are equally desirable in the majority (50%) of the patients before and after ARCR.
- Overall, patient preference for strength return dominates pain relief in long-term.
- Increasing age is associated with a stronger preference for pain relief.
- Improved understanding of patient expectations after ARCR will promote meaningful changes in patient satisfaction.
A rotator cuff tear (RCT) can cause significant pain, weakness, stiffness, and loss of function in the shoulder. In most patients, arthroscopic rotator cuff repair (ARCR) provides significant and reproducible pain relief and variable return of shoulder strength and function.1-4 ARCR outcomes are well described and well represented by validated outcome measures.5-9 However, these outcomes do not always correlate with patient satisfaction. For example, after ARCR, 2 patients with similar outcome scores may have different satisfaction levels.
Patient satisfaction involves multiple factors and varies with the patient’s preoperative expectations and the degree to which the surgery matches the patient’s desired outcomes.10-15 In clinical studies, Tashjian and colleagues,10 Henn and colleagues,11 and O’Holleran and colleagues12 found patient satisfaction correlated most highly with postoperative shoulder pain, shoulder function, general health status, and outcome scores. However, our understanding of patients’ desired outcomes and expectations of ARCR is limited, particularly regarding the importance of pain relief and strength return relative to each other. We believe patients’ preoperative expectations are influenced by their self-assessments of symptom severity and by their understanding of the outcomes of surgical procedures and of the information they receive from their surgeons during preoperative evaluation.
We conducted an observational study to determine patients’ preoperative preferences and the importance of post-ARCR pain relief and strength return relative to each other. After surgery, preferences and ratings of pain relief and strength return were reevaluated to determine if they were altered by outcomes. We also studied the influence of multiple factors, including severity of preoperative symptoms (pain, weakness), age, sex, occupation, and active sports involvement, on patients’ preoperative ratings of the importance of post-ARCR improvements in pain relief and strength return. We hypothesized that patients would vary in how they preoperatively value and desire post-ARCR pain relief and strength return.
Materials and Methods
The simple shoulder questionnaire (Figure) designed for this study had 12 items. Patients subjectively assessed the severity of their symptoms (pain level, shoulder weakness) and rated the importance of both pain relief and strength return to their occupational and personal life.
Before patients underwent surgery for symptomatic suspected RCTs, they were approached to participate in this prospective study. Sixty-five patients provided informed consent on forms approved by an Institutional Review Board. Inclusion criteria were suspected unilateral rotator cuff pathology and willingness to participate. Of the 65 patients, 60 underwent ARCR without another procedure, such as shoulder instability repair, SLAP (superior labrum anterior-to-posterior) repair, or distal clavicle excision; the other 5 patients elected nonoperative treatment and were excluded from review. At a mean (SD) follow-up of 5.2 (0.2) years, the 60 patients who had surgery completed the questionnaire again and rated the importance of pain relief and strength return relative to each other.
Patients with RCTs were divided according to age, sex, shoulder dominance, occupation type, and active sports involvement. Standard definitions for occupation types were used: blue-collar, manual labor jobs; white-collar, salaried/educated positions; and retired.
Matched-pairs t tests were used to compare preoperative and postoperative continuous variables (strength return preference, pain relief preference, SPD). One-way analysis of variance (ANOVA) was used to compare categorical variables (sex, shoulder dominance, active sports involvement) with continuous variables (SPD), and bivariate regression was used to compare groups with continuous data (age, SPD). In cases involving more than 2 groups (occupation types), the Tukey honestly significant difference (HSD) test was used to evaluate intergroup differences. P < .05 was used for statistical significance.
Results
ARCR Outcomes
After ARCR, there was significant improvement in patient-reported pain and subjective strength scores. Mean (SD) pain score improved from 5.9 (2.3) to 1.3 (2.3) after ARCR (P < .001), and mean (SD) strength improved from 46% (22%) of normal to 84% (17%) of normal (P < .001).
Importance of Post-ARCR Pain Relief and Strength Return
Analysis of preoperative questionnaire responses
revealed that, of 60 patients, 29 (48.3%) considered pain relief and strength return equally important, 20 (33.3%) valued postoperative strength return was more important, and 11 patients (18.3%) rated pain relief was more important than strength return. After a mean (SD) follow-up of 5.2 (0.2) years, 33 patients (55 %) valued pain relief and strength return as equally important, 17 patients (28.3%) preferred a strength recovery, and 10 patients (16.7%) preferred pain relief.
Overall patient ratings were significantly higher for strength return compared to pain relief before surgery, mean (SD), 9.2 (2.1) and 8.6 (2.3) (P = .02), and afterward, 8.9 (1.9) and 8.2 (3.1) (P = .03) (Table 1).
Subgroup Analyses
Sex and Age. Of the 60 patients, 43 were male and 17 female. Mean (SD) preoperative SPD was 1.0 (2.7) for males and 0.7 (2.3) females; the difference was not significant (P = .61). After surgery, females emphasized strength return over pain relief more than males did: Mean (SD) SPD was significantly higher (P = .04) for females, 1.7 (3.0), than for males, 0.4 (2.5). There were no preoperative–postoperative differences (P = .33) for males or females (Table 2).
Hand Dominance. RCT was found in the dominant shoulder of 31 patients (52%). Shoulder dominance did not affect SPD: Mean (SD) preoperative SPD was 1.3 (2.3) for dominant shoulders and 0.5 (2.7) for nondominant shoulders (P = .21), and postoperative SPD was 0.7 (2.6) for dominant and 0.9 (2.8) for nondominant (P = .79). SPD did not change from before surgery to after surgery for dominant (P = .14) or nondominant (P = .28) shoulders (Table 2).
Active Sports Participation. Thirty-two patients (53%) reported preoperative involvement in sports; 35 (58%) reported postoperative involvement (P = .37). Mean (SD) preoperative SPD was 1.4 (3.0) for involved patients and 0.3 (1.7) for uninvolved patients (P = .09), and postoperative SPD was 0.6 (2.8) for involved patients and 1.0 (2.6) for uninvolved patients (P = .53). SPD did not change from before surgery to after surgery for involved (P = .17) or uninvolved (P = .26) patients (Table 2).
Occupation Type. There were 9 blue-collar workers (15%), 32 white-collar workers (53%), and 19 retirees (32%). Mean (SD) preoperative SPD was 2.8 (4.2) for blue-collar workers, 1.2 (2.1) for white-collar workers, and –0.4 (0.4) for retirees. There were no significant differences in preoperative SPD between blue-collar and white-collar workers (P = .19) or between white-collar workers and retirees (P = .06), but there was a significant difference between blue-collar workers and retirees (P = .004). Mean (SD) postoperative SPD was 1.3 (2.7) for blue-collar workers, 1.2 (3.1) for white-collar workers, and –0.3 (1.6) for retirees. There were no significant differences between blue-collar and white-collar workers (P = .99), white-collar workers and retirees (P = .13), or blue-collar workers and retirees (P = .3).
Discussion
In this study, we wanted to determine patients’ pre- and postoperative preferences for pain relief and strength return after ARCR. Preoperative and postoperative preference analysis of the 60 patients who underwent ARCR revealed that the majority valued pain relief and strength return equally. However, overall, there was higher ratings for strength return in long term after ARCR, irrespective of age, sex, preoperative levels of shoulder pain and weakness, and preoperative and postoperative sports involvement.
Patients’ preoperative expectations are a function of their assessment of their symptoms, their perceptions of expected surgical outcomes, and their understanding of preoperative discussion with their surgeons. In this study, patients self-assessed their shoulder symptoms and their effect on their occupational and personal life. They also rated the importance of post-ARCR pain relief and strength return relative to each other. To assess whether surgical outcomes affected perceptions of pain relief and strength return, patients completed the questionnaire before and after surgery. Overall, patients rated postoperative strength return over pain relief on long-term (5 years).
Subgroup analysis revealed a weak positive correlation between patient-reported preoperative pain scores and ratings of the importance of pain relief after surgery, but there was no correlation between postoperative pain scores and ratings of the importance of pain relief after surgery. This finding was surprising because we thought pain relief would be more important than strength return for patients with higher pain scores.1-3,16-21 We would like to clarify a point about this study: That patients preferred strength return over pain relief does not mean they did not care about pain relief. A substantial subset of patients (~50%) valued pain relief and strength return equally. In rotator cuff pathology, pain and weakness are to an extent interrelated. Shoulder pain that limits a patient’s ability to perform a strenuous task can be perceived as shoulder weakness, which may explain why, despite having higher pain scores, patients preferred strength return over pain relief. Increasing age showed a positive correlation with preference for pain relief, which explains the finding that retirees preferred pain relief over strength return. We used SPD to express the preference for strength return over pain relief before and after ARCR. Unfortunately, SPD may not be used to quantitatively define the preference for strength return over pain relief.
Patient satisfaction after RCR involves multiple factors and has been well studied. In a retrospective analysis of 112 patients, Tashjian and colleagues10 found that patient satisfaction was affected by preoperative expectations, marital status, disability status, preoperative pain function, and general health status after RCR. They also found a positive but weak correlation between patient satisfaction and functional outcome scores, including visual analog scale (VAS), Simple Shoulder Test (SST), and Disabilities of the Arm, Shoulder, and Hand (DASH) scores. Henn and colleagues11 evaluated 125 patients who underwent primary RCR for a chronic RCT. Higher preoperative expectations correlated with better postoperative VAS, SST, DASH, and Short Form 36 performance, irrespective of worker compensation status, symptom duration, number of patient comorbidities, tear size, repair technique, and number of previous operations. In a prospective cohort analysis of 311 RCR patients, O’Holleran and colleagues12 found that decreased patient satisfaction was associated with postoperative pain and dysfunction. Furthermore, willingness to recommend surgery to another person was significantly related to patient satisfaction. In the present study, we did not correlate preoperative expectations with postoperative outcome scores or evaluate the effect of other known factors on RCR outcomes. Our main goal was to understand ARCR patients’ preoperative and postoperative evaluations of the importance of pain relief and strength return relative to each other. Improved understanding of patients’ expectations will allow us to identify disparities between expectations and outcomes.
Our study had several limitations. First, our questionnaire was not validated. However, we used it only as an assessment tool, to collect data, and do not propose using it to assess ARCR outcomes. Second, objective strength measurements were not performed, before or after surgery, and therefore patients’ perceptions of weakness were not tested. Third, we did not correlate preoperative or postoperative shoulder outcome scores with patients’ expectations. Our intention was to understand how ARCR patients rate the importance of pain relief and strength return relative to each other. Fourth, we did not correlate patients’ expectations of strength return and pain relief with preoperative tear size or postoperative retear status.
Our observational study results showed that, before undergoing ARCR, most patients valued postoperative pain relief and strength return equally. However, there was an overall preference for strength return over pain relief. Furthermore, this preference held up irrespective of age, sex, sports involvement, or preoperative symptom severity. These findings add to our understanding of patients’ preoperative expectations of ARCR.
Am J Orthop. 2017;46(4):E244-E250. Copyright Frontline Medical Communications Inc. 2017. All rights reserved.
1. Cole BJ, McCarty LP 3rd, Kang RW, Alford W, Lewis PB, Hayden JK. Arthroscopic rotator cuff repair: prospective functional outcome and repair integrity at minimum 2-year follow-up. J Shoulder Elbow Surg. 2007;16(5):579-585.
2. Huijsmans PE, Pritchard MP, Berghs BM, van Rooyen KS, Wallace AL, de Beer JF. Arthroscopic rotator cuff repair with double-row fixation. J Bone Joint Surg Am. 2007;89(6):1248-1257.
3. Wilson F, Hinov V, Adams G. Arthroscopic repair of full-thickness tears of the rotator cuff: 2- to 14-year follow-up. Arthroscopy. 2002;18(2):136-144.
4. Denard PJ, Jiwani AZ, Lädermann A, Burkhart SS. Long-term outcome of a consecutive series of subscapularis tendon tears repaired arthroscopically. Arthroscopy. 2012;28(11):1587-1591.
5. Richards RR, An KN, Bigliani LU, et al. A standardized method for the assessment of shoulder function. J Shoulder Elbow Surg. 1994;3(6):347-352.
6. Roach KE, Budiman-Mak E, Songsiridej N, Lertratanakul Y. Development of a shoulder pain and disability index. Arthritis Care Res. 1991;4(4):143-149.
7. Constant CR, Murley AH. A clinical method of functional assessment of the shoulder. Clin Orthop Relat Res. 1987;(214):160-164.
8. Michener LA, McClure PW, Sennett BJ. American Shoulder and Elbow Surgeons Standardized Shoulder Assessment Form, patient self-report section: reliability, validity, and responsiveness. J Shoulder Elbow Surg. 2002;11(6):587-594.
9. Romeo AA, Bach BR Jr, O’Halloran KL. Scoring systems for shoulder conditions. Am J Sports Med. 1996;24(4):472-476.
10. Tashjian RZ, Bradley MP, Tocci S, Rey J, Henn RF, Green A. Factors influencing patient satisfaction after rotator cuff repair. J Shoulder Elbow Surg. 2007;16(6):752-758.
11. Henn RF 3rd, Kang L, Tashjian RZ, Green A. Patients’ preoperative expectations predict the outcome of rotator cuff repair. J Bone Joint Surg Am. 2007;89(9):1913-1919.
12. O’Holleran JD, Kocher MS, Horan MP, Briggs KK, Hawkins RJ. Determinants of patient satisfaction with outcome after rotator cuff surgery. J Bone Joint Surg Am. 2005;87(1):121-126.
13. Namdari S, Donegan RP, Chamberlain AM, Galatz LM, Yamaguchi K, Keener JD. Factors affecting outcome after structural failure of repaired rotator cuff tears. J Bone Joint Surg Am. 2014;96(2):99-105.
14. Nho SJ, Brown BS, Lyman S, Adler RS, Altchek DW, MacGillivray JD. Prospective analysis of arthroscopic rotator cuff repair: prognostic factors affecting clinical and ultrasound outcome. J Shoulder Elbow Surg. 2009;18(1):13-20.
15. Sonnabend DH, Watson EM. Structural factors affecting the outcome of rotator cuff repair. J Shoulder Elbow Surg. 2002;11(3):212-218.
16. Boileau P, Brassart N, Watkinson DJ, Carles M, Hatzidakis AM, Krishnan SG. Arthroscopic repair of full-thickness tears of the supraspinatus: does the tendon really heal? J Bone Joint Surg Am. 2005;87(6):1229-1240.
17. Sugaya H, Maeda K, Matsuki K, Moriishi J. Repair integrity and functional outcome after arthroscopic double-row rotator cuff repair. A prospective outcome study. J Bone Joint Surg Am. 2007;89(5):953-960.
18. DeFranco MJ, Bershadsky B, Ciccone J, Yum JK, Iannotti JP. Functional outcome of arthroscopic rotator cuff repairs: a correlation of anatomic and clinical results. J Shoulder Elbow Surg. 2007;16(6):759-765.
19. Galatz LM, Ball CM, Teefey SA, Middleton WD, Yamaguchi K. The outcome and repair integrity of completely arthroscopically repaired large and massive rotator cuff tears. J Bone Joint Surg Am. 2004;86(2):219-224.
20. Harryman DT 2nd, Mack LA, Wang KY, Jackins SE, Richardson ML, Matsen FA 3rd. Repairs of the rotator cuff. Correlation of functional results with integrity of the cuff. J Bone Joint Surg Am. 1991;73(7):982-989.
21. Romeo AA, Hang DW, Bach BR Jr, Shott S. Repair of full thickness rotator cuff tears. Gender, age, and other factors affecting outcome. Clin Orthop Relat Res. 1999;(367):243-255.
Take-Home Points
- Pain relief and return of strength are important satisfaction variables for patients undergoing ARCR.
- Pain relief and strength return are equally desirable in the majority (50%) of the patients before and after ARCR.
- Overall, patient preference for strength return dominates pain relief in long-term.
- Increasing age is associated with a stronger preference for pain relief.
- Improved understanding of patient expectations after ARCR will promote meaningful changes in patient satisfaction.
A rotator cuff tear (RCT) can cause significant pain, weakness, stiffness, and loss of function in the shoulder. In most patients, arthroscopic rotator cuff repair (ARCR) provides significant and reproducible pain relief and variable return of shoulder strength and function.1-4 ARCR outcomes are well described and well represented by validated outcome measures.5-9 However, these outcomes do not always correlate with patient satisfaction. For example, after ARCR, 2 patients with similar outcome scores may have different satisfaction levels.
Patient satisfaction involves multiple factors and varies with the patient’s preoperative expectations and the degree to which the surgery matches the patient’s desired outcomes.10-15 In clinical studies, Tashjian and colleagues,10 Henn and colleagues,11 and O’Holleran and colleagues12 found patient satisfaction correlated most highly with postoperative shoulder pain, shoulder function, general health status, and outcome scores. However, our understanding of patients’ desired outcomes and expectations of ARCR is limited, particularly regarding the importance of pain relief and strength return relative to each other. We believe patients’ preoperative expectations are influenced by their self-assessments of symptom severity and by their understanding of the outcomes of surgical procedures and of the information they receive from their surgeons during preoperative evaluation.
We conducted an observational study to determine patients’ preoperative preferences and the importance of post-ARCR pain relief and strength return relative to each other. After surgery, preferences and ratings of pain relief and strength return were reevaluated to determine if they were altered by outcomes. We also studied the influence of multiple factors, including severity of preoperative symptoms (pain, weakness), age, sex, occupation, and active sports involvement, on patients’ preoperative ratings of the importance of post-ARCR improvements in pain relief and strength return. We hypothesized that patients would vary in how they preoperatively value and desire post-ARCR pain relief and strength return.
Materials and Methods
The simple shoulder questionnaire (Figure) designed for this study had 12 items. Patients subjectively assessed the severity of their symptoms (pain level, shoulder weakness) and rated the importance of both pain relief and strength return to their occupational and personal life.
Before patients underwent surgery for symptomatic suspected RCTs, they were approached to participate in this prospective study. Sixty-five patients provided informed consent on forms approved by an Institutional Review Board. Inclusion criteria were suspected unilateral rotator cuff pathology and willingness to participate. Of the 65 patients, 60 underwent ARCR without another procedure, such as shoulder instability repair, SLAP (superior labrum anterior-to-posterior) repair, or distal clavicle excision; the other 5 patients elected nonoperative treatment and were excluded from review. At a mean (SD) follow-up of 5.2 (0.2) years, the 60 patients who had surgery completed the questionnaire again and rated the importance of pain relief and strength return relative to each other.
Patients with RCTs were divided according to age, sex, shoulder dominance, occupation type, and active sports involvement. Standard definitions for occupation types were used: blue-collar, manual labor jobs; white-collar, salaried/educated positions; and retired.
Matched-pairs t tests were used to compare preoperative and postoperative continuous variables (strength return preference, pain relief preference, SPD). One-way analysis of variance (ANOVA) was used to compare categorical variables (sex, shoulder dominance, active sports involvement) with continuous variables (SPD), and bivariate regression was used to compare groups with continuous data (age, SPD). In cases involving more than 2 groups (occupation types), the Tukey honestly significant difference (HSD) test was used to evaluate intergroup differences. P < .05 was used for statistical significance.
Results
ARCR Outcomes
After ARCR, there was significant improvement in patient-reported pain and subjective strength scores. Mean (SD) pain score improved from 5.9 (2.3) to 1.3 (2.3) after ARCR (P < .001), and mean (SD) strength improved from 46% (22%) of normal to 84% (17%) of normal (P < .001).
Importance of Post-ARCR Pain Relief and Strength Return
Analysis of preoperative questionnaire responses
revealed that, of 60 patients, 29 (48.3%) considered pain relief and strength return equally important, 20 (33.3%) valued postoperative strength return was more important, and 11 patients (18.3%) rated pain relief was more important than strength return. After a mean (SD) follow-up of 5.2 (0.2) years, 33 patients (55 %) valued pain relief and strength return as equally important, 17 patients (28.3%) preferred a strength recovery, and 10 patients (16.7%) preferred pain relief.
Overall patient ratings were significantly higher for strength return compared to pain relief before surgery, mean (SD), 9.2 (2.1) and 8.6 (2.3) (P = .02), and afterward, 8.9 (1.9) and 8.2 (3.1) (P = .03) (Table 1).
Subgroup Analyses
Sex and Age. Of the 60 patients, 43 were male and 17 female. Mean (SD) preoperative SPD was 1.0 (2.7) for males and 0.7 (2.3) females; the difference was not significant (P = .61). After surgery, females emphasized strength return over pain relief more than males did: Mean (SD) SPD was significantly higher (P = .04) for females, 1.7 (3.0), than for males, 0.4 (2.5). There were no preoperative–postoperative differences (P = .33) for males or females (Table 2).
Hand Dominance. RCT was found in the dominant shoulder of 31 patients (52%). Shoulder dominance did not affect SPD: Mean (SD) preoperative SPD was 1.3 (2.3) for dominant shoulders and 0.5 (2.7) for nondominant shoulders (P = .21), and postoperative SPD was 0.7 (2.6) for dominant and 0.9 (2.8) for nondominant (P = .79). SPD did not change from before surgery to after surgery for dominant (P = .14) or nondominant (P = .28) shoulders (Table 2).
Active Sports Participation. Thirty-two patients (53%) reported preoperative involvement in sports; 35 (58%) reported postoperative involvement (P = .37). Mean (SD) preoperative SPD was 1.4 (3.0) for involved patients and 0.3 (1.7) for uninvolved patients (P = .09), and postoperative SPD was 0.6 (2.8) for involved patients and 1.0 (2.6) for uninvolved patients (P = .53). SPD did not change from before surgery to after surgery for involved (P = .17) or uninvolved (P = .26) patients (Table 2).
Occupation Type. There were 9 blue-collar workers (15%), 32 white-collar workers (53%), and 19 retirees (32%). Mean (SD) preoperative SPD was 2.8 (4.2) for blue-collar workers, 1.2 (2.1) for white-collar workers, and –0.4 (0.4) for retirees. There were no significant differences in preoperative SPD between blue-collar and white-collar workers (P = .19) or between white-collar workers and retirees (P = .06), but there was a significant difference between blue-collar workers and retirees (P = .004). Mean (SD) postoperative SPD was 1.3 (2.7) for blue-collar workers, 1.2 (3.1) for white-collar workers, and –0.3 (1.6) for retirees. There were no significant differences between blue-collar and white-collar workers (P = .99), white-collar workers and retirees (P = .13), or blue-collar workers and retirees (P = .3).
Discussion
In this study, we wanted to determine patients’ pre- and postoperative preferences for pain relief and strength return after ARCR. Preoperative and postoperative preference analysis of the 60 patients who underwent ARCR revealed that the majority valued pain relief and strength return equally. However, overall, there was higher ratings for strength return in long term after ARCR, irrespective of age, sex, preoperative levels of shoulder pain and weakness, and preoperative and postoperative sports involvement.
Patients’ preoperative expectations are a function of their assessment of their symptoms, their perceptions of expected surgical outcomes, and their understanding of preoperative discussion with their surgeons. In this study, patients self-assessed their shoulder symptoms and their effect on their occupational and personal life. They also rated the importance of post-ARCR pain relief and strength return relative to each other. To assess whether surgical outcomes affected perceptions of pain relief and strength return, patients completed the questionnaire before and after surgery. Overall, patients rated postoperative strength return over pain relief on long-term (5 years).
Subgroup analysis revealed a weak positive correlation between patient-reported preoperative pain scores and ratings of the importance of pain relief after surgery, but there was no correlation between postoperative pain scores and ratings of the importance of pain relief after surgery. This finding was surprising because we thought pain relief would be more important than strength return for patients with higher pain scores.1-3,16-21 We would like to clarify a point about this study: That patients preferred strength return over pain relief does not mean they did not care about pain relief. A substantial subset of patients (~50%) valued pain relief and strength return equally. In rotator cuff pathology, pain and weakness are to an extent interrelated. Shoulder pain that limits a patient’s ability to perform a strenuous task can be perceived as shoulder weakness, which may explain why, despite having higher pain scores, patients preferred strength return over pain relief. Increasing age showed a positive correlation with preference for pain relief, which explains the finding that retirees preferred pain relief over strength return. We used SPD to express the preference for strength return over pain relief before and after ARCR. Unfortunately, SPD may not be used to quantitatively define the preference for strength return over pain relief.
Patient satisfaction after RCR involves multiple factors and has been well studied. In a retrospective analysis of 112 patients, Tashjian and colleagues10 found that patient satisfaction was affected by preoperative expectations, marital status, disability status, preoperative pain function, and general health status after RCR. They also found a positive but weak correlation between patient satisfaction and functional outcome scores, including visual analog scale (VAS), Simple Shoulder Test (SST), and Disabilities of the Arm, Shoulder, and Hand (DASH) scores. Henn and colleagues11 evaluated 125 patients who underwent primary RCR for a chronic RCT. Higher preoperative expectations correlated with better postoperative VAS, SST, DASH, and Short Form 36 performance, irrespective of worker compensation status, symptom duration, number of patient comorbidities, tear size, repair technique, and number of previous operations. In a prospective cohort analysis of 311 RCR patients, O’Holleran and colleagues12 found that decreased patient satisfaction was associated with postoperative pain and dysfunction. Furthermore, willingness to recommend surgery to another person was significantly related to patient satisfaction. In the present study, we did not correlate preoperative expectations with postoperative outcome scores or evaluate the effect of other known factors on RCR outcomes. Our main goal was to understand ARCR patients’ preoperative and postoperative evaluations of the importance of pain relief and strength return relative to each other. Improved understanding of patients’ expectations will allow us to identify disparities between expectations and outcomes.
Our study had several limitations. First, our questionnaire was not validated. However, we used it only as an assessment tool, to collect data, and do not propose using it to assess ARCR outcomes. Second, objective strength measurements were not performed, before or after surgery, and therefore patients’ perceptions of weakness were not tested. Third, we did not correlate preoperative or postoperative shoulder outcome scores with patients’ expectations. Our intention was to understand how ARCR patients rate the importance of pain relief and strength return relative to each other. Fourth, we did not correlate patients’ expectations of strength return and pain relief with preoperative tear size or postoperative retear status.
Our observational study results showed that, before undergoing ARCR, most patients valued postoperative pain relief and strength return equally. However, there was an overall preference for strength return over pain relief. Furthermore, this preference held up irrespective of age, sex, sports involvement, or preoperative symptom severity. These findings add to our understanding of patients’ preoperative expectations of ARCR.
Am J Orthop. 2017;46(4):E244-E250. Copyright Frontline Medical Communications Inc. 2017. All rights reserved.
Take-Home Points
- Pain relief and return of strength are important satisfaction variables for patients undergoing ARCR.
- Pain relief and strength return are equally desirable in the majority (50%) of the patients before and after ARCR.
- Overall, patient preference for strength return dominates pain relief in long-term.
- Increasing age is associated with a stronger preference for pain relief.
- Improved understanding of patient expectations after ARCR will promote meaningful changes in patient satisfaction.
A rotator cuff tear (RCT) can cause significant pain, weakness, stiffness, and loss of function in the shoulder. In most patients, arthroscopic rotator cuff repair (ARCR) provides significant and reproducible pain relief and variable return of shoulder strength and function.1-4 ARCR outcomes are well described and well represented by validated outcome measures.5-9 However, these outcomes do not always correlate with patient satisfaction. For example, after ARCR, 2 patients with similar outcome scores may have different satisfaction levels.
Patient satisfaction involves multiple factors and varies with the patient’s preoperative expectations and the degree to which the surgery matches the patient’s desired outcomes.10-15 In clinical studies, Tashjian and colleagues,10 Henn and colleagues,11 and O’Holleran and colleagues12 found patient satisfaction correlated most highly with postoperative shoulder pain, shoulder function, general health status, and outcome scores. However, our understanding of patients’ desired outcomes and expectations of ARCR is limited, particularly regarding the importance of pain relief and strength return relative to each other. We believe patients’ preoperative expectations are influenced by their self-assessments of symptom severity and by their understanding of the outcomes of surgical procedures and of the information they receive from their surgeons during preoperative evaluation.
We conducted an observational study to determine patients’ preoperative preferences and the importance of post-ARCR pain relief and strength return relative to each other. After surgery, preferences and ratings of pain relief and strength return were reevaluated to determine if they were altered by outcomes. We also studied the influence of multiple factors, including severity of preoperative symptoms (pain, weakness), age, sex, occupation, and active sports involvement, on patients’ preoperative ratings of the importance of post-ARCR improvements in pain relief and strength return. We hypothesized that patients would vary in how they preoperatively value and desire post-ARCR pain relief and strength return.
Materials and Methods
The simple shoulder questionnaire (Figure) designed for this study had 12 items. Patients subjectively assessed the severity of their symptoms (pain level, shoulder weakness) and rated the importance of both pain relief and strength return to their occupational and personal life.
Before patients underwent surgery for symptomatic suspected RCTs, they were approached to participate in this prospective study. Sixty-five patients provided informed consent on forms approved by an Institutional Review Board. Inclusion criteria were suspected unilateral rotator cuff pathology and willingness to participate. Of the 65 patients, 60 underwent ARCR without another procedure, such as shoulder instability repair, SLAP (superior labrum anterior-to-posterior) repair, or distal clavicle excision; the other 5 patients elected nonoperative treatment and were excluded from review. At a mean (SD) follow-up of 5.2 (0.2) years, the 60 patients who had surgery completed the questionnaire again and rated the importance of pain relief and strength return relative to each other.
Patients with RCTs were divided according to age, sex, shoulder dominance, occupation type, and active sports involvement. Standard definitions for occupation types were used: blue-collar, manual labor jobs; white-collar, salaried/educated positions; and retired.
Matched-pairs t tests were used to compare preoperative and postoperative continuous variables (strength return preference, pain relief preference, SPD). One-way analysis of variance (ANOVA) was used to compare categorical variables (sex, shoulder dominance, active sports involvement) with continuous variables (SPD), and bivariate regression was used to compare groups with continuous data (age, SPD). In cases involving more than 2 groups (occupation types), the Tukey honestly significant difference (HSD) test was used to evaluate intergroup differences. P < .05 was used for statistical significance.
Results
ARCR Outcomes
After ARCR, there was significant improvement in patient-reported pain and subjective strength scores. Mean (SD) pain score improved from 5.9 (2.3) to 1.3 (2.3) after ARCR (P < .001), and mean (SD) strength improved from 46% (22%) of normal to 84% (17%) of normal (P < .001).
Importance of Post-ARCR Pain Relief and Strength Return
Analysis of preoperative questionnaire responses
revealed that, of 60 patients, 29 (48.3%) considered pain relief and strength return equally important, 20 (33.3%) valued postoperative strength return was more important, and 11 patients (18.3%) rated pain relief was more important than strength return. After a mean (SD) follow-up of 5.2 (0.2) years, 33 patients (55 %) valued pain relief and strength return as equally important, 17 patients (28.3%) preferred a strength recovery, and 10 patients (16.7%) preferred pain relief.
Overall patient ratings were significantly higher for strength return compared to pain relief before surgery, mean (SD), 9.2 (2.1) and 8.6 (2.3) (P = .02), and afterward, 8.9 (1.9) and 8.2 (3.1) (P = .03) (Table 1).
Subgroup Analyses
Sex and Age. Of the 60 patients, 43 were male and 17 female. Mean (SD) preoperative SPD was 1.0 (2.7) for males and 0.7 (2.3) females; the difference was not significant (P = .61). After surgery, females emphasized strength return over pain relief more than males did: Mean (SD) SPD was significantly higher (P = .04) for females, 1.7 (3.0), than for males, 0.4 (2.5). There were no preoperative–postoperative differences (P = .33) for males or females (Table 2).
Hand Dominance. RCT was found in the dominant shoulder of 31 patients (52%). Shoulder dominance did not affect SPD: Mean (SD) preoperative SPD was 1.3 (2.3) for dominant shoulders and 0.5 (2.7) for nondominant shoulders (P = .21), and postoperative SPD was 0.7 (2.6) for dominant and 0.9 (2.8) for nondominant (P = .79). SPD did not change from before surgery to after surgery for dominant (P = .14) or nondominant (P = .28) shoulders (Table 2).
Active Sports Participation. Thirty-two patients (53%) reported preoperative involvement in sports; 35 (58%) reported postoperative involvement (P = .37). Mean (SD) preoperative SPD was 1.4 (3.0) for involved patients and 0.3 (1.7) for uninvolved patients (P = .09), and postoperative SPD was 0.6 (2.8) for involved patients and 1.0 (2.6) for uninvolved patients (P = .53). SPD did not change from before surgery to after surgery for involved (P = .17) or uninvolved (P = .26) patients (Table 2).
Occupation Type. There were 9 blue-collar workers (15%), 32 white-collar workers (53%), and 19 retirees (32%). Mean (SD) preoperative SPD was 2.8 (4.2) for blue-collar workers, 1.2 (2.1) for white-collar workers, and –0.4 (0.4) for retirees. There were no significant differences in preoperative SPD between blue-collar and white-collar workers (P = .19) or between white-collar workers and retirees (P = .06), but there was a significant difference between blue-collar workers and retirees (P = .004). Mean (SD) postoperative SPD was 1.3 (2.7) for blue-collar workers, 1.2 (3.1) for white-collar workers, and –0.3 (1.6) for retirees. There were no significant differences between blue-collar and white-collar workers (P = .99), white-collar workers and retirees (P = .13), or blue-collar workers and retirees (P = .3).
Discussion
In this study, we wanted to determine patients’ pre- and postoperative preferences for pain relief and strength return after ARCR. Preoperative and postoperative preference analysis of the 60 patients who underwent ARCR revealed that the majority valued pain relief and strength return equally. However, overall, there was higher ratings for strength return in long term after ARCR, irrespective of age, sex, preoperative levels of shoulder pain and weakness, and preoperative and postoperative sports involvement.
Patients’ preoperative expectations are a function of their assessment of their symptoms, their perceptions of expected surgical outcomes, and their understanding of preoperative discussion with their surgeons. In this study, patients self-assessed their shoulder symptoms and their effect on their occupational and personal life. They also rated the importance of post-ARCR pain relief and strength return relative to each other. To assess whether surgical outcomes affected perceptions of pain relief and strength return, patients completed the questionnaire before and after surgery. Overall, patients rated postoperative strength return over pain relief on long-term (5 years).
Subgroup analysis revealed a weak positive correlation between patient-reported preoperative pain scores and ratings of the importance of pain relief after surgery, but there was no correlation between postoperative pain scores and ratings of the importance of pain relief after surgery. This finding was surprising because we thought pain relief would be more important than strength return for patients with higher pain scores.1-3,16-21 We would like to clarify a point about this study: That patients preferred strength return over pain relief does not mean they did not care about pain relief. A substantial subset of patients (~50%) valued pain relief and strength return equally. In rotator cuff pathology, pain and weakness are to an extent interrelated. Shoulder pain that limits a patient’s ability to perform a strenuous task can be perceived as shoulder weakness, which may explain why, despite having higher pain scores, patients preferred strength return over pain relief. Increasing age showed a positive correlation with preference for pain relief, which explains the finding that retirees preferred pain relief over strength return. We used SPD to express the preference for strength return over pain relief before and after ARCR. Unfortunately, SPD may not be used to quantitatively define the preference for strength return over pain relief.
Patient satisfaction after RCR involves multiple factors and has been well studied. In a retrospective analysis of 112 patients, Tashjian and colleagues10 found that patient satisfaction was affected by preoperative expectations, marital status, disability status, preoperative pain function, and general health status after RCR. They also found a positive but weak correlation between patient satisfaction and functional outcome scores, including visual analog scale (VAS), Simple Shoulder Test (SST), and Disabilities of the Arm, Shoulder, and Hand (DASH) scores. Henn and colleagues11 evaluated 125 patients who underwent primary RCR for a chronic RCT. Higher preoperative expectations correlated with better postoperative VAS, SST, DASH, and Short Form 36 performance, irrespective of worker compensation status, symptom duration, number of patient comorbidities, tear size, repair technique, and number of previous operations. In a prospective cohort analysis of 311 RCR patients, O’Holleran and colleagues12 found that decreased patient satisfaction was associated with postoperative pain and dysfunction. Furthermore, willingness to recommend surgery to another person was significantly related to patient satisfaction. In the present study, we did not correlate preoperative expectations with postoperative outcome scores or evaluate the effect of other known factors on RCR outcomes. Our main goal was to understand ARCR patients’ preoperative and postoperative evaluations of the importance of pain relief and strength return relative to each other. Improved understanding of patients’ expectations will allow us to identify disparities between expectations and outcomes.
Our study had several limitations. First, our questionnaire was not validated. However, we used it only as an assessment tool, to collect data, and do not propose using it to assess ARCR outcomes. Second, objective strength measurements were not performed, before or after surgery, and therefore patients’ perceptions of weakness were not tested. Third, we did not correlate preoperative or postoperative shoulder outcome scores with patients’ expectations. Our intention was to understand how ARCR patients rate the importance of pain relief and strength return relative to each other. Fourth, we did not correlate patients’ expectations of strength return and pain relief with preoperative tear size or postoperative retear status.
Our observational study results showed that, before undergoing ARCR, most patients valued postoperative pain relief and strength return equally. However, there was an overall preference for strength return over pain relief. Furthermore, this preference held up irrespective of age, sex, sports involvement, or preoperative symptom severity. These findings add to our understanding of patients’ preoperative expectations of ARCR.
Am J Orthop. 2017;46(4):E244-E250. Copyright Frontline Medical Communications Inc. 2017. All rights reserved.
1. Cole BJ, McCarty LP 3rd, Kang RW, Alford W, Lewis PB, Hayden JK. Arthroscopic rotator cuff repair: prospective functional outcome and repair integrity at minimum 2-year follow-up. J Shoulder Elbow Surg. 2007;16(5):579-585.
2. Huijsmans PE, Pritchard MP, Berghs BM, van Rooyen KS, Wallace AL, de Beer JF. Arthroscopic rotator cuff repair with double-row fixation. J Bone Joint Surg Am. 2007;89(6):1248-1257.
3. Wilson F, Hinov V, Adams G. Arthroscopic repair of full-thickness tears of the rotator cuff: 2- to 14-year follow-up. Arthroscopy. 2002;18(2):136-144.
4. Denard PJ, Jiwani AZ, Lädermann A, Burkhart SS. Long-term outcome of a consecutive series of subscapularis tendon tears repaired arthroscopically. Arthroscopy. 2012;28(11):1587-1591.
5. Richards RR, An KN, Bigliani LU, et al. A standardized method for the assessment of shoulder function. J Shoulder Elbow Surg. 1994;3(6):347-352.
6. Roach KE, Budiman-Mak E, Songsiridej N, Lertratanakul Y. Development of a shoulder pain and disability index. Arthritis Care Res. 1991;4(4):143-149.
7. Constant CR, Murley AH. A clinical method of functional assessment of the shoulder. Clin Orthop Relat Res. 1987;(214):160-164.
8. Michener LA, McClure PW, Sennett BJ. American Shoulder and Elbow Surgeons Standardized Shoulder Assessment Form, patient self-report section: reliability, validity, and responsiveness. J Shoulder Elbow Surg. 2002;11(6):587-594.
9. Romeo AA, Bach BR Jr, O’Halloran KL. Scoring systems for shoulder conditions. Am J Sports Med. 1996;24(4):472-476.
10. Tashjian RZ, Bradley MP, Tocci S, Rey J, Henn RF, Green A. Factors influencing patient satisfaction after rotator cuff repair. J Shoulder Elbow Surg. 2007;16(6):752-758.
11. Henn RF 3rd, Kang L, Tashjian RZ, Green A. Patients’ preoperative expectations predict the outcome of rotator cuff repair. J Bone Joint Surg Am. 2007;89(9):1913-1919.
12. O’Holleran JD, Kocher MS, Horan MP, Briggs KK, Hawkins RJ. Determinants of patient satisfaction with outcome after rotator cuff surgery. J Bone Joint Surg Am. 2005;87(1):121-126.
13. Namdari S, Donegan RP, Chamberlain AM, Galatz LM, Yamaguchi K, Keener JD. Factors affecting outcome after structural failure of repaired rotator cuff tears. J Bone Joint Surg Am. 2014;96(2):99-105.
14. Nho SJ, Brown BS, Lyman S, Adler RS, Altchek DW, MacGillivray JD. Prospective analysis of arthroscopic rotator cuff repair: prognostic factors affecting clinical and ultrasound outcome. J Shoulder Elbow Surg. 2009;18(1):13-20.
15. Sonnabend DH, Watson EM. Structural factors affecting the outcome of rotator cuff repair. J Shoulder Elbow Surg. 2002;11(3):212-218.
16. Boileau P, Brassart N, Watkinson DJ, Carles M, Hatzidakis AM, Krishnan SG. Arthroscopic repair of full-thickness tears of the supraspinatus: does the tendon really heal? J Bone Joint Surg Am. 2005;87(6):1229-1240.
17. Sugaya H, Maeda K, Matsuki K, Moriishi J. Repair integrity and functional outcome after arthroscopic double-row rotator cuff repair. A prospective outcome study. J Bone Joint Surg Am. 2007;89(5):953-960.
18. DeFranco MJ, Bershadsky B, Ciccone J, Yum JK, Iannotti JP. Functional outcome of arthroscopic rotator cuff repairs: a correlation of anatomic and clinical results. J Shoulder Elbow Surg. 2007;16(6):759-765.
19. Galatz LM, Ball CM, Teefey SA, Middleton WD, Yamaguchi K. The outcome and repair integrity of completely arthroscopically repaired large and massive rotator cuff tears. J Bone Joint Surg Am. 2004;86(2):219-224.
20. Harryman DT 2nd, Mack LA, Wang KY, Jackins SE, Richardson ML, Matsen FA 3rd. Repairs of the rotator cuff. Correlation of functional results with integrity of the cuff. J Bone Joint Surg Am. 1991;73(7):982-989.
21. Romeo AA, Hang DW, Bach BR Jr, Shott S. Repair of full thickness rotator cuff tears. Gender, age, and other factors affecting outcome. Clin Orthop Relat Res. 1999;(367):243-255.
1. Cole BJ, McCarty LP 3rd, Kang RW, Alford W, Lewis PB, Hayden JK. Arthroscopic rotator cuff repair: prospective functional outcome and repair integrity at minimum 2-year follow-up. J Shoulder Elbow Surg. 2007;16(5):579-585.
2. Huijsmans PE, Pritchard MP, Berghs BM, van Rooyen KS, Wallace AL, de Beer JF. Arthroscopic rotator cuff repair with double-row fixation. J Bone Joint Surg Am. 2007;89(6):1248-1257.
3. Wilson F, Hinov V, Adams G. Arthroscopic repair of full-thickness tears of the rotator cuff: 2- to 14-year follow-up. Arthroscopy. 2002;18(2):136-144.
4. Denard PJ, Jiwani AZ, Lädermann A, Burkhart SS. Long-term outcome of a consecutive series of subscapularis tendon tears repaired arthroscopically. Arthroscopy. 2012;28(11):1587-1591.
5. Richards RR, An KN, Bigliani LU, et al. A standardized method for the assessment of shoulder function. J Shoulder Elbow Surg. 1994;3(6):347-352.
6. Roach KE, Budiman-Mak E, Songsiridej N, Lertratanakul Y. Development of a shoulder pain and disability index. Arthritis Care Res. 1991;4(4):143-149.
7. Constant CR, Murley AH. A clinical method of functional assessment of the shoulder. Clin Orthop Relat Res. 1987;(214):160-164.
8. Michener LA, McClure PW, Sennett BJ. American Shoulder and Elbow Surgeons Standardized Shoulder Assessment Form, patient self-report section: reliability, validity, and responsiveness. J Shoulder Elbow Surg. 2002;11(6):587-594.
9. Romeo AA, Bach BR Jr, O’Halloran KL. Scoring systems for shoulder conditions. Am J Sports Med. 1996;24(4):472-476.
10. Tashjian RZ, Bradley MP, Tocci S, Rey J, Henn RF, Green A. Factors influencing patient satisfaction after rotator cuff repair. J Shoulder Elbow Surg. 2007;16(6):752-758.
11. Henn RF 3rd, Kang L, Tashjian RZ, Green A. Patients’ preoperative expectations predict the outcome of rotator cuff repair. J Bone Joint Surg Am. 2007;89(9):1913-1919.
12. O’Holleran JD, Kocher MS, Horan MP, Briggs KK, Hawkins RJ. Determinants of patient satisfaction with outcome after rotator cuff surgery. J Bone Joint Surg Am. 2005;87(1):121-126.
13. Namdari S, Donegan RP, Chamberlain AM, Galatz LM, Yamaguchi K, Keener JD. Factors affecting outcome after structural failure of repaired rotator cuff tears. J Bone Joint Surg Am. 2014;96(2):99-105.
14. Nho SJ, Brown BS, Lyman S, Adler RS, Altchek DW, MacGillivray JD. Prospective analysis of arthroscopic rotator cuff repair: prognostic factors affecting clinical and ultrasound outcome. J Shoulder Elbow Surg. 2009;18(1):13-20.
15. Sonnabend DH, Watson EM. Structural factors affecting the outcome of rotator cuff repair. J Shoulder Elbow Surg. 2002;11(3):212-218.
16. Boileau P, Brassart N, Watkinson DJ, Carles M, Hatzidakis AM, Krishnan SG. Arthroscopic repair of full-thickness tears of the supraspinatus: does the tendon really heal? J Bone Joint Surg Am. 2005;87(6):1229-1240.
17. Sugaya H, Maeda K, Matsuki K, Moriishi J. Repair integrity and functional outcome after arthroscopic double-row rotator cuff repair. A prospective outcome study. J Bone Joint Surg Am. 2007;89(5):953-960.
18. DeFranco MJ, Bershadsky B, Ciccone J, Yum JK, Iannotti JP. Functional outcome of arthroscopic rotator cuff repairs: a correlation of anatomic and clinical results. J Shoulder Elbow Surg. 2007;16(6):759-765.
19. Galatz LM, Ball CM, Teefey SA, Middleton WD, Yamaguchi K. The outcome and repair integrity of completely arthroscopically repaired large and massive rotator cuff tears. J Bone Joint Surg Am. 2004;86(2):219-224.
20. Harryman DT 2nd, Mack LA, Wang KY, Jackins SE, Richardson ML, Matsen FA 3rd. Repairs of the rotator cuff. Correlation of functional results with integrity of the cuff. J Bone Joint Surg Am. 1991;73(7):982-989.
21. Romeo AA, Hang DW, Bach BR Jr, Shott S. Repair of full thickness rotator cuff tears. Gender, age, and other factors affecting outcome. Clin Orthop Relat Res. 1999;(367):243-255.