User login
Inpatient Staffing in Pediatric Programs
Resident duty hour restrictions were initially implemented in New York in 1989 with New York State Code 405 in response to a patient death in a New York City Emergency Department.1 This case initiated an evaluation of potential risks to patient safety when residents were inadequately supervised and overfatigued. In 2003, the Accreditation Council for Graduate Medical Education (ACGME) implemented resident duty hours nationally due to concerns for patient safety and quality of care.2 These restrictions involved the implementation of the 80‐hour work week (averaged over 4 weeks), a maximum duty length of 30 hours, and prescriptive supervision guidelines. In December 2008, the Institute of Medicine (IOM) proposed additional changes to further restrict resident duty hours which also included overnight protected sleep periods and additional days off per month.3 The ACGME responded by mandating new resident duty hour restrictions in October 2010 which will be implemented in July 2011. The ACGME's new changes include a change in the maximum duty hour length for residents in their first year of training (PGY‐1) of 16 hours. Residents in their second year of training (PGY‐2) level and above may work a maximum of 24 hours with an additional 4 hours for transition of care and resident education. The ACGME strongly recommends strategic napping, but do not have a protected overnight sleep period in place4 (Table 1).
Current Guidelines | IOM Proposed Changes | ACGME Mandated Changes | |
---|---|---|---|
December 2008 | October 2010 | ||
| |||
Maximum hours of work per week | 80 hr averaged over 4 wk | 80 hr averaged over 4 wk | 80 hr averaged over 4 wk |
Maximum duty length | 30 hr (admitting patients for up to 24 hr, then additional 6 hr for transition of care) | 30 hr with 5 hr protected sleep period (admitting patients for up to 16 hr) | PGY‐1 residents, 16 hr |
Or | PGY‐2 residents, 24 hr with additional 4 hr for transition of care | ||
16 hr with no protected sleep period | |||
Strategic napping | None | 5 hr protected sleep period for 30 hr shifts | Highly recommended after 16 hr of continuous duty |
Time off between duty periods | 10 hr after shift | 10 hr after day shift | Recommend 10 hr, but must have at least 8 hr off |
12 hr after night shift | In their final years, residents can have less than 8 hr | ||
14 hr after 30 hr shifts | |||
Maximum consecutive nights of night float | None | 4 consecutive nights maximum | 6 consecutive nights maximum |
Frequency of in‐house call | Every third night, on average | Every third night, no averaging | Every third night, no averaging |
Days off per month | 4 days off | 5 days off, at least one 48 hr period per month | 4 days off |
Moonlighting restrictions | Internal moonlighting counts against 80 hr cap | Both internal and external moonlighting count against 80 hr cap | Both internal and external moonlighting count against 80 hr cap |
There is growing concern regarding the impact of these new resident duty hour restrictions on the coverage of inpatient services, particularly during the overnight period. To our knowledge, there is no published national data on how pediatric inpatient teaching services are staffed at night. The objective of this study was to survey the current landscape of pediatric resident coverage of noncritical care inpatient teaching services. In addition, we sought to explore how changes in work hour restrictions might affect the role of pediatric hospitalists in training programs.
METHODS
We developed an institutional review board (IRB)‐approved Web‐based electronic survey. The survey consisted of 17 questions. The survey obtained information regarding the demographics of the program including: number of residents, daily patient census per ward intern, information regarding staff‐only pediatric ward services, overnight coverage, and current attending in‐house overnight coverage (see Appendix). We also examined the prevalence of pediatric hospitalists in training programs, their current role in staffing patients, and how that role may change with the implementation of additional resident duty hour restrictions. Initially, the survey was reviewed and tested by several pediatric hospitalists and program directors. It was then reviewed and approved by the Association of Pediatric Program Director (APPD) research task force. The survey was sent out to 196 US pediatric residency programs via the APPD listserve in January 2010. Program directors were given the option of completing it themselves or specifically designating someone else to complete it. Two reminders were sent. We then sent an additional request for program participation on the pediatric hospitalist listserve. All data was collected by February 2010.
RESULTS
One hundred twenty unique responses were received (61% of total pediatric residency programs). As of 2009, this represented 5201 pediatric residents (58% of total pediatric residents). The average program size was 43 residents (range: 12‐156 residents, median 43). The average daily patient census per ward intern during daytime hours was 6.65 patients (range: 3‐17, median 6). Twenty percent of training programs had staff‐only (no residents) pediatric ward services during daytime hours. In the programs with both staff‐only and resident pediatric ward services, only 19% of patients were covered by the staff‐only teams and 81% of patients were covered by resident teams.
During the overnight period, 86% of resident teams did not have caps on the number of new patient admissions. An average of 3.6 providers per training program were in‐house overnight to accept patient admissions to pediatric wards. Ninety‐four percent of these providers in‐house were residents (399 residents in‐house/425 total providers in‐house each night).
Twenty‐five percent of the training programs that responded to the survey had pediatric hospitalist attendings in‐house at night. This included both overnight and partial nights (ie, until midnight). Other attendings in‐house at night include: neonatal intensive care unit (NICU) attendings (53% of programs), pediatric intensive care unit (PICU) attendings (46% of programs), Pediatric Emergency Medicine attendings (65% of programs), and Pediatric Surgery attendings (6.4% of programs). Twenty‐two percent of programs had no in‐house attendings at night (Figure 1).

Pediatric hospitalists were involved with 84% (n = 97) of training programs. Sixty percent (n = 58) of the pediatric hospitalist teams were staffed with both teaching attendings and residents. Fourteen percent (n = 14) of the pediatric hospitalist teams did not involve residents (staff‐only) and 25% (n = 25) had both types of teams. Specifically, of the programs that had pediatric hospitalists, 20% (n = 19) of them had hospitalist attendings in‐house 24 hours per day and 13% (n = 12) of teams had hospitalist attendings in‐house into the evening hours for a varying amount of time. Of the programs with hospitalist attendings in‐house 24 hours per day, 52% (n = 11) had started this coverage within the past 3 years.
Looking towards the future, and prior to the enactment of the October 2010 ACGME standards, 31% (n = 35) of the training programs that lacked 24/7 hospitalist in‐house coverage in January 2010 anticipated adding this level of coverage within the next 5 years. Notably, 70% (n = 81) of training programs felt that further resident work hour restrictions, which have since been enacted, would likely require the addition of more hospitalist attendings at night. Our survey allowed program directors to make open‐ended comments on how further work hour restrictions may change inpatient staffing in noncritical care inpatient teaching services.
DISCUSSION
To our knowledge, this was the first national study of pediatric resident coverage in noncritical care inpatient teaching services. While there was significant variation in how inpatient teaching services were covered across these programs, in January 2010, residents were involved in the majority of patient care with only 20% of programs having attending‐only hospitalist teams during the daytime. During the overnight period, the proportion of patient care provided by residents became even more significant with residents representing 94% of the total in‐house providers accepting new admissions. While pediatric hospitalists were prevalent at these training programs, their role in direct patient care overnight was limited. Only 6% of total in‐house providers accepting admissions at night were pediatric hospitalists.
The comments made by program directors are representative of the overall concerns regarding changes to resident work hours (see Table 2). In a position statement by the Association of Pediatric Program Directors in regards to the IOM recommendations, concerns were raised stating that the recommendations of the IOM Committee are intended to enhance patient safety without appropriate consideration for the educational and professional development of trainees.5 While the newly mandated ACGME standards are different than the previous IOM recommendations, it is clear that there will be very significant changes to accommodate these new standards. Our study was done prior to the new ACGME's standards. At the time of the survey, less than a third of programs were anticipating the addition of 24/7 pediatric hospitalist coverage; however, if resident work hours were further restricted, 70% of programs felt that additional hospitalists would be needed. This is a significant increase in the previously anticipated need for overnight attending hospitalist coverage, especially in light of the further restrictions mandated by the ACGME. We know that the response of New York State programs to the 405 regulations varied by program size, but all made significant changes to accommodate the new standards.6 It is clear that many program directors nationally are anticipating significant changes to their residencies when these new restrictions are enacted. The respondents in our survey felt that pediatric hospitalists are going to have to play an even bigger role at night when additional resident work hour restrictions are put into place.
|
▪ If the new duty hours are mandated, we would have to go to a night float system to be in compliance. This would require more residents and we do not have the funding to hire more residents. |
▪ Restrictions will be costly. It will increase shift work mentality, and increase pt errors due to handovers. If these (work restrictions) are not applied to all doctors (neurosurgeons, ICU doctors), they should not apply to resident doctors. |
▪ The additional restrictions may make the hospital consider giving up its residency program in favor of a hospitalist‐only model. |
▪ We do not have enough residents to care for the current patient load. |
▪ Additional work hour restrictions will lead to more hand‐over care and less ownership of patients by residents who identify themselves as primary patient physicians. Both situations are associated with increased rates of complications and possible sentinel events. |
▪ If the hours are reduced, the hospital will be forced to hire physicians for the care of patients. The administration of the hospital is now beginning to ask why they should financially support the training program if the residents are not providing a substantial portion of the hospital care for the patients. |
Pediatric hospital medicine remains a rapidly growing field.7 Eighty‐four percent of pediatric training programs utilize pediatric hospitalists. Over 60% of these pediatric hospitalist teams are involved in teaching teams with residents. While we did not directly study the supply and demand of pediatric hospitalists, there is some concern that even despite its rapid growth, the supply of pediatric hospitalists will not keep up with the demand when further resident work hours restrictions are implemented. At time of submission, a cost‐analysis has not yet been publicly published on the ACGME's new changes. There is data available based on the IOM's 2008 recommendations. A study by Nuckols and Escarce8 suggests that if the IOM's recommendations were implemented, the entire healthcare system nationally would have to develop and fill new full‐time positions equal to 5001 attending physicians, 5984 midlevel providers (nurse practitioners or physician assistants), 320 licensed vocational nurses, 229 nursing aides, and 45 laboratory technicians. This would be equivalent to adding an additional 8247 residency positions across all specialties.810 While the ACGME's new mandated changes are different than the IOM's recommendations, they will also restrict resident duty hours that we believe could lead to gaps in patient care requiring significant personnel changes in the healthcare system.
There are several limitations to our study. We did not study the role of pediatric subspecialty fellows and their involvement in pediatric inpatient services in these training programs. We also did not study the prevalence and use of resident night float systems. While night floats may be used in some programs, it may become more prevalent with the possible restriction in intern work hours down to 16 hours. As with any survey, there remains both volunteer and nonresponse bias with the programs that decide to complete or disregard the survey. Finally, there remains some concern over the data collection after the survey was sent out to the hospitalist listserve. Pediatric hospitalists may have incorrectly filled out the data for their program after their program director had already completed the survey. We attempted to minimize this problem by specifically instructing hospitalists to encourage their program director to fill out the survey if they had not already done so. We also compared computer Internet Protocol (IP) addresses and actual program responses, before and after the hospitalist e‐mail was sent, in an attempt to minimize the chance of including duplicated responses from the same program. Lastly, the January 2010 survey predated the October 2010 ACGME response to the IOM recommendations, and the responses may be different now that the specific restrictions have been mandated with an actual implementation date.
CONCLUSIONS
This study shows that pediatric teaching services varied significantly in how they provided overnight coverage in 2010 prior to new ACGME recommendations. Overall, residents were providing the overwhelming majority of the patient care overnight in pediatric training programs. While hospitalists were prevalent in pediatric training programs, in 2010 they had limited roles in direct patient care at night. The ACGME has now mandated additional residency work hour restrictions to be implemented July 2011. With these restrictions, hospitalists will likely need to expand their services, and additional hospitalists will be needed to provide overnight coverage. It is unclear where those hospitalists will come from and what their role will be. It is also unclear what the impact of increased demand and changed job description will be on the continued evolution of the field of Pediatric Hospital Medicine.
Future work needs to be done to establish benchmarks for inpatient coverage. The benchmarks could include guidelines on balancing patient safety with resident education. This may also involve the implementation of resident night float models. There needs to be monitoring on how changes in resident work hours and staffing affect coverage and, ultimately, how changes affect patient and resident outcomes.
APPENDIX
INPATIENT STAFFING WITHIN PEDIATRIC RESIDENCY PROGRAMS SURVEY
|
Demographics |
How many residents are in your residency program? (total, categorical, Med‐Peds, other combined Peds) |
What is your average daily patient census per ward intern during daytime hours? |
Does your hospital have a staff‐only (no residents) pediatric ward service during the daytime hours? |
If your hospital has a staff‐only pediatric ward service, what are the proportion of patients cared for by residents vs staff‐only during daytime hours? |
Do your residents cap the number of new patient admissions at night? |
Providers in‐house overnight |
How many providers do you have in‐house at night until midnight/overnight to accept patient admissions to pediatric wards? (residents, hospitalists, nurse practitioners, other) |
Do you have attendings in‐house at night? (pediatric hospitalists, NICU, PICU, Peds EM, Peds Surgery, no attendings, other) |
Pediatric hospitalists |
Does your hospital have pediatric hospitalists? |
Are your pediatric hospitalist teams staffed by: (teaching attendings and residents, hospitalist‐staff only, both) |
If you have a staff‐only hospitalist team (no residents), how long has it been in existence? (less than 1 year, 1‐3 years, 4‐10 years, over 10 years) |
Are your hospitalist attendings in‐house: (daytime only, 24 hours/day, other) |
If your hospitalist attendings are in‐house 24/7, how many years has that coverage been available? (less than 1 year, 1‐3 years, 4‐10 years, over 10 years, not available) |
Future pediatric hospitalist coverage |
Do you anticipate that your hospital will be adding 24/7 hospitalist attending coverage? (next year, next 2 years, next 5 years, not anticipating adding coverage, 24/7 hospitalist coverage already in place) |
In your opinion, would further resident work hour restrictions make your hospital more likely to add additional hospitalist attendings at night? (very likely, somewhat likely, neutral, not likely) |
- The Bell Commission: ethical implications for the training of physicians.Mt Sinai J Med.2000;67(2):136–139. , .
- Restricted duty hours for surgeons and impact on residents quality of life, education, and patient care: a literature review.Patient Saf Surg.2009;3(1):3. , .
- Institute of Medicine. Resident Duty Hours: Enhancing Sleep, Supervision, and Safety. Released December 02, 2008. Available at: http://www.iom.edu/Reports/2008/Resident‐Duty‐Hours‐Enhancing‐Sleep‐Supervision‐and‐Safety.aspx. Accessed September 20,2009.
- ACGME 2010 Standards “Common Program Requirements.” Available at: http://acgme‐2010standards.org/pdf/Common_Program_ Requirements_07012011.pdf. Accessed January 27,2011.
- Association of Pediatric Program Directors. Association of Pediatric Program Directors (APPD) Position Statement in Response to the IOM Recommendations on Resident Duty Hours.2009. Available at: http://www.appd.org/PDFs/APPD _IOM%20 _Duty _Hours _Report _Position _Paper _4–30‐09.pdf. Accessed March 27, 2010.
- Lessons learned from New York state: fourteen years of experience with work hour limitations.Acad Med.2005;80(5):467–472. , , , , .
- Health care market trends and the evolution of hospitalist use and rolesJ Gen Intern Med.2005;20(2):101–107. , , , .
- Cost implications of reduced work hours and workloads for resident physicians.N Engl J Med.2009;360:2202–2215. , , , , .
- Revisiting duty‐hour length—IOM recommendations for patient safety and resident education.N Engl J Med.2008;359:2633–2635. .
- Resident duty hour restrictions: is less really more?J Pediatr.2009;154:631–632. , , , .
Resident duty hour restrictions were initially implemented in New York in 1989 with New York State Code 405 in response to a patient death in a New York City Emergency Department.1 This case initiated an evaluation of potential risks to patient safety when residents were inadequately supervised and overfatigued. In 2003, the Accreditation Council for Graduate Medical Education (ACGME) implemented resident duty hours nationally due to concerns for patient safety and quality of care.2 These restrictions involved the implementation of the 80‐hour work week (averaged over 4 weeks), a maximum duty length of 30 hours, and prescriptive supervision guidelines. In December 2008, the Institute of Medicine (IOM) proposed additional changes to further restrict resident duty hours which also included overnight protected sleep periods and additional days off per month.3 The ACGME responded by mandating new resident duty hour restrictions in October 2010 which will be implemented in July 2011. The ACGME's new changes include a change in the maximum duty hour length for residents in their first year of training (PGY‐1) of 16 hours. Residents in their second year of training (PGY‐2) level and above may work a maximum of 24 hours with an additional 4 hours for transition of care and resident education. The ACGME strongly recommends strategic napping, but do not have a protected overnight sleep period in place4 (Table 1).
Current Guidelines | IOM Proposed Changes | ACGME Mandated Changes | |
---|---|---|---|
December 2008 | October 2010 | ||
| |||
Maximum hours of work per week | 80 hr averaged over 4 wk | 80 hr averaged over 4 wk | 80 hr averaged over 4 wk |
Maximum duty length | 30 hr (admitting patients for up to 24 hr, then additional 6 hr for transition of care) | 30 hr with 5 hr protected sleep period (admitting patients for up to 16 hr) | PGY‐1 residents, 16 hr |
Or | PGY‐2 residents, 24 hr with additional 4 hr for transition of care | ||
16 hr with no protected sleep period | |||
Strategic napping | None | 5 hr protected sleep period for 30 hr shifts | Highly recommended after 16 hr of continuous duty |
Time off between duty periods | 10 hr after shift | 10 hr after day shift | Recommend 10 hr, but must have at least 8 hr off |
12 hr after night shift | In their final years, residents can have less than 8 hr | ||
14 hr after 30 hr shifts | |||
Maximum consecutive nights of night float | None | 4 consecutive nights maximum | 6 consecutive nights maximum |
Frequency of in‐house call | Every third night, on average | Every third night, no averaging | Every third night, no averaging |
Days off per month | 4 days off | 5 days off, at least one 48 hr period per month | 4 days off |
Moonlighting restrictions | Internal moonlighting counts against 80 hr cap | Both internal and external moonlighting count against 80 hr cap | Both internal and external moonlighting count against 80 hr cap |
There is growing concern regarding the impact of these new resident duty hour restrictions on the coverage of inpatient services, particularly during the overnight period. To our knowledge, there is no published national data on how pediatric inpatient teaching services are staffed at night. The objective of this study was to survey the current landscape of pediatric resident coverage of noncritical care inpatient teaching services. In addition, we sought to explore how changes in work hour restrictions might affect the role of pediatric hospitalists in training programs.
METHODS
We developed an institutional review board (IRB)‐approved Web‐based electronic survey. The survey consisted of 17 questions. The survey obtained information regarding the demographics of the program including: number of residents, daily patient census per ward intern, information regarding staff‐only pediatric ward services, overnight coverage, and current attending in‐house overnight coverage (see Appendix). We also examined the prevalence of pediatric hospitalists in training programs, their current role in staffing patients, and how that role may change with the implementation of additional resident duty hour restrictions. Initially, the survey was reviewed and tested by several pediatric hospitalists and program directors. It was then reviewed and approved by the Association of Pediatric Program Director (APPD) research task force. The survey was sent out to 196 US pediatric residency programs via the APPD listserve in January 2010. Program directors were given the option of completing it themselves or specifically designating someone else to complete it. Two reminders were sent. We then sent an additional request for program participation on the pediatric hospitalist listserve. All data was collected by February 2010.
RESULTS
One hundred twenty unique responses were received (61% of total pediatric residency programs). As of 2009, this represented 5201 pediatric residents (58% of total pediatric residents). The average program size was 43 residents (range: 12‐156 residents, median 43). The average daily patient census per ward intern during daytime hours was 6.65 patients (range: 3‐17, median 6). Twenty percent of training programs had staff‐only (no residents) pediatric ward services during daytime hours. In the programs with both staff‐only and resident pediatric ward services, only 19% of patients were covered by the staff‐only teams and 81% of patients were covered by resident teams.
During the overnight period, 86% of resident teams did not have caps on the number of new patient admissions. An average of 3.6 providers per training program were in‐house overnight to accept patient admissions to pediatric wards. Ninety‐four percent of these providers in‐house were residents (399 residents in‐house/425 total providers in‐house each night).
Twenty‐five percent of the training programs that responded to the survey had pediatric hospitalist attendings in‐house at night. This included both overnight and partial nights (ie, until midnight). Other attendings in‐house at night include: neonatal intensive care unit (NICU) attendings (53% of programs), pediatric intensive care unit (PICU) attendings (46% of programs), Pediatric Emergency Medicine attendings (65% of programs), and Pediatric Surgery attendings (6.4% of programs). Twenty‐two percent of programs had no in‐house attendings at night (Figure 1).

Pediatric hospitalists were involved with 84% (n = 97) of training programs. Sixty percent (n = 58) of the pediatric hospitalist teams were staffed with both teaching attendings and residents. Fourteen percent (n = 14) of the pediatric hospitalist teams did not involve residents (staff‐only) and 25% (n = 25) had both types of teams. Specifically, of the programs that had pediatric hospitalists, 20% (n = 19) of them had hospitalist attendings in‐house 24 hours per day and 13% (n = 12) of teams had hospitalist attendings in‐house into the evening hours for a varying amount of time. Of the programs with hospitalist attendings in‐house 24 hours per day, 52% (n = 11) had started this coverage within the past 3 years.
Looking towards the future, and prior to the enactment of the October 2010 ACGME standards, 31% (n = 35) of the training programs that lacked 24/7 hospitalist in‐house coverage in January 2010 anticipated adding this level of coverage within the next 5 years. Notably, 70% (n = 81) of training programs felt that further resident work hour restrictions, which have since been enacted, would likely require the addition of more hospitalist attendings at night. Our survey allowed program directors to make open‐ended comments on how further work hour restrictions may change inpatient staffing in noncritical care inpatient teaching services.
DISCUSSION
To our knowledge, this was the first national study of pediatric resident coverage in noncritical care inpatient teaching services. While there was significant variation in how inpatient teaching services were covered across these programs, in January 2010, residents were involved in the majority of patient care with only 20% of programs having attending‐only hospitalist teams during the daytime. During the overnight period, the proportion of patient care provided by residents became even more significant with residents representing 94% of the total in‐house providers accepting new admissions. While pediatric hospitalists were prevalent at these training programs, their role in direct patient care overnight was limited. Only 6% of total in‐house providers accepting admissions at night were pediatric hospitalists.
The comments made by program directors are representative of the overall concerns regarding changes to resident work hours (see Table 2). In a position statement by the Association of Pediatric Program Directors in regards to the IOM recommendations, concerns were raised stating that the recommendations of the IOM Committee are intended to enhance patient safety without appropriate consideration for the educational and professional development of trainees.5 While the newly mandated ACGME standards are different than the previous IOM recommendations, it is clear that there will be very significant changes to accommodate these new standards. Our study was done prior to the new ACGME's standards. At the time of the survey, less than a third of programs were anticipating the addition of 24/7 pediatric hospitalist coverage; however, if resident work hours were further restricted, 70% of programs felt that additional hospitalists would be needed. This is a significant increase in the previously anticipated need for overnight attending hospitalist coverage, especially in light of the further restrictions mandated by the ACGME. We know that the response of New York State programs to the 405 regulations varied by program size, but all made significant changes to accommodate the new standards.6 It is clear that many program directors nationally are anticipating significant changes to their residencies when these new restrictions are enacted. The respondents in our survey felt that pediatric hospitalists are going to have to play an even bigger role at night when additional resident work hour restrictions are put into place.
|
▪ If the new duty hours are mandated, we would have to go to a night float system to be in compliance. This would require more residents and we do not have the funding to hire more residents. |
▪ Restrictions will be costly. It will increase shift work mentality, and increase pt errors due to handovers. If these (work restrictions) are not applied to all doctors (neurosurgeons, ICU doctors), they should not apply to resident doctors. |
▪ The additional restrictions may make the hospital consider giving up its residency program in favor of a hospitalist‐only model. |
▪ We do not have enough residents to care for the current patient load. |
▪ Additional work hour restrictions will lead to more hand‐over care and less ownership of patients by residents who identify themselves as primary patient physicians. Both situations are associated with increased rates of complications and possible sentinel events. |
▪ If the hours are reduced, the hospital will be forced to hire physicians for the care of patients. The administration of the hospital is now beginning to ask why they should financially support the training program if the residents are not providing a substantial portion of the hospital care for the patients. |
Pediatric hospital medicine remains a rapidly growing field.7 Eighty‐four percent of pediatric training programs utilize pediatric hospitalists. Over 60% of these pediatric hospitalist teams are involved in teaching teams with residents. While we did not directly study the supply and demand of pediatric hospitalists, there is some concern that even despite its rapid growth, the supply of pediatric hospitalists will not keep up with the demand when further resident work hours restrictions are implemented. At time of submission, a cost‐analysis has not yet been publicly published on the ACGME's new changes. There is data available based on the IOM's 2008 recommendations. A study by Nuckols and Escarce8 suggests that if the IOM's recommendations were implemented, the entire healthcare system nationally would have to develop and fill new full‐time positions equal to 5001 attending physicians, 5984 midlevel providers (nurse practitioners or physician assistants), 320 licensed vocational nurses, 229 nursing aides, and 45 laboratory technicians. This would be equivalent to adding an additional 8247 residency positions across all specialties.810 While the ACGME's new mandated changes are different than the IOM's recommendations, they will also restrict resident duty hours that we believe could lead to gaps in patient care requiring significant personnel changes in the healthcare system.
There are several limitations to our study. We did not study the role of pediatric subspecialty fellows and their involvement in pediatric inpatient services in these training programs. We also did not study the prevalence and use of resident night float systems. While night floats may be used in some programs, it may become more prevalent with the possible restriction in intern work hours down to 16 hours. As with any survey, there remains both volunteer and nonresponse bias with the programs that decide to complete or disregard the survey. Finally, there remains some concern over the data collection after the survey was sent out to the hospitalist listserve. Pediatric hospitalists may have incorrectly filled out the data for their program after their program director had already completed the survey. We attempted to minimize this problem by specifically instructing hospitalists to encourage their program director to fill out the survey if they had not already done so. We also compared computer Internet Protocol (IP) addresses and actual program responses, before and after the hospitalist e‐mail was sent, in an attempt to minimize the chance of including duplicated responses from the same program. Lastly, the January 2010 survey predated the October 2010 ACGME response to the IOM recommendations, and the responses may be different now that the specific restrictions have been mandated with an actual implementation date.
CONCLUSIONS
This study shows that pediatric teaching services varied significantly in how they provided overnight coverage in 2010 prior to new ACGME recommendations. Overall, residents were providing the overwhelming majority of the patient care overnight in pediatric training programs. While hospitalists were prevalent in pediatric training programs, in 2010 they had limited roles in direct patient care at night. The ACGME has now mandated additional residency work hour restrictions to be implemented July 2011. With these restrictions, hospitalists will likely need to expand their services, and additional hospitalists will be needed to provide overnight coverage. It is unclear where those hospitalists will come from and what their role will be. It is also unclear what the impact of increased demand and changed job description will be on the continued evolution of the field of Pediatric Hospital Medicine.
Future work needs to be done to establish benchmarks for inpatient coverage. The benchmarks could include guidelines on balancing patient safety with resident education. This may also involve the implementation of resident night float models. There needs to be monitoring on how changes in resident work hours and staffing affect coverage and, ultimately, how changes affect patient and resident outcomes.
APPENDIX
INPATIENT STAFFING WITHIN PEDIATRIC RESIDENCY PROGRAMS SURVEY
|
Demographics |
How many residents are in your residency program? (total, categorical, Med‐Peds, other combined Peds) |
What is your average daily patient census per ward intern during daytime hours? |
Does your hospital have a staff‐only (no residents) pediatric ward service during the daytime hours? |
If your hospital has a staff‐only pediatric ward service, what are the proportion of patients cared for by residents vs staff‐only during daytime hours? |
Do your residents cap the number of new patient admissions at night? |
Providers in‐house overnight |
How many providers do you have in‐house at night until midnight/overnight to accept patient admissions to pediatric wards? (residents, hospitalists, nurse practitioners, other) |
Do you have attendings in‐house at night? (pediatric hospitalists, NICU, PICU, Peds EM, Peds Surgery, no attendings, other) |
Pediatric hospitalists |
Does your hospital have pediatric hospitalists? |
Are your pediatric hospitalist teams staffed by: (teaching attendings and residents, hospitalist‐staff only, both) |
If you have a staff‐only hospitalist team (no residents), how long has it been in existence? (less than 1 year, 1‐3 years, 4‐10 years, over 10 years) |
Are your hospitalist attendings in‐house: (daytime only, 24 hours/day, other) |
If your hospitalist attendings are in‐house 24/7, how many years has that coverage been available? (less than 1 year, 1‐3 years, 4‐10 years, over 10 years, not available) |
Future pediatric hospitalist coverage |
Do you anticipate that your hospital will be adding 24/7 hospitalist attending coverage? (next year, next 2 years, next 5 years, not anticipating adding coverage, 24/7 hospitalist coverage already in place) |
In your opinion, would further resident work hour restrictions make your hospital more likely to add additional hospitalist attendings at night? (very likely, somewhat likely, neutral, not likely) |
Resident duty hour restrictions were initially implemented in New York in 1989 with New York State Code 405 in response to a patient death in a New York City Emergency Department.1 This case initiated an evaluation of potential risks to patient safety when residents were inadequately supervised and overfatigued. In 2003, the Accreditation Council for Graduate Medical Education (ACGME) implemented resident duty hours nationally due to concerns for patient safety and quality of care.2 These restrictions involved the implementation of the 80‐hour work week (averaged over 4 weeks), a maximum duty length of 30 hours, and prescriptive supervision guidelines. In December 2008, the Institute of Medicine (IOM) proposed additional changes to further restrict resident duty hours which also included overnight protected sleep periods and additional days off per month.3 The ACGME responded by mandating new resident duty hour restrictions in October 2010 which will be implemented in July 2011. The ACGME's new changes include a change in the maximum duty hour length for residents in their first year of training (PGY‐1) of 16 hours. Residents in their second year of training (PGY‐2) level and above may work a maximum of 24 hours with an additional 4 hours for transition of care and resident education. The ACGME strongly recommends strategic napping, but do not have a protected overnight sleep period in place4 (Table 1).
Current Guidelines | IOM Proposed Changes | ACGME Mandated Changes | |
---|---|---|---|
December 2008 | October 2010 | ||
| |||
Maximum hours of work per week | 80 hr averaged over 4 wk | 80 hr averaged over 4 wk | 80 hr averaged over 4 wk |
Maximum duty length | 30 hr (admitting patients for up to 24 hr, then additional 6 hr for transition of care) | 30 hr with 5 hr protected sleep period (admitting patients for up to 16 hr) | PGY‐1 residents, 16 hr |
Or | PGY‐2 residents, 24 hr with additional 4 hr for transition of care | ||
16 hr with no protected sleep period | |||
Strategic napping | None | 5 hr protected sleep period for 30 hr shifts | Highly recommended after 16 hr of continuous duty |
Time off between duty periods | 10 hr after shift | 10 hr after day shift | Recommend 10 hr, but must have at least 8 hr off |
12 hr after night shift | In their final years, residents can have less than 8 hr | ||
14 hr after 30 hr shifts | |||
Maximum consecutive nights of night float | None | 4 consecutive nights maximum | 6 consecutive nights maximum |
Frequency of in‐house call | Every third night, on average | Every third night, no averaging | Every third night, no averaging |
Days off per month | 4 days off | 5 days off, at least one 48 hr period per month | 4 days off |
Moonlighting restrictions | Internal moonlighting counts against 80 hr cap | Both internal and external moonlighting count against 80 hr cap | Both internal and external moonlighting count against 80 hr cap |
There is growing concern regarding the impact of these new resident duty hour restrictions on the coverage of inpatient services, particularly during the overnight period. To our knowledge, there is no published national data on how pediatric inpatient teaching services are staffed at night. The objective of this study was to survey the current landscape of pediatric resident coverage of noncritical care inpatient teaching services. In addition, we sought to explore how changes in work hour restrictions might affect the role of pediatric hospitalists in training programs.
METHODS
We developed an institutional review board (IRB)‐approved Web‐based electronic survey. The survey consisted of 17 questions. The survey obtained information regarding the demographics of the program including: number of residents, daily patient census per ward intern, information regarding staff‐only pediatric ward services, overnight coverage, and current attending in‐house overnight coverage (see Appendix). We also examined the prevalence of pediatric hospitalists in training programs, their current role in staffing patients, and how that role may change with the implementation of additional resident duty hour restrictions. Initially, the survey was reviewed and tested by several pediatric hospitalists and program directors. It was then reviewed and approved by the Association of Pediatric Program Director (APPD) research task force. The survey was sent out to 196 US pediatric residency programs via the APPD listserve in January 2010. Program directors were given the option of completing it themselves or specifically designating someone else to complete it. Two reminders were sent. We then sent an additional request for program participation on the pediatric hospitalist listserve. All data was collected by February 2010.
RESULTS
One hundred twenty unique responses were received (61% of total pediatric residency programs). As of 2009, this represented 5201 pediatric residents (58% of total pediatric residents). The average program size was 43 residents (range: 12‐156 residents, median 43). The average daily patient census per ward intern during daytime hours was 6.65 patients (range: 3‐17, median 6). Twenty percent of training programs had staff‐only (no residents) pediatric ward services during daytime hours. In the programs with both staff‐only and resident pediatric ward services, only 19% of patients were covered by the staff‐only teams and 81% of patients were covered by resident teams.
During the overnight period, 86% of resident teams did not have caps on the number of new patient admissions. An average of 3.6 providers per training program were in‐house overnight to accept patient admissions to pediatric wards. Ninety‐four percent of these providers in‐house were residents (399 residents in‐house/425 total providers in‐house each night).
Twenty‐five percent of the training programs that responded to the survey had pediatric hospitalist attendings in‐house at night. This included both overnight and partial nights (ie, until midnight). Other attendings in‐house at night include: neonatal intensive care unit (NICU) attendings (53% of programs), pediatric intensive care unit (PICU) attendings (46% of programs), Pediatric Emergency Medicine attendings (65% of programs), and Pediatric Surgery attendings (6.4% of programs). Twenty‐two percent of programs had no in‐house attendings at night (Figure 1).

Pediatric hospitalists were involved with 84% (n = 97) of training programs. Sixty percent (n = 58) of the pediatric hospitalist teams were staffed with both teaching attendings and residents. Fourteen percent (n = 14) of the pediatric hospitalist teams did not involve residents (staff‐only) and 25% (n = 25) had both types of teams. Specifically, of the programs that had pediatric hospitalists, 20% (n = 19) of them had hospitalist attendings in‐house 24 hours per day and 13% (n = 12) of teams had hospitalist attendings in‐house into the evening hours for a varying amount of time. Of the programs with hospitalist attendings in‐house 24 hours per day, 52% (n = 11) had started this coverage within the past 3 years.
Looking towards the future, and prior to the enactment of the October 2010 ACGME standards, 31% (n = 35) of the training programs that lacked 24/7 hospitalist in‐house coverage in January 2010 anticipated adding this level of coverage within the next 5 years. Notably, 70% (n = 81) of training programs felt that further resident work hour restrictions, which have since been enacted, would likely require the addition of more hospitalist attendings at night. Our survey allowed program directors to make open‐ended comments on how further work hour restrictions may change inpatient staffing in noncritical care inpatient teaching services.
DISCUSSION
To our knowledge, this was the first national study of pediatric resident coverage in noncritical care inpatient teaching services. While there was significant variation in how inpatient teaching services were covered across these programs, in January 2010, residents were involved in the majority of patient care with only 20% of programs having attending‐only hospitalist teams during the daytime. During the overnight period, the proportion of patient care provided by residents became even more significant with residents representing 94% of the total in‐house providers accepting new admissions. While pediatric hospitalists were prevalent at these training programs, their role in direct patient care overnight was limited. Only 6% of total in‐house providers accepting admissions at night were pediatric hospitalists.
The comments made by program directors are representative of the overall concerns regarding changes to resident work hours (see Table 2). In a position statement by the Association of Pediatric Program Directors in regards to the IOM recommendations, concerns were raised stating that the recommendations of the IOM Committee are intended to enhance patient safety without appropriate consideration for the educational and professional development of trainees.5 While the newly mandated ACGME standards are different than the previous IOM recommendations, it is clear that there will be very significant changes to accommodate these new standards. Our study was done prior to the new ACGME's standards. At the time of the survey, less than a third of programs were anticipating the addition of 24/7 pediatric hospitalist coverage; however, if resident work hours were further restricted, 70% of programs felt that additional hospitalists would be needed. This is a significant increase in the previously anticipated need for overnight attending hospitalist coverage, especially in light of the further restrictions mandated by the ACGME. We know that the response of New York State programs to the 405 regulations varied by program size, but all made significant changes to accommodate the new standards.6 It is clear that many program directors nationally are anticipating significant changes to their residencies when these new restrictions are enacted. The respondents in our survey felt that pediatric hospitalists are going to have to play an even bigger role at night when additional resident work hour restrictions are put into place.
|
▪ If the new duty hours are mandated, we would have to go to a night float system to be in compliance. This would require more residents and we do not have the funding to hire more residents. |
▪ Restrictions will be costly. It will increase shift work mentality, and increase pt errors due to handovers. If these (work restrictions) are not applied to all doctors (neurosurgeons, ICU doctors), they should not apply to resident doctors. |
▪ The additional restrictions may make the hospital consider giving up its residency program in favor of a hospitalist‐only model. |
▪ We do not have enough residents to care for the current patient load. |
▪ Additional work hour restrictions will lead to more hand‐over care and less ownership of patients by residents who identify themselves as primary patient physicians. Both situations are associated with increased rates of complications and possible sentinel events. |
▪ If the hours are reduced, the hospital will be forced to hire physicians for the care of patients. The administration of the hospital is now beginning to ask why they should financially support the training program if the residents are not providing a substantial portion of the hospital care for the patients. |
Pediatric hospital medicine remains a rapidly growing field.7 Eighty‐four percent of pediatric training programs utilize pediatric hospitalists. Over 60% of these pediatric hospitalist teams are involved in teaching teams with residents. While we did not directly study the supply and demand of pediatric hospitalists, there is some concern that even despite its rapid growth, the supply of pediatric hospitalists will not keep up with the demand when further resident work hours restrictions are implemented. At time of submission, a cost‐analysis has not yet been publicly published on the ACGME's new changes. There is data available based on the IOM's 2008 recommendations. A study by Nuckols and Escarce8 suggests that if the IOM's recommendations were implemented, the entire healthcare system nationally would have to develop and fill new full‐time positions equal to 5001 attending physicians, 5984 midlevel providers (nurse practitioners or physician assistants), 320 licensed vocational nurses, 229 nursing aides, and 45 laboratory technicians. This would be equivalent to adding an additional 8247 residency positions across all specialties.810 While the ACGME's new mandated changes are different than the IOM's recommendations, they will also restrict resident duty hours that we believe could lead to gaps in patient care requiring significant personnel changes in the healthcare system.
There are several limitations to our study. We did not study the role of pediatric subspecialty fellows and their involvement in pediatric inpatient services in these training programs. We also did not study the prevalence and use of resident night float systems. While night floats may be used in some programs, it may become more prevalent with the possible restriction in intern work hours down to 16 hours. As with any survey, there remains both volunteer and nonresponse bias with the programs that decide to complete or disregard the survey. Finally, there remains some concern over the data collection after the survey was sent out to the hospitalist listserve. Pediatric hospitalists may have incorrectly filled out the data for their program after their program director had already completed the survey. We attempted to minimize this problem by specifically instructing hospitalists to encourage their program director to fill out the survey if they had not already done so. We also compared computer Internet Protocol (IP) addresses and actual program responses, before and after the hospitalist e‐mail was sent, in an attempt to minimize the chance of including duplicated responses from the same program. Lastly, the January 2010 survey predated the October 2010 ACGME response to the IOM recommendations, and the responses may be different now that the specific restrictions have been mandated with an actual implementation date.
CONCLUSIONS
This study shows that pediatric teaching services varied significantly in how they provided overnight coverage in 2010 prior to new ACGME recommendations. Overall, residents were providing the overwhelming majority of the patient care overnight in pediatric training programs. While hospitalists were prevalent in pediatric training programs, in 2010 they had limited roles in direct patient care at night. The ACGME has now mandated additional residency work hour restrictions to be implemented July 2011. With these restrictions, hospitalists will likely need to expand their services, and additional hospitalists will be needed to provide overnight coverage. It is unclear where those hospitalists will come from and what their role will be. It is also unclear what the impact of increased demand and changed job description will be on the continued evolution of the field of Pediatric Hospital Medicine.
Future work needs to be done to establish benchmarks for inpatient coverage. The benchmarks could include guidelines on balancing patient safety with resident education. This may also involve the implementation of resident night float models. There needs to be monitoring on how changes in resident work hours and staffing affect coverage and, ultimately, how changes affect patient and resident outcomes.
APPENDIX
INPATIENT STAFFING WITHIN PEDIATRIC RESIDENCY PROGRAMS SURVEY
|
Demographics |
How many residents are in your residency program? (total, categorical, Med‐Peds, other combined Peds) |
What is your average daily patient census per ward intern during daytime hours? |
Does your hospital have a staff‐only (no residents) pediatric ward service during the daytime hours? |
If your hospital has a staff‐only pediatric ward service, what are the proportion of patients cared for by residents vs staff‐only during daytime hours? |
Do your residents cap the number of new patient admissions at night? |
Providers in‐house overnight |
How many providers do you have in‐house at night until midnight/overnight to accept patient admissions to pediatric wards? (residents, hospitalists, nurse practitioners, other) |
Do you have attendings in‐house at night? (pediatric hospitalists, NICU, PICU, Peds EM, Peds Surgery, no attendings, other) |
Pediatric hospitalists |
Does your hospital have pediatric hospitalists? |
Are your pediatric hospitalist teams staffed by: (teaching attendings and residents, hospitalist‐staff only, both) |
If you have a staff‐only hospitalist team (no residents), how long has it been in existence? (less than 1 year, 1‐3 years, 4‐10 years, over 10 years) |
Are your hospitalist attendings in‐house: (daytime only, 24 hours/day, other) |
If your hospitalist attendings are in‐house 24/7, how many years has that coverage been available? (less than 1 year, 1‐3 years, 4‐10 years, over 10 years, not available) |
Future pediatric hospitalist coverage |
Do you anticipate that your hospital will be adding 24/7 hospitalist attending coverage? (next year, next 2 years, next 5 years, not anticipating adding coverage, 24/7 hospitalist coverage already in place) |
In your opinion, would further resident work hour restrictions make your hospital more likely to add additional hospitalist attendings at night? (very likely, somewhat likely, neutral, not likely) |
- The Bell Commission: ethical implications for the training of physicians.Mt Sinai J Med.2000;67(2):136–139. , .
- Restricted duty hours for surgeons and impact on residents quality of life, education, and patient care: a literature review.Patient Saf Surg.2009;3(1):3. , .
- Institute of Medicine. Resident Duty Hours: Enhancing Sleep, Supervision, and Safety. Released December 02, 2008. Available at: http://www.iom.edu/Reports/2008/Resident‐Duty‐Hours‐Enhancing‐Sleep‐Supervision‐and‐Safety.aspx. Accessed September 20,2009.
- ACGME 2010 Standards “Common Program Requirements.” Available at: http://acgme‐2010standards.org/pdf/Common_Program_ Requirements_07012011.pdf. Accessed January 27,2011.
- Association of Pediatric Program Directors. Association of Pediatric Program Directors (APPD) Position Statement in Response to the IOM Recommendations on Resident Duty Hours.2009. Available at: http://www.appd.org/PDFs/APPD _IOM%20 _Duty _Hours _Report _Position _Paper _4–30‐09.pdf. Accessed March 27, 2010.
- Lessons learned from New York state: fourteen years of experience with work hour limitations.Acad Med.2005;80(5):467–472. , , , , .
- Health care market trends and the evolution of hospitalist use and rolesJ Gen Intern Med.2005;20(2):101–107. , , , .
- Cost implications of reduced work hours and workloads for resident physicians.N Engl J Med.2009;360:2202–2215. , , , , .
- Revisiting duty‐hour length—IOM recommendations for patient safety and resident education.N Engl J Med.2008;359:2633–2635. .
- Resident duty hour restrictions: is less really more?J Pediatr.2009;154:631–632. , , , .
- The Bell Commission: ethical implications for the training of physicians.Mt Sinai J Med.2000;67(2):136–139. , .
- Restricted duty hours for surgeons and impact on residents quality of life, education, and patient care: a literature review.Patient Saf Surg.2009;3(1):3. , .
- Institute of Medicine. Resident Duty Hours: Enhancing Sleep, Supervision, and Safety. Released December 02, 2008. Available at: http://www.iom.edu/Reports/2008/Resident‐Duty‐Hours‐Enhancing‐Sleep‐Supervision‐and‐Safety.aspx. Accessed September 20,2009.
- ACGME 2010 Standards “Common Program Requirements.” Available at: http://acgme‐2010standards.org/pdf/Common_Program_ Requirements_07012011.pdf. Accessed January 27,2011.
- Association of Pediatric Program Directors. Association of Pediatric Program Directors (APPD) Position Statement in Response to the IOM Recommendations on Resident Duty Hours.2009. Available at: http://www.appd.org/PDFs/APPD _IOM%20 _Duty _Hours _Report _Position _Paper _4–30‐09.pdf. Accessed March 27, 2010.
- Lessons learned from New York state: fourteen years of experience with work hour limitations.Acad Med.2005;80(5):467–472. , , , , .
- Health care market trends and the evolution of hospitalist use and rolesJ Gen Intern Med.2005;20(2):101–107. , , , .
- Cost implications of reduced work hours and workloads for resident physicians.N Engl J Med.2009;360:2202–2215. , , , , .
- Revisiting duty‐hour length—IOM recommendations for patient safety and resident education.N Engl J Med.2008;359:2633–2635. .
- Resident duty hour restrictions: is less really more?J Pediatr.2009;154:631–632. , , , .
Copyright © 2011 Society of Hospital Medicine
Improving Feedback to Ward Residents
Feedback has long been recognized as pivotal to the attainment of clinical acumen and skills in medical training.1 Formative feedback can give trainees insight into their strengths and weaknesses, and provide them with clear goals and methods to attain those goals.1, 2 In fact, feedback given regularly over time by a respected figure has shown to improve physician performance.3 However, most faculty are not trained to provide effective feedback. As a result, supervisors often believe they are giving more feedback than trainees believe they are receiving, and residents receive little feedback that they perceive as useful.4 Most residents receive little to no feedback on their communications skills4 or professionalism,5 and rarely receive corrective feedback.6, 7
Faculty may fail to give feedback to residents for a number of reasons. Those barriers most commonly cited in the literature are discomfort with criticizing residents,6, 7 lack of time,4 and lack of direct observation of residents in clinical settings.810 Several studies have looked at tools to guide feedback and address the barrier of discomfort with criticism.6, 7, 11 Some showed improvements in overall feedback, though often supervisors gave only positive feedback and avoided giving feedback about weaknesses.6, 7, 11 Despite the recognition of lack of time as a barrier to feedback,4 most studies on feedback interventions thus far have not included setting aside time for the feedback to occur.6, 7, 11, 12 Finally, a number of studies utilized objective structured clinical examinations (OSCEs) coupled with immediate feedback to improve direct observation of residents, with success in improving feedback related to the encounter.9, 10, 13 To address the gaps in the current literature, the goals of our study were to address 2 specific barriers to feedback for residents: lack of time and discomfort with giving feedback.
The aim of this study was to improve Internal Medicine (IM) residents' and attendings' experiences with feedback on the wards using a pocket card and a dedicated feedback session. We developed and evaluated the pocket feedback card and session for faculty to improve the quality and frequency of their feedback to residents in the inpatient setting. We performed a randomized trial to evaluate our intervention. We hypothesized that the intervention would: 1) improve the quality and quantity of attendings' feedback given to IM ward residents; and 2) improve attendings' comfort with feedback delivery on the wards.
PARTICIPANTS AND METHODS
Setting
The study was performed at Mount Sinai Medical Center in New York City, New York, between July 2008 and January 2009.
Participants
Participants in this study were IM residents and ward teaching attendings on inpatient ward teams at Mount Sinai Medical Center from July 2008 to January 2009. There are 12 ward teams on 3 inpatient services (each service has 4 teams) during each block at our hospital. Ward teams are made up of 1 teaching attending, 1 resident, 1 to 3 interns, and 1 to 2 medical students. The majority of attendings are on the ward service for 4‐week blocks, but some are only on for 1 or 2 weeks. Teams included in the randomization were the General Medicine and Gastroenterology/Cardiology service teams. Half of the General Medicine service attendings are hospitalists. Ward teams were excluded from the study randomization if the attending on the team was on the wards for less than 2 weeks, or if the attending had already been assigned to the experimental group in a previous block, given the influence of having used the card and feedback session previously. Since residents were unaware of the intervention and random assignments were based on attendings, residents could be assigned to the intervention group or the control group on any given inpatient rotation. Therefore, a resident could be in the control group in 1 block and the intervention group in his/her next block on the wards or vice versa, or could be assigned to either the intervention or the control group on more than 1 occasion. Because resident participants were blinded to their team's assignment (as intervention or control) and all surveys were anonymous (tracked as intervention or control by the team name only), it was not possible to exclude residents based on their prior participation or to match the surveys completed by the same residents.
Study Design
We performed a prospective randomized study to evaluate our educational innovation. The unit of randomization was the ward team. For each block, approximately half of the 6‐8 teams were randomized to the intervention group and half to the control group. Randomization assignments were completed the day prior to the start of the block using the random allocation software based on the ward team letters (blind to the attending and resident names). Of the 48 possible ward teams (8 teams per block over 6 blocks), 36 teams were randomized to the intervention or control groups, and 12 teams were not based on the above exclusion criteria. Of the 36 teams, 16 (composed of 16 attendings and 48 residents and interns) were randomized to the intervention group, and 20 (composed of 20 attendings and 63 residents and interns) were randomized to the control group.
The study was blinded such that residents and attendings in the control group were unaware of the study. The study was exempt from IRB review by the Mount Sinai Institutional Review Board, and Grants and Contracts Office, as an evaluation of the effectiveness of an instructional technique in medical education.
Intervention Design
We designed a pocket feedback card to guide a feedback session and assist attendings in giving useful feedback to IM residents on the wards (Figure 1).14 The individual items and categories were adapted from the Accreditation Council for Graduate Medical Education (ACGME) Common Program Requirements Core Competencies section and were revised via the expert consensus of the authors.14 We included 20 items related to resident skills, knowledge, attitudes, and behaviors important to the care of hospitalized patients, grouped under the 6 ACGME core competency domains.14 Many of these items correspond to competencies in the Society of Hospital Medicine (SHM) Core Competencies; in particular, the categories of Systems‐Based Practice and Practice‐Based Learning mirror competencies in the SHM Core Competencies Healthcare Systems chapter.15 Each item utilized a 5‐point Likert scale (1 = very poor, 3 = at expected level, 5 = superior) to evaluate resident performance (Figure 1). We created this card to serve as a directive and specific guide for attendings to provide feedback about specific domains and to give more constructive feedback. The card was to be used during a specific dedicated feedback session in order to overcome the commonly cited barrier of lack of time.

Program Implementation
On the first day of the block, both groups of attendings received the standard inpatient ward orientation given by the program director, including instructions about teaching and administrative responsibilities, and explicit instructions to provide mid‐rotation feedback to residents. Attendings randomized to the intervention group had an additional 5‐minute orientation given by 1 of the investigators. The orientation included a brief discussion on the importance of feedback and an introduction to the items on the card.2 In addition, faculty were instructed to dedicate 1 mid‐rotation attending rounds as a feedback session, to meet individually for 10‐15 minutes with each of the 3‐4 residents on their team, and to use the card to provide feedback on skills in each domain. As noted on the feedback card, if a resident scored less than 3 on a skill set, the attending was instructed to give examples of skills within that domain needing improvement and to offer suggestions for improvement. The intervention group was also asked not to discuss the card or session with others. No other instructions were provided.
Survey Design
At the end of each block, residents and attendings in both groups completed questionnaires to assess satisfaction with, and attitudes toward, feedback (Supporting Information Appendices 1 and 2 in the online version of this article). Survey questions were based on the competency areas included in the feedback card, previously published surveys evaluating feedback interventions,5, 9, 11 and expert opinion. The resident survey was designed to address the impact of feedback on the domains of resident knowledge, clinical and communication skills, and attitudes about feedback from supervisors and peers. We utilized a 5‐point Likert scale including: strongly disagree, disagree, neutral, agree, and strongly agree. The attending survey addressed attendings' satisfaction with feedback encounters and resident performance. At the completion of the study, investigators compared responses in intervention and control groups.
Statistical Analysis
For purposes of analysis, due to the relatively small number of responses for certain answer choices, the Likert scale was converted to a dichotomous variable. The responses of agree and strongly agree were coded as agree; and disagree, strongly disagree, and neutral were coded as disagree. Neutral was coded as disagree in order to avoid overestimating positive attitudes and, in effect, bias our results toward the null hypothesis. Differences between groups were analyzed using chi‐square Fisher's exact test (2‐sided).
Qualitative Interviews
In order to understand the relative contribution of the feedback card versus the feedback session, we performed a qualitative survey of attendings in the intervention group. Following the conclusion of the study period, we selected a convenience sample of 8 attendings from the intervention group for these brief qualitative interviews. We asked 3 basic questions. Was the intervention of the feedback card and dedicated time for feedback useful? Did you find one component, either the card or the dedicated time for feedback, more useful than the other? Were there any negative effects on patient care, education, or other areas, from using an attending rounds as a feedback session? This data was coded and analyzed for common themes.
RESULTS
During the 6‐month study period, 34 teaching attendings (over 36 attending inpatient blocks) and 93 IM residents (over 111 resident inpatient blocks) participated in the study. Thirty‐four of 36 attending surveys and 96 of 111 resident surveys were completed. The overall survey response rates for residents and attendings were 85% and 94%, respectively. Two attendings participated during 2 separate blocks, first in the control group and then in the intervention group, and 18 residents participated during 2 separate blocks. No attendings or residents participated more than twice.
Resident survey response rate was 81.2% in the intervention group and 87.3% in the control group (Table 1). Residents in the intervention group reported receiving more feedback regarding skills they did well (89.7% vs 63.6%, P = 0.004) and skills needing improvement (51.3% vs 25.5%, P = 0.02) than those in the control group. In addition, more intervention residents reported receiving useful information regarding how to improve their skills (53.8% vs 27.3%, P = 0.01), and reported actually improving both their clinical skills (61.5% vs 27.8%, P = 0.001) and their professionalism/communication skills (51.3% vs 29.1%, P = 0.03) based on feedback received from attendings.
Survey Item | Resident Intervention Agree* % (No.) N = 39 | Resident Control Agree*% (No.) N = 55 | P Value |
---|---|---|---|
| |||
I did NOT receive a sufficient amount of feedback from my attending supervisor(s) this block. | 20.5 (8) | 38.2 (21) | 0.08 |
I received feedback from my attending regarding skills I did well during this block. | 89.7 (35) | 63.6 (35) | 0.004 |
I received feedback from my attending regarding specific skills that needed improvement during this block. | 51.3 (20) | 25.5 (14) | 0.02 |
I received useful information from my attending about how to improve my skills during this block. | 53.8 (21) | 27.3 (15) | 0.01 |
I improved my clinical skills based on feedback I received from my attending this block. | 61.5 (24) | 27.8 (15) | 0.001 |
I improved my professionalism/communication skills based on feedback I received from my attending this block. | 51.3 (20) | 29.1 (16) | 0.03 |
I improved my knowledge base because of feedback I received from my attending this block. | 64.1 (25) | 60.0 (33) | 0.83 |
The feedback I received from my attending this block gave me an overall sense of my performance more than it helped me identify specific areas for improvement. | 64.1 (25) | 65.5 (36) | 1.0 |
Feedback from colleagues (other interns and residents) is more helpful than feedback from attendings. | 41.0 (16) | 43.6 (24) | 0.84 |
Independent of feedback received from others, I am able to identify areas in which I need improvement. | 84.6 (33) | 80.0 (44) | 0.60 |
The attending survey response rates for the intervention and control groups were 100% and 90%, respectively. In general, both groups of attendings reported that they were comfortable giving feedback and that they did, in fact, give feedback in each area during their ward block (Table 2). More intervention attendings felt that at least 1 of their residents improved their professionalism/communication skills based on the feedback given (76.9% vs 31.1%, P = 0.02). There were no other significant differences between the groups of attendings.
Survey Item | Attending Intervention Agree* % (No.) N = 16 | Attending Control Agree* % (No.) N = 18 | P Value |
---|---|---|---|
| |||
Giving feedback to housestaff was DIFFICULT this block. | 6.3 (1) | 16.7 (3) | 0.60 |
I was comfortable giving feedback to my housestaff this block. | 93.8 (15) | 94.4 (17) | 1.00 |
I did NOT give a sufficient amount of feedback to my housestaff this block. | 18.8 (3) | 38.9 (7) | 0.27 |
My skills in giving feedback improved during this block. | 50 (8) | 16.7 (3) | 0.07 |
I gave feedback to housestaff regarding skills they did well during this block. | 100 (16) | 94.4 (17) | 1.00 |
I gave feedback to housestaff which targeted specific areas for their improvement. | 81.3 (13) | 70.6 (12) | 0.69 |
At least one of my housestaff improved his/her clinical skills based on feedback I gave this block. | 68.8 (11) | 47.1 (8) | 0.30 |
At least one of my housestaff improved his/her professionalism/communication skills based on feedback I gave this block. | 76.9 (10) | 31.1 (5) | 0.02 |
At least one of my housestaff improved his/her fund of knowledge based on feedback I gave this block. | 50.0 (8) | 52.9 (9) | 1.00 |
Housestaff found the feedback I gave them useful. | 66.7 (10) | 62.5 (10) | 1.00 |
I find it DIFFICULT to find time during inpatient rotations to give feedback to residents regarding their performance. | 50.0 (8) | 33.3 (6) | 0.49 |
Intervention attendings also shared their attitudes toward the feedback card and session. A majority felt that using 1 attending rounds as a feedback session helped create a dedicated time for giving feedback (68.8%), and that the feedback card helped them to give specific, constructive feedback (62.5%). Most attendings reported they would use the feedback card and session again during future inpatient blocks (81%), and would recommend them to other attendings (75%).
Qualitative data from intervention attending interviews revealed further thoughts about the feedback card and feedback session. Most attendings interviewed (7/8) felt that the card was useful for the structure and topic guidance it provided. Half felt that setting aside time for feedback was the more useful component. The other half reported that, because they usually set aside time for feedback regardless, the card was more useful. None of the attendings felt that the feedback card or session was detrimental for patient care or education, and many said that the intervention had positive effects on these areas. For example, 1 attending said that the session added to patient care because I used particular [patient] cases as examples for giving feedback.
DISCUSSION
In this randomized study, we found that a simple pocket feedback card and dedicated feedback session was acceptable to ward attendings and improved resident satisfaction with feedback. Unlike most prior studies of feedback, we demonstrated more feedback around skills needing improvement, and intervention residents felt the feedback they received helped them improve their skills. Our educational intervention was unique in that it combined a pocket card to structure feedback content and dedicated time to structure the feedback process, to address 2 of the major barriers to giving feedback: lack of time and lack of comfort.
The pocket card itself as a tool for improving feedback is innovative and valuable. As a short but directive guide, the card supports attendings' delivery of relevant and specific feedback about residents' performance, and because it is based on the ACGME competencies, it may help attendings focus feedback on areas in which they will later evaluate residents. The inclusion of a prespecified time for giving feedback was important as well, in that it allowed for face‐to‐face feedback to occur, as opposed to a passing comment after a presentation or brief notes in a written final evaluation. Both the card and the feedback session seemed equally important for the success of this intervention, with attitudes varying based on individual attending preferences. Those who usually set aside time for feedback on their own found the card more useful, whereas those who had more trouble finding time for feedback found the specific session more useful. Most attendings found the intervention as a whole helpful, and without any detrimental effects on patient care or education. The card and session may be particularly valuable for hospital attendings, given their growing presence as teachers and supervisors for residents, and their busy days on the wards.
Our study results have important implications for resident training in the hospital. Improving resident receipt of feedback about strengths and weaknesses is an ACGME training requirement, and specific guidance about how to improve skills is critical for focusing improvement efforts. Previous studies have demonstrated that directive feedback in medical training can lead to a variety of performance improvements, including better evaluations by other professionals,9, 16 and objective improvements in resident communication skills,17 chart documentation,18 and clinical management of patients.11, 15, 19 By improving the quality of feedback across several domains and facilitating the feedback process, our intervention may lead to similar improvements. Future studies should examine the global impact of guided feedback as in our study. Perhaps most importantly, attendings found the intervention acceptable and would recommend its use, implying longer term sustainability of its integration into the hospital routine.
One strength of our study was its prospective randomized design. Despite the importance of rigor in medical education research, there remains a paucity of randomized studies to evaluate educational interventions for residents in inpatient settings. Few studies of feedback interventions in particular have performed randomized trials,5, 6, 11 and only one has examined a feedback intervention in a randomized fashion in the inpatient setting.12 This evaluation of a 20‐minute intervention, and a reminder card for supervising attendings to improve written and verbal feedback to residents, modestly improved the amount of verbal feedback given to residents, but did not impact the number of residents receiving mid‐rotation feedback or feedback overall as our study did by report.12
There were several important limitations to our study. First, because this was a single institution study, we only achieved modest sample sizes, particularly in the attending groups, and were unable to assess all of the differences in attending attitudes related to feedback. Second, control and intervention participants were on service simultaneously, which may have led to contamination of the control group and an underestimation of the true impact of our intervention. Since residents were not exclusive to 1 study group on 1 occasion (18 of the 93 residents participated during 2 separate blocks), our results may be biased. In particular, those residents who had the intervention first, and were subsequently in the control group, may have rated the control experience worse than they would have otherwise, creating a bias in favor of a positive result for our intervention. Nonetheless, we believe this situation was uncommon and the potential associated bias minimal. Further, this study assessed attitudes related to feedback and self‐reported knowledge and skills, but did not directly assess resident knowledge, skills, or patient outcomes. We recognize the importance of these outcomes and hope that future interventions can determine these important downstream effects of feedback. We were also unable to assess the card and session's impact on attendings' comfort with feedback, because most attendings in both groups reported feeling comfortable giving feedback. This result may indicate that attendings actually are comfortable giving feedback, or may suggest some element of social desirability bias. Finally, in this study, we designed an intervention which combined the pocket card and dedicated feedback time. We did not quantitatively examine the effect of either component alone, and it is unclear if offering the feedback card without protected time or offering protected time without a guide would have impacted feedback on the wards. However, qualitative data from our study support the use of both components, and implementing the 2 components together is feasible in any inpatient teaching setting.
Despite these limitations, protected time for feedback guided by a pocket feedback card is a simple intervention that appears to improve feedback quantity and quality for ward residents, and guides them to improve their performance. Our low‐intensity intervention helped attendings give residents the tools to improve their clinical and communication skills. An opportunity to make a positive impact on resident education with such a small intervention is rare. The use of a feedback card with protected feedback time could be easily implemented in any training program, and is a valuable tool for busy hospitalists who are more commonly supervising residents on their inpatient rotations.
- Feedback in clinical medical education.JAMA.1983;250(6):777–781. .
- Giving feedback in medical education: verification of recommended techniques.J Gen Intern Med.1998;13(2):111–116. , .
- Systematic review of the literature on assessment, feedback and physicians' clinical performance: BEME Guide No. 7.Med Teach.2006;28(2):117–128. , , , , .
- Missed opportunities: a descriptive assessment of teaching and attitudes regarding communication skills in a surgical residency.Curr Surg.2006;63(6):401–409. , , , .
- Impact of a 360‐degree professionalism assessment on faculty comfort and skills in feedback delivery.J Gen Intern Med.2008;23(7):969–972. , , .
- Daily encounter cards facilitate competency‐based feedback while leniency bias persists.CJEM.2008;10(1):44–50. , .
- Teaching compassion and respect. Attending physicians' responses to problematic behaviors.J Gen Intern Med.1999;14(1):49–55. , , , , .
- Faculty and the observation of trainees' clinical skills: problems and opportunities.Acad Med.2004;79(1):16–22. .
- Direct observation of residents in the emergency department: a structured educational program.Acad Emerg Med.2009;16(4):343–351. , .
- Evaluation of a novel assessment form for observing medical residents: a randomised, controlled trial.Med Educ.2008;42(12):1234–1242. , , , et al.
- Resident evaluations: the use of daily evaluation forms in rheumatology ambulatory care.J Rheumatol.2009;36(6):1298–1303. , , , et al.
- Effectiveness of a focused educational intervention on resident evaluations from faculty a randomized controlled trial.J Gen Intern Med.2001;16(7):427–434. , , , .
- Effects of training in direct observation of medical residents' clinical competence: a randomized trial.Ann Intern Med.2004;140(11):874–881. , , .
- Internal Medicine Program Requirements. ACGME. July 1, 2009. Available at: http://www.acgme.org/acWebsite/downloads/RRC_progReq/140_internal_medicine_07012009.pdf. Accessed November 8,2009.
- How to use the core competencies in hospital medicine: a framework for curriculum development.J Hosp Med. 2006;1(suppl 1):57–67. , , , , .
- Debriefing in the intensive care unit: a feedback tool to facilitate bedside teaching.Crit Care Med.2007;35(3):738–754. , , , , .
- Use of an innovative video feedback technique to enhance communication skills training.Med Educ.2004;38(2):145–157. , , , et al.
- The impact of feedback to medical housestaff on chart documentation and quality of care in the outpatient setting.J Gen Intern Med.1997;12(6):352–356. .
- Feedback and the mini clinical evaluation exercise.J Gen Intern Med.2004;19(5 pt 2):558–561. , , , .
Feedback has long been recognized as pivotal to the attainment of clinical acumen and skills in medical training.1 Formative feedback can give trainees insight into their strengths and weaknesses, and provide them with clear goals and methods to attain those goals.1, 2 In fact, feedback given regularly over time by a respected figure has shown to improve physician performance.3 However, most faculty are not trained to provide effective feedback. As a result, supervisors often believe they are giving more feedback than trainees believe they are receiving, and residents receive little feedback that they perceive as useful.4 Most residents receive little to no feedback on their communications skills4 or professionalism,5 and rarely receive corrective feedback.6, 7
Faculty may fail to give feedback to residents for a number of reasons. Those barriers most commonly cited in the literature are discomfort with criticizing residents,6, 7 lack of time,4 and lack of direct observation of residents in clinical settings.810 Several studies have looked at tools to guide feedback and address the barrier of discomfort with criticism.6, 7, 11 Some showed improvements in overall feedback, though often supervisors gave only positive feedback and avoided giving feedback about weaknesses.6, 7, 11 Despite the recognition of lack of time as a barrier to feedback,4 most studies on feedback interventions thus far have not included setting aside time for the feedback to occur.6, 7, 11, 12 Finally, a number of studies utilized objective structured clinical examinations (OSCEs) coupled with immediate feedback to improve direct observation of residents, with success in improving feedback related to the encounter.9, 10, 13 To address the gaps in the current literature, the goals of our study were to address 2 specific barriers to feedback for residents: lack of time and discomfort with giving feedback.
The aim of this study was to improve Internal Medicine (IM) residents' and attendings' experiences with feedback on the wards using a pocket card and a dedicated feedback session. We developed and evaluated the pocket feedback card and session for faculty to improve the quality and frequency of their feedback to residents in the inpatient setting. We performed a randomized trial to evaluate our intervention. We hypothesized that the intervention would: 1) improve the quality and quantity of attendings' feedback given to IM ward residents; and 2) improve attendings' comfort with feedback delivery on the wards.
PARTICIPANTS AND METHODS
Setting
The study was performed at Mount Sinai Medical Center in New York City, New York, between July 2008 and January 2009.
Participants
Participants in this study were IM residents and ward teaching attendings on inpatient ward teams at Mount Sinai Medical Center from July 2008 to January 2009. There are 12 ward teams on 3 inpatient services (each service has 4 teams) during each block at our hospital. Ward teams are made up of 1 teaching attending, 1 resident, 1 to 3 interns, and 1 to 2 medical students. The majority of attendings are on the ward service for 4‐week blocks, but some are only on for 1 or 2 weeks. Teams included in the randomization were the General Medicine and Gastroenterology/Cardiology service teams. Half of the General Medicine service attendings are hospitalists. Ward teams were excluded from the study randomization if the attending on the team was on the wards for less than 2 weeks, or if the attending had already been assigned to the experimental group in a previous block, given the influence of having used the card and feedback session previously. Since residents were unaware of the intervention and random assignments were based on attendings, residents could be assigned to the intervention group or the control group on any given inpatient rotation. Therefore, a resident could be in the control group in 1 block and the intervention group in his/her next block on the wards or vice versa, or could be assigned to either the intervention or the control group on more than 1 occasion. Because resident participants were blinded to their team's assignment (as intervention or control) and all surveys were anonymous (tracked as intervention or control by the team name only), it was not possible to exclude residents based on their prior participation or to match the surveys completed by the same residents.
Study Design
We performed a prospective randomized study to evaluate our educational innovation. The unit of randomization was the ward team. For each block, approximately half of the 6‐8 teams were randomized to the intervention group and half to the control group. Randomization assignments were completed the day prior to the start of the block using the random allocation software based on the ward team letters (blind to the attending and resident names). Of the 48 possible ward teams (8 teams per block over 6 blocks), 36 teams were randomized to the intervention or control groups, and 12 teams were not based on the above exclusion criteria. Of the 36 teams, 16 (composed of 16 attendings and 48 residents and interns) were randomized to the intervention group, and 20 (composed of 20 attendings and 63 residents and interns) were randomized to the control group.
The study was blinded such that residents and attendings in the control group were unaware of the study. The study was exempt from IRB review by the Mount Sinai Institutional Review Board, and Grants and Contracts Office, as an evaluation of the effectiveness of an instructional technique in medical education.
Intervention Design
We designed a pocket feedback card to guide a feedback session and assist attendings in giving useful feedback to IM residents on the wards (Figure 1).14 The individual items and categories were adapted from the Accreditation Council for Graduate Medical Education (ACGME) Common Program Requirements Core Competencies section and were revised via the expert consensus of the authors.14 We included 20 items related to resident skills, knowledge, attitudes, and behaviors important to the care of hospitalized patients, grouped under the 6 ACGME core competency domains.14 Many of these items correspond to competencies in the Society of Hospital Medicine (SHM) Core Competencies; in particular, the categories of Systems‐Based Practice and Practice‐Based Learning mirror competencies in the SHM Core Competencies Healthcare Systems chapter.15 Each item utilized a 5‐point Likert scale (1 = very poor, 3 = at expected level, 5 = superior) to evaluate resident performance (Figure 1). We created this card to serve as a directive and specific guide for attendings to provide feedback about specific domains and to give more constructive feedback. The card was to be used during a specific dedicated feedback session in order to overcome the commonly cited barrier of lack of time.

Program Implementation
On the first day of the block, both groups of attendings received the standard inpatient ward orientation given by the program director, including instructions about teaching and administrative responsibilities, and explicit instructions to provide mid‐rotation feedback to residents. Attendings randomized to the intervention group had an additional 5‐minute orientation given by 1 of the investigators. The orientation included a brief discussion on the importance of feedback and an introduction to the items on the card.2 In addition, faculty were instructed to dedicate 1 mid‐rotation attending rounds as a feedback session, to meet individually for 10‐15 minutes with each of the 3‐4 residents on their team, and to use the card to provide feedback on skills in each domain. As noted on the feedback card, if a resident scored less than 3 on a skill set, the attending was instructed to give examples of skills within that domain needing improvement and to offer suggestions for improvement. The intervention group was also asked not to discuss the card or session with others. No other instructions were provided.
Survey Design
At the end of each block, residents and attendings in both groups completed questionnaires to assess satisfaction with, and attitudes toward, feedback (Supporting Information Appendices 1 and 2 in the online version of this article). Survey questions were based on the competency areas included in the feedback card, previously published surveys evaluating feedback interventions,5, 9, 11 and expert opinion. The resident survey was designed to address the impact of feedback on the domains of resident knowledge, clinical and communication skills, and attitudes about feedback from supervisors and peers. We utilized a 5‐point Likert scale including: strongly disagree, disagree, neutral, agree, and strongly agree. The attending survey addressed attendings' satisfaction with feedback encounters and resident performance. At the completion of the study, investigators compared responses in intervention and control groups.
Statistical Analysis
For purposes of analysis, due to the relatively small number of responses for certain answer choices, the Likert scale was converted to a dichotomous variable. The responses of agree and strongly agree were coded as agree; and disagree, strongly disagree, and neutral were coded as disagree. Neutral was coded as disagree in order to avoid overestimating positive attitudes and, in effect, bias our results toward the null hypothesis. Differences between groups were analyzed using chi‐square Fisher's exact test (2‐sided).
Qualitative Interviews
In order to understand the relative contribution of the feedback card versus the feedback session, we performed a qualitative survey of attendings in the intervention group. Following the conclusion of the study period, we selected a convenience sample of 8 attendings from the intervention group for these brief qualitative interviews. We asked 3 basic questions. Was the intervention of the feedback card and dedicated time for feedback useful? Did you find one component, either the card or the dedicated time for feedback, more useful than the other? Were there any negative effects on patient care, education, or other areas, from using an attending rounds as a feedback session? This data was coded and analyzed for common themes.
RESULTS
During the 6‐month study period, 34 teaching attendings (over 36 attending inpatient blocks) and 93 IM residents (over 111 resident inpatient blocks) participated in the study. Thirty‐four of 36 attending surveys and 96 of 111 resident surveys were completed. The overall survey response rates for residents and attendings were 85% and 94%, respectively. Two attendings participated during 2 separate blocks, first in the control group and then in the intervention group, and 18 residents participated during 2 separate blocks. No attendings or residents participated more than twice.
Resident survey response rate was 81.2% in the intervention group and 87.3% in the control group (Table 1). Residents in the intervention group reported receiving more feedback regarding skills they did well (89.7% vs 63.6%, P = 0.004) and skills needing improvement (51.3% vs 25.5%, P = 0.02) than those in the control group. In addition, more intervention residents reported receiving useful information regarding how to improve their skills (53.8% vs 27.3%, P = 0.01), and reported actually improving both their clinical skills (61.5% vs 27.8%, P = 0.001) and their professionalism/communication skills (51.3% vs 29.1%, P = 0.03) based on feedback received from attendings.
Survey Item | Resident Intervention Agree* % (No.) N = 39 | Resident Control Agree*% (No.) N = 55 | P Value |
---|---|---|---|
| |||
I did NOT receive a sufficient amount of feedback from my attending supervisor(s) this block. | 20.5 (8) | 38.2 (21) | 0.08 |
I received feedback from my attending regarding skills I did well during this block. | 89.7 (35) | 63.6 (35) | 0.004 |
I received feedback from my attending regarding specific skills that needed improvement during this block. | 51.3 (20) | 25.5 (14) | 0.02 |
I received useful information from my attending about how to improve my skills during this block. | 53.8 (21) | 27.3 (15) | 0.01 |
I improved my clinical skills based on feedback I received from my attending this block. | 61.5 (24) | 27.8 (15) | 0.001 |
I improved my professionalism/communication skills based on feedback I received from my attending this block. | 51.3 (20) | 29.1 (16) | 0.03 |
I improved my knowledge base because of feedback I received from my attending this block. | 64.1 (25) | 60.0 (33) | 0.83 |
The feedback I received from my attending this block gave me an overall sense of my performance more than it helped me identify specific areas for improvement. | 64.1 (25) | 65.5 (36) | 1.0 |
Feedback from colleagues (other interns and residents) is more helpful than feedback from attendings. | 41.0 (16) | 43.6 (24) | 0.84 |
Independent of feedback received from others, I am able to identify areas in which I need improvement. | 84.6 (33) | 80.0 (44) | 0.60 |
The attending survey response rates for the intervention and control groups were 100% and 90%, respectively. In general, both groups of attendings reported that they were comfortable giving feedback and that they did, in fact, give feedback in each area during their ward block (Table 2). More intervention attendings felt that at least 1 of their residents improved their professionalism/communication skills based on the feedback given (76.9% vs 31.1%, P = 0.02). There were no other significant differences between the groups of attendings.
Survey Item | Attending Intervention Agree* % (No.) N = 16 | Attending Control Agree* % (No.) N = 18 | P Value |
---|---|---|---|
| |||
Giving feedback to housestaff was DIFFICULT this block. | 6.3 (1) | 16.7 (3) | 0.60 |
I was comfortable giving feedback to my housestaff this block. | 93.8 (15) | 94.4 (17) | 1.00 |
I did NOT give a sufficient amount of feedback to my housestaff this block. | 18.8 (3) | 38.9 (7) | 0.27 |
My skills in giving feedback improved during this block. | 50 (8) | 16.7 (3) | 0.07 |
I gave feedback to housestaff regarding skills they did well during this block. | 100 (16) | 94.4 (17) | 1.00 |
I gave feedback to housestaff which targeted specific areas for their improvement. | 81.3 (13) | 70.6 (12) | 0.69 |
At least one of my housestaff improved his/her clinical skills based on feedback I gave this block. | 68.8 (11) | 47.1 (8) | 0.30 |
At least one of my housestaff improved his/her professionalism/communication skills based on feedback I gave this block. | 76.9 (10) | 31.1 (5) | 0.02 |
At least one of my housestaff improved his/her fund of knowledge based on feedback I gave this block. | 50.0 (8) | 52.9 (9) | 1.00 |
Housestaff found the feedback I gave them useful. | 66.7 (10) | 62.5 (10) | 1.00 |
I find it DIFFICULT to find time during inpatient rotations to give feedback to residents regarding their performance. | 50.0 (8) | 33.3 (6) | 0.49 |
Intervention attendings also shared their attitudes toward the feedback card and session. A majority felt that using 1 attending rounds as a feedback session helped create a dedicated time for giving feedback (68.8%), and that the feedback card helped them to give specific, constructive feedback (62.5%). Most attendings reported they would use the feedback card and session again during future inpatient blocks (81%), and would recommend them to other attendings (75%).
Qualitative data from intervention attending interviews revealed further thoughts about the feedback card and feedback session. Most attendings interviewed (7/8) felt that the card was useful for the structure and topic guidance it provided. Half felt that setting aside time for feedback was the more useful component. The other half reported that, because they usually set aside time for feedback regardless, the card was more useful. None of the attendings felt that the feedback card or session was detrimental for patient care or education, and many said that the intervention had positive effects on these areas. For example, 1 attending said that the session added to patient care because I used particular [patient] cases as examples for giving feedback.
DISCUSSION
In this randomized study, we found that a simple pocket feedback card and dedicated feedback session was acceptable to ward attendings and improved resident satisfaction with feedback. Unlike most prior studies of feedback, we demonstrated more feedback around skills needing improvement, and intervention residents felt the feedback they received helped them improve their skills. Our educational intervention was unique in that it combined a pocket card to structure feedback content and dedicated time to structure the feedback process, to address 2 of the major barriers to giving feedback: lack of time and lack of comfort.
The pocket card itself as a tool for improving feedback is innovative and valuable. As a short but directive guide, the card supports attendings' delivery of relevant and specific feedback about residents' performance, and because it is based on the ACGME competencies, it may help attendings focus feedback on areas in which they will later evaluate residents. The inclusion of a prespecified time for giving feedback was important as well, in that it allowed for face‐to‐face feedback to occur, as opposed to a passing comment after a presentation or brief notes in a written final evaluation. Both the card and the feedback session seemed equally important for the success of this intervention, with attitudes varying based on individual attending preferences. Those who usually set aside time for feedback on their own found the card more useful, whereas those who had more trouble finding time for feedback found the specific session more useful. Most attendings found the intervention as a whole helpful, and without any detrimental effects on patient care or education. The card and session may be particularly valuable for hospital attendings, given their growing presence as teachers and supervisors for residents, and their busy days on the wards.
Our study results have important implications for resident training in the hospital. Improving resident receipt of feedback about strengths and weaknesses is an ACGME training requirement, and specific guidance about how to improve skills is critical for focusing improvement efforts. Previous studies have demonstrated that directive feedback in medical training can lead to a variety of performance improvements, including better evaluations by other professionals,9, 16 and objective improvements in resident communication skills,17 chart documentation,18 and clinical management of patients.11, 15, 19 By improving the quality of feedback across several domains and facilitating the feedback process, our intervention may lead to similar improvements. Future studies should examine the global impact of guided feedback as in our study. Perhaps most importantly, attendings found the intervention acceptable and would recommend its use, implying longer term sustainability of its integration into the hospital routine.
One strength of our study was its prospective randomized design. Despite the importance of rigor in medical education research, there remains a paucity of randomized studies to evaluate educational interventions for residents in inpatient settings. Few studies of feedback interventions in particular have performed randomized trials,5, 6, 11 and only one has examined a feedback intervention in a randomized fashion in the inpatient setting.12 This evaluation of a 20‐minute intervention, and a reminder card for supervising attendings to improve written and verbal feedback to residents, modestly improved the amount of verbal feedback given to residents, but did not impact the number of residents receiving mid‐rotation feedback or feedback overall as our study did by report.12
There were several important limitations to our study. First, because this was a single institution study, we only achieved modest sample sizes, particularly in the attending groups, and were unable to assess all of the differences in attending attitudes related to feedback. Second, control and intervention participants were on service simultaneously, which may have led to contamination of the control group and an underestimation of the true impact of our intervention. Since residents were not exclusive to 1 study group on 1 occasion (18 of the 93 residents participated during 2 separate blocks), our results may be biased. In particular, those residents who had the intervention first, and were subsequently in the control group, may have rated the control experience worse than they would have otherwise, creating a bias in favor of a positive result for our intervention. Nonetheless, we believe this situation was uncommon and the potential associated bias minimal. Further, this study assessed attitudes related to feedback and self‐reported knowledge and skills, but did not directly assess resident knowledge, skills, or patient outcomes. We recognize the importance of these outcomes and hope that future interventions can determine these important downstream effects of feedback. We were also unable to assess the card and session's impact on attendings' comfort with feedback, because most attendings in both groups reported feeling comfortable giving feedback. This result may indicate that attendings actually are comfortable giving feedback, or may suggest some element of social desirability bias. Finally, in this study, we designed an intervention which combined the pocket card and dedicated feedback time. We did not quantitatively examine the effect of either component alone, and it is unclear if offering the feedback card without protected time or offering protected time without a guide would have impacted feedback on the wards. However, qualitative data from our study support the use of both components, and implementing the 2 components together is feasible in any inpatient teaching setting.
Despite these limitations, protected time for feedback guided by a pocket feedback card is a simple intervention that appears to improve feedback quantity and quality for ward residents, and guides them to improve their performance. Our low‐intensity intervention helped attendings give residents the tools to improve their clinical and communication skills. An opportunity to make a positive impact on resident education with such a small intervention is rare. The use of a feedback card with protected feedback time could be easily implemented in any training program, and is a valuable tool for busy hospitalists who are more commonly supervising residents on their inpatient rotations.
Feedback has long been recognized as pivotal to the attainment of clinical acumen and skills in medical training.1 Formative feedback can give trainees insight into their strengths and weaknesses, and provide them with clear goals and methods to attain those goals.1, 2 In fact, feedback given regularly over time by a respected figure has shown to improve physician performance.3 However, most faculty are not trained to provide effective feedback. As a result, supervisors often believe they are giving more feedback than trainees believe they are receiving, and residents receive little feedback that they perceive as useful.4 Most residents receive little to no feedback on their communications skills4 or professionalism,5 and rarely receive corrective feedback.6, 7
Faculty may fail to give feedback to residents for a number of reasons. Those barriers most commonly cited in the literature are discomfort with criticizing residents,6, 7 lack of time,4 and lack of direct observation of residents in clinical settings.810 Several studies have looked at tools to guide feedback and address the barrier of discomfort with criticism.6, 7, 11 Some showed improvements in overall feedback, though often supervisors gave only positive feedback and avoided giving feedback about weaknesses.6, 7, 11 Despite the recognition of lack of time as a barrier to feedback,4 most studies on feedback interventions thus far have not included setting aside time for the feedback to occur.6, 7, 11, 12 Finally, a number of studies utilized objective structured clinical examinations (OSCEs) coupled with immediate feedback to improve direct observation of residents, with success in improving feedback related to the encounter.9, 10, 13 To address the gaps in the current literature, the goals of our study were to address 2 specific barriers to feedback for residents: lack of time and discomfort with giving feedback.
The aim of this study was to improve Internal Medicine (IM) residents' and attendings' experiences with feedback on the wards using a pocket card and a dedicated feedback session. We developed and evaluated the pocket feedback card and session for faculty to improve the quality and frequency of their feedback to residents in the inpatient setting. We performed a randomized trial to evaluate our intervention. We hypothesized that the intervention would: 1) improve the quality and quantity of attendings' feedback given to IM ward residents; and 2) improve attendings' comfort with feedback delivery on the wards.
PARTICIPANTS AND METHODS
Setting
The study was performed at Mount Sinai Medical Center in New York City, New York, between July 2008 and January 2009.
Participants
Participants in this study were IM residents and ward teaching attendings on inpatient ward teams at Mount Sinai Medical Center from July 2008 to January 2009. There are 12 ward teams on 3 inpatient services (each service has 4 teams) during each block at our hospital. Ward teams are made up of 1 teaching attending, 1 resident, 1 to 3 interns, and 1 to 2 medical students. The majority of attendings are on the ward service for 4‐week blocks, but some are only on for 1 or 2 weeks. Teams included in the randomization were the General Medicine and Gastroenterology/Cardiology service teams. Half of the General Medicine service attendings are hospitalists. Ward teams were excluded from the study randomization if the attending on the team was on the wards for less than 2 weeks, or if the attending had already been assigned to the experimental group in a previous block, given the influence of having used the card and feedback session previously. Since residents were unaware of the intervention and random assignments were based on attendings, residents could be assigned to the intervention group or the control group on any given inpatient rotation. Therefore, a resident could be in the control group in 1 block and the intervention group in his/her next block on the wards or vice versa, or could be assigned to either the intervention or the control group on more than 1 occasion. Because resident participants were blinded to their team's assignment (as intervention or control) and all surveys were anonymous (tracked as intervention or control by the team name only), it was not possible to exclude residents based on their prior participation or to match the surveys completed by the same residents.
Study Design
We performed a prospective randomized study to evaluate our educational innovation. The unit of randomization was the ward team. For each block, approximately half of the 6‐8 teams were randomized to the intervention group and half to the control group. Randomization assignments were completed the day prior to the start of the block using the random allocation software based on the ward team letters (blind to the attending and resident names). Of the 48 possible ward teams (8 teams per block over 6 blocks), 36 teams were randomized to the intervention or control groups, and 12 teams were not based on the above exclusion criteria. Of the 36 teams, 16 (composed of 16 attendings and 48 residents and interns) were randomized to the intervention group, and 20 (composed of 20 attendings and 63 residents and interns) were randomized to the control group.
The study was blinded such that residents and attendings in the control group were unaware of the study. The study was exempt from IRB review by the Mount Sinai Institutional Review Board, and Grants and Contracts Office, as an evaluation of the effectiveness of an instructional technique in medical education.
Intervention Design
We designed a pocket feedback card to guide a feedback session and assist attendings in giving useful feedback to IM residents on the wards (Figure 1).14 The individual items and categories were adapted from the Accreditation Council for Graduate Medical Education (ACGME) Common Program Requirements Core Competencies section and were revised via the expert consensus of the authors.14 We included 20 items related to resident skills, knowledge, attitudes, and behaviors important to the care of hospitalized patients, grouped under the 6 ACGME core competency domains.14 Many of these items correspond to competencies in the Society of Hospital Medicine (SHM) Core Competencies; in particular, the categories of Systems‐Based Practice and Practice‐Based Learning mirror competencies in the SHM Core Competencies Healthcare Systems chapter.15 Each item utilized a 5‐point Likert scale (1 = very poor, 3 = at expected level, 5 = superior) to evaluate resident performance (Figure 1). We created this card to serve as a directive and specific guide for attendings to provide feedback about specific domains and to give more constructive feedback. The card was to be used during a specific dedicated feedback session in order to overcome the commonly cited barrier of lack of time.

Program Implementation
On the first day of the block, both groups of attendings received the standard inpatient ward orientation given by the program director, including instructions about teaching and administrative responsibilities, and explicit instructions to provide mid‐rotation feedback to residents. Attendings randomized to the intervention group had an additional 5‐minute orientation given by 1 of the investigators. The orientation included a brief discussion on the importance of feedback and an introduction to the items on the card.2 In addition, faculty were instructed to dedicate 1 mid‐rotation attending rounds as a feedback session, to meet individually for 10‐15 minutes with each of the 3‐4 residents on their team, and to use the card to provide feedback on skills in each domain. As noted on the feedback card, if a resident scored less than 3 on a skill set, the attending was instructed to give examples of skills within that domain needing improvement and to offer suggestions for improvement. The intervention group was also asked not to discuss the card or session with others. No other instructions were provided.
Survey Design
At the end of each block, residents and attendings in both groups completed questionnaires to assess satisfaction with, and attitudes toward, feedback (Supporting Information Appendices 1 and 2 in the online version of this article). Survey questions were based on the competency areas included in the feedback card, previously published surveys evaluating feedback interventions,5, 9, 11 and expert opinion. The resident survey was designed to address the impact of feedback on the domains of resident knowledge, clinical and communication skills, and attitudes about feedback from supervisors and peers. We utilized a 5‐point Likert scale including: strongly disagree, disagree, neutral, agree, and strongly agree. The attending survey addressed attendings' satisfaction with feedback encounters and resident performance. At the completion of the study, investigators compared responses in intervention and control groups.
Statistical Analysis
For purposes of analysis, due to the relatively small number of responses for certain answer choices, the Likert scale was converted to a dichotomous variable. The responses of agree and strongly agree were coded as agree; and disagree, strongly disagree, and neutral were coded as disagree. Neutral was coded as disagree in order to avoid overestimating positive attitudes and, in effect, bias our results toward the null hypothesis. Differences between groups were analyzed using chi‐square Fisher's exact test (2‐sided).
Qualitative Interviews
In order to understand the relative contribution of the feedback card versus the feedback session, we performed a qualitative survey of attendings in the intervention group. Following the conclusion of the study period, we selected a convenience sample of 8 attendings from the intervention group for these brief qualitative interviews. We asked 3 basic questions. Was the intervention of the feedback card and dedicated time for feedback useful? Did you find one component, either the card or the dedicated time for feedback, more useful than the other? Were there any negative effects on patient care, education, or other areas, from using an attending rounds as a feedback session? This data was coded and analyzed for common themes.
RESULTS
During the 6‐month study period, 34 teaching attendings (over 36 attending inpatient blocks) and 93 IM residents (over 111 resident inpatient blocks) participated in the study. Thirty‐four of 36 attending surveys and 96 of 111 resident surveys were completed. The overall survey response rates for residents and attendings were 85% and 94%, respectively. Two attendings participated during 2 separate blocks, first in the control group and then in the intervention group, and 18 residents participated during 2 separate blocks. No attendings or residents participated more than twice.
Resident survey response rate was 81.2% in the intervention group and 87.3% in the control group (Table 1). Residents in the intervention group reported receiving more feedback regarding skills they did well (89.7% vs 63.6%, P = 0.004) and skills needing improvement (51.3% vs 25.5%, P = 0.02) than those in the control group. In addition, more intervention residents reported receiving useful information regarding how to improve their skills (53.8% vs 27.3%, P = 0.01), and reported actually improving both their clinical skills (61.5% vs 27.8%, P = 0.001) and their professionalism/communication skills (51.3% vs 29.1%, P = 0.03) based on feedback received from attendings.
Survey Item | Resident Intervention Agree* % (No.) N = 39 | Resident Control Agree*% (No.) N = 55 | P Value |
---|---|---|---|
| |||
I did NOT receive a sufficient amount of feedback from my attending supervisor(s) this block. | 20.5 (8) | 38.2 (21) | 0.08 |
I received feedback from my attending regarding skills I did well during this block. | 89.7 (35) | 63.6 (35) | 0.004 |
I received feedback from my attending regarding specific skills that needed improvement during this block. | 51.3 (20) | 25.5 (14) | 0.02 |
I received useful information from my attending about how to improve my skills during this block. | 53.8 (21) | 27.3 (15) | 0.01 |
I improved my clinical skills based on feedback I received from my attending this block. | 61.5 (24) | 27.8 (15) | 0.001 |
I improved my professionalism/communication skills based on feedback I received from my attending this block. | 51.3 (20) | 29.1 (16) | 0.03 |
I improved my knowledge base because of feedback I received from my attending this block. | 64.1 (25) | 60.0 (33) | 0.83 |
The feedback I received from my attending this block gave me an overall sense of my performance more than it helped me identify specific areas for improvement. | 64.1 (25) | 65.5 (36) | 1.0 |
Feedback from colleagues (other interns and residents) is more helpful than feedback from attendings. | 41.0 (16) | 43.6 (24) | 0.84 |
Independent of feedback received from others, I am able to identify areas in which I need improvement. | 84.6 (33) | 80.0 (44) | 0.60 |
The attending survey response rates for the intervention and control groups were 100% and 90%, respectively. In general, both groups of attendings reported that they were comfortable giving feedback and that they did, in fact, give feedback in each area during their ward block (Table 2). More intervention attendings felt that at least 1 of their residents improved their professionalism/communication skills based on the feedback given (76.9% vs 31.1%, P = 0.02). There were no other significant differences between the groups of attendings.
Survey Item | Attending Intervention Agree* % (No.) N = 16 | Attending Control Agree* % (No.) N = 18 | P Value |
---|---|---|---|
| |||
Giving feedback to housestaff was DIFFICULT this block. | 6.3 (1) | 16.7 (3) | 0.60 |
I was comfortable giving feedback to my housestaff this block. | 93.8 (15) | 94.4 (17) | 1.00 |
I did NOT give a sufficient amount of feedback to my housestaff this block. | 18.8 (3) | 38.9 (7) | 0.27 |
My skills in giving feedback improved during this block. | 50 (8) | 16.7 (3) | 0.07 |
I gave feedback to housestaff regarding skills they did well during this block. | 100 (16) | 94.4 (17) | 1.00 |
I gave feedback to housestaff which targeted specific areas for their improvement. | 81.3 (13) | 70.6 (12) | 0.69 |
At least one of my housestaff improved his/her clinical skills based on feedback I gave this block. | 68.8 (11) | 47.1 (8) | 0.30 |
At least one of my housestaff improved his/her professionalism/communication skills based on feedback I gave this block. | 76.9 (10) | 31.1 (5) | 0.02 |
At least one of my housestaff improved his/her fund of knowledge based on feedback I gave this block. | 50.0 (8) | 52.9 (9) | 1.00 |
Housestaff found the feedback I gave them useful. | 66.7 (10) | 62.5 (10) | 1.00 |
I find it DIFFICULT to find time during inpatient rotations to give feedback to residents regarding their performance. | 50.0 (8) | 33.3 (6) | 0.49 |
Intervention attendings also shared their attitudes toward the feedback card and session. A majority felt that using 1 attending rounds as a feedback session helped create a dedicated time for giving feedback (68.8%), and that the feedback card helped them to give specific, constructive feedback (62.5%). Most attendings reported they would use the feedback card and session again during future inpatient blocks (81%), and would recommend them to other attendings (75%).
Qualitative data from intervention attending interviews revealed further thoughts about the feedback card and feedback session. Most attendings interviewed (7/8) felt that the card was useful for the structure and topic guidance it provided. Half felt that setting aside time for feedback was the more useful component. The other half reported that, because they usually set aside time for feedback regardless, the card was more useful. None of the attendings felt that the feedback card or session was detrimental for patient care or education, and many said that the intervention had positive effects on these areas. For example, 1 attending said that the session added to patient care because I used particular [patient] cases as examples for giving feedback.
DISCUSSION
In this randomized study, we found that a simple pocket feedback card and dedicated feedback session was acceptable to ward attendings and improved resident satisfaction with feedback. Unlike most prior studies of feedback, we demonstrated more feedback around skills needing improvement, and intervention residents felt the feedback they received helped them improve their skills. Our educational intervention was unique in that it combined a pocket card to structure feedback content and dedicated time to structure the feedback process, to address 2 of the major barriers to giving feedback: lack of time and lack of comfort.
The pocket card itself as a tool for improving feedback is innovative and valuable. As a short but directive guide, the card supports attendings' delivery of relevant and specific feedback about residents' performance, and because it is based on the ACGME competencies, it may help attendings focus feedback on areas in which they will later evaluate residents. The inclusion of a prespecified time for giving feedback was important as well, in that it allowed for face‐to‐face feedback to occur, as opposed to a passing comment after a presentation or brief notes in a written final evaluation. Both the card and the feedback session seemed equally important for the success of this intervention, with attitudes varying based on individual attending preferences. Those who usually set aside time for feedback on their own found the card more useful, whereas those who had more trouble finding time for feedback found the specific session more useful. Most attendings found the intervention as a whole helpful, and without any detrimental effects on patient care or education. The card and session may be particularly valuable for hospital attendings, given their growing presence as teachers and supervisors for residents, and their busy days on the wards.
Our study results have important implications for resident training in the hospital. Improving resident receipt of feedback about strengths and weaknesses is an ACGME training requirement, and specific guidance about how to improve skills is critical for focusing improvement efforts. Previous studies have demonstrated that directive feedback in medical training can lead to a variety of performance improvements, including better evaluations by other professionals,9, 16 and objective improvements in resident communication skills,17 chart documentation,18 and clinical management of patients.11, 15, 19 By improving the quality of feedback across several domains and facilitating the feedback process, our intervention may lead to similar improvements. Future studies should examine the global impact of guided feedback as in our study. Perhaps most importantly, attendings found the intervention acceptable and would recommend its use, implying longer term sustainability of its integration into the hospital routine.
One strength of our study was its prospective randomized design. Despite the importance of rigor in medical education research, there remains a paucity of randomized studies to evaluate educational interventions for residents in inpatient settings. Few studies of feedback interventions in particular have performed randomized trials,5, 6, 11 and only one has examined a feedback intervention in a randomized fashion in the inpatient setting.12 This evaluation of a 20‐minute intervention, and a reminder card for supervising attendings to improve written and verbal feedback to residents, modestly improved the amount of verbal feedback given to residents, but did not impact the number of residents receiving mid‐rotation feedback or feedback overall as our study did by report.12
There were several important limitations to our study. First, because this was a single institution study, we only achieved modest sample sizes, particularly in the attending groups, and were unable to assess all of the differences in attending attitudes related to feedback. Second, control and intervention participants were on service simultaneously, which may have led to contamination of the control group and an underestimation of the true impact of our intervention. Since residents were not exclusive to 1 study group on 1 occasion (18 of the 93 residents participated during 2 separate blocks), our results may be biased. In particular, those residents who had the intervention first, and were subsequently in the control group, may have rated the control experience worse than they would have otherwise, creating a bias in favor of a positive result for our intervention. Nonetheless, we believe this situation was uncommon and the potential associated bias minimal. Further, this study assessed attitudes related to feedback and self‐reported knowledge and skills, but did not directly assess resident knowledge, skills, or patient outcomes. We recognize the importance of these outcomes and hope that future interventions can determine these important downstream effects of feedback. We were also unable to assess the card and session's impact on attendings' comfort with feedback, because most attendings in both groups reported feeling comfortable giving feedback. This result may indicate that attendings actually are comfortable giving feedback, or may suggest some element of social desirability bias. Finally, in this study, we designed an intervention which combined the pocket card and dedicated feedback time. We did not quantitatively examine the effect of either component alone, and it is unclear if offering the feedback card without protected time or offering protected time without a guide would have impacted feedback on the wards. However, qualitative data from our study support the use of both components, and implementing the 2 components together is feasible in any inpatient teaching setting.
Despite these limitations, protected time for feedback guided by a pocket feedback card is a simple intervention that appears to improve feedback quantity and quality for ward residents, and guides them to improve their performance. Our low‐intensity intervention helped attendings give residents the tools to improve their clinical and communication skills. An opportunity to make a positive impact on resident education with such a small intervention is rare. The use of a feedback card with protected feedback time could be easily implemented in any training program, and is a valuable tool for busy hospitalists who are more commonly supervising residents on their inpatient rotations.
- Feedback in clinical medical education.JAMA.1983;250(6):777–781. .
- Giving feedback in medical education: verification of recommended techniques.J Gen Intern Med.1998;13(2):111–116. , .
- Systematic review of the literature on assessment, feedback and physicians' clinical performance: BEME Guide No. 7.Med Teach.2006;28(2):117–128. , , , , .
- Missed opportunities: a descriptive assessment of teaching and attitudes regarding communication skills in a surgical residency.Curr Surg.2006;63(6):401–409. , , , .
- Impact of a 360‐degree professionalism assessment on faculty comfort and skills in feedback delivery.J Gen Intern Med.2008;23(7):969–972. , , .
- Daily encounter cards facilitate competency‐based feedback while leniency bias persists.CJEM.2008;10(1):44–50. , .
- Teaching compassion and respect. Attending physicians' responses to problematic behaviors.J Gen Intern Med.1999;14(1):49–55. , , , , .
- Faculty and the observation of trainees' clinical skills: problems and opportunities.Acad Med.2004;79(1):16–22. .
- Direct observation of residents in the emergency department: a structured educational program.Acad Emerg Med.2009;16(4):343–351. , .
- Evaluation of a novel assessment form for observing medical residents: a randomised, controlled trial.Med Educ.2008;42(12):1234–1242. , , , et al.
- Resident evaluations: the use of daily evaluation forms in rheumatology ambulatory care.J Rheumatol.2009;36(6):1298–1303. , , , et al.
- Effectiveness of a focused educational intervention on resident evaluations from faculty a randomized controlled trial.J Gen Intern Med.2001;16(7):427–434. , , , .
- Effects of training in direct observation of medical residents' clinical competence: a randomized trial.Ann Intern Med.2004;140(11):874–881. , , .
- Internal Medicine Program Requirements. ACGME. July 1, 2009. Available at: http://www.acgme.org/acWebsite/downloads/RRC_progReq/140_internal_medicine_07012009.pdf. Accessed November 8,2009.
- How to use the core competencies in hospital medicine: a framework for curriculum development.J Hosp Med. 2006;1(suppl 1):57–67. , , , , .
- Debriefing in the intensive care unit: a feedback tool to facilitate bedside teaching.Crit Care Med.2007;35(3):738–754. , , , , .
- Use of an innovative video feedback technique to enhance communication skills training.Med Educ.2004;38(2):145–157. , , , et al.
- The impact of feedback to medical housestaff on chart documentation and quality of care in the outpatient setting.J Gen Intern Med.1997;12(6):352–356. .
- Feedback and the mini clinical evaluation exercise.J Gen Intern Med.2004;19(5 pt 2):558–561. , , , .
- Feedback in clinical medical education.JAMA.1983;250(6):777–781. .
- Giving feedback in medical education: verification of recommended techniques.J Gen Intern Med.1998;13(2):111–116. , .
- Systematic review of the literature on assessment, feedback and physicians' clinical performance: BEME Guide No. 7.Med Teach.2006;28(2):117–128. , , , , .
- Missed opportunities: a descriptive assessment of teaching and attitudes regarding communication skills in a surgical residency.Curr Surg.2006;63(6):401–409. , , , .
- Impact of a 360‐degree professionalism assessment on faculty comfort and skills in feedback delivery.J Gen Intern Med.2008;23(7):969–972. , , .
- Daily encounter cards facilitate competency‐based feedback while leniency bias persists.CJEM.2008;10(1):44–50. , .
- Teaching compassion and respect. Attending physicians' responses to problematic behaviors.J Gen Intern Med.1999;14(1):49–55. , , , , .
- Faculty and the observation of trainees' clinical skills: problems and opportunities.Acad Med.2004;79(1):16–22. .
- Direct observation of residents in the emergency department: a structured educational program.Acad Emerg Med.2009;16(4):343–351. , .
- Evaluation of a novel assessment form for observing medical residents: a randomised, controlled trial.Med Educ.2008;42(12):1234–1242. , , , et al.
- Resident evaluations: the use of daily evaluation forms in rheumatology ambulatory care.J Rheumatol.2009;36(6):1298–1303. , , , et al.
- Effectiveness of a focused educational intervention on resident evaluations from faculty a randomized controlled trial.J Gen Intern Med.2001;16(7):427–434. , , , .
- Effects of training in direct observation of medical residents' clinical competence: a randomized trial.Ann Intern Med.2004;140(11):874–881. , , .
- Internal Medicine Program Requirements. ACGME. July 1, 2009. Available at: http://www.acgme.org/acWebsite/downloads/RRC_progReq/140_internal_medicine_07012009.pdf. Accessed November 8,2009.
- How to use the core competencies in hospital medicine: a framework for curriculum development.J Hosp Med. 2006;1(suppl 1):57–67. , , , , .
- Debriefing in the intensive care unit: a feedback tool to facilitate bedside teaching.Crit Care Med.2007;35(3):738–754. , , , , .
- Use of an innovative video feedback technique to enhance communication skills training.Med Educ.2004;38(2):145–157. , , , et al.
- The impact of feedback to medical housestaff on chart documentation and quality of care in the outpatient setting.J Gen Intern Med.1997;12(6):352–356. .
- Feedback and the mini clinical evaluation exercise.J Gen Intern Med.2004;19(5 pt 2):558–561. , , , .
Copyright © 2011 Society of Hospital Medicine
Thromboprophylaxis: Survey on Barriers
Each year in North America, over 7 million adults are hospitalized with a medical illness.1 Acute illness and decreased mobility in hospital places patients at increased risk for venous thromboembolism (VTE), which includes deep vein thrombosis (DVT) and life‐threatening pulmonary embolism (PE).2 Since VTE remains the most preventable cause of death in hospitalized patients, numerous studies have aimed at reducing the incidence of hospital‐acquired DVT. Aside from cost, the impact of VTE to the healthcare system is felt not only by those who diagnose and treat VTE, but also by those responsible for correcting the severe bleeding that can result from inappropriate use of thromboprophylaxis. Approximately 60% of symptomatic VTE occurs in medical patients, and recent hospitalization for medical illness accounts for 25% of all community‐diagnosed VTE. The Agency for Health Research and Quality ranks DVT prevention as the top priority out of 79 patient safety initiatives, and expert consensus groups provide a strong recommendation that DVT prophylaxis with a low‐dose anticoagulant should be administered to at‐risk hospitalized medical patients.2, 3
Despite the availability, efficacy, and safety of DVT prophylaxis,2 it is discouraging that only 21% to 62% of medical patients receive prophylaxis,49 and only 16% to 40% receive appropriate prophylaxis.46, 1012 However, 70% to 90% of patients in other at‐risk groups, such as surgical patients or critically ill patients, receive prophylaxis.1316 The reason why DVT prophylaxis is so underutilized in medical patients is unclear, as explanations for low rates of clinical practice guideline utilization are multifaceted,17 and few studies have investigated the barriers to optimal thromboprophylaxis.1820
To explore possible reasons for this disparity between evidence and practice, we conducted a cross‐sectional survey of 4 clinician groups involved in the care of hospitalized medical patients. Our objective was to identify barriers and potential solutions to the underutilization of DVT prophylaxis in hospitalized medical patients.
METHODS
Instrument Development
The survey focused on 3 domains: perceived importance, effectiveness, and safety of DVT prophylaxis; perceived barriers to implementation; and perceived potential success and feasibility of interventions to optimize DVT prophylaxis. The survey cover letter outlined background information, study design, and a statement on confidentiality. A prior survey of DVT prophylaxis administered to thrombosis experts was used to generate survey questions.21
Only survey respondents who answered yes to the first question, Are you involved in any aspect of the care of hospitalized general medical patients for whom DVT prophylaxis is considered? were asked to complete the remaining sections. Subsequent questions required respondents to check the box on a 7‐point Likert‐type scale that most accurately reflected their perception (Table 1). A successful intervention was defined as one that, if implemented, would yield the anticipated effect and a feasible intervention as one that was easy to implement without major logistical burden. Respondents were also asked which clinician group was best able to provide a daily assessment of patients' need for DVT prophylaxis, ensure DVT prophylaxis is prescribed, and ensure adherence.
|
Section 1: Perceptions regarding DVT prophylaxis in hospitalized medical patients* |
1. How important an issue is the prevention of DVT in hospitalized general medical patients? |
2. To your knowledge, how effective are currently used anticoagulant strategies for the prevention of DVT in hospitalized medical patients? |
3. How safe are currently used anticoagulant strategies for the prevention of DVT in hospitalized medical patients? |
4. Current anticoagulant prophylaxis strategies are: 1 = underutilized, 4 = appropriately utilized, 7 = overutilized. |
Section 2: Perceptions regarding barriers to the optimal use of DVT prophylaxis |
1. Lack of time to consider DVT prophylaxis in every patient |
2. Lack of clear indications for DVT prophylaxis (ie, who should get prophylaxis) |
3. Lack of clear contraindications for DVT prophylaxis (ie, who should not get prophylaxis) |
4. Lack of awareness about effectiveness of DVT prophylaxis |
5. Lack of physician agreement with current DVT prophylaxis guidelines |
6. Patient discomfort from subcutaneous injections of anticoagulants |
7. Clinician concerns about increased bleeding risk from anticoagulant administration |
Section 3: Perceptions of interventions relating to DVT prophylaxis |
1. Yearly multidisciplinary educational meetings: to engage a wide spectrum of healthcare professionals to review DVT prophylaxis in hospitalized medical patients |
2. Posters on the wards: to remind healthcare professionals about DVT prophylaxis and patients who are eligible or ineligible for this treatment |
3. Laminated pocket cards: to remind healthcare professionals about DVT prophylaxis and patients who are eligible and ineligible for this treatment |
4. Preprinted order sheets: to remind healthcare professionals about DVT prophylaxis and patients who are eligible and ineligible for this treatment |
5. Periodic audit and feedback to healthcare providers: E‐mails to physicians containing reports on compliance with DVT prevention practice guidelines over recent years |
6. Computerized reminders (to the physicians): to prompt the physician to consider DVT prophylaxis upon opening a patient's electronic medical record |
7. Nurse reminders (to the physician): to remind the physician about DVT prophylaxis using written or verbal reminders |
8. Pharmacist reminders (to the physician): to remind the physician about DVT prophylaxis using written or verbal reminders |
9. Physiotherapist reminders (to the physician): to remind the physician about DVT prophylaxis using written or verbal reminders |
10. Use of a local opinion leader (within the hospital) to promote evidence‐based use of DVT prophylaxis guidelines: to educate healthcare professionals on best practices for DVT prophylaxis |
Survey Administration
The survey was distributed between April and July 2007 in both paper‐based and web‐based formats using Survey Monkey software. Ontario members of the Canadian Society of Internal Medicine (n = 193) received a direct electronic invitation (from N.S.L., on behalf of J.D.D.) to participate, while members of the Canadian Society of Hospital Pharmacists (CSHP) (n = 1002) received an electronic invitation from an administrator for the CSHP to participate. The CSHP could not ensure that all members receiving the survey were hospital‐based pharmacists, so it was expected that the response rate from this group would be low. Nurse and physiotherapy managers at a convenience sample of 8 hospitals in Ontario, Canada, distributed paper‐based surveys to their staff using stamped, preaddressed envelopes. Nonresponders in all groups were sent reminders at 2 and 4 weeks.22 Data from all completed surveys were entered into an electronic database by a research coordinator (N.S.L.). A research assistant entered paper‐based survey data in duplicate, with discrepancies resolved by consensus and mediation by a third person (J.C.). The study was conducted with Institutional Ethics Review Board approval, and all respondents provided informed consent to participate. All responses were anonymous and confidential.
Statistical Considerations
Given the exploratory nature of this survey, there was no prespecified hypothesis‐driven respondent sample size. Proportions were used to describe response rates. Survey responses scored on the 7‐point Likert‐type scale were expressed as a mean and 95% confidence interval (CI). Important, highly potentially successful, and highly potentially feasible barriers were defined as those with a mean 5 points. Questions without responses, questions with multiple responses, and questions with illegible responses were treated as missing values. All statistical analyses were done using SAS version 9 (Cary, NC).
RESULTS
Survey Responses
The overall response rate was 36.3% (563/1553), with 65.5% (211/322) of nurses, 40.4% (78/193) of physicians, 24.1% (242/1002) of pharmacists, and 88.8% (32/36) of physiotherapists completing surveys. When pharmacists were removed from the response rate calculation (since it was expected that many of those receiving the survey were not in a primarily hospital‐based practice), the overall response rate rose to 58.3% (321/551). Excluded were 9.2% (52/563) of returned surveys, as respondents indicated the topic was not relevant to their practice. Five hundred eleven surveys were included in the final analysis (Figure 1).

Importance, Effectiveness, Safety, and Appropriateness of DVT Prophylaxis Strategies
DVT prophylaxis was perceived across clinician groups as important (mean score 6.4; 95% CI 6.3 to 6.5), safe (mean 5.5; 95% CI 5.4 to 5.6), and effective (mean 5.5; 95% CI 5.4 to 6.6) (Figure 2). The mean score for the appropriateness of current DVT prophylaxis practices was 3.5 (95% CI 3.4 to 3.7), suggesting an overall perception of underutilization. However, by respondent groups, DVT prophylaxis was considered to be underutilized by physicians (mean 2.5; 95% CI 2.3 to 2.7) and pharmacists (mean 3.1; 95% CI 2.9 to 3.2), while nurses (mean 4.3; 95% CI 4.2 to 4.5) and physiotherapists (mean 3.8; 95% CI, 3.4 to 4.2) tended to consider current strategies as appropriate.

Potential Barriers to DVT Prophylaxis Utilization
Figure 3 demonstrates that no single barrier to DVT prophylaxis utilization was dominant and no barriers were considered very important. Perceived barriers carrying comparable weight were: concerns about bleeding (mean 4.8; 95% CI 4.6 to 4.9); lack of clear indications (mean 4.6; 95% CI 4.5 to 4.8) and contraindications to DVT prophylaxis (mean 4.4; 95% CI 4.3 to 4.6); lack of awareness about effectiveness of DVT prophylaxis (mean 4.5; 95% CI 4.3 to 4.7); and lack of time to consider DVT prophylaxis in every patient (mean 4.4; 95% CI 4.3 to 4.6). Patient discomfort from subcutaneous injections was perceived as the least important barrier (mean 3.8; 95% CI 3.6 to 4.0). Physicians perceived lack of awareness about the effectiveness of DVT prophylaxis as the most important barrier (mean 4.0; 95% CI 3.5 to 4.4), whereas concern about bleeding was dominant among non‐physicians (nurses' mean 5.2; 95% CI 5.0 to 5.5; pharmacists' mean 4.7; 95% CI 4.5 to 4.9; physiotherapists' mean 4.6; 95% CI 3.9 to 5.3).

Potential Success and Feasibility of Interventions to Optimize DVT Prophylaxis Utilization
Interventions considered across clinician groups as highly potentially successful were: preprinted order sheets (5.7; 95% CI 5.6 to 5.8); pharmacist reminders to physicians (mean 5.3; 95% CI 5.1 to 5.4); computerized reminders to physicians (mean 5.0; 95% CI 4.9 to 5.2); and use of a local opinion leader (mean 5.0; 95% CI 4.9 to 5.2). Interventions considered highly potentially feasible were: posters (mean 5.7; CI 5.6 to 5.8); preprinted order sheets (mean 5.5; 95% CI 5.4 to 5.7); laminated pocket cards (mean 5.4; 95% CI 5.2 to 5.5); multidisciplinary educational meetings (mean 5.0; 95% CI 4.9 to 5.2); and pharmacist reminders to physicians (mean 5.0; 95% CI 4.9 to 5.1). Preprinted orders and pharmacist reminders were perceived by all clinician groups as having both high potential success and feasibility (Figure 4).

Perceptions on Which Clinician Group Is Best Able to Assess and Implement DVT Prophylaxis
Respondents were divided between considering the attending physician and the bedside nurse as best able to perform a daily assessment of patients' need for DVT prophylaxis (43.4% [204/470] vs 44.0% [207/470], respectively). Respondents from these groups each predominantly thought this responsibility was theirs, with 68.1% (49/72) of physicians and 61.5% (123/200) of nurses perceiving this as their responsibility (Figure 5).

Forty‐one percent (193/471) of respondents perceived the attending physician as best able to ensure that DVT prophylaxis is ordered, while 31.2% (147/471) identified the pharmacist and 23.3% (110/471) identified the bedside nurse as best suited to this role. Among pharmacists, 66.3% (114/172) perceived that the attending pharmacist is best able to perform this task. Among respondents, 61.9% (296/478) felt the bedside nurse is best able to ensure adherence to DVT prophylaxis, with good agreement among all clinician groups.
DISCUSSION
Our survey identified several perceived barriers to optimizing DVT prophylaxis, consistent with those reported in the White Paper sponsored by the American Public Health Association.23 While no single barrier outlined in our survey was dominant, 2 novel barriers were identified: misperception of DVT prophylaxis underutilization, and confusion about roles and responsibilities in the area of DVT prophylaxis. Attention to these barriers may be helpful in developing an intervention aimed at bridging the gap between evidence and practice.
While our survey demonstrates agreement across clinician groups on the importance, efficacy, and safety of DVT prophylaxis, the discordant perceptions that exist about whether DVT prophylaxis is utilized appropriately is an important concern. Physician and pharmacist‐respondents demonstrated awareness that thromboprophylaxis is underutilized in medical patients. However, despite overwhelming published evidence to the contrary, nurses responding to our survey did not tend to recognize the problem of DVT prophylaxis underutilization in hospitalized medical patients. This knowledge deficit may be a significant barrier particularly since the pooled group of respondents indicated that nurses are among those caregivers best able to conduct a daily assessment of patients' need for DVT prophylaxis. A possible explanation for the finding that nurses and physiotherapists demonstrated a relative lack of awareness of the problem of DVT prophylaxis underutilization is ward‐specific healthcare priorities. Nursing and physiotherapy care on surgical wards is aimed at preventing postoperative complications, including DVT. However, its primary focus on medical wards is the management of acute medical problems. Prevention of hospital‐related complications, such as DVT, is often a secondary focus. Therefore, ensuring that all clinician groups are educated about the problem of DVT prophylaxis underutilization is necessary to drive quality improvement. A physician‐based survey on antithrombotic therapies demonstrated a similar need for education on guideline recommendations.20
A second important barrier identified in our survey is that both attending nurses and physicians feel that daily assessment of a patient's need for DVT prophylaxis is their responsibility. Confusion about roles and responsibilities in this area of patient care was reported by Cook et al., who identified that multidisciplinary care was perceived as a barrier to effective VTE prevention.18 Uncertainty as to which group should take ownership of DVT prophylaxis can lead to a diffusion of responsibility, a lack of accountability, and a gap in care. A resolution to whether DVT risk assessment is a nursing or a physician role could be reached through increased interdisciplinary communication and provision of clear definitions of roles to hospital staff.
Survey respondents felt that preprinted orders and pharmacist reminders to physicians were potentially successful and feasible strategies to optimize DVT prophylaxis. These components could be part of a simple tool to initiate prophylaxis. While electronic alerts have been shown to increase prophylaxis rates,24 we suspect that many respondents did not view these as highly important because of limited use of computerized order entry at their facilities. Interestingly, survey respondents did not perceive audit‐and‐feedback systems or local opinion leaders as potentially successful, though previous studies have demonstrated that they can change clinician behavior.25, 26 This may be because respondents may not be aware of the strength of technology‐based interventions (eg, electronic orders) and the role of opinion leaders, and the evidence in support of such interventions.24, 26 A systematic review of studies to improve DVT prophylaxis in hospitals reported that a combination of multiple active strategies is most effective, particularly those that link physician reminders with audit‐and‐feedback.27 For example, in the define study, a multicomponent intervention consisting of interactive educational sessions, verbal and computerized prompts, and individual performance feedback significantly improved adherence to DVT prophylaxis guidelines in critically ill patients.28 Whether a similar intervention could improve adherence to DVT prophylaxis guidelines in hospitalized medical patients merits further study. Any intervention must be paired with better education about which patients should, and should not, receive prophylaxis, as this may address many reported barriers in our survey (including concerns about bleeding). Respondents' uncertainty about these issues is not surprising, as studies of DVT prophylaxis in medical patients are not plentiful.2 However, recent guidelines do identify subgroups of medically ill patients in whom DVT prophylaxis is indicated.2 A clear and simple DVT risk assessment algorithm that identifies medical patients in whom DVT prophylaxis should (or should not) be administered may help to overcome respondents' concerns.
A limitation of our survey is the overall response rate of 36.3%, largely driven by the considerable number of nonresponding pharmacists (n = 760, reflecting 49% of the entire sample). However, the majority of the pharmacists were likely not hospital‐based, were thus not a target of this study, and their low response rate is not surprising. After excluding pharmacists, the response rate was 58.3% (321/551), which is consistent with response rates of other large‐sample surveys.29 The lower response rate for physicians and pharmacists may also reflect web‐based survey dissemination which, despite its feasibility, has lower response rates than paper‐based dissemination.3032 While the sample of physicians was relatively small compared to the other respondent groups surveyed, we aimed to identify barriers to actually implementing VTE prophylaxis, not just ordering prophylaxis, which is a multidisciplinary concern.
Although this survey was based on Canadian healthcare providers' perspectives, we believe the results are generalizable since both US and Canadian‐based studies have found that VTE prophylaxis is underutilized among hospitalized medical patients.4, 6 Furthermore, the American College of Chest Physicians (ACCP) guidelines on VTE prophylaxis, which are well‐recognized in both the United States and Canada, were developed with input from Canadian and American content experts.2 And while the US and Canadian healthcare systems are organized differently, at the patient‐care level, the roles of healthcare professionals are very similar. The generalizability of our findings is, however, limited by the institutional characteristics of respondents. We do not purport that the responses of any of the 4 clinician groups are generalizable to those groups as a whole. Although we surveyed clinicians in teaching and nonteaching, urban and rural practices, perceptions about DVT prophylaxis may be influenced by other factors, including the availability of local preprinted orders, electronic medical records, and quality improvement programs. Another potential limitation is that we did not assess all possible strategies to improve DVT prophylaxis, such as nurse practitioners and computerized decision support systems. These were purposely excluded, as they are not financially feasible in all centers, and thus not generalizable. Finally, like all self‐administered surveys, our findings reflect respondents' perceptions rather than objective observations about practice.
In conclusion, we identified novel and important barriers to optimal DVT prophylaxis utilization and potential interventions to address this important safety concern in hospitalized medical patients. To overcome some of these barriers, we propose an educational intervention prior to delivery of a top‐down, evidence‐based intervention to first increase healthcare providers' knowledge of the safety of DVT prophylaxis, system and team‐based approaches, and which interventions are most likely to be successful so as to encourage greater compliance with the intervention. A top‐down, system‐wide approach, involving the entire healthcare team and hospital administrators, can help drive this communication. As DVT prophylaxis becomes an increasingly important component in hospital accreditation, such solutions become appealing to facilitate change in practices. Results of this survey may inform future knowledge translation interventions by eliminating perceived barriers to DVT prophylaxis and by incorporating strategies that are perceived by healthcare professionals to be successful, feasible, and supported by evidence.
- National hospital discharge survey: annual summary, 1996.Vital Health Stat.1999;13:1–46. , .
- Prevention of venous thromboembolism: American College of Chest Physicians Evidence‐Based Clinical Practice Guidelines (8th ed).Chest.2008;133:381S–443S. , , , et al.
- Evidence Report/Technology Assessment: No. 43. AHRQ Publication No. 01‐E058, July 2001. Rockville, MD: Agency for Healthcare Research and Quality. Available at: http://www.ahrq.gov/clinic/ptsafety/. Accessed October 9,2007. , , , et al. Making health care safer: a critical analysis of patient safety practices.
- Multicenter evaluation of the use of venous thromboembolism prophylaxis in acutely ill medical patients in Canada.Thromb Res.2007;119:145–155. , , , et al.
- Hospitals' compliance with prophylaxis guidelines for venous thromboembolism.Am J Health Syst Pharm.2007;64:69–76. , , , .
- Thromboprophylaxis rates in US medical centers: success or failure?J Thromb Haemost.2007;5:1610–1616. , , , .
- A retrospective evaluation of adherence to guidelines for prevention of thromboembolic events in general medical inpatients.Can J Hosp Pharm.2006;59:258–263. , , .
- Venous thromboembolism prophylaxis in acutely ill hospitalized medical patients: findings from the International Medical Prevention Registry on Venous Thromboembolism.Chest.2007;132:936–945. , , , et al.
- Venous thromboembolism prophylaxis in medical inpatients: a retrospective chart review.Thromb Res.2003;111:215–219. , , , .
- Missed opportunities for prevention of venous thromboembolism: an evaluation of the use of DVT prophylaxis guidelines.Chest2001;120:1964–1971. , , .
- Thrombosis prophylaxis in medical patients: a retrospective review of clinical practice patterns.Haematologica.2002;87:746–750. , , , et al.
- Venous thromboembolism risk and prophylaxis in the acute hospital care setting (ENDORSE study): a multinational cross‐sectional study.Lancet.2008;371:387–394. , , , et al.
- The use of low molecular weight heparins for the prevention of postoperative venous thromboembolism in general surgery. A survey of practice in the United States.Int Angiol.2002;1:78–85. , , , , .
- Venous thromboembolic disease management patterns in total hip arthroplasty and total knee arthroplasty patients: a survey of the AAHKS membership.J Arthroplasty.2001;6:679–688. , , , et al.
- Thromboprophylaxis in medical‐surgical intensive care unit patients.J Crit Care.2005;20:320–323. , , .
- Utilization of venous thromboembolism prophylaxis in a medical‐surgical ICU.Chest.1998;113:162–164. , .
- Why don't physicians follow clinical practice guidelines? A framework for improvement.JAMA.1999;282:1458–1465. , , , et al.
- Thromboprophylaxis for hospitalized medical patients: a multicenter qualitative study.J Hosp Med.2009;4;269–275. , , , et al.
- Definition of immobility in studies of thromboprophylaxis in hospitalized medical patients: a systematic review.J Vasc Nurs.2010;28:54–66. , , , et al.
- The use of antithrombotic therapies in the prevention and treatment of arterial and venous thrombosis: a survey of current knowledge and practice supporting the need for clinical education.Crit Pathw Cardiol.2010;9:41–48. , , , et al.
- Antithrombotic and thrombolytic therapy: from evidence to application: the Seventh ACCP Conference on Antithrombotic and Thrombolytic Therapy.Chest.2004;126:688S–696S. , , , et al.
- Mail and internet surveys: the tailored design method.New York, NY:John Wiley 2000. .
- Deep‐vein thrombosis: advancing awareness to protect patient lives. Public Health Leadership Conference on Deep‐Vein Thrombosis. American Public Health Association. Available at: http://www.apha.org/NR/rdonlyres/A209F84A‐7C0E‐4761–9ECF‐61D22E1E11F7/0/DVT_White_Paper.pdf. Accessed May 28,2008.
- Electronic alerts to prevent venous thromboembolism among hospitalized patients.N Engl J Med.2005;352:969–977. , , , et al.
- Getting a validated guideline into local practice: implementation and audit of the SIGN guideline on the prevention of deep vein thrombosis in a district general hospital.Scott Med J.1998;43:23–25. , , , et al.
- Local opinion leaders: effects on professional practice and health care outcomes.Cochrane Database Syst Rev.2007;24(1):CD000125. , , , .
- A systematic review of strategies to improve prophylaxis for venous thromboembolism in hospitals.Ann Surg.2005;241:397–415. , , , et al.
- Minimizing errors of omission: behavioural reinforcement of heparin to avert venous emboli: the BEHAVE study.Crit Care Med.2006;34:694–699. , , , et al.
- Using the Internet to conduct surveys of health professionals: a valid alternative?Fam Pract.2003;20:545–551. , , , .
- Use of new technology in endourology and laparoscopy by American urologists: Internet and postal survey.Urology.2000;56:760–765. , , , , , .
- E‐mail versus conventional postal mail survey of geriatric chiefs.Gerontologist.2001;41:799–804. , , , , .
- Internet versus mailed questionnaires: a randomized comparison.J Med Internet Res.2004;6:e30. , , , et al.
Each year in North America, over 7 million adults are hospitalized with a medical illness.1 Acute illness and decreased mobility in hospital places patients at increased risk for venous thromboembolism (VTE), which includes deep vein thrombosis (DVT) and life‐threatening pulmonary embolism (PE).2 Since VTE remains the most preventable cause of death in hospitalized patients, numerous studies have aimed at reducing the incidence of hospital‐acquired DVT. Aside from cost, the impact of VTE to the healthcare system is felt not only by those who diagnose and treat VTE, but also by those responsible for correcting the severe bleeding that can result from inappropriate use of thromboprophylaxis. Approximately 60% of symptomatic VTE occurs in medical patients, and recent hospitalization for medical illness accounts for 25% of all community‐diagnosed VTE. The Agency for Health Research and Quality ranks DVT prevention as the top priority out of 79 patient safety initiatives, and expert consensus groups provide a strong recommendation that DVT prophylaxis with a low‐dose anticoagulant should be administered to at‐risk hospitalized medical patients.2, 3
Despite the availability, efficacy, and safety of DVT prophylaxis,2 it is discouraging that only 21% to 62% of medical patients receive prophylaxis,49 and only 16% to 40% receive appropriate prophylaxis.46, 1012 However, 70% to 90% of patients in other at‐risk groups, such as surgical patients or critically ill patients, receive prophylaxis.1316 The reason why DVT prophylaxis is so underutilized in medical patients is unclear, as explanations for low rates of clinical practice guideline utilization are multifaceted,17 and few studies have investigated the barriers to optimal thromboprophylaxis.1820
To explore possible reasons for this disparity between evidence and practice, we conducted a cross‐sectional survey of 4 clinician groups involved in the care of hospitalized medical patients. Our objective was to identify barriers and potential solutions to the underutilization of DVT prophylaxis in hospitalized medical patients.
METHODS
Instrument Development
The survey focused on 3 domains: perceived importance, effectiveness, and safety of DVT prophylaxis; perceived barriers to implementation; and perceived potential success and feasibility of interventions to optimize DVT prophylaxis. The survey cover letter outlined background information, study design, and a statement on confidentiality. A prior survey of DVT prophylaxis administered to thrombosis experts was used to generate survey questions.21
Only survey respondents who answered yes to the first question, Are you involved in any aspect of the care of hospitalized general medical patients for whom DVT prophylaxis is considered? were asked to complete the remaining sections. Subsequent questions required respondents to check the box on a 7‐point Likert‐type scale that most accurately reflected their perception (Table 1). A successful intervention was defined as one that, if implemented, would yield the anticipated effect and a feasible intervention as one that was easy to implement without major logistical burden. Respondents were also asked which clinician group was best able to provide a daily assessment of patients' need for DVT prophylaxis, ensure DVT prophylaxis is prescribed, and ensure adherence.
|
Section 1: Perceptions regarding DVT prophylaxis in hospitalized medical patients* |
1. How important an issue is the prevention of DVT in hospitalized general medical patients? |
2. To your knowledge, how effective are currently used anticoagulant strategies for the prevention of DVT in hospitalized medical patients? |
3. How safe are currently used anticoagulant strategies for the prevention of DVT in hospitalized medical patients? |
4. Current anticoagulant prophylaxis strategies are: 1 = underutilized, 4 = appropriately utilized, 7 = overutilized. |
Section 2: Perceptions regarding barriers to the optimal use of DVT prophylaxis |
1. Lack of time to consider DVT prophylaxis in every patient |
2. Lack of clear indications for DVT prophylaxis (ie, who should get prophylaxis) |
3. Lack of clear contraindications for DVT prophylaxis (ie, who should not get prophylaxis) |
4. Lack of awareness about effectiveness of DVT prophylaxis |
5. Lack of physician agreement with current DVT prophylaxis guidelines |
6. Patient discomfort from subcutaneous injections of anticoagulants |
7. Clinician concerns about increased bleeding risk from anticoagulant administration |
Section 3: Perceptions of interventions relating to DVT prophylaxis |
1. Yearly multidisciplinary educational meetings: to engage a wide spectrum of healthcare professionals to review DVT prophylaxis in hospitalized medical patients |
2. Posters on the wards: to remind healthcare professionals about DVT prophylaxis and patients who are eligible or ineligible for this treatment |
3. Laminated pocket cards: to remind healthcare professionals about DVT prophylaxis and patients who are eligible and ineligible for this treatment |
4. Preprinted order sheets: to remind healthcare professionals about DVT prophylaxis and patients who are eligible and ineligible for this treatment |
5. Periodic audit and feedback to healthcare providers: E‐mails to physicians containing reports on compliance with DVT prevention practice guidelines over recent years |
6. Computerized reminders (to the physicians): to prompt the physician to consider DVT prophylaxis upon opening a patient's electronic medical record |
7. Nurse reminders (to the physician): to remind the physician about DVT prophylaxis using written or verbal reminders |
8. Pharmacist reminders (to the physician): to remind the physician about DVT prophylaxis using written or verbal reminders |
9. Physiotherapist reminders (to the physician): to remind the physician about DVT prophylaxis using written or verbal reminders |
10. Use of a local opinion leader (within the hospital) to promote evidence‐based use of DVT prophylaxis guidelines: to educate healthcare professionals on best practices for DVT prophylaxis |
Survey Administration
The survey was distributed between April and July 2007 in both paper‐based and web‐based formats using Survey Monkey software. Ontario members of the Canadian Society of Internal Medicine (n = 193) received a direct electronic invitation (from N.S.L., on behalf of J.D.D.) to participate, while members of the Canadian Society of Hospital Pharmacists (CSHP) (n = 1002) received an electronic invitation from an administrator for the CSHP to participate. The CSHP could not ensure that all members receiving the survey were hospital‐based pharmacists, so it was expected that the response rate from this group would be low. Nurse and physiotherapy managers at a convenience sample of 8 hospitals in Ontario, Canada, distributed paper‐based surveys to their staff using stamped, preaddressed envelopes. Nonresponders in all groups were sent reminders at 2 and 4 weeks.22 Data from all completed surveys were entered into an electronic database by a research coordinator (N.S.L.). A research assistant entered paper‐based survey data in duplicate, with discrepancies resolved by consensus and mediation by a third person (J.C.). The study was conducted with Institutional Ethics Review Board approval, and all respondents provided informed consent to participate. All responses were anonymous and confidential.
Statistical Considerations
Given the exploratory nature of this survey, there was no prespecified hypothesis‐driven respondent sample size. Proportions were used to describe response rates. Survey responses scored on the 7‐point Likert‐type scale were expressed as a mean and 95% confidence interval (CI). Important, highly potentially successful, and highly potentially feasible barriers were defined as those with a mean 5 points. Questions without responses, questions with multiple responses, and questions with illegible responses were treated as missing values. All statistical analyses were done using SAS version 9 (Cary, NC).
RESULTS
Survey Responses
The overall response rate was 36.3% (563/1553), with 65.5% (211/322) of nurses, 40.4% (78/193) of physicians, 24.1% (242/1002) of pharmacists, and 88.8% (32/36) of physiotherapists completing surveys. When pharmacists were removed from the response rate calculation (since it was expected that many of those receiving the survey were not in a primarily hospital‐based practice), the overall response rate rose to 58.3% (321/551). Excluded were 9.2% (52/563) of returned surveys, as respondents indicated the topic was not relevant to their practice. Five hundred eleven surveys were included in the final analysis (Figure 1).

Importance, Effectiveness, Safety, and Appropriateness of DVT Prophylaxis Strategies
DVT prophylaxis was perceived across clinician groups as important (mean score 6.4; 95% CI 6.3 to 6.5), safe (mean 5.5; 95% CI 5.4 to 5.6), and effective (mean 5.5; 95% CI 5.4 to 6.6) (Figure 2). The mean score for the appropriateness of current DVT prophylaxis practices was 3.5 (95% CI 3.4 to 3.7), suggesting an overall perception of underutilization. However, by respondent groups, DVT prophylaxis was considered to be underutilized by physicians (mean 2.5; 95% CI 2.3 to 2.7) and pharmacists (mean 3.1; 95% CI 2.9 to 3.2), while nurses (mean 4.3; 95% CI 4.2 to 4.5) and physiotherapists (mean 3.8; 95% CI, 3.4 to 4.2) tended to consider current strategies as appropriate.

Potential Barriers to DVT Prophylaxis Utilization
Figure 3 demonstrates that no single barrier to DVT prophylaxis utilization was dominant and no barriers were considered very important. Perceived barriers carrying comparable weight were: concerns about bleeding (mean 4.8; 95% CI 4.6 to 4.9); lack of clear indications (mean 4.6; 95% CI 4.5 to 4.8) and contraindications to DVT prophylaxis (mean 4.4; 95% CI 4.3 to 4.6); lack of awareness about effectiveness of DVT prophylaxis (mean 4.5; 95% CI 4.3 to 4.7); and lack of time to consider DVT prophylaxis in every patient (mean 4.4; 95% CI 4.3 to 4.6). Patient discomfort from subcutaneous injections was perceived as the least important barrier (mean 3.8; 95% CI 3.6 to 4.0). Physicians perceived lack of awareness about the effectiveness of DVT prophylaxis as the most important barrier (mean 4.0; 95% CI 3.5 to 4.4), whereas concern about bleeding was dominant among non‐physicians (nurses' mean 5.2; 95% CI 5.0 to 5.5; pharmacists' mean 4.7; 95% CI 4.5 to 4.9; physiotherapists' mean 4.6; 95% CI 3.9 to 5.3).

Potential Success and Feasibility of Interventions to Optimize DVT Prophylaxis Utilization
Interventions considered across clinician groups as highly potentially successful were: preprinted order sheets (5.7; 95% CI 5.6 to 5.8); pharmacist reminders to physicians (mean 5.3; 95% CI 5.1 to 5.4); computerized reminders to physicians (mean 5.0; 95% CI 4.9 to 5.2); and use of a local opinion leader (mean 5.0; 95% CI 4.9 to 5.2). Interventions considered highly potentially feasible were: posters (mean 5.7; CI 5.6 to 5.8); preprinted order sheets (mean 5.5; 95% CI 5.4 to 5.7); laminated pocket cards (mean 5.4; 95% CI 5.2 to 5.5); multidisciplinary educational meetings (mean 5.0; 95% CI 4.9 to 5.2); and pharmacist reminders to physicians (mean 5.0; 95% CI 4.9 to 5.1). Preprinted orders and pharmacist reminders were perceived by all clinician groups as having both high potential success and feasibility (Figure 4).

Perceptions on Which Clinician Group Is Best Able to Assess and Implement DVT Prophylaxis
Respondents were divided between considering the attending physician and the bedside nurse as best able to perform a daily assessment of patients' need for DVT prophylaxis (43.4% [204/470] vs 44.0% [207/470], respectively). Respondents from these groups each predominantly thought this responsibility was theirs, with 68.1% (49/72) of physicians and 61.5% (123/200) of nurses perceiving this as their responsibility (Figure 5).

Forty‐one percent (193/471) of respondents perceived the attending physician as best able to ensure that DVT prophylaxis is ordered, while 31.2% (147/471) identified the pharmacist and 23.3% (110/471) identified the bedside nurse as best suited to this role. Among pharmacists, 66.3% (114/172) perceived that the attending pharmacist is best able to perform this task. Among respondents, 61.9% (296/478) felt the bedside nurse is best able to ensure adherence to DVT prophylaxis, with good agreement among all clinician groups.
DISCUSSION
Our survey identified several perceived barriers to optimizing DVT prophylaxis, consistent with those reported in the White Paper sponsored by the American Public Health Association.23 While no single barrier outlined in our survey was dominant, 2 novel barriers were identified: misperception of DVT prophylaxis underutilization, and confusion about roles and responsibilities in the area of DVT prophylaxis. Attention to these barriers may be helpful in developing an intervention aimed at bridging the gap between evidence and practice.
While our survey demonstrates agreement across clinician groups on the importance, efficacy, and safety of DVT prophylaxis, the discordant perceptions that exist about whether DVT prophylaxis is utilized appropriately is an important concern. Physician and pharmacist‐respondents demonstrated awareness that thromboprophylaxis is underutilized in medical patients. However, despite overwhelming published evidence to the contrary, nurses responding to our survey did not tend to recognize the problem of DVT prophylaxis underutilization in hospitalized medical patients. This knowledge deficit may be a significant barrier particularly since the pooled group of respondents indicated that nurses are among those caregivers best able to conduct a daily assessment of patients' need for DVT prophylaxis. A possible explanation for the finding that nurses and physiotherapists demonstrated a relative lack of awareness of the problem of DVT prophylaxis underutilization is ward‐specific healthcare priorities. Nursing and physiotherapy care on surgical wards is aimed at preventing postoperative complications, including DVT. However, its primary focus on medical wards is the management of acute medical problems. Prevention of hospital‐related complications, such as DVT, is often a secondary focus. Therefore, ensuring that all clinician groups are educated about the problem of DVT prophylaxis underutilization is necessary to drive quality improvement. A physician‐based survey on antithrombotic therapies demonstrated a similar need for education on guideline recommendations.20
A second important barrier identified in our survey is that both attending nurses and physicians feel that daily assessment of a patient's need for DVT prophylaxis is their responsibility. Confusion about roles and responsibilities in this area of patient care was reported by Cook et al., who identified that multidisciplinary care was perceived as a barrier to effective VTE prevention.18 Uncertainty as to which group should take ownership of DVT prophylaxis can lead to a diffusion of responsibility, a lack of accountability, and a gap in care. A resolution to whether DVT risk assessment is a nursing or a physician role could be reached through increased interdisciplinary communication and provision of clear definitions of roles to hospital staff.
Survey respondents felt that preprinted orders and pharmacist reminders to physicians were potentially successful and feasible strategies to optimize DVT prophylaxis. These components could be part of a simple tool to initiate prophylaxis. While electronic alerts have been shown to increase prophylaxis rates,24 we suspect that many respondents did not view these as highly important because of limited use of computerized order entry at their facilities. Interestingly, survey respondents did not perceive audit‐and‐feedback systems or local opinion leaders as potentially successful, though previous studies have demonstrated that they can change clinician behavior.25, 26 This may be because respondents may not be aware of the strength of technology‐based interventions (eg, electronic orders) and the role of opinion leaders, and the evidence in support of such interventions.24, 26 A systematic review of studies to improve DVT prophylaxis in hospitals reported that a combination of multiple active strategies is most effective, particularly those that link physician reminders with audit‐and‐feedback.27 For example, in the define study, a multicomponent intervention consisting of interactive educational sessions, verbal and computerized prompts, and individual performance feedback significantly improved adherence to DVT prophylaxis guidelines in critically ill patients.28 Whether a similar intervention could improve adherence to DVT prophylaxis guidelines in hospitalized medical patients merits further study. Any intervention must be paired with better education about which patients should, and should not, receive prophylaxis, as this may address many reported barriers in our survey (including concerns about bleeding). Respondents' uncertainty about these issues is not surprising, as studies of DVT prophylaxis in medical patients are not plentiful.2 However, recent guidelines do identify subgroups of medically ill patients in whom DVT prophylaxis is indicated.2 A clear and simple DVT risk assessment algorithm that identifies medical patients in whom DVT prophylaxis should (or should not) be administered may help to overcome respondents' concerns.
A limitation of our survey is the overall response rate of 36.3%, largely driven by the considerable number of nonresponding pharmacists (n = 760, reflecting 49% of the entire sample). However, the majority of the pharmacists were likely not hospital‐based, were thus not a target of this study, and their low response rate is not surprising. After excluding pharmacists, the response rate was 58.3% (321/551), which is consistent with response rates of other large‐sample surveys.29 The lower response rate for physicians and pharmacists may also reflect web‐based survey dissemination which, despite its feasibility, has lower response rates than paper‐based dissemination.3032 While the sample of physicians was relatively small compared to the other respondent groups surveyed, we aimed to identify barriers to actually implementing VTE prophylaxis, not just ordering prophylaxis, which is a multidisciplinary concern.
Although this survey was based on Canadian healthcare providers' perspectives, we believe the results are generalizable since both US and Canadian‐based studies have found that VTE prophylaxis is underutilized among hospitalized medical patients.4, 6 Furthermore, the American College of Chest Physicians (ACCP) guidelines on VTE prophylaxis, which are well‐recognized in both the United States and Canada, were developed with input from Canadian and American content experts.2 And while the US and Canadian healthcare systems are organized differently, at the patient‐care level, the roles of healthcare professionals are very similar. The generalizability of our findings is, however, limited by the institutional characteristics of respondents. We do not purport that the responses of any of the 4 clinician groups are generalizable to those groups as a whole. Although we surveyed clinicians in teaching and nonteaching, urban and rural practices, perceptions about DVT prophylaxis may be influenced by other factors, including the availability of local preprinted orders, electronic medical records, and quality improvement programs. Another potential limitation is that we did not assess all possible strategies to improve DVT prophylaxis, such as nurse practitioners and computerized decision support systems. These were purposely excluded, as they are not financially feasible in all centers, and thus not generalizable. Finally, like all self‐administered surveys, our findings reflect respondents' perceptions rather than objective observations about practice.
In conclusion, we identified novel and important barriers to optimal DVT prophylaxis utilization and potential interventions to address this important safety concern in hospitalized medical patients. To overcome some of these barriers, we propose an educational intervention prior to delivery of a top‐down, evidence‐based intervention to first increase healthcare providers' knowledge of the safety of DVT prophylaxis, system and team‐based approaches, and which interventions are most likely to be successful so as to encourage greater compliance with the intervention. A top‐down, system‐wide approach, involving the entire healthcare team and hospital administrators, can help drive this communication. As DVT prophylaxis becomes an increasingly important component in hospital accreditation, such solutions become appealing to facilitate change in practices. Results of this survey may inform future knowledge translation interventions by eliminating perceived barriers to DVT prophylaxis and by incorporating strategies that are perceived by healthcare professionals to be successful, feasible, and supported by evidence.
Each year in North America, over 7 million adults are hospitalized with a medical illness.1 Acute illness and decreased mobility in hospital places patients at increased risk for venous thromboembolism (VTE), which includes deep vein thrombosis (DVT) and life‐threatening pulmonary embolism (PE).2 Since VTE remains the most preventable cause of death in hospitalized patients, numerous studies have aimed at reducing the incidence of hospital‐acquired DVT. Aside from cost, the impact of VTE to the healthcare system is felt not only by those who diagnose and treat VTE, but also by those responsible for correcting the severe bleeding that can result from inappropriate use of thromboprophylaxis. Approximately 60% of symptomatic VTE occurs in medical patients, and recent hospitalization for medical illness accounts for 25% of all community‐diagnosed VTE. The Agency for Health Research and Quality ranks DVT prevention as the top priority out of 79 patient safety initiatives, and expert consensus groups provide a strong recommendation that DVT prophylaxis with a low‐dose anticoagulant should be administered to at‐risk hospitalized medical patients.2, 3
Despite the availability, efficacy, and safety of DVT prophylaxis,2 it is discouraging that only 21% to 62% of medical patients receive prophylaxis,49 and only 16% to 40% receive appropriate prophylaxis.46, 1012 However, 70% to 90% of patients in other at‐risk groups, such as surgical patients or critically ill patients, receive prophylaxis.1316 The reason why DVT prophylaxis is so underutilized in medical patients is unclear, as explanations for low rates of clinical practice guideline utilization are multifaceted,17 and few studies have investigated the barriers to optimal thromboprophylaxis.1820
To explore possible reasons for this disparity between evidence and practice, we conducted a cross‐sectional survey of 4 clinician groups involved in the care of hospitalized medical patients. Our objective was to identify barriers and potential solutions to the underutilization of DVT prophylaxis in hospitalized medical patients.
METHODS
Instrument Development
The survey focused on 3 domains: perceived importance, effectiveness, and safety of DVT prophylaxis; perceived barriers to implementation; and perceived potential success and feasibility of interventions to optimize DVT prophylaxis. The survey cover letter outlined background information, study design, and a statement on confidentiality. A prior survey of DVT prophylaxis administered to thrombosis experts was used to generate survey questions.21
Only survey respondents who answered yes to the first question, Are you involved in any aspect of the care of hospitalized general medical patients for whom DVT prophylaxis is considered? were asked to complete the remaining sections. Subsequent questions required respondents to check the box on a 7‐point Likert‐type scale that most accurately reflected their perception (Table 1). A successful intervention was defined as one that, if implemented, would yield the anticipated effect and a feasible intervention as one that was easy to implement without major logistical burden. Respondents were also asked which clinician group was best able to provide a daily assessment of patients' need for DVT prophylaxis, ensure DVT prophylaxis is prescribed, and ensure adherence.
|
Section 1: Perceptions regarding DVT prophylaxis in hospitalized medical patients* |
1. How important an issue is the prevention of DVT in hospitalized general medical patients? |
2. To your knowledge, how effective are currently used anticoagulant strategies for the prevention of DVT in hospitalized medical patients? |
3. How safe are currently used anticoagulant strategies for the prevention of DVT in hospitalized medical patients? |
4. Current anticoagulant prophylaxis strategies are: 1 = underutilized, 4 = appropriately utilized, 7 = overutilized. |
Section 2: Perceptions regarding barriers to the optimal use of DVT prophylaxis |
1. Lack of time to consider DVT prophylaxis in every patient |
2. Lack of clear indications for DVT prophylaxis (ie, who should get prophylaxis) |
3. Lack of clear contraindications for DVT prophylaxis (ie, who should not get prophylaxis) |
4. Lack of awareness about effectiveness of DVT prophylaxis |
5. Lack of physician agreement with current DVT prophylaxis guidelines |
6. Patient discomfort from subcutaneous injections of anticoagulants |
7. Clinician concerns about increased bleeding risk from anticoagulant administration |
Section 3: Perceptions of interventions relating to DVT prophylaxis |
1. Yearly multidisciplinary educational meetings: to engage a wide spectrum of healthcare professionals to review DVT prophylaxis in hospitalized medical patients |
2. Posters on the wards: to remind healthcare professionals about DVT prophylaxis and patients who are eligible or ineligible for this treatment |
3. Laminated pocket cards: to remind healthcare professionals about DVT prophylaxis and patients who are eligible and ineligible for this treatment |
4. Preprinted order sheets: to remind healthcare professionals about DVT prophylaxis and patients who are eligible and ineligible for this treatment |
5. Periodic audit and feedback to healthcare providers: E‐mails to physicians containing reports on compliance with DVT prevention practice guidelines over recent years |
6. Computerized reminders (to the physicians): to prompt the physician to consider DVT prophylaxis upon opening a patient's electronic medical record |
7. Nurse reminders (to the physician): to remind the physician about DVT prophylaxis using written or verbal reminders |
8. Pharmacist reminders (to the physician): to remind the physician about DVT prophylaxis using written or verbal reminders |
9. Physiotherapist reminders (to the physician): to remind the physician about DVT prophylaxis using written or verbal reminders |
10. Use of a local opinion leader (within the hospital) to promote evidence‐based use of DVT prophylaxis guidelines: to educate healthcare professionals on best practices for DVT prophylaxis |
Survey Administration
The survey was distributed between April and July 2007 in both paper‐based and web‐based formats using Survey Monkey software. Ontario members of the Canadian Society of Internal Medicine (n = 193) received a direct electronic invitation (from N.S.L., on behalf of J.D.D.) to participate, while members of the Canadian Society of Hospital Pharmacists (CSHP) (n = 1002) received an electronic invitation from an administrator for the CSHP to participate. The CSHP could not ensure that all members receiving the survey were hospital‐based pharmacists, so it was expected that the response rate from this group would be low. Nurse and physiotherapy managers at a convenience sample of 8 hospitals in Ontario, Canada, distributed paper‐based surveys to their staff using stamped, preaddressed envelopes. Nonresponders in all groups were sent reminders at 2 and 4 weeks.22 Data from all completed surveys were entered into an electronic database by a research coordinator (N.S.L.). A research assistant entered paper‐based survey data in duplicate, with discrepancies resolved by consensus and mediation by a third person (J.C.). The study was conducted with Institutional Ethics Review Board approval, and all respondents provided informed consent to participate. All responses were anonymous and confidential.
Statistical Considerations
Given the exploratory nature of this survey, there was no prespecified hypothesis‐driven respondent sample size. Proportions were used to describe response rates. Survey responses scored on the 7‐point Likert‐type scale were expressed as a mean and 95% confidence interval (CI). Important, highly potentially successful, and highly potentially feasible barriers were defined as those with a mean 5 points. Questions without responses, questions with multiple responses, and questions with illegible responses were treated as missing values. All statistical analyses were done using SAS version 9 (Cary, NC).
RESULTS
Survey Responses
The overall response rate was 36.3% (563/1553), with 65.5% (211/322) of nurses, 40.4% (78/193) of physicians, 24.1% (242/1002) of pharmacists, and 88.8% (32/36) of physiotherapists completing surveys. When pharmacists were removed from the response rate calculation (since it was expected that many of those receiving the survey were not in a primarily hospital‐based practice), the overall response rate rose to 58.3% (321/551). Excluded were 9.2% (52/563) of returned surveys, as respondents indicated the topic was not relevant to their practice. Five hundred eleven surveys were included in the final analysis (Figure 1).

Importance, Effectiveness, Safety, and Appropriateness of DVT Prophylaxis Strategies
DVT prophylaxis was perceived across clinician groups as important (mean score 6.4; 95% CI 6.3 to 6.5), safe (mean 5.5; 95% CI 5.4 to 5.6), and effective (mean 5.5; 95% CI 5.4 to 6.6) (Figure 2). The mean score for the appropriateness of current DVT prophylaxis practices was 3.5 (95% CI 3.4 to 3.7), suggesting an overall perception of underutilization. However, by respondent groups, DVT prophylaxis was considered to be underutilized by physicians (mean 2.5; 95% CI 2.3 to 2.7) and pharmacists (mean 3.1; 95% CI 2.9 to 3.2), while nurses (mean 4.3; 95% CI 4.2 to 4.5) and physiotherapists (mean 3.8; 95% CI, 3.4 to 4.2) tended to consider current strategies as appropriate.

Potential Barriers to DVT Prophylaxis Utilization
Figure 3 demonstrates that no single barrier to DVT prophylaxis utilization was dominant and no barriers were considered very important. Perceived barriers carrying comparable weight were: concerns about bleeding (mean 4.8; 95% CI 4.6 to 4.9); lack of clear indications (mean 4.6; 95% CI 4.5 to 4.8) and contraindications to DVT prophylaxis (mean 4.4; 95% CI 4.3 to 4.6); lack of awareness about effectiveness of DVT prophylaxis (mean 4.5; 95% CI 4.3 to 4.7); and lack of time to consider DVT prophylaxis in every patient (mean 4.4; 95% CI 4.3 to 4.6). Patient discomfort from subcutaneous injections was perceived as the least important barrier (mean 3.8; 95% CI 3.6 to 4.0). Physicians perceived lack of awareness about the effectiveness of DVT prophylaxis as the most important barrier (mean 4.0; 95% CI 3.5 to 4.4), whereas concern about bleeding was dominant among non‐physicians (nurses' mean 5.2; 95% CI 5.0 to 5.5; pharmacists' mean 4.7; 95% CI 4.5 to 4.9; physiotherapists' mean 4.6; 95% CI 3.9 to 5.3).

Potential Success and Feasibility of Interventions to Optimize DVT Prophylaxis Utilization
Interventions considered across clinician groups as highly potentially successful were: preprinted order sheets (5.7; 95% CI 5.6 to 5.8); pharmacist reminders to physicians (mean 5.3; 95% CI 5.1 to 5.4); computerized reminders to physicians (mean 5.0; 95% CI 4.9 to 5.2); and use of a local opinion leader (mean 5.0; 95% CI 4.9 to 5.2). Interventions considered highly potentially feasible were: posters (mean 5.7; CI 5.6 to 5.8); preprinted order sheets (mean 5.5; 95% CI 5.4 to 5.7); laminated pocket cards (mean 5.4; 95% CI 5.2 to 5.5); multidisciplinary educational meetings (mean 5.0; 95% CI 4.9 to 5.2); and pharmacist reminders to physicians (mean 5.0; 95% CI 4.9 to 5.1). Preprinted orders and pharmacist reminders were perceived by all clinician groups as having both high potential success and feasibility (Figure 4).

Perceptions on Which Clinician Group Is Best Able to Assess and Implement DVT Prophylaxis
Respondents were divided between considering the attending physician and the bedside nurse as best able to perform a daily assessment of patients' need for DVT prophylaxis (43.4% [204/470] vs 44.0% [207/470], respectively). Respondents from these groups each predominantly thought this responsibility was theirs, with 68.1% (49/72) of physicians and 61.5% (123/200) of nurses perceiving this as their responsibility (Figure 5).

Forty‐one percent (193/471) of respondents perceived the attending physician as best able to ensure that DVT prophylaxis is ordered, while 31.2% (147/471) identified the pharmacist and 23.3% (110/471) identified the bedside nurse as best suited to this role. Among pharmacists, 66.3% (114/172) perceived that the attending pharmacist is best able to perform this task. Among respondents, 61.9% (296/478) felt the bedside nurse is best able to ensure adherence to DVT prophylaxis, with good agreement among all clinician groups.
DISCUSSION
Our survey identified several perceived barriers to optimizing DVT prophylaxis, consistent with those reported in the White Paper sponsored by the American Public Health Association.23 While no single barrier outlined in our survey was dominant, 2 novel barriers were identified: misperception of DVT prophylaxis underutilization, and confusion about roles and responsibilities in the area of DVT prophylaxis. Attention to these barriers may be helpful in developing an intervention aimed at bridging the gap between evidence and practice.
While our survey demonstrates agreement across clinician groups on the importance, efficacy, and safety of DVT prophylaxis, the discordant perceptions that exist about whether DVT prophylaxis is utilized appropriately is an important concern. Physician and pharmacist‐respondents demonstrated awareness that thromboprophylaxis is underutilized in medical patients. However, despite overwhelming published evidence to the contrary, nurses responding to our survey did not tend to recognize the problem of DVT prophylaxis underutilization in hospitalized medical patients. This knowledge deficit may be a significant barrier particularly since the pooled group of respondents indicated that nurses are among those caregivers best able to conduct a daily assessment of patients' need for DVT prophylaxis. A possible explanation for the finding that nurses and physiotherapists demonstrated a relative lack of awareness of the problem of DVT prophylaxis underutilization is ward‐specific healthcare priorities. Nursing and physiotherapy care on surgical wards is aimed at preventing postoperative complications, including DVT. However, its primary focus on medical wards is the management of acute medical problems. Prevention of hospital‐related complications, such as DVT, is often a secondary focus. Therefore, ensuring that all clinician groups are educated about the problem of DVT prophylaxis underutilization is necessary to drive quality improvement. A physician‐based survey on antithrombotic therapies demonstrated a similar need for education on guideline recommendations.20
A second important barrier identified in our survey is that both attending nurses and physicians feel that daily assessment of a patient's need for DVT prophylaxis is their responsibility. Confusion about roles and responsibilities in this area of patient care was reported by Cook et al., who identified that multidisciplinary care was perceived as a barrier to effective VTE prevention.18 Uncertainty as to which group should take ownership of DVT prophylaxis can lead to a diffusion of responsibility, a lack of accountability, and a gap in care. A resolution to whether DVT risk assessment is a nursing or a physician role could be reached through increased interdisciplinary communication and provision of clear definitions of roles to hospital staff.
Survey respondents felt that preprinted orders and pharmacist reminders to physicians were potentially successful and feasible strategies to optimize DVT prophylaxis. These components could be part of a simple tool to initiate prophylaxis. While electronic alerts have been shown to increase prophylaxis rates,24 we suspect that many respondents did not view these as highly important because of limited use of computerized order entry at their facilities. Interestingly, survey respondents did not perceive audit‐and‐feedback systems or local opinion leaders as potentially successful, though previous studies have demonstrated that they can change clinician behavior.25, 26 This may be because respondents may not be aware of the strength of technology‐based interventions (eg, electronic orders) and the role of opinion leaders, and the evidence in support of such interventions.24, 26 A systematic review of studies to improve DVT prophylaxis in hospitals reported that a combination of multiple active strategies is most effective, particularly those that link physician reminders with audit‐and‐feedback.27 For example, in the define study, a multicomponent intervention consisting of interactive educational sessions, verbal and computerized prompts, and individual performance feedback significantly improved adherence to DVT prophylaxis guidelines in critically ill patients.28 Whether a similar intervention could improve adherence to DVT prophylaxis guidelines in hospitalized medical patients merits further study. Any intervention must be paired with better education about which patients should, and should not, receive prophylaxis, as this may address many reported barriers in our survey (including concerns about bleeding). Respondents' uncertainty about these issues is not surprising, as studies of DVT prophylaxis in medical patients are not plentiful.2 However, recent guidelines do identify subgroups of medically ill patients in whom DVT prophylaxis is indicated.2 A clear and simple DVT risk assessment algorithm that identifies medical patients in whom DVT prophylaxis should (or should not) be administered may help to overcome respondents' concerns.
A limitation of our survey is the overall response rate of 36.3%, largely driven by the considerable number of nonresponding pharmacists (n = 760, reflecting 49% of the entire sample). However, the majority of the pharmacists were likely not hospital‐based, were thus not a target of this study, and their low response rate is not surprising. After excluding pharmacists, the response rate was 58.3% (321/551), which is consistent with response rates of other large‐sample surveys.29 The lower response rate for physicians and pharmacists may also reflect web‐based survey dissemination which, despite its feasibility, has lower response rates than paper‐based dissemination.3032 While the sample of physicians was relatively small compared to the other respondent groups surveyed, we aimed to identify barriers to actually implementing VTE prophylaxis, not just ordering prophylaxis, which is a multidisciplinary concern.
Although this survey was based on Canadian healthcare providers' perspectives, we believe the results are generalizable since both US and Canadian‐based studies have found that VTE prophylaxis is underutilized among hospitalized medical patients.4, 6 Furthermore, the American College of Chest Physicians (ACCP) guidelines on VTE prophylaxis, which are well‐recognized in both the United States and Canada, were developed with input from Canadian and American content experts.2 And while the US and Canadian healthcare systems are organized differently, at the patient‐care level, the roles of healthcare professionals are very similar. The generalizability of our findings is, however, limited by the institutional characteristics of respondents. We do not purport that the responses of any of the 4 clinician groups are generalizable to those groups as a whole. Although we surveyed clinicians in teaching and nonteaching, urban and rural practices, perceptions about DVT prophylaxis may be influenced by other factors, including the availability of local preprinted orders, electronic medical records, and quality improvement programs. Another potential limitation is that we did not assess all possible strategies to improve DVT prophylaxis, such as nurse practitioners and computerized decision support systems. These were purposely excluded, as they are not financially feasible in all centers, and thus not generalizable. Finally, like all self‐administered surveys, our findings reflect respondents' perceptions rather than objective observations about practice.
In conclusion, we identified novel and important barriers to optimal DVT prophylaxis utilization and potential interventions to address this important safety concern in hospitalized medical patients. To overcome some of these barriers, we propose an educational intervention prior to delivery of a top‐down, evidence‐based intervention to first increase healthcare providers' knowledge of the safety of DVT prophylaxis, system and team‐based approaches, and which interventions are most likely to be successful so as to encourage greater compliance with the intervention. A top‐down, system‐wide approach, involving the entire healthcare team and hospital administrators, can help drive this communication. As DVT prophylaxis becomes an increasingly important component in hospital accreditation, such solutions become appealing to facilitate change in practices. Results of this survey may inform future knowledge translation interventions by eliminating perceived barriers to DVT prophylaxis and by incorporating strategies that are perceived by healthcare professionals to be successful, feasible, and supported by evidence.
- National hospital discharge survey: annual summary, 1996.Vital Health Stat.1999;13:1–46. , .
- Prevention of venous thromboembolism: American College of Chest Physicians Evidence‐Based Clinical Practice Guidelines (8th ed).Chest.2008;133:381S–443S. , , , et al.
- Evidence Report/Technology Assessment: No. 43. AHRQ Publication No. 01‐E058, July 2001. Rockville, MD: Agency for Healthcare Research and Quality. Available at: http://www.ahrq.gov/clinic/ptsafety/. Accessed October 9,2007. , , , et al. Making health care safer: a critical analysis of patient safety practices.
- Multicenter evaluation of the use of venous thromboembolism prophylaxis in acutely ill medical patients in Canada.Thromb Res.2007;119:145–155. , , , et al.
- Hospitals' compliance with prophylaxis guidelines for venous thromboembolism.Am J Health Syst Pharm.2007;64:69–76. , , , .
- Thromboprophylaxis rates in US medical centers: success or failure?J Thromb Haemost.2007;5:1610–1616. , , , .
- A retrospective evaluation of adherence to guidelines for prevention of thromboembolic events in general medical inpatients.Can J Hosp Pharm.2006;59:258–263. , , .
- Venous thromboembolism prophylaxis in acutely ill hospitalized medical patients: findings from the International Medical Prevention Registry on Venous Thromboembolism.Chest.2007;132:936–945. , , , et al.
- Venous thromboembolism prophylaxis in medical inpatients: a retrospective chart review.Thromb Res.2003;111:215–219. , , , .
- Missed opportunities for prevention of venous thromboembolism: an evaluation of the use of DVT prophylaxis guidelines.Chest2001;120:1964–1971. , , .
- Thrombosis prophylaxis in medical patients: a retrospective review of clinical practice patterns.Haematologica.2002;87:746–750. , , , et al.
- Venous thromboembolism risk and prophylaxis in the acute hospital care setting (ENDORSE study): a multinational cross‐sectional study.Lancet.2008;371:387–394. , , , et al.
- The use of low molecular weight heparins for the prevention of postoperative venous thromboembolism in general surgery. A survey of practice in the United States.Int Angiol.2002;1:78–85. , , , , .
- Venous thromboembolic disease management patterns in total hip arthroplasty and total knee arthroplasty patients: a survey of the AAHKS membership.J Arthroplasty.2001;6:679–688. , , , et al.
- Thromboprophylaxis in medical‐surgical intensive care unit patients.J Crit Care.2005;20:320–323. , , .
- Utilization of venous thromboembolism prophylaxis in a medical‐surgical ICU.Chest.1998;113:162–164. , .
- Why don't physicians follow clinical practice guidelines? A framework for improvement.JAMA.1999;282:1458–1465. , , , et al.
- Thromboprophylaxis for hospitalized medical patients: a multicenter qualitative study.J Hosp Med.2009;4;269–275. , , , et al.
- Definition of immobility in studies of thromboprophylaxis in hospitalized medical patients: a systematic review.J Vasc Nurs.2010;28:54–66. , , , et al.
- The use of antithrombotic therapies in the prevention and treatment of arterial and venous thrombosis: a survey of current knowledge and practice supporting the need for clinical education.Crit Pathw Cardiol.2010;9:41–48. , , , et al.
- Antithrombotic and thrombolytic therapy: from evidence to application: the Seventh ACCP Conference on Antithrombotic and Thrombolytic Therapy.Chest.2004;126:688S–696S. , , , et al.
- Mail and internet surveys: the tailored design method.New York, NY:John Wiley 2000. .
- Deep‐vein thrombosis: advancing awareness to protect patient lives. Public Health Leadership Conference on Deep‐Vein Thrombosis. American Public Health Association. Available at: http://www.apha.org/NR/rdonlyres/A209F84A‐7C0E‐4761–9ECF‐61D22E1E11F7/0/DVT_White_Paper.pdf. Accessed May 28,2008.
- Electronic alerts to prevent venous thromboembolism among hospitalized patients.N Engl J Med.2005;352:969–977. , , , et al.
- Getting a validated guideline into local practice: implementation and audit of the SIGN guideline on the prevention of deep vein thrombosis in a district general hospital.Scott Med J.1998;43:23–25. , , , et al.
- Local opinion leaders: effects on professional practice and health care outcomes.Cochrane Database Syst Rev.2007;24(1):CD000125. , , , .
- A systematic review of strategies to improve prophylaxis for venous thromboembolism in hospitals.Ann Surg.2005;241:397–415. , , , et al.
- Minimizing errors of omission: behavioural reinforcement of heparin to avert venous emboli: the BEHAVE study.Crit Care Med.2006;34:694–699. , , , et al.
- Using the Internet to conduct surveys of health professionals: a valid alternative?Fam Pract.2003;20:545–551. , , , .
- Use of new technology in endourology and laparoscopy by American urologists: Internet and postal survey.Urology.2000;56:760–765. , , , , , .
- E‐mail versus conventional postal mail survey of geriatric chiefs.Gerontologist.2001;41:799–804. , , , , .
- Internet versus mailed questionnaires: a randomized comparison.J Med Internet Res.2004;6:e30. , , , et al.
- National hospital discharge survey: annual summary, 1996.Vital Health Stat.1999;13:1–46. , .
- Prevention of venous thromboembolism: American College of Chest Physicians Evidence‐Based Clinical Practice Guidelines (8th ed).Chest.2008;133:381S–443S. , , , et al.
- Evidence Report/Technology Assessment: No. 43. AHRQ Publication No. 01‐E058, July 2001. Rockville, MD: Agency for Healthcare Research and Quality. Available at: http://www.ahrq.gov/clinic/ptsafety/. Accessed October 9,2007. , , , et al. Making health care safer: a critical analysis of patient safety practices.
- Multicenter evaluation of the use of venous thromboembolism prophylaxis in acutely ill medical patients in Canada.Thromb Res.2007;119:145–155. , , , et al.
- Hospitals' compliance with prophylaxis guidelines for venous thromboembolism.Am J Health Syst Pharm.2007;64:69–76. , , , .
- Thromboprophylaxis rates in US medical centers: success or failure?J Thromb Haemost.2007;5:1610–1616. , , , .
- A retrospective evaluation of adherence to guidelines for prevention of thromboembolic events in general medical inpatients.Can J Hosp Pharm.2006;59:258–263. , , .
- Venous thromboembolism prophylaxis in acutely ill hospitalized medical patients: findings from the International Medical Prevention Registry on Venous Thromboembolism.Chest.2007;132:936–945. , , , et al.
- Venous thromboembolism prophylaxis in medical inpatients: a retrospective chart review.Thromb Res.2003;111:215–219. , , , .
- Missed opportunities for prevention of venous thromboembolism: an evaluation of the use of DVT prophylaxis guidelines.Chest2001;120:1964–1971. , , .
- Thrombosis prophylaxis in medical patients: a retrospective review of clinical practice patterns.Haematologica.2002;87:746–750. , , , et al.
- Venous thromboembolism risk and prophylaxis in the acute hospital care setting (ENDORSE study): a multinational cross‐sectional study.Lancet.2008;371:387–394. , , , et al.
- The use of low molecular weight heparins for the prevention of postoperative venous thromboembolism in general surgery. A survey of practice in the United States.Int Angiol.2002;1:78–85. , , , , .
- Venous thromboembolic disease management patterns in total hip arthroplasty and total knee arthroplasty patients: a survey of the AAHKS membership.J Arthroplasty.2001;6:679–688. , , , et al.
- Thromboprophylaxis in medical‐surgical intensive care unit patients.J Crit Care.2005;20:320–323. , , .
- Utilization of venous thromboembolism prophylaxis in a medical‐surgical ICU.Chest.1998;113:162–164. , .
- Why don't physicians follow clinical practice guidelines? A framework for improvement.JAMA.1999;282:1458–1465. , , , et al.
- Thromboprophylaxis for hospitalized medical patients: a multicenter qualitative study.J Hosp Med.2009;4;269–275. , , , et al.
- Definition of immobility in studies of thromboprophylaxis in hospitalized medical patients: a systematic review.J Vasc Nurs.2010;28:54–66. , , , et al.
- The use of antithrombotic therapies in the prevention and treatment of arterial and venous thrombosis: a survey of current knowledge and practice supporting the need for clinical education.Crit Pathw Cardiol.2010;9:41–48. , , , et al.
- Antithrombotic and thrombolytic therapy: from evidence to application: the Seventh ACCP Conference on Antithrombotic and Thrombolytic Therapy.Chest.2004;126:688S–696S. , , , et al.
- Mail and internet surveys: the tailored design method.New York, NY:John Wiley 2000. .
- Deep‐vein thrombosis: advancing awareness to protect patient lives. Public Health Leadership Conference on Deep‐Vein Thrombosis. American Public Health Association. Available at: http://www.apha.org/NR/rdonlyres/A209F84A‐7C0E‐4761–9ECF‐61D22E1E11F7/0/DVT_White_Paper.pdf. Accessed May 28,2008.
- Electronic alerts to prevent venous thromboembolism among hospitalized patients.N Engl J Med.2005;352:969–977. , , , et al.
- Getting a validated guideline into local practice: implementation and audit of the SIGN guideline on the prevention of deep vein thrombosis in a district general hospital.Scott Med J.1998;43:23–25. , , , et al.
- Local opinion leaders: effects on professional practice and health care outcomes.Cochrane Database Syst Rev.2007;24(1):CD000125. , , , .
- A systematic review of strategies to improve prophylaxis for venous thromboembolism in hospitals.Ann Surg.2005;241:397–415. , , , et al.
- Minimizing errors of omission: behavioural reinforcement of heparin to avert venous emboli: the BEHAVE study.Crit Care Med.2006;34:694–699. , , , et al.
- Using the Internet to conduct surveys of health professionals: a valid alternative?Fam Pract.2003;20:545–551. , , , .
- Use of new technology in endourology and laparoscopy by American urologists: Internet and postal survey.Urology.2000;56:760–765. , , , , , .
- E‐mail versus conventional postal mail survey of geriatric chiefs.Gerontologist.2001;41:799–804. , , , , .
- Internet versus mailed questionnaires: a randomized comparison.J Med Internet Res.2004;6:e30. , , , et al.
Copyright © 2011 Society of Hospital Medicine
Accuracy of GoogleTranslate™
The population of patients in the US with limited English proficiency (LEP)those who speak English less than very well1is substantial and continues to grow.1, 2 Patients with LEP are at risk for lower quality health care overall than their English‐speaking counterparts.38 Professional in‐person interpreters greatly improve spoken communication and quality of care for these patients,4, 9 but their assistance is typically based on the clinical encounter. Particularly if interpreting by phone, interpreters are unlikely to be able to help with materials such as discharge instructions or information sheets meant for family members. Professional written translations of patient educational material help to bridge this gap, allowing clinicians to convey detailed written instructions to patients. However, professional translations must be prepared well in advance of any encounter and can only be used for easily anticipated problems.
The need to translate less common, patient‐specific instructions arises spontaneously in clinical practice, and formally prepared written translations are not useful in these situations. Online translation tools such as GoogleTranslate (available at
We conducted a pilot evaluation of an online translation tool as it relates to detailed, complex patient educational material. Our primary goal was to compare the accuracy of a Spanish translation generated by the online tool to that done by a professional agency. Our secondary goals were: 1) to assess whether sentence word length or complexity mediated the accuracy of GT; and 2) to lay the foundation for a more comprehensive study of the accuracy of online translation tools, with respect to patient educational material.
Methods
Translation Tool and Language Choice
We selected Google Translate (GT) since it is one of the more commonly used online translation tools and because Google is the most widely used search engine in the United States.13 GT uses statistical translation methodology to convert text, documents, and websites between languages; statistical translation involves the following three steps. First, the translation program recognizes a sentence to translate. Second, it compares the words and phrases within that sentence to the billions of words in its library (drawn from bilingual professionally translated documents, such as United Nations proceedings). Third, it uses this comparison to generate a translation combining the words and phrases deemed most equivalent between the source sentence and the target language. If there are multiple sentences, the program recognizes and translates each independently. As the body of bilingual work grows, the program learns and refines its rules automatically.14 In contrast, in rule‐based translation, a program would use manually prespecified rules regarding word choice and grammar to generate a translation.15 We assessed GT's accuracy translating from English to Spanish because Spanish is the predominant non‐English language spoken in the US.1
Document Selection and Preparation
We selected the instruction manual regarding warfarin use prepared by the Agency for Healthcare Research and Quality (AHRQ) for this accuracy evaluation. We selected this manual,16 written at a 6th grade reading level, because a professional Spanish translation was available (completed by ASET International Service, LLC, before and independently of this study), and because patient educational material regarding warfarin has been associated with fewer bleeding events.17 We downloaded the English document on October 19, 2009 and used the GT website to translate it en bloc. We then copied the resulting Spanish output into a text file. The English document and the professional Spanish translation (downloaded the same day) were both converted into text files in the same manner.
Grading Methodology
We scored the translation chosen using both manual and automated evaluation techniques. These techniques are widely used in the machine translation literature and are explained below.
Manual Evaluation: Evaluators, Domains, Scoring
We recruited three nonclinician, bilingual, nativeSpanish‐speaking research assistants as evaluators. The evaluators were all college educated with a Bachelor's degree or higher and were of Mexican, Nicaraguan, and Guatemalan ancestry. Each evaluator received a brief orientation regarding the project, as well as an explanation of the scores, and then proceeded to the blinded evaluation independently.
We asked evaluators to score sentences on Likert scales along five primary domains: fluency, adequacy, meaning, severity, and preference. Fluency and adequacy are well accepted components of machine translation evaluation,18 with fluency being an assessment of grammar and readability ranging from 5 (Perfect fluency; like reading a newspaper) to 1 (No fluency; no appreciable grammar, not understandable) and adequacy being an assessment of information preservation ranging from 5 (100% of information conveyed from the original) to 1 (0% of information conveyed from the original). Given that a sentence can be highly adequate but drastically change the connotation and intent of the sentence (eg, a sentence that contains 75% of the correct words but changes a sentence from take this medication twice a day to take this medication once every two days), we asked evaluators to assess meaning, a measure of connotation and intent maintenance, with scores ranging from 5 (Same meaning as original) to 1 (Totally different meaning from the original).19 Evaluators also assessed severity, a new measure of potential harm if a given sentence was assessed as having errors of any kind, ranging from 5 (Error, no effect on patient care) to 1 (Error, dangerous to patient) with an additional option of N/A (Sentence basically accurate). Finally, evaluators rated a blinded preference (also a new measure) for either of two translated sentences, ranging from Strongly prefer translation #1 to Strongly prefer translation #2. The order of the sentences was random (eg, sometimes the professional translation was first and sometimes the GT translation was). We subsequently converted this to preference for the professional translation, ranging from 5 (Strongly prefer the professional translation) to 1 (Strongly prefer the GT translation) in order to standardize the responses (Figures 1 and 2).


The overall flow of the study is given in Figure 3. Each evaluator initially scored 20 sentences translated by GT and 10 sentences translated professionally along the first four domains. All 30 of these sentences were randomly selected from the original, 263‐sentence pamphlet. For fluency, evaluators had access only to the translated sentence to be scored; for adequacy, meaning, and severity, they had access to both the translated sentence and the original English sentence. Ten of the 30 sentences were further selected randomly for scoring on the preference domain. For these 10 sentences, evaluators compared the GT and professional translations of the same sentence (with the original English sentence available as a reference) and indicated a preference, for any reason, for one translation or the other. Evaluators were blinded to the technique of translation (GT or professional) for all scored sentences and domains. We chose twice as many sentences from the GT preparations for the first four domains to maximize measurements for the translation technology we were evaluating, with the smaller number of professional translations serving as controls.

After scoring the first 30 sentences, evaluators met with one of the authors (R.R.K.) to discuss and consolidate their approach to scoring. They then scored an additional 10 GT‐translated sentences and 5 professionally translated sentences for the first four domains, and 9 of these 15 sentences for preference, to see if the meeting changed their scoring approach. These sentences were selected randomly from the original, 263‐sentence pamphlet, excluding the 30 evaluated in the previous step.
Automated Machine Translation Evaluation
Machine translation researchers have developed automated measures allowing the rapid and inexpensive scoring and rescoring of translations. These automated measures supplement more time‐ and resource‐intensive manual evaluations. The automated measures are based upon how well the translation compares to one or, ideally, multiple professionally prepared reference translations. They correlate well with human judgments on the domains above, especially when multiple reference translations are used (increasing the number of reference translations increases the variability allowed for words and phrases in the machine translation, improving the likelihood that differences in score are related to differences in quality rather than differences in translator preference).20 For this study, we used Metric for Evaluation of Translation with Explicit Ordering (METEOR), a machine translation evaluation system that allows additional flexibility for the machine translation in terms of grading individual sentences and being sensitive to synonyms, word stemming, and word order.21 We obtained a METEOR score for each of the GT‐translated sentences using the professional translation as our reference, and assessed correlation between this automated measure and the manual evaluations for the GT sentences, with the aim of assessing the feasibility of using METEOR in future work on patient educational material translation.
Outcomes and Statistical Analysis
We compared the scores assigned to GT‐translated sentences for each of the five manually scored domains as compared to the scores of the professionally translated sentences, as well as the impact of word count and sentence complexity on the scores achieved specifically by the GT‐translated sentences, using clustered linear regression to account for the fact that each of the 45 sentences were scored by each of the three evaluators. Sentences were classified as simple if they contained one or fewer clauses and complex if they contained more than one clause.22 We also assessed interrater reliability for the manual scoring system using intraclass correlation coefficients and repeatability. Repeatability is an estimate of the maximum difference, with 95% confidence, between scores assigned to the same sentence on the same domain by two different evaluators;23 lower scores indicate greater agreement between evaluators. Since we did not have clinical data or a gold standard, we used repeatability to estimate the value above which a difference between two scores might be clinically significant and not simply due to interrater variability.24 Finally, we assessed the correlation of the manual scores with those calculated by the METEOR automated evaluation tool using Pearson correlation coefficients. All analyses were conducted using Stata 11 (College Station, TX).
Results
Sentence Description
A total of 45 sentences were evaluated by the bilingual research assistants. The initial 30 sentences and the subsequent, post‐consolidation meeting 15 sentences were scored similarly in all outcomes, after adjustment for word length and complexity, so we pooled all 45 sentences (as well as the 19 total sentence pairs scored for preference) for the final analysis. Average sentence lengths were 14.2 words, 15.5 words, and 16.6 words for the English source text, professionally translated sentences, and GT‐translated sentences, respectively. Thirty‐three percent of the English source sentences were simple and 67% were complex.
Manual Evaluation Scores
Sentences translated by GT received worse scores on fluency as compared to the professional translations (3.4 vs 4.7, P < 0.0001). Comparisons for adequacy and meaning were not statistically significantly different. GT‐translated sentences contained more errors of any severity as compared to the professional translations (39% vs 22%, P = 0.05), but a similar number of serious, clinically impactful errors (severity scores of 3, 2, or 1; 4% vs 2%, P = 0.61). However, one GT‐translated sentence was considered erroneous with a severity level of 1 (Error, dangerous to patient). This particular sentence was 25 words long and complex in structure in the original English document; all three evaluators considered the GT translation nonsensical (La hemorragia mayor, llame a su mdico, o ir a la emergencia de un hospital habitacin si usted tiene cualquiera de los siguientes: Red N, oscuro, caf o cola de orina de color.) Evaluators had no overall preference for the professional translation (3.2, 95% confidence interval = 2.7 to 3.7, with 3 indicating no preference; P = 0.36) (Table 1).
GoogleTranslate Translation | Professional Translation | P Value | |
---|---|---|---|
| |||
Fluency* | 3.4 | 4.7 | <0.0001 |
Adequacy* | 4.5 | 4.8 | 0.19 |
Meaning* | 4.2 | 4.5 | 0.29 |
Severity | |||
Any error | 39% | 22% | 0.05 |
Serious error | 4% | 2% | 0.61 |
Preference* | 3.2 | 0.36 |
Mediation of Scores by Sentence Length or Complexity
We found that sentence length was not associated with scores for fluency, adequacy, meaning, severity, or preference (P > 0.30 in each case). Complexity, however, was significantly associated with preference: evaluators' preferred the professional translation for complex English sentences while being more ambivalent about simple English sentences (3.6 vs 2.6, P = 0.03).
Interrater Reliability and Repeatability
We assessed the interrater reliability for each domain using intraclass correlation coefficients and repeatability. For fluency, the intraclass correlation was best at 0.70; for adequacy, it was 0.58; for meaning, 0.42; for severity, 0.48; and for preference, 0.37. The repeatability scores were 1.4 for fluency, 0.6 for adequacy, 2.2 for meaning, 1.2 for severity, and 3.8 for preference, indicating that two evaluators might give a sentence almost the same score (at most, 1 point apart from one another) for adequacy, but might have opposite preferences regarding which translation of a sentence was superior.
Correlation with METEOR
Correlation between the first four domains and the METEOR scores were less than in prior studies.21 Fluency correlated best with METEOR at 0.53; adequacy correlated least with METEOR at 0.29. The remaining scores were in‐between. All correlations were statistically significant at P < 0.01 (Table 2).
Correlation with METEOR | P value | |
---|---|---|
| ||
Fluency | 0.53 | <0.0001 |
Adequacy | 0.29 | 0.006 |
Meaning | 0.33 | 0.002 |
Severity | 0.39 | 0.002 |
Discussion
In this preliminary study comparing the accuracy of GT to professional translation for patient educational material, we found that GT was inferior to the professional translation in grammatical fluency but generally preserved the content and sense of the original text. Out of 30 GT sentences assessed, there was one substantially erroneous translation that was considered potentially dangerous. Evaluators preferred the professionally translated sentences for complex sentences, but when the English source sentence was simplecontaining a single clausethis preference disappeared.
Like Sharif and Tse,12 we found that for information not arranged in sentences, automated translation sometimes produced nonsensical sentences. In our study, these resulted from an English sentence fragment followed by a bulleted list; in their study, the nonsensical translations resulted from pharmacy labels. The difference in frequency of these errors between our studies may have resulted partly from the translation tool evaluated (GT vs programs used by pharmacies in the Bronx), but may have also been due to our use of machine translation for complete sentencesthe purpose for which it is optimally designed. The hypothesis that machine translations of clinical information are most understandable when used for simple, complete sentences concurs with the methodology used by these tools and requires further study.
GT has the potential to be very useful to clinicians, particularly for those instances when the communication required is both spontaneous and routine or noncritical. For example, in the inpatient setting, patients could communicate diet and other nonclinical requests, as well as ask or answer simple, short questions when the interpreter is not available. In such situations, the low cost and ease of using online translations and machine translation more generally may help to circumvent the tendency of clinicians to get by with inadequate language skills or to avoid communication altogether.25 If used wisely, GT and other online tools could supplement the use of standardized translations and professional interpreters in helping clinicians to overcome language barriers and linguistic inertia, though this will require further assessment.
Ours is a pilot study, and while it suggests a more promising way to use online translation tools, significant further evaluation is required regarding accuracy and applicability prior to widespread use of any machine translation tools for patient care. The document we utilized for evaluation was a professionally translated patient educational brochure provided to individuals starting a complex medication. As online translation tools would most likely not be used in this setting, but rather for spontaneous and less critical patient‐specific instructions, further testing of GT as applied to such scenarios should be considered. Second, we only evaluated GT for English translated into Spanish; its usefulness in other languages will need to be evaluated. It also remains to be seen how easily GT translations will be understood by patients, who may have variable medical understanding and educational attainment as compared to our evaluators. Finally, in this evaluation, we only assessed automated written translation, not automated spoken translation services such as those now available on cellular phones and other mobile devices.11 The latter are based upon translation software with an additional speech recognition interface. These applications may prove to be even more useful than online translation, but the speech recognition component will add an additional layer of potential error and these applications will need to be evaluated on their own merits.
The domains chosen for this study had only moderate interrater reliability as assessed by intraclass correlation and repeatability, with meaning and preference scoring particularly poorly. The latter domains in particular will require more thorough assessment before routine use in online translation assessment. The variability in all domains may have resulted partly from the choice of nonclinicians of different ancestral backgrounds as evaluators. However, this variability is likely better representative of the wide range of patient backgrounds. Because our evaluators were not professional translators, we asked a professional interpreter to grade all sentences to assess the quality of their evaluation. While the interpreter noted slightly fewer errors among the professionally translated sentences (13% vs 22%) and slightly more errors among the GT‐translated sentences (50% vs 39%), and preferred the professional translation slightly more (3.8 vs 3.2), his scores for all of the other measures were almost identical, increasing our confidence in our primary findings (Appendix A). Additionally, since statistical translation is conducted sentence by sentence, in our study evaluators only scored translations at the sentence level. The accuracy of GT for whole paragraphs or entire documents will need to be assessed separately. The correlation between METEOR and the manual evaluation scores was less than in prior studies; while inexpensive to assess, METEOR will have to be recalibrated in optimal circumstanceswith several reference translations available rather than just onebefore it can be used to supplement the assessment of new languages, new materials, other translation technologies, and improvements in a given technology over time for patient educational material.
In summary, GT scored worse in grammar but similarly in content and sense to the professional translation, committing one critical error in translating a complex, fragmented sentence as nonsense. We believe that, with further study and judicious use, GT has the potential to substantially improve clinicians' communication with patients with limited English proficiency in the area of brief spontaneous patient‐specific information, supplementing well the role that professional spoken interpretation and standardized written translations already play.
- Language use and English‐speaking ability: 2000. In:Census 2000 Brief.Washington, DC:US Census Bureau;2003. p. 2. http://www.census.gov/prod/2003pubs/c2kbr‐29.pdf. , .
- The need for more research on language barriers in health care: a proposed research agenda.Milbank Q.2006;84(1):111–133. , , , , .
- Language proficiency and adverse events in US hospitals: a pilot study.Int J Qual Health Care.2007;19(2):60–67. , , , .
- The impact of medical interpreter services on the quality of health care: a systematic review.Med Care Res Rev.2005;62(3):255–299. .
- Errors in medical interpretation and their potential clinical consequences in pediatric encounters.Pediatrics.2003;111(1):6–14. , , , et al.
- The effect of English language proficiency on length of stay and in‐hospital mortality.J Gen Intern Med.2004;19(3):221–228. , , , et al.
- Influence of language barriers on outcomes of hospital care for general medicine inpatients.J Hosp Med.2010;5(5):276–282. , , , .
- Hospitals, language, and culture: a snapshot of the nation. In:Los Angeles, CA:The California Endowment, the Joint Commission;2007. p.51–52. http://www.jointcommission.org/assets/1/6/hlc_paper.pdf. , .
- Do professional interpreters improve clinical care for patients with limited English proficiency? A systematic review of the literature.Health Serv Res.2007;42(2):727–754. , , , .
- Google's Computing Power Refines Translation Tool.New York Times; March 9,2010. Accessed March 24, 2010. http://www.nytimes.com/2010/03/09/technology/09translate.html?_r=1. .
- New York Times; March 20,2010. Accessed March 24, 2010. http://www.nytimes.com/2010/03/21/opinion/21bellos.html. , Translator.
- Accuracy of computer‐generated, Spanish‐language medicine labels.Pediatrics.2010;125(5):960–965. , .
- Nielsen NetRatings Search Engine Ratings.SearchEngineWatch; August 22,2006. Accessed March 24, 2010. http://searchenginewatch.com/2156451. .
- Google.Google Translate Help;2010. Accessed March 24, 2010. http://translate.google.com/support/?hl=en.
- Chapter 4: Basic strategies. In:An Introduction to Machine Translation;1992. Accessed April 22, 2010. http://www.hutchinsweb.me.uk/IntroMT‐4.pdf , .
- Your Guide to Coumadin®/Warfarin Therapy.Agency for Healthcare Research and Quality; August 21,2008. Accessed October 19, 2009. http://www.ahrq.gov/consumer/btpills.htm. .
- Patient reported receipt of medication instructions for warfarin is associated with reduced risk of serious bleeding events.J Gen Intern Med.2008;23(10):1589–1594. , , , et al.
- The ARPA MT evaluation methodologies: evolution, lessons, and future approaches. In: Proceedings of AMTA, 1994, Columbia, MD; October1994. , , .
- Overview of the IWSLT 2005 evaluation campaign. In: Proceedings of IWSLT 2005, Pittsburgh, PA; October2005. , .
- BLEU: a method for automatic evaluation of machine translation. In: ACL‐2002: 40th Annual Meeting of the Association for Computational Linguistics.2002:311–318. , , , .
- METEOR: an automatic metric for MT evaluation with high levels of correlation with human judgments. In: Proceedings of the Second Workshop on Statistical Machine Translation at ACL, Prague, Czech Republic; June2007. , .
- The Structure of a Sentence.Ottawa:The Writing Centre, University of Ottawa;2007. .
- Statistical methods for assessing agreement between two methods of clinical measurement.Lancet.1986;1(8476):307–310. , .
- Measurement, reproducibility, and validity. In:Epidemiologic Methods 203.San Francisco:Department of Biostatistics and Epidemiology, University of California;2009. .
- Getting by: underuse of interpreters by resident physicians.J Gen Intern Med.2009;24(2):256–262. , , , , .
The population of patients in the US with limited English proficiency (LEP)those who speak English less than very well1is substantial and continues to grow.1, 2 Patients with LEP are at risk for lower quality health care overall than their English‐speaking counterparts.38 Professional in‐person interpreters greatly improve spoken communication and quality of care for these patients,4, 9 but their assistance is typically based on the clinical encounter. Particularly if interpreting by phone, interpreters are unlikely to be able to help with materials such as discharge instructions or information sheets meant for family members. Professional written translations of patient educational material help to bridge this gap, allowing clinicians to convey detailed written instructions to patients. However, professional translations must be prepared well in advance of any encounter and can only be used for easily anticipated problems.
The need to translate less common, patient‐specific instructions arises spontaneously in clinical practice, and formally prepared written translations are not useful in these situations. Online translation tools such as GoogleTranslate (available at
We conducted a pilot evaluation of an online translation tool as it relates to detailed, complex patient educational material. Our primary goal was to compare the accuracy of a Spanish translation generated by the online tool to that done by a professional agency. Our secondary goals were: 1) to assess whether sentence word length or complexity mediated the accuracy of GT; and 2) to lay the foundation for a more comprehensive study of the accuracy of online translation tools, with respect to patient educational material.
Methods
Translation Tool and Language Choice
We selected Google Translate (GT) since it is one of the more commonly used online translation tools and because Google is the most widely used search engine in the United States.13 GT uses statistical translation methodology to convert text, documents, and websites between languages; statistical translation involves the following three steps. First, the translation program recognizes a sentence to translate. Second, it compares the words and phrases within that sentence to the billions of words in its library (drawn from bilingual professionally translated documents, such as United Nations proceedings). Third, it uses this comparison to generate a translation combining the words and phrases deemed most equivalent between the source sentence and the target language. If there are multiple sentences, the program recognizes and translates each independently. As the body of bilingual work grows, the program learns and refines its rules automatically.14 In contrast, in rule‐based translation, a program would use manually prespecified rules regarding word choice and grammar to generate a translation.15 We assessed GT's accuracy translating from English to Spanish because Spanish is the predominant non‐English language spoken in the US.1
Document Selection and Preparation
We selected the instruction manual regarding warfarin use prepared by the Agency for Healthcare Research and Quality (AHRQ) for this accuracy evaluation. We selected this manual,16 written at a 6th grade reading level, because a professional Spanish translation was available (completed by ASET International Service, LLC, before and independently of this study), and because patient educational material regarding warfarin has been associated with fewer bleeding events.17 We downloaded the English document on October 19, 2009 and used the GT website to translate it en bloc. We then copied the resulting Spanish output into a text file. The English document and the professional Spanish translation (downloaded the same day) were both converted into text files in the same manner.
Grading Methodology
We scored the translation chosen using both manual and automated evaluation techniques. These techniques are widely used in the machine translation literature and are explained below.
Manual Evaluation: Evaluators, Domains, Scoring
We recruited three nonclinician, bilingual, nativeSpanish‐speaking research assistants as evaluators. The evaluators were all college educated with a Bachelor's degree or higher and were of Mexican, Nicaraguan, and Guatemalan ancestry. Each evaluator received a brief orientation regarding the project, as well as an explanation of the scores, and then proceeded to the blinded evaluation independently.
We asked evaluators to score sentences on Likert scales along five primary domains: fluency, adequacy, meaning, severity, and preference. Fluency and adequacy are well accepted components of machine translation evaluation,18 with fluency being an assessment of grammar and readability ranging from 5 (Perfect fluency; like reading a newspaper) to 1 (No fluency; no appreciable grammar, not understandable) and adequacy being an assessment of information preservation ranging from 5 (100% of information conveyed from the original) to 1 (0% of information conveyed from the original). Given that a sentence can be highly adequate but drastically change the connotation and intent of the sentence (eg, a sentence that contains 75% of the correct words but changes a sentence from take this medication twice a day to take this medication once every two days), we asked evaluators to assess meaning, a measure of connotation and intent maintenance, with scores ranging from 5 (Same meaning as original) to 1 (Totally different meaning from the original).19 Evaluators also assessed severity, a new measure of potential harm if a given sentence was assessed as having errors of any kind, ranging from 5 (Error, no effect on patient care) to 1 (Error, dangerous to patient) with an additional option of N/A (Sentence basically accurate). Finally, evaluators rated a blinded preference (also a new measure) for either of two translated sentences, ranging from Strongly prefer translation #1 to Strongly prefer translation #2. The order of the sentences was random (eg, sometimes the professional translation was first and sometimes the GT translation was). We subsequently converted this to preference for the professional translation, ranging from 5 (Strongly prefer the professional translation) to 1 (Strongly prefer the GT translation) in order to standardize the responses (Figures 1 and 2).


The overall flow of the study is given in Figure 3. Each evaluator initially scored 20 sentences translated by GT and 10 sentences translated professionally along the first four domains. All 30 of these sentences were randomly selected from the original, 263‐sentence pamphlet. For fluency, evaluators had access only to the translated sentence to be scored; for adequacy, meaning, and severity, they had access to both the translated sentence and the original English sentence. Ten of the 30 sentences were further selected randomly for scoring on the preference domain. For these 10 sentences, evaluators compared the GT and professional translations of the same sentence (with the original English sentence available as a reference) and indicated a preference, for any reason, for one translation or the other. Evaluators were blinded to the technique of translation (GT or professional) for all scored sentences and domains. We chose twice as many sentences from the GT preparations for the first four domains to maximize measurements for the translation technology we were evaluating, with the smaller number of professional translations serving as controls.

After scoring the first 30 sentences, evaluators met with one of the authors (R.R.K.) to discuss and consolidate their approach to scoring. They then scored an additional 10 GT‐translated sentences and 5 professionally translated sentences for the first four domains, and 9 of these 15 sentences for preference, to see if the meeting changed their scoring approach. These sentences were selected randomly from the original, 263‐sentence pamphlet, excluding the 30 evaluated in the previous step.
Automated Machine Translation Evaluation
Machine translation researchers have developed automated measures allowing the rapid and inexpensive scoring and rescoring of translations. These automated measures supplement more time‐ and resource‐intensive manual evaluations. The automated measures are based upon how well the translation compares to one or, ideally, multiple professionally prepared reference translations. They correlate well with human judgments on the domains above, especially when multiple reference translations are used (increasing the number of reference translations increases the variability allowed for words and phrases in the machine translation, improving the likelihood that differences in score are related to differences in quality rather than differences in translator preference).20 For this study, we used Metric for Evaluation of Translation with Explicit Ordering (METEOR), a machine translation evaluation system that allows additional flexibility for the machine translation in terms of grading individual sentences and being sensitive to synonyms, word stemming, and word order.21 We obtained a METEOR score for each of the GT‐translated sentences using the professional translation as our reference, and assessed correlation between this automated measure and the manual evaluations for the GT sentences, with the aim of assessing the feasibility of using METEOR in future work on patient educational material translation.
Outcomes and Statistical Analysis
We compared the scores assigned to GT‐translated sentences for each of the five manually scored domains as compared to the scores of the professionally translated sentences, as well as the impact of word count and sentence complexity on the scores achieved specifically by the GT‐translated sentences, using clustered linear regression to account for the fact that each of the 45 sentences were scored by each of the three evaluators. Sentences were classified as simple if they contained one or fewer clauses and complex if they contained more than one clause.22 We also assessed interrater reliability for the manual scoring system using intraclass correlation coefficients and repeatability. Repeatability is an estimate of the maximum difference, with 95% confidence, between scores assigned to the same sentence on the same domain by two different evaluators;23 lower scores indicate greater agreement between evaluators. Since we did not have clinical data or a gold standard, we used repeatability to estimate the value above which a difference between two scores might be clinically significant and not simply due to interrater variability.24 Finally, we assessed the correlation of the manual scores with those calculated by the METEOR automated evaluation tool using Pearson correlation coefficients. All analyses were conducted using Stata 11 (College Station, TX).
Results
Sentence Description
A total of 45 sentences were evaluated by the bilingual research assistants. The initial 30 sentences and the subsequent, post‐consolidation meeting 15 sentences were scored similarly in all outcomes, after adjustment for word length and complexity, so we pooled all 45 sentences (as well as the 19 total sentence pairs scored for preference) for the final analysis. Average sentence lengths were 14.2 words, 15.5 words, and 16.6 words for the English source text, professionally translated sentences, and GT‐translated sentences, respectively. Thirty‐three percent of the English source sentences were simple and 67% were complex.
Manual Evaluation Scores
Sentences translated by GT received worse scores on fluency as compared to the professional translations (3.4 vs 4.7, P < 0.0001). Comparisons for adequacy and meaning were not statistically significantly different. GT‐translated sentences contained more errors of any severity as compared to the professional translations (39% vs 22%, P = 0.05), but a similar number of serious, clinically impactful errors (severity scores of 3, 2, or 1; 4% vs 2%, P = 0.61). However, one GT‐translated sentence was considered erroneous with a severity level of 1 (Error, dangerous to patient). This particular sentence was 25 words long and complex in structure in the original English document; all three evaluators considered the GT translation nonsensical (La hemorragia mayor, llame a su mdico, o ir a la emergencia de un hospital habitacin si usted tiene cualquiera de los siguientes: Red N, oscuro, caf o cola de orina de color.) Evaluators had no overall preference for the professional translation (3.2, 95% confidence interval = 2.7 to 3.7, with 3 indicating no preference; P = 0.36) (Table 1).
GoogleTranslate Translation | Professional Translation | P Value | |
---|---|---|---|
| |||
Fluency* | 3.4 | 4.7 | <0.0001 |
Adequacy* | 4.5 | 4.8 | 0.19 |
Meaning* | 4.2 | 4.5 | 0.29 |
Severity | |||
Any error | 39% | 22% | 0.05 |
Serious error | 4% | 2% | 0.61 |
Preference* | 3.2 | 0.36 |
Mediation of Scores by Sentence Length or Complexity
We found that sentence length was not associated with scores for fluency, adequacy, meaning, severity, or preference (P > 0.30 in each case). Complexity, however, was significantly associated with preference: evaluators' preferred the professional translation for complex English sentences while being more ambivalent about simple English sentences (3.6 vs 2.6, P = 0.03).
Interrater Reliability and Repeatability
We assessed the interrater reliability for each domain using intraclass correlation coefficients and repeatability. For fluency, the intraclass correlation was best at 0.70; for adequacy, it was 0.58; for meaning, 0.42; for severity, 0.48; and for preference, 0.37. The repeatability scores were 1.4 for fluency, 0.6 for adequacy, 2.2 for meaning, 1.2 for severity, and 3.8 for preference, indicating that two evaluators might give a sentence almost the same score (at most, 1 point apart from one another) for adequacy, but might have opposite preferences regarding which translation of a sentence was superior.
Correlation with METEOR
Correlation between the first four domains and the METEOR scores were less than in prior studies.21 Fluency correlated best with METEOR at 0.53; adequacy correlated least with METEOR at 0.29. The remaining scores were in‐between. All correlations were statistically significant at P < 0.01 (Table 2).
Correlation with METEOR | P value | |
---|---|---|
| ||
Fluency | 0.53 | <0.0001 |
Adequacy | 0.29 | 0.006 |
Meaning | 0.33 | 0.002 |
Severity | 0.39 | 0.002 |
Discussion
In this preliminary study comparing the accuracy of GT to professional translation for patient educational material, we found that GT was inferior to the professional translation in grammatical fluency but generally preserved the content and sense of the original text. Out of 30 GT sentences assessed, there was one substantially erroneous translation that was considered potentially dangerous. Evaluators preferred the professionally translated sentences for complex sentences, but when the English source sentence was simplecontaining a single clausethis preference disappeared.
Like Sharif and Tse,12 we found that for information not arranged in sentences, automated translation sometimes produced nonsensical sentences. In our study, these resulted from an English sentence fragment followed by a bulleted list; in their study, the nonsensical translations resulted from pharmacy labels. The difference in frequency of these errors between our studies may have resulted partly from the translation tool evaluated (GT vs programs used by pharmacies in the Bronx), but may have also been due to our use of machine translation for complete sentencesthe purpose for which it is optimally designed. The hypothesis that machine translations of clinical information are most understandable when used for simple, complete sentences concurs with the methodology used by these tools and requires further study.
GT has the potential to be very useful to clinicians, particularly for those instances when the communication required is both spontaneous and routine or noncritical. For example, in the inpatient setting, patients could communicate diet and other nonclinical requests, as well as ask or answer simple, short questions when the interpreter is not available. In such situations, the low cost and ease of using online translations and machine translation more generally may help to circumvent the tendency of clinicians to get by with inadequate language skills or to avoid communication altogether.25 If used wisely, GT and other online tools could supplement the use of standardized translations and professional interpreters in helping clinicians to overcome language barriers and linguistic inertia, though this will require further assessment.
Ours is a pilot study, and while it suggests a more promising way to use online translation tools, significant further evaluation is required regarding accuracy and applicability prior to widespread use of any machine translation tools for patient care. The document we utilized for evaluation was a professionally translated patient educational brochure provided to individuals starting a complex medication. As online translation tools would most likely not be used in this setting, but rather for spontaneous and less critical patient‐specific instructions, further testing of GT as applied to such scenarios should be considered. Second, we only evaluated GT for English translated into Spanish; its usefulness in other languages will need to be evaluated. It also remains to be seen how easily GT translations will be understood by patients, who may have variable medical understanding and educational attainment as compared to our evaluators. Finally, in this evaluation, we only assessed automated written translation, not automated spoken translation services such as those now available on cellular phones and other mobile devices.11 The latter are based upon translation software with an additional speech recognition interface. These applications may prove to be even more useful than online translation, but the speech recognition component will add an additional layer of potential error and these applications will need to be evaluated on their own merits.
The domains chosen for this study had only moderate interrater reliability as assessed by intraclass correlation and repeatability, with meaning and preference scoring particularly poorly. The latter domains in particular will require more thorough assessment before routine use in online translation assessment. The variability in all domains may have resulted partly from the choice of nonclinicians of different ancestral backgrounds as evaluators. However, this variability is likely better representative of the wide range of patient backgrounds. Because our evaluators were not professional translators, we asked a professional interpreter to grade all sentences to assess the quality of their evaluation. While the interpreter noted slightly fewer errors among the professionally translated sentences (13% vs 22%) and slightly more errors among the GT‐translated sentences (50% vs 39%), and preferred the professional translation slightly more (3.8 vs 3.2), his scores for all of the other measures were almost identical, increasing our confidence in our primary findings (Appendix A). Additionally, since statistical translation is conducted sentence by sentence, in our study evaluators only scored translations at the sentence level. The accuracy of GT for whole paragraphs or entire documents will need to be assessed separately. The correlation between METEOR and the manual evaluation scores was less than in prior studies; while inexpensive to assess, METEOR will have to be recalibrated in optimal circumstanceswith several reference translations available rather than just onebefore it can be used to supplement the assessment of new languages, new materials, other translation technologies, and improvements in a given technology over time for patient educational material.
In summary, GT scored worse in grammar but similarly in content and sense to the professional translation, committing one critical error in translating a complex, fragmented sentence as nonsense. We believe that, with further study and judicious use, GT has the potential to substantially improve clinicians' communication with patients with limited English proficiency in the area of brief spontaneous patient‐specific information, supplementing well the role that professional spoken interpretation and standardized written translations already play.
The population of patients in the US with limited English proficiency (LEP)those who speak English less than very well1is substantial and continues to grow.1, 2 Patients with LEP are at risk for lower quality health care overall than their English‐speaking counterparts.38 Professional in‐person interpreters greatly improve spoken communication and quality of care for these patients,4, 9 but their assistance is typically based on the clinical encounter. Particularly if interpreting by phone, interpreters are unlikely to be able to help with materials such as discharge instructions or information sheets meant for family members. Professional written translations of patient educational material help to bridge this gap, allowing clinicians to convey detailed written instructions to patients. However, professional translations must be prepared well in advance of any encounter and can only be used for easily anticipated problems.
The need to translate less common, patient‐specific instructions arises spontaneously in clinical practice, and formally prepared written translations are not useful in these situations. Online translation tools such as GoogleTranslate (available at
We conducted a pilot evaluation of an online translation tool as it relates to detailed, complex patient educational material. Our primary goal was to compare the accuracy of a Spanish translation generated by the online tool to that done by a professional agency. Our secondary goals were: 1) to assess whether sentence word length or complexity mediated the accuracy of GT; and 2) to lay the foundation for a more comprehensive study of the accuracy of online translation tools, with respect to patient educational material.
Methods
Translation Tool and Language Choice
We selected Google Translate (GT) since it is one of the more commonly used online translation tools and because Google is the most widely used search engine in the United States.13 GT uses statistical translation methodology to convert text, documents, and websites between languages; statistical translation involves the following three steps. First, the translation program recognizes a sentence to translate. Second, it compares the words and phrases within that sentence to the billions of words in its library (drawn from bilingual professionally translated documents, such as United Nations proceedings). Third, it uses this comparison to generate a translation combining the words and phrases deemed most equivalent between the source sentence and the target language. If there are multiple sentences, the program recognizes and translates each independently. As the body of bilingual work grows, the program learns and refines its rules automatically.14 In contrast, in rule‐based translation, a program would use manually prespecified rules regarding word choice and grammar to generate a translation.15 We assessed GT's accuracy translating from English to Spanish because Spanish is the predominant non‐English language spoken in the US.1
Document Selection and Preparation
We selected the instruction manual regarding warfarin use prepared by the Agency for Healthcare Research and Quality (AHRQ) for this accuracy evaluation. We selected this manual,16 written at a 6th grade reading level, because a professional Spanish translation was available (completed by ASET International Service, LLC, before and independently of this study), and because patient educational material regarding warfarin has been associated with fewer bleeding events.17 We downloaded the English document on October 19, 2009 and used the GT website to translate it en bloc. We then copied the resulting Spanish output into a text file. The English document and the professional Spanish translation (downloaded the same day) were both converted into text files in the same manner.
Grading Methodology
We scored the translation chosen using both manual and automated evaluation techniques. These techniques are widely used in the machine translation literature and are explained below.
Manual Evaluation: Evaluators, Domains, Scoring
We recruited three nonclinician, bilingual, nativeSpanish‐speaking research assistants as evaluators. The evaluators were all college educated with a Bachelor's degree or higher and were of Mexican, Nicaraguan, and Guatemalan ancestry. Each evaluator received a brief orientation regarding the project, as well as an explanation of the scores, and then proceeded to the blinded evaluation independently.
We asked evaluators to score sentences on Likert scales along five primary domains: fluency, adequacy, meaning, severity, and preference. Fluency and adequacy are well accepted components of machine translation evaluation,18 with fluency being an assessment of grammar and readability ranging from 5 (Perfect fluency; like reading a newspaper) to 1 (No fluency; no appreciable grammar, not understandable) and adequacy being an assessment of information preservation ranging from 5 (100% of information conveyed from the original) to 1 (0% of information conveyed from the original). Given that a sentence can be highly adequate but drastically change the connotation and intent of the sentence (eg, a sentence that contains 75% of the correct words but changes a sentence from take this medication twice a day to take this medication once every two days), we asked evaluators to assess meaning, a measure of connotation and intent maintenance, with scores ranging from 5 (Same meaning as original) to 1 (Totally different meaning from the original).19 Evaluators also assessed severity, a new measure of potential harm if a given sentence was assessed as having errors of any kind, ranging from 5 (Error, no effect on patient care) to 1 (Error, dangerous to patient) with an additional option of N/A (Sentence basically accurate). Finally, evaluators rated a blinded preference (also a new measure) for either of two translated sentences, ranging from Strongly prefer translation #1 to Strongly prefer translation #2. The order of the sentences was random (eg, sometimes the professional translation was first and sometimes the GT translation was). We subsequently converted this to preference for the professional translation, ranging from 5 (Strongly prefer the professional translation) to 1 (Strongly prefer the GT translation) in order to standardize the responses (Figures 1 and 2).


The overall flow of the study is given in Figure 3. Each evaluator initially scored 20 sentences translated by GT and 10 sentences translated professionally along the first four domains. All 30 of these sentences were randomly selected from the original, 263‐sentence pamphlet. For fluency, evaluators had access only to the translated sentence to be scored; for adequacy, meaning, and severity, they had access to both the translated sentence and the original English sentence. Ten of the 30 sentences were further selected randomly for scoring on the preference domain. For these 10 sentences, evaluators compared the GT and professional translations of the same sentence (with the original English sentence available as a reference) and indicated a preference, for any reason, for one translation or the other. Evaluators were blinded to the technique of translation (GT or professional) for all scored sentences and domains. We chose twice as many sentences from the GT preparations for the first four domains to maximize measurements for the translation technology we were evaluating, with the smaller number of professional translations serving as controls.

After scoring the first 30 sentences, evaluators met with one of the authors (R.R.K.) to discuss and consolidate their approach to scoring. They then scored an additional 10 GT‐translated sentences and 5 professionally translated sentences for the first four domains, and 9 of these 15 sentences for preference, to see if the meeting changed their scoring approach. These sentences were selected randomly from the original, 263‐sentence pamphlet, excluding the 30 evaluated in the previous step.
Automated Machine Translation Evaluation
Machine translation researchers have developed automated measures allowing the rapid and inexpensive scoring and rescoring of translations. These automated measures supplement more time‐ and resource‐intensive manual evaluations. The automated measures are based upon how well the translation compares to one or, ideally, multiple professionally prepared reference translations. They correlate well with human judgments on the domains above, especially when multiple reference translations are used (increasing the number of reference translations increases the variability allowed for words and phrases in the machine translation, improving the likelihood that differences in score are related to differences in quality rather than differences in translator preference).20 For this study, we used Metric for Evaluation of Translation with Explicit Ordering (METEOR), a machine translation evaluation system that allows additional flexibility for the machine translation in terms of grading individual sentences and being sensitive to synonyms, word stemming, and word order.21 We obtained a METEOR score for each of the GT‐translated sentences using the professional translation as our reference, and assessed correlation between this automated measure and the manual evaluations for the GT sentences, with the aim of assessing the feasibility of using METEOR in future work on patient educational material translation.
Outcomes and Statistical Analysis
We compared the scores assigned to GT‐translated sentences for each of the five manually scored domains as compared to the scores of the professionally translated sentences, as well as the impact of word count and sentence complexity on the scores achieved specifically by the GT‐translated sentences, using clustered linear regression to account for the fact that each of the 45 sentences were scored by each of the three evaluators. Sentences were classified as simple if they contained one or fewer clauses and complex if they contained more than one clause.22 We also assessed interrater reliability for the manual scoring system using intraclass correlation coefficients and repeatability. Repeatability is an estimate of the maximum difference, with 95% confidence, between scores assigned to the same sentence on the same domain by two different evaluators;23 lower scores indicate greater agreement between evaluators. Since we did not have clinical data or a gold standard, we used repeatability to estimate the value above which a difference between two scores might be clinically significant and not simply due to interrater variability.24 Finally, we assessed the correlation of the manual scores with those calculated by the METEOR automated evaluation tool using Pearson correlation coefficients. All analyses were conducted using Stata 11 (College Station, TX).
Results
Sentence Description
A total of 45 sentences were evaluated by the bilingual research assistants. The initial 30 sentences and the subsequent, post‐consolidation meeting 15 sentences were scored similarly in all outcomes, after adjustment for word length and complexity, so we pooled all 45 sentences (as well as the 19 total sentence pairs scored for preference) for the final analysis. Average sentence lengths were 14.2 words, 15.5 words, and 16.6 words for the English source text, professionally translated sentences, and GT‐translated sentences, respectively. Thirty‐three percent of the English source sentences were simple and 67% were complex.
Manual Evaluation Scores
Sentences translated by GT received worse scores on fluency as compared to the professional translations (3.4 vs 4.7, P < 0.0001). Comparisons for adequacy and meaning were not statistically significantly different. GT‐translated sentences contained more errors of any severity as compared to the professional translations (39% vs 22%, P = 0.05), but a similar number of serious, clinically impactful errors (severity scores of 3, 2, or 1; 4% vs 2%, P = 0.61). However, one GT‐translated sentence was considered erroneous with a severity level of 1 (Error, dangerous to patient). This particular sentence was 25 words long and complex in structure in the original English document; all three evaluators considered the GT translation nonsensical (La hemorragia mayor, llame a su mdico, o ir a la emergencia de un hospital habitacin si usted tiene cualquiera de los siguientes: Red N, oscuro, caf o cola de orina de color.) Evaluators had no overall preference for the professional translation (3.2, 95% confidence interval = 2.7 to 3.7, with 3 indicating no preference; P = 0.36) (Table 1).
GoogleTranslate Translation | Professional Translation | P Value | |
---|---|---|---|
| |||
Fluency* | 3.4 | 4.7 | <0.0001 |
Adequacy* | 4.5 | 4.8 | 0.19 |
Meaning* | 4.2 | 4.5 | 0.29 |
Severity | |||
Any error | 39% | 22% | 0.05 |
Serious error | 4% | 2% | 0.61 |
Preference* | 3.2 | 0.36 |
Mediation of Scores by Sentence Length or Complexity
We found that sentence length was not associated with scores for fluency, adequacy, meaning, severity, or preference (P > 0.30 in each case). Complexity, however, was significantly associated with preference: evaluators' preferred the professional translation for complex English sentences while being more ambivalent about simple English sentences (3.6 vs 2.6, P = 0.03).
Interrater Reliability and Repeatability
We assessed the interrater reliability for each domain using intraclass correlation coefficients and repeatability. For fluency, the intraclass correlation was best at 0.70; for adequacy, it was 0.58; for meaning, 0.42; for severity, 0.48; and for preference, 0.37. The repeatability scores were 1.4 for fluency, 0.6 for adequacy, 2.2 for meaning, 1.2 for severity, and 3.8 for preference, indicating that two evaluators might give a sentence almost the same score (at most, 1 point apart from one another) for adequacy, but might have opposite preferences regarding which translation of a sentence was superior.
Correlation with METEOR
Correlation between the first four domains and the METEOR scores were less than in prior studies.21 Fluency correlated best with METEOR at 0.53; adequacy correlated least with METEOR at 0.29. The remaining scores were in‐between. All correlations were statistically significant at P < 0.01 (Table 2).
Correlation with METEOR | P value | |
---|---|---|
| ||
Fluency | 0.53 | <0.0001 |
Adequacy | 0.29 | 0.006 |
Meaning | 0.33 | 0.002 |
Severity | 0.39 | 0.002 |
Discussion
In this preliminary study comparing the accuracy of GT to professional translation for patient educational material, we found that GT was inferior to the professional translation in grammatical fluency but generally preserved the content and sense of the original text. Out of 30 GT sentences assessed, there was one substantially erroneous translation that was considered potentially dangerous. Evaluators preferred the professionally translated sentences for complex sentences, but when the English source sentence was simplecontaining a single clausethis preference disappeared.
Like Sharif and Tse,12 we found that for information not arranged in sentences, automated translation sometimes produced nonsensical sentences. In our study, these resulted from an English sentence fragment followed by a bulleted list; in their study, the nonsensical translations resulted from pharmacy labels. The difference in frequency of these errors between our studies may have resulted partly from the translation tool evaluated (GT vs programs used by pharmacies in the Bronx), but may have also been due to our use of machine translation for complete sentencesthe purpose for which it is optimally designed. The hypothesis that machine translations of clinical information are most understandable when used for simple, complete sentences concurs with the methodology used by these tools and requires further study.
GT has the potential to be very useful to clinicians, particularly for those instances when the communication required is both spontaneous and routine or noncritical. For example, in the inpatient setting, patients could communicate diet and other nonclinical requests, as well as ask or answer simple, short questions when the interpreter is not available. In such situations, the low cost and ease of using online translations and machine translation more generally may help to circumvent the tendency of clinicians to get by with inadequate language skills or to avoid communication altogether.25 If used wisely, GT and other online tools could supplement the use of standardized translations and professional interpreters in helping clinicians to overcome language barriers and linguistic inertia, though this will require further assessment.
Ours is a pilot study, and while it suggests a more promising way to use online translation tools, significant further evaluation is required regarding accuracy and applicability prior to widespread use of any machine translation tools for patient care. The document we utilized for evaluation was a professionally translated patient educational brochure provided to individuals starting a complex medication. As online translation tools would most likely not be used in this setting, but rather for spontaneous and less critical patient‐specific instructions, further testing of GT as applied to such scenarios should be considered. Second, we only evaluated GT for English translated into Spanish; its usefulness in other languages will need to be evaluated. It also remains to be seen how easily GT translations will be understood by patients, who may have variable medical understanding and educational attainment as compared to our evaluators. Finally, in this evaluation, we only assessed automated written translation, not automated spoken translation services such as those now available on cellular phones and other mobile devices.11 The latter are based upon translation software with an additional speech recognition interface. These applications may prove to be even more useful than online translation, but the speech recognition component will add an additional layer of potential error and these applications will need to be evaluated on their own merits.
The domains chosen for this study had only moderate interrater reliability as assessed by intraclass correlation and repeatability, with meaning and preference scoring particularly poorly. The latter domains in particular will require more thorough assessment before routine use in online translation assessment. The variability in all domains may have resulted partly from the choice of nonclinicians of different ancestral backgrounds as evaluators. However, this variability is likely better representative of the wide range of patient backgrounds. Because our evaluators were not professional translators, we asked a professional interpreter to grade all sentences to assess the quality of their evaluation. While the interpreter noted slightly fewer errors among the professionally translated sentences (13% vs 22%) and slightly more errors among the GT‐translated sentences (50% vs 39%), and preferred the professional translation slightly more (3.8 vs 3.2), his scores for all of the other measures were almost identical, increasing our confidence in our primary findings (Appendix A). Additionally, since statistical translation is conducted sentence by sentence, in our study evaluators only scored translations at the sentence level. The accuracy of GT for whole paragraphs or entire documents will need to be assessed separately. The correlation between METEOR and the manual evaluation scores was less than in prior studies; while inexpensive to assess, METEOR will have to be recalibrated in optimal circumstanceswith several reference translations available rather than just onebefore it can be used to supplement the assessment of new languages, new materials, other translation technologies, and improvements in a given technology over time for patient educational material.
In summary, GT scored worse in grammar but similarly in content and sense to the professional translation, committing one critical error in translating a complex, fragmented sentence as nonsense. We believe that, with further study and judicious use, GT has the potential to substantially improve clinicians' communication with patients with limited English proficiency in the area of brief spontaneous patient‐specific information, supplementing well the role that professional spoken interpretation and standardized written translations already play.
- Language use and English‐speaking ability: 2000. In:Census 2000 Brief.Washington, DC:US Census Bureau;2003. p. 2. http://www.census.gov/prod/2003pubs/c2kbr‐29.pdf. , .
- The need for more research on language barriers in health care: a proposed research agenda.Milbank Q.2006;84(1):111–133. , , , , .
- Language proficiency and adverse events in US hospitals: a pilot study.Int J Qual Health Care.2007;19(2):60–67. , , , .
- The impact of medical interpreter services on the quality of health care: a systematic review.Med Care Res Rev.2005;62(3):255–299. .
- Errors in medical interpretation and their potential clinical consequences in pediatric encounters.Pediatrics.2003;111(1):6–14. , , , et al.
- The effect of English language proficiency on length of stay and in‐hospital mortality.J Gen Intern Med.2004;19(3):221–228. , , , et al.
- Influence of language barriers on outcomes of hospital care for general medicine inpatients.J Hosp Med.2010;5(5):276–282. , , , .
- Hospitals, language, and culture: a snapshot of the nation. In:Los Angeles, CA:The California Endowment, the Joint Commission;2007. p.51–52. http://www.jointcommission.org/assets/1/6/hlc_paper.pdf. , .
- Do professional interpreters improve clinical care for patients with limited English proficiency? A systematic review of the literature.Health Serv Res.2007;42(2):727–754. , , , .
- Google's Computing Power Refines Translation Tool.New York Times; March 9,2010. Accessed March 24, 2010. http://www.nytimes.com/2010/03/09/technology/09translate.html?_r=1. .
- New York Times; March 20,2010. Accessed March 24, 2010. http://www.nytimes.com/2010/03/21/opinion/21bellos.html. , Translator.
- Accuracy of computer‐generated, Spanish‐language medicine labels.Pediatrics.2010;125(5):960–965. , .
- Nielsen NetRatings Search Engine Ratings.SearchEngineWatch; August 22,2006. Accessed March 24, 2010. http://searchenginewatch.com/2156451. .
- Google.Google Translate Help;2010. Accessed March 24, 2010. http://translate.google.com/support/?hl=en.
- Chapter 4: Basic strategies. In:An Introduction to Machine Translation;1992. Accessed April 22, 2010. http://www.hutchinsweb.me.uk/IntroMT‐4.pdf , .
- Your Guide to Coumadin®/Warfarin Therapy.Agency for Healthcare Research and Quality; August 21,2008. Accessed October 19, 2009. http://www.ahrq.gov/consumer/btpills.htm. .
- Patient reported receipt of medication instructions for warfarin is associated with reduced risk of serious bleeding events.J Gen Intern Med.2008;23(10):1589–1594. , , , et al.
- The ARPA MT evaluation methodologies: evolution, lessons, and future approaches. In: Proceedings of AMTA, 1994, Columbia, MD; October1994. , , .
- Overview of the IWSLT 2005 evaluation campaign. In: Proceedings of IWSLT 2005, Pittsburgh, PA; October2005. , .
- BLEU: a method for automatic evaluation of machine translation. In: ACL‐2002: 40th Annual Meeting of the Association for Computational Linguistics.2002:311–318. , , , .
- METEOR: an automatic metric for MT evaluation with high levels of correlation with human judgments. In: Proceedings of the Second Workshop on Statistical Machine Translation at ACL, Prague, Czech Republic; June2007. , .
- The Structure of a Sentence.Ottawa:The Writing Centre, University of Ottawa;2007. .
- Statistical methods for assessing agreement between two methods of clinical measurement.Lancet.1986;1(8476):307–310. , .
- Measurement, reproducibility, and validity. In:Epidemiologic Methods 203.San Francisco:Department of Biostatistics and Epidemiology, University of California;2009. .
- Getting by: underuse of interpreters by resident physicians.J Gen Intern Med.2009;24(2):256–262. , , , , .
- Language use and English‐speaking ability: 2000. In:Census 2000 Brief.Washington, DC:US Census Bureau;2003. p. 2. http://www.census.gov/prod/2003pubs/c2kbr‐29.pdf. , .
- The need for more research on language barriers in health care: a proposed research agenda.Milbank Q.2006;84(1):111–133. , , , , .
- Language proficiency and adverse events in US hospitals: a pilot study.Int J Qual Health Care.2007;19(2):60–67. , , , .
- The impact of medical interpreter services on the quality of health care: a systematic review.Med Care Res Rev.2005;62(3):255–299. .
- Errors in medical interpretation and their potential clinical consequences in pediatric encounters.Pediatrics.2003;111(1):6–14. , , , et al.
- The effect of English language proficiency on length of stay and in‐hospital mortality.J Gen Intern Med.2004;19(3):221–228. , , , et al.
- Influence of language barriers on outcomes of hospital care for general medicine inpatients.J Hosp Med.2010;5(5):276–282. , , , .
- Hospitals, language, and culture: a snapshot of the nation. In:Los Angeles, CA:The California Endowment, the Joint Commission;2007. p.51–52. http://www.jointcommission.org/assets/1/6/hlc_paper.pdf. , .
- Do professional interpreters improve clinical care for patients with limited English proficiency? A systematic review of the literature.Health Serv Res.2007;42(2):727–754. , , , .
- Google's Computing Power Refines Translation Tool.New York Times; March 9,2010. Accessed March 24, 2010. http://www.nytimes.com/2010/03/09/technology/09translate.html?_r=1. .
- New York Times; March 20,2010. Accessed March 24, 2010. http://www.nytimes.com/2010/03/21/opinion/21bellos.html. , Translator.
- Accuracy of computer‐generated, Spanish‐language medicine labels.Pediatrics.2010;125(5):960–965. , .
- Nielsen NetRatings Search Engine Ratings.SearchEngineWatch; August 22,2006. Accessed March 24, 2010. http://searchenginewatch.com/2156451. .
- Google.Google Translate Help;2010. Accessed March 24, 2010. http://translate.google.com/support/?hl=en.
- Chapter 4: Basic strategies. In:An Introduction to Machine Translation;1992. Accessed April 22, 2010. http://www.hutchinsweb.me.uk/IntroMT‐4.pdf , .
- Your Guide to Coumadin®/Warfarin Therapy.Agency for Healthcare Research and Quality; August 21,2008. Accessed October 19, 2009. http://www.ahrq.gov/consumer/btpills.htm. .
- Patient reported receipt of medication instructions for warfarin is associated with reduced risk of serious bleeding events.J Gen Intern Med.2008;23(10):1589–1594. , , , et al.
- The ARPA MT evaluation methodologies: evolution, lessons, and future approaches. In: Proceedings of AMTA, 1994, Columbia, MD; October1994. , , .
- Overview of the IWSLT 2005 evaluation campaign. In: Proceedings of IWSLT 2005, Pittsburgh, PA; October2005. , .
- BLEU: a method for automatic evaluation of machine translation. In: ACL‐2002: 40th Annual Meeting of the Association for Computational Linguistics.2002:311–318. , , , .
- METEOR: an automatic metric for MT evaluation with high levels of correlation with human judgments. In: Proceedings of the Second Workshop on Statistical Machine Translation at ACL, Prague, Czech Republic; June2007. , .
- The Structure of a Sentence.Ottawa:The Writing Centre, University of Ottawa;2007. .
- Statistical methods for assessing agreement between two methods of clinical measurement.Lancet.1986;1(8476):307–310. , .
- Measurement, reproducibility, and validity. In:Epidemiologic Methods 203.San Francisco:Department of Biostatistics and Epidemiology, University of California;2009. .
- Getting by: underuse of interpreters by resident physicians.J Gen Intern Med.2009;24(2):256–262. , , , , .
Copyright © 2011 Society of Hospital Medicine
Teachable Moments
With World Stroke Day scheduled for Saturday, a frequent speaker for the National Stroke Association (NSA) wants to remind hospitalists to push their patients to know their risk factors.
"They have an excellent opportunity to be an educator, particularly because of that captive audience," says David Willis, MD, a primary-care physician in Ocala, Fla., who frequently holds educational events for the NSA.
Dr. Willis cites data from a 2010 survey (PDF) compiled by NSA and Boehringer Ingelheim Pharmaceuticals that shows while more than 75% of healthcare providers reported talking to patients about atrial fibrillation (AF) and stroke, nearly half don't recall the conversation. And just 40% of patients initiate the discussion.
Dr. Willis, who served on the steering committee that interpreted the survey results, says that hospitalists dealing with AF patients can "quarterback" care plans and help improve communication with post-discharge physicians, be they primary care or specialists.
"We may not be getting that thought across as well as we think we are," he says.
Improved communication and transitions will become more important as unnecessary readmissions related to AF or stroke financially impact physicians because the government may reduce reimbursements for repeated hospital visits. Dr. Willis suggests that hospitalists take the reins of integrating their patient education efforts into checklists, health information technology, or some formalized process.
"My experience is, if you create protocols, they usually work better than educating people at a provider level," he says.
With World Stroke Day scheduled for Saturday, a frequent speaker for the National Stroke Association (NSA) wants to remind hospitalists to push their patients to know their risk factors.
"They have an excellent opportunity to be an educator, particularly because of that captive audience," says David Willis, MD, a primary-care physician in Ocala, Fla., who frequently holds educational events for the NSA.
Dr. Willis cites data from a 2010 survey (PDF) compiled by NSA and Boehringer Ingelheim Pharmaceuticals that shows while more than 75% of healthcare providers reported talking to patients about atrial fibrillation (AF) and stroke, nearly half don't recall the conversation. And just 40% of patients initiate the discussion.
Dr. Willis, who served on the steering committee that interpreted the survey results, says that hospitalists dealing with AF patients can "quarterback" care plans and help improve communication with post-discharge physicians, be they primary care or specialists.
"We may not be getting that thought across as well as we think we are," he says.
Improved communication and transitions will become more important as unnecessary readmissions related to AF or stroke financially impact physicians because the government may reduce reimbursements for repeated hospital visits. Dr. Willis suggests that hospitalists take the reins of integrating their patient education efforts into checklists, health information technology, or some formalized process.
"My experience is, if you create protocols, they usually work better than educating people at a provider level," he says.
With World Stroke Day scheduled for Saturday, a frequent speaker for the National Stroke Association (NSA) wants to remind hospitalists to push their patients to know their risk factors.
"They have an excellent opportunity to be an educator, particularly because of that captive audience," says David Willis, MD, a primary-care physician in Ocala, Fla., who frequently holds educational events for the NSA.
Dr. Willis cites data from a 2010 survey (PDF) compiled by NSA and Boehringer Ingelheim Pharmaceuticals that shows while more than 75% of healthcare providers reported talking to patients about atrial fibrillation (AF) and stroke, nearly half don't recall the conversation. And just 40% of patients initiate the discussion.
Dr. Willis, who served on the steering committee that interpreted the survey results, says that hospitalists dealing with AF patients can "quarterback" care plans and help improve communication with post-discharge physicians, be they primary care or specialists.
"We may not be getting that thought across as well as we think we are," he says.
Improved communication and transitions will become more important as unnecessary readmissions related to AF or stroke financially impact physicians because the government may reduce reimbursements for repeated hospital visits. Dr. Willis suggests that hospitalists take the reins of integrating their patient education efforts into checklists, health information technology, or some formalized process.
"My experience is, if you create protocols, they usually work better than educating people at a provider level," he says.
Specialty Hospitalists to Meet in Vegas
Medical professionals from across the country will attend the first national meeting on the topic of specialty hospitalists Nov. 4 at the Mandalay Bay Resort and Casino in Las Vegas. Sponsored by SHM, the American Hospital Association, the Neurohospitalist Society, and OBGynHospitalist.com, the gathering is for anyone interested in adopting a hospital-focused model of practice, including physician and nonphysician clinicians, as well as those in medical support industries, such as insurance carriers, policymakers, and healthcare media.
According to organizers, the one-day meeting will be structured to encourage networking and exchange of ideas among attendees, and will include presentations, panel discussions, and Q&A sessions.
"This is less 'Come hear from people who have this all figured out' … it's 'Come hear from people who are thinking about this a lot.' But the attendees are a big part of the knowledge base," says John Nelson, MD, MHM, hospitalist medical director at Overlake Hospital in Bellevue, Wash.
Dr. Nelson, cofounder and past president of SHM as well as the Nov. 4 meeting director, says he hopes to bring together healthcare leaders from diverse backgrounds to share their experiences and insights. Since this movement is growing organically rather than descending from a central agency, organizers expect to centralize the sharing of ideas and best practices.
Nearly 60 interested parties have pre-registered for the meeting, according to SHM. Attendees will take what they have learned back to their own hospitals or businesses, Dr. Nelson says, and continue the conversation with their colleagues.
The cost to attend the meeting is $350 and seats remain available; register by phone, 800-843-3360, or via the SHM website.
Medical professionals from across the country will attend the first national meeting on the topic of specialty hospitalists Nov. 4 at the Mandalay Bay Resort and Casino in Las Vegas. Sponsored by SHM, the American Hospital Association, the Neurohospitalist Society, and OBGynHospitalist.com, the gathering is for anyone interested in adopting a hospital-focused model of practice, including physician and nonphysician clinicians, as well as those in medical support industries, such as insurance carriers, policymakers, and healthcare media.
According to organizers, the one-day meeting will be structured to encourage networking and exchange of ideas among attendees, and will include presentations, panel discussions, and Q&A sessions.
"This is less 'Come hear from people who have this all figured out' … it's 'Come hear from people who are thinking about this a lot.' But the attendees are a big part of the knowledge base," says John Nelson, MD, MHM, hospitalist medical director at Overlake Hospital in Bellevue, Wash.
Dr. Nelson, cofounder and past president of SHM as well as the Nov. 4 meeting director, says he hopes to bring together healthcare leaders from diverse backgrounds to share their experiences and insights. Since this movement is growing organically rather than descending from a central agency, organizers expect to centralize the sharing of ideas and best practices.
Nearly 60 interested parties have pre-registered for the meeting, according to SHM. Attendees will take what they have learned back to their own hospitals or businesses, Dr. Nelson says, and continue the conversation with their colleagues.
The cost to attend the meeting is $350 and seats remain available; register by phone, 800-843-3360, or via the SHM website.
Medical professionals from across the country will attend the first national meeting on the topic of specialty hospitalists Nov. 4 at the Mandalay Bay Resort and Casino in Las Vegas. Sponsored by SHM, the American Hospital Association, the Neurohospitalist Society, and OBGynHospitalist.com, the gathering is for anyone interested in adopting a hospital-focused model of practice, including physician and nonphysician clinicians, as well as those in medical support industries, such as insurance carriers, policymakers, and healthcare media.
According to organizers, the one-day meeting will be structured to encourage networking and exchange of ideas among attendees, and will include presentations, panel discussions, and Q&A sessions.
"This is less 'Come hear from people who have this all figured out' … it's 'Come hear from people who are thinking about this a lot.' But the attendees are a big part of the knowledge base," says John Nelson, MD, MHM, hospitalist medical director at Overlake Hospital in Bellevue, Wash.
Dr. Nelson, cofounder and past president of SHM as well as the Nov. 4 meeting director, says he hopes to bring together healthcare leaders from diverse backgrounds to share their experiences and insights. Since this movement is growing organically rather than descending from a central agency, organizers expect to centralize the sharing of ideas and best practices.
Nearly 60 interested parties have pre-registered for the meeting, according to SHM. Attendees will take what they have learned back to their own hospitals or businesses, Dr. Nelson says, and continue the conversation with their colleagues.
The cost to attend the meeting is $350 and seats remain available; register by phone, 800-843-3360, or via the SHM website.
Mortality Among Elders With Pneumonia
Pneumonia occurs more commonly among older persons.1 With advancing age, the frequency of hospitalizations and mortality for pneumonia are higher.2 Among the tools developed to predict short‐term mortality is the pneumonia severity index (PSI), which is the best known among severity of illness indices for pneumonia.3 Its ability to predict short‐term mortality for CAP, particularly in identifying those at low risk was previously demonstrated.4 More recently, the extension of its utility in predicting 30‐day mortality for healthcare‐associated pneumonia (HCAP) was demonstrated.5
Severity of illness is one of several risk factors for adverse outcomes among older persons with acute illness. Besides comorbidity, other factors include functional impairment and atypical presentation. Information on physical functioning had equal importance as laboratory data in prognostication of in‐hospital mortality.6 In addition, walking impairment was 1 of 5 components of a risk adjustment index developed to predict 1‐year mortality for hospitalized older persons.7 Atypical presentations of illness, such as delirium and falls, independently predicted poor outcomes among hospitalized older patients.8
Specifically for pneumonia, functional status has also been shown to be an independent predictor of short‐term mortality among older patients hospitalized with CAP.913 Among atypical presentations, only absence of chills was an independent prognostic factor for CAP.9 Bacteremia was an independent factor related to death among adults with CAP, albeit for severe disease resulting in intensive care unit admission.14 It was also included in a severity assessment score; its higher scores were associated with early mortality.15 However, blood culture results are only available 2 to 3 days into the hospital episode. Therefore, bacteremia is a potential risk factor for mortality that is not identifiable at the start of hospitalization.
While PSI is a comprehensive collection of demographic, clinical, and investigative measures, it does not include items on functional status or atypical presentation. Neither does it account for recent hospitalization or comorbid conditions of significance to older persons, such as dementia and depression. It is plausible that at least some of these factors hold added prognostic value.
With all these in mind, we conducted a study with the following objectives: 1) to determine whether functional impairment, recent hospitalization, comorbid conditions of particular significance with advancing age, and atypical presentation are significantly associated with short‐term mortality among older patients hospitalized for CAP and HCAP, after taking into account PSI; and 2) if so, to estimate the magnitude of increased mortality risk with these factors. We tested our null hypotheses that, after adjustment for PSI class, 1) recent hospitalization, 2) pre‐morbid functional impairment, 3) dementia and depression, and 4) atypical presentation of illness have no association with 30‐day mortality for older persons hospitalized for CAP and HCAP, both combined and alone.
PATIENTS AND METHODS
Design and Setting
This was a retrospective cohort study that employed secondary analyses of chart and administrative data. The setting was 3 acute care public hospitals of the National Healthcare Group (NHG) cluster in Singapore. We merged data from hospital charts, the NHG Operations Data Store administrative database, and the national death registry. The local Institution Review Board (IRB) approved waiver of consent, and all other study procedures were consistent with the principles of the Helsinki Declaration.
Patient Population
We included first hospital episodes of adults aged 65 years or older with the principal diagnosis of pneumonia in 2007. These episodes were identified by their primary International Classification of Diseases, 9th revision, Clinical Modification (ICD‐9‐CM) codes of 480 to 486 in the administrative data. Next, we applied our study definition of pneumonia, which required the presence of acute symptoms or signs of pneumonia at the point of hospital admission, and a chest radiograph with features consistent with pneumonia that was obtained during the period from 24 hours before, to 48 hours after, hospital admission. In doing so, we included patients with community‐acquired pneumonia (CAP)16 and healthcare‐associated pneumonia (HCAP),17 but not hospital‐acquired pneumonia (HAP). We excluded patients whose charts were not accessible for review because of human immunodeficiency virus/acquired immune deficiency syndrome (HIV/AIDS) and those whose charts were unavailable for other reasons. The study flow diagram is shown in Figure 1.

We assigned the diagnosis of HCAP to patients who were admitted to an acute care hospital for 2 or more days in the prior 90 days, resided in a nursing home or long‐term care facility, or received of intravenous antibiotic therapy, chemotherapy, wound care, or hemodialysis in the prior 30 days.18 Remaining patients were assigned CAP.
Data Collection
Trained research nurses used an abstraction protocol to collect demographic and clinical information from the charts, and to extract laboratory results and chest radiograph reports from the computerized clinical records. Where radiological reports were equivocal with respect to features of pneumonia, we obtained the opinion of one of our respiratory physician investigators whose decision was final. A researcher with bio‐informatics expertise extracted admission‐related information from the administrative data. Chart, administrative, and mortality data were merged to assemble the study database.
Outcome and Explanatory Variables
The outcome (dependent) variable was 30‐day all‐cause mortality. The following explanatory (independent) variables were examined:
Pneumonia severity index (PSI): We used PSI class as specified in the original studies.4
Recent hospitalization: Hospitalization in the prior 90 days and 30 days were explored.
Atypical presentation of illness: Acute geriatric syndromes (falls or acute impairment of mobility), and absence of cough and purulent sputum were examined. Delirium was not one of the syndromes because PSI includes altered mental state as an item.4
Functional impairment: Pre‐morbid ambulation impairment and feeding impairment were examined. Impairment was defined as needing assistance or being totally dependent.
Additional comorbid conditions: We selected dementia and depression, as they may have impact on mortality in older persons but were not included in PSI.
We did not include bacteremia, because its presence cannot be determined at the time of illness presentation.
From previous experience, we anticipated missing values for functional status measures in up to 5% of charts. Where values were missing, we used the simple imputation strategy of assigning no ambulation or feeding impairment.
Sample Size Calculation
With a sample size of 1400 patients and a 30‐day mortality rate of 25%, 350 cases of death were expected. Using the rule of thumb of at least 10 cases per independent variable,19 we were able to work with 35 candidate explanatory variables in logistic regression for the entire group. Assuming that the subpopulations of CAP and HCAP consist of 700 patients each, with mortality rates of 20% and 30%, respectively, then 14 could be explored for CAP and 21 candidate variables for HCAP.
Data Analyses
Pre‐morbid ambulation impairment and feeding impairment probably represent different points along the continuum of functional impairment. During preliminary analyses when both variables were adjusted for each other in logistic regression, pre‐morbid ambulation impairment (odds ratio [OR] 4.94, 95% confidence interval [CI] 3.79 to 6.43) was associated with 30‐day mortality, whereas pre‐morbid feeding impairment was not (OR 0.82, 95% CI 0.61 to 1.09). As such, pre‐morbid ambulation impairment was selected as the variable to represent functional impairment. Hospitalization in the prior 30 days was more strongly associated with 30‐day mortality (OR 2.38, 95% CI 1.77 to 3.21) than was hospitalization in the prior 90 days (OR 1.90, 95% CI 1.49 to 2.41). Therefore, hospitalization in the prior 30 days was selected as the variable to reflect recent hospitalization.
We used logistic regression analysis and regressed 30‐day mortality on PSI class and other explanatory variables. OR estimates and their 95% CI were used to quantify the strength of associations of the explanatory variables with mortality, and to test their statistical significance. In addition, we explored the possibility of interactions between PSI class and the patient factors. To this end, we constructed additional regression models that included appropriate interaction terms and tested their statistical significance. As a form of sensitivity analysis, we repeated the regression analyses only for hospital episodes with complete functional data and observed the extent to which OR estimates changed. Furthermore, we performed 2‐level hierarchical modeling to account for clustering at the hospital level and re‐examined the OR and 95% CI for the patient factors. We conducted these analyses for the entire group, and repeated them separately for CAP and HCAP. Finally, to estimate the extent to which the patient factors would increase predicted 30‐day mortality, we performed marginal effects analyses for the entire group to quantify the increased risk when individual factors were present.
We used STATA version 9.2 (Stata Corp, College Station, TX) for all statistical analyses. Hierarchical modeling was performed using the xtlogit command. STATA post‐estimation commands mfx and prvalue were employed to estimate marginal effects and predicted probabilities, respectively. The unit of analysis was patients. Statistical significance was defined by P values of less than 0.05.
RESULTS
Among 1607 patients included, 890 (55.4%) had CAP and 717 (44.6%) had HCAP. Baseline patient characteristics of patients with CAP and HCAP are shown in Table 1. The 30‐day mortality rate was 28.1% for the entire group, and 20.6% and 37.4% for patients with CAP and HCAP, respectively. When stratified according to PSI classes 2, 3, 4, and 5, this rate was 0%, 8.2%, 24.4%, and 56.0%, respectively. Because there were no deaths among those with PSI class 2, this category was merged with class 3 for the regression analyses. Missing data on pre‐morbid ambulation impairment and feeding impairment occurred for 39 (2.4%) and 69 (4.6%) patients, respectively.
Whole Study Population (n = 1607) | Those With CAP (n = 890) | Those With HCAP (n = 717) | |
---|---|---|---|
| |||
Median age, years (IQR) | 80 (7487) | 79 (7385) | 82 (7588) |
Male, n (%) | 876 (54.5) | 477 (53.6) | 399 (55.7) |
Median pneumonia severity index (PSI) score, (IQR) | 109 (87134) | 100 (82121) | 120 (99144) |
PSI class: | |||
2 | 98 (6.1) | 84 (9.4) | 14 (2.0) |
3 | 353 (22.0) | 260 (29.2) | 93 (13.0) |
4 | 713 (44.4) | 386 (43.4) | 327 (45.6) |
5 | 443 (27.6) | 160 (18.0) | 283 (39.5) |
Pre‐morbid ambulation impairment, n (%) | 798 (49.7) | 287 (32.3) | 511 (71.3) |
Pre‐morbid feeding impairment, n (%) | 298 (18.5) | 74 (8.3) | 224 (31.2) |
Hospitalization in prior 30 days, n (%) | 209 (13.0) | 0 (0) | 209 (29.2) |
Nursing home residence, n (%) | 362 (22.5) | 0 (0) | 362 (50.5) |
Acute geriatric syndromes, n (%) | 442 (27.5) | 241 (27.1) | 201 (28.0) |
Absence of both cough and purulent sputum, n (%) | 559 (34.8) | 226 (25.4) | 333 (46.4) |
Dementia, n (%) | 307 (19.1) | 121 (13.6) | 178 (25.8) |
Depression, n (%) | 165 (10.3) | 53 (6.0) | 109 (15.8) |
Neoplastic disease, n (%) | 108 (6.7) | 33 (3.7) | 75 (10.5) |
Liver disease, n (%) | 48 (3.0) | 25 (2.8) | 23 (3.2) |
Congestive heart failure, n (%) | 257 (16.0) | 129 (14.5) | 128 (17.9) |
Stroke, n (%) | 490 (30.5) | 215 (24.2) | 275 (38.4) |
Renal failure, n (%) | 220 (13.7) | 97 (10.9) | 123 (17.2) |
Chronic lung disease, n (%) | 316 (19.7) | 177 (19.9) | 139 (19.4) |
Diabetes mellitus, n (%) | 515 (32.1) | 273 (30.7) | 242 (33.8) |
Emergency department diagnosis of pneumonia, n (%) | 857 (53.3) | 494 (55.5) | 363 (50.6) |
For CAP and HCAP together, pre‐morbid ambulation impairment was associated with increased 30‐day mortality (339/798 [42.5%] vs 112/809 [13.8%], unadjusted OR 4.60, 95% CI 3.60 to 5.87, P < 0.01), as was hospitalization in the prior 30 days (94/209 [45.0%] vs 357/1398 [25.5%], unadjusted OR 2.38, 95% CI 1.77 to 3.21, P = 0.02). This was also the case for dementia (118/307 [38.4%] vs 333/1300 [25.6%], unadjusted OR 1.81, 95% CI 1.40 to 2.35, P < 0.01), acute geriatric syndromes (163/442 [36.9%] vs 288/1165 [24.7%], unadjusted OR 1.78, 95% CI 1.41 to 2.25, P < 0.01), and absence of cough and purulent sputum (226/559 [40.4%] vs 225/1048 [21.5%], unadjusted OR 2.48, 95% CI 1.98 to 3.11, P < 0.01). However, depression was not significantly associated with 30‐day mortality (57/165 [34.6%] vs 394/1442 [27.3%], unadjusted OR 1.40, 95% CI 1.00 to 1.97, P = 0.05).
Table 2 summarizes the results of logistic regression. It shows that pre‐morbid ambulation impairment, hospitalization in the prior 30 days, and absence of cough and purulent sputum were all independently associated with 30‐day mortality after adjustment for PSI score for the entire group. These associations remained statistically significant when CAP and HCAP were examined separately. Because none of those with CAP could have hospitalization in the prior 30 days, this factor was not included in the CAP model. The strength of association for the same patient factor varied across the pneumonia sub‐type. This was markedly so for pre‐morbid ambulation impairment, with the OR estimate being almost 3‐fold higher for CAP than for HCAP. Dementia, depression, and acute geriatric syndromes were not associated with 30‐day mortality. When the analyses were repeated after excluding hospital episodes with missing values for pre‐morbid ambulation impairment, the same 3 variables were significantly associated with 30‐day mortality, with trivial differences in strength of association compared to when imputation was performed. The OR estimates for pre‐morbid ambulation impairment, hospitalization in the prior 30 days, and absence of cough and purulent sputum were 2.82 (95% CI 2.12 to 3.76), 1.83 (95% CI 1.42 to 2.83), and 1.47 (95% CI 1.14 to 1.91).
Baseline Patient Factors | Adjusted Odds Ratio (95% Confidence Interval) | ||
---|---|---|---|
All Patients (n = 1607) | Patients With CAP (n = 890) | Patients With HCAP (n = 717) | |
| |||
Pneumonia severity index (PSI) class (reference: PSI classes 2 and 3 combined): | |||
4 | 3.37* (2.20 to 5.17) | 4.02* (2.29 to 7.08) | 2.69* (1.38 to 5.26) |
5 | 11.19* (7.14 to 17.55) | 13.03* (7.00 to 24.24) | 9.73* (4.86 to 19.46) |
Pre‐morbid ambulation impairment | 2.61* (1.98 to 3.45) | 4.56* (3.06 to 6.78) | 1.60* (1.06 to 2.42) |
Hospitalization in the prior 30 days | 1.93* (1.38 to 2.71) | 2.13* (1.47 to 3.09) | |
Dementia | 1.00 (0.74 to 1.37) | 0.82 (0.49 to 1.38) | 1.15 (0.78 to 1.69) |
Depression | 0.83 (0.56 to 1.23) | 1.03 (0.48 to 2.18) | 0.83 (0.53 to 1.31) |
Acute geriatric syndromes | 0.96 (0.72 to 1.26) | 1.26 (0.83 to 1.92) | 0.74 (0.50 to 1.08) |
Absence of cough and purulent sputum | 1.47* (1.14 to 1.90) | 1.64* (1.08 to 2.46) | 1.45* (1.04 to 2.03) |
Two‐level hierarchical modeling to account for clustering at the hospital level obtained negligible change in OR estimates of the patient factors and their 95% CI. There were no statistically significant interactions between PSI class and the 3 patient factors (results not shown).
The model‐predicted increase in mortality risk with presence of individual patient factors for the entire group is shown in Table 3. Across the 3 factors, 30‐day mortality increased by 1.9% to 6.1% for those with PSI class 2 and 3, and by 9.0% to 23.2% for those with PSI class 5. The upper end of these ranges represented the effect of pre‐morbid ambulation impairment, while the lower end was that for absence of cough and purulent sputum. With reference to the predicted mortality rates for PSI class which are listed in the footnotes of Table 3, the adverse prognosis conferred by individual patient factors amounted to relative risk inflation of 27% to 145% depending on the specific factor and PSI class.
Predicted Increase in 30‐Day Mortality With Presence of Single Baseline Patient Factors, % (95% Confidence Interval) | |||
---|---|---|---|
PSI Classes 2 and 3 (n = 449) | PSI Class 4 (n = 700) | PSI Class 5 (n = 413) | |
| |||
Pre‐morbid ambulation impairment | 6.1 (3.2 to 9.0) | 15.0 (10.2 to 19.7) | 23.2 (16.8 to 29.7) |
Hospitalization in the prior 30 days | 3.6 (0.9 to 6.3) | 9.3 (3.6 to 15.1) | 15.7 (7.3 to 24.2) |
Absence of cough and purulent sputum | 1.9 (0.4 to 3.4) | 5.0 (1.4 to 8.6) | 9.0 (3.0 to 15.0) |
DISCUSSION
After accounting for PSI class, we found 3 additional patient factors that were independently associated with 30‐day mortality among older persons hospitalized for pneumonia. Firstly, our study confirms that impaired physical function reflected by pre‐morbid ambulation impairment increases mortality risk, as previously demonstrated by Torres et al.10 It is likely that impaired function reflects an underlying vulnerability for adverse outcomes that is seen across primary diagnoses.7 Secondly, recent hospitalization often indicates clinical, functional, and social complexities, as well as increased likelihood of infection by more virulent organisms commonly associated with healthcare‐related infections. Together, these 2 factors could increase mortality risk. Thirdly, atypical presentations may be associated with increased mortality, because these often occur in frail older persons who are vulnerable to adverse outcomes8 due to diseases suffered and treatment received. Atypical presentations may also result in delayed diagnosis and treatment of pneumonia.
Pilotto et al. found that a multidimensional index comprising functional status, comorbidity burden, mental status, and nutritional assessment, among others, had a higher predictive accuracy for 30‐day mortality than did PSI.20 While there was a previous attempt to combine PSI with independent predictors to identify low‐risk older patients with CAP,21 we could not find similar work on the range of patient factors examined in this study. Indeed, the most important contribution that our study brings to the growing body of literature on short‐term mortality, among older persons hospitalized for pneumonia, is the prognostic importance of these 3 additional patient factors over and above severity of illness measured by PSI. With reference to the baseline predicted risk for different PSI class categories shown in Table 3, we have demonstrated that the predicted increase in mortality risk with the presence of these 3 factors is often not trivial, particularly for those with more severe pneumonia.
These 3 patient factors retained prognostic significance after accounting for PSI class for HCAP. However, only 2 factors were associated with mortality for CAP, because by definition recent hospitalization does not occur. A relevant discussion point is whether CAP and HCAP should be grouped together or classified separately. It is pertinent to reflect that the utility of making a distinction between CAP and HCAP appears to lie largely in the domain of therapeutics regarding the initial choice of antibiotics,18, 2225 although there has been some debate on this point.26 Moreover, the major features of HCAP, namely recent hospitalization (albeit in the prior 30 days, rather than 90 days) and nursing home residence (an item in PSI) were included in our regression analyses. Therefore, it seems reasonable to consider CAP and HCAP as a single group for risk stratification at the clinical frontline. We also argue that combining CAP and HCAP for risk adjustment will result in larger sample sizes that can minimize uncertainty around treatment effect estimates, when comparing across different interventions or providers. The same approach of analyzing CAP and HCAP together was adopted in a recent study that compared US hospitals on their risk‐adjusted performance for pneumonia among Medicare beneficiaries.27
The 30‐day mortality rates in this study are higher than those in the original PSI studies, even when stratified according to PSI class. However, more recent studies also registered relatively high mortality rates ranging from 18% to 19%.12, 28 There are a number of possible reasons for the higher mortality rates observed in our study. Firstly, we included both CAP and HCAP, whereas some other studies focused only on CAP. Secondly, the original PSI studies excluded patients with previous hospitalization within 7 days of admission, while we included them. Thirdly, our study population was relatively old (median age: 80 years) and had a higher proportion from nursing homes (22%). Although age and nursing home residence are variables in the PSI, the weights assigned to these 2 items may not adequately reflect the magnitude of mortality risk they confer. Finally, our understanding is that the study population comprises a relatively high proportion of patients who have do‐not‐resuscitate (DNR) instructions, though this was not measured. All these patient characteristics are likely to be associated with higher mortality risk.
The major strength of this study relates to its real world setting, where there were no major exclusion criteria except for HIV/AIDS. In addition, the clinical data at our disposal allowed selection from a relatively wide range of patient factors, beyond that commonly available in administrative data alone.
However, a few important limitations need to be acknowledged. Firstly, the retrospective nature of the study restricted data to those routinely collected, rather than that specifically acquired for research. Important unmeasured factors include inflammatory markers such as C‐reactive protein (CRP) or procalcitonin levels which have been shown to have prognostic value.29 Others include frailty, socioeconomic status, and social support.20 Secondly, increased likelihood of measurement error associated with retrospectively collected data could result in bias with uncertain direction. Thirdly, our strategy of assuming no functional impairment in the absence of documentation raises the possibility of underidentification and consequent bias in the direction of underestimation of the strength of association between pre‐morbid ambulation impairment and mortality. If so, the true association could even be stronger. Finally, we did not capture do‐not‐resuscitate (DNR) decisions because these were not consistently documented in the charts. We concede that DNR status is expected to be associated with short‐term mortality30 and therefore remains an unobserved factor that may explain a proportion of the mortality risk attributed to other factors in our study, such as pre‐morbid ambulation impairment.
Where do we proceed from here? Given our findings, further work that examines the unmeasured factors mentioned should be done. CRP and procalcitonin levels can be extracted from the laboratory results database when they are measured. However, specification of the other 3 factors is more challenging, given that these represent clinical or social constructs wherein optimal measurement is less certain. It would be important to estimate how much these factors improve the prediction of short‐term mortality beyond that achieved by PSI and the patient factors we have identified.
Nonetheless, the clinical implications of our work are clear. While PSI class is a time‐tested tool, addition of pre‐morbid ambulation impairment, hospitalization in the prior 30 days, and absence of cough and purulent sputum can further improve risk stratification for short‐term mortality, when older persons present initially with clinical and radiological features of pneumonia. Information on these factors should be available in routine clinical care and, therefore, their use in risk stratification should be considered. For more valid and credible risk adjustment, these 3 factors could be considered in addition to severity of illness indices where data availability permits.
CONCLUSION
Recent hospitalization, pre‐morbid ambulation impairment, and atypical clinical presentation were independently associated with higher 30‐day mortality among older persons hospitalized for pneumonia, after adjusting for severity of illness with PSI class. These factors could be considered in addition to PSI, when performing risk stratification and adjustment in this setting.
Acknowledgements
The authors thank Clinical Associate Professor Sin Fai Lam for his assistance in the study, and the medical board chairmen of the 3 study hospitals for their support and encouragement.
- Community‐acquired pneumonia in the elderly.Clin Infect Dis.2000;31:1066–1078. .
- Hospitalized community‐acquired pneumonia in the elderly—age‐ and sex‐related patterns of care and outcome in the United States.Am J Respir Crit Care Med.2002;165:766–772. , , , , , .
- Validation of a pneumonia prognostic index using the MedisGroups Comparative Hospital Database.Am J Med.1993;94:153–159. , , , , .
- A prediction rule to identify low‐risk patients with community‐acquired pneumonia.N Engl J Med.1997;336:243–250. , , , et al.
- Application and comparison of scoring indices to predict outcomes in patients with healthcare‐associated pneumonia.Critical Care.2011;15:R32. , , , et al.
- Predicting in‐hospital mortality: the importance of functional status information.Med Care.1995;33:906–921. , , , , , .
- Burden of illness score for elderly persons: risk adjustment incorporating the cumulative impact of diseases, physiologic abnormalities, and functional impairments.Med Care.2003;41:70–83. , , , et al.
- Illness presentation in elderly patients.Arch Intern Med.1995;155:1060–1064. , , , , .
- Community‐acquired pneumonia in the elderly: Spanish multicentre study.Eur Respir J.2003;21:294–302. , , , et al.
- Outcome predictors of pneumonia in elderly patients: importance of functional assessment.J Am Geriatr Soc.2004;52:1603–1609. , , , et al.
- Factors influencing in‐hospital mortality in community‐acquired pneumonia: a prospective study of patients not initially admitted to the ICU.Chest.2005;127;1260–1270. , .
- Assessment of pneumonia in older adults: effect of functional status.J Am Geriatr Soc.2006;54:1062–1067. , , .
- Only severely limited, premorbid functional status is associated with short‐ and longterm mortality in patients with pneumonia who are critically ill: a prospective observational study.Chest.2011;139:88–94. , , , et al.
- Severe community‐acquired pneumonia: assessment of microbial aetiology as mortality factor.Eur Respir J.2004;24:779–785. , , , et al.
- PIRO score for community‐acquired pneumonia: a new prediction rule for assessment of severity in intensive care unit patients with community‐acquired pneumonia.Crit Care Med.2009;37:456–462. , , , , , .
- Practice guidelines for the management of community‐acquired pneumonia.Clin Infect Dis.2000;31:347–382. , , , , , .
- Epidemiology and outcomes of health‐care–associated pneumonia—results from a large US database of culture‐positive pneumonia.Chest.2005;128:3854–3862. , , , , , .
- American Thoracic Society and Infectious Diseases Society of America.Guidelines for the management of adults with hospital‐acquired, ventilator‐associated, and healthcare‐associated pneumonia.Am J Respir Crit Care Med.2005;171:388–416.
- Conceptual and practical issues in developing risk‐adjustment methods. In: Iezzoni LI, editor.Risk Adjustment for Measuring Health Care Outcomes.3rd ed.Chicago, IL:Health Administration Press;2003:179–205. , , .
- The multidimensional prognostic index predicts short‐ and long‐term mortality in hospitalized geriatric patients with pneumonia.J Gerontol A Biol Sci Med Sci.2009;64A:880–887. , , , et al.
- A validation and potential modification of the pneumonia severity index in elderly patients with community‐acquired pneumonia.J Am Geriatr Soc.2006;54:1212–1219. , , , et al.
- Health care‐associated pneumonia—a new therapeutic paradigm.Chest.2005;128:3784–3786. , .
- Health care‐associated pneumonia requiring hospital admission.Arch Intern Med.2007;167:1393–1399. , , , et al.
- Health care‐associated pneumonia (HCAP): a critical appraisal to improve identification, management, and outcomes—Proceedings of the HCAP Summit.Clin Infect Dis.2008;46(suppl 4):S296–S334. , , , et al.
- for the Study Group of the Italian Society of Internal Medicine.Outcomes of patients hospitalized with community‐acquired, health care‐associated, and hospital‐acquired pneumonia.Ann Intern Med.2009;150:19–26. , , , , ;
- Healthcare‐associated pneumonia is a heterogeneous disease, and all patients do not need the same broad‐spectrum antibiotic therapy as complex nosocomial pneumonia.Curr Opin Infect Dis.2009;22:316–325. , .
- The performance of US hospitals as reflected in risk‐standardized 30‐day mortality and readmission rates for Medicare beneficiaries with pneumonia.J Hosp Med.2010;5:E12–E18. , , , et al.
- Temporal trends in outcomes of older patients with pneumonia.Arch Intern Med.2000;160:3385–3391. , , , et al.
- Clinical review: the role of biomarkers in the diagnosis and management of community‐acquired pneumonia.Critical Care.2010;14:203. , .
- Community‐acquired pneumonia and do‐not‐resuscitate orders.J Am Geriatr Soc.2002;50:290–299. , , , et al.
Pneumonia occurs more commonly among older persons.1 With advancing age, the frequency of hospitalizations and mortality for pneumonia are higher.2 Among the tools developed to predict short‐term mortality is the pneumonia severity index (PSI), which is the best known among severity of illness indices for pneumonia.3 Its ability to predict short‐term mortality for CAP, particularly in identifying those at low risk was previously demonstrated.4 More recently, the extension of its utility in predicting 30‐day mortality for healthcare‐associated pneumonia (HCAP) was demonstrated.5
Severity of illness is one of several risk factors for adverse outcomes among older persons with acute illness. Besides comorbidity, other factors include functional impairment and atypical presentation. Information on physical functioning had equal importance as laboratory data in prognostication of in‐hospital mortality.6 In addition, walking impairment was 1 of 5 components of a risk adjustment index developed to predict 1‐year mortality for hospitalized older persons.7 Atypical presentations of illness, such as delirium and falls, independently predicted poor outcomes among hospitalized older patients.8
Specifically for pneumonia, functional status has also been shown to be an independent predictor of short‐term mortality among older patients hospitalized with CAP.913 Among atypical presentations, only absence of chills was an independent prognostic factor for CAP.9 Bacteremia was an independent factor related to death among adults with CAP, albeit for severe disease resulting in intensive care unit admission.14 It was also included in a severity assessment score; its higher scores were associated with early mortality.15 However, blood culture results are only available 2 to 3 days into the hospital episode. Therefore, bacteremia is a potential risk factor for mortality that is not identifiable at the start of hospitalization.
While PSI is a comprehensive collection of demographic, clinical, and investigative measures, it does not include items on functional status or atypical presentation. Neither does it account for recent hospitalization or comorbid conditions of significance to older persons, such as dementia and depression. It is plausible that at least some of these factors hold added prognostic value.
With all these in mind, we conducted a study with the following objectives: 1) to determine whether functional impairment, recent hospitalization, comorbid conditions of particular significance with advancing age, and atypical presentation are significantly associated with short‐term mortality among older patients hospitalized for CAP and HCAP, after taking into account PSI; and 2) if so, to estimate the magnitude of increased mortality risk with these factors. We tested our null hypotheses that, after adjustment for PSI class, 1) recent hospitalization, 2) pre‐morbid functional impairment, 3) dementia and depression, and 4) atypical presentation of illness have no association with 30‐day mortality for older persons hospitalized for CAP and HCAP, both combined and alone.
PATIENTS AND METHODS
Design and Setting
This was a retrospective cohort study that employed secondary analyses of chart and administrative data. The setting was 3 acute care public hospitals of the National Healthcare Group (NHG) cluster in Singapore. We merged data from hospital charts, the NHG Operations Data Store administrative database, and the national death registry. The local Institution Review Board (IRB) approved waiver of consent, and all other study procedures were consistent with the principles of the Helsinki Declaration.
Patient Population
We included first hospital episodes of adults aged 65 years or older with the principal diagnosis of pneumonia in 2007. These episodes were identified by their primary International Classification of Diseases, 9th revision, Clinical Modification (ICD‐9‐CM) codes of 480 to 486 in the administrative data. Next, we applied our study definition of pneumonia, which required the presence of acute symptoms or signs of pneumonia at the point of hospital admission, and a chest radiograph with features consistent with pneumonia that was obtained during the period from 24 hours before, to 48 hours after, hospital admission. In doing so, we included patients with community‐acquired pneumonia (CAP)16 and healthcare‐associated pneumonia (HCAP),17 but not hospital‐acquired pneumonia (HAP). We excluded patients whose charts were not accessible for review because of human immunodeficiency virus/acquired immune deficiency syndrome (HIV/AIDS) and those whose charts were unavailable for other reasons. The study flow diagram is shown in Figure 1.

We assigned the diagnosis of HCAP to patients who were admitted to an acute care hospital for 2 or more days in the prior 90 days, resided in a nursing home or long‐term care facility, or received of intravenous antibiotic therapy, chemotherapy, wound care, or hemodialysis in the prior 30 days.18 Remaining patients were assigned CAP.
Data Collection
Trained research nurses used an abstraction protocol to collect demographic and clinical information from the charts, and to extract laboratory results and chest radiograph reports from the computerized clinical records. Where radiological reports were equivocal with respect to features of pneumonia, we obtained the opinion of one of our respiratory physician investigators whose decision was final. A researcher with bio‐informatics expertise extracted admission‐related information from the administrative data. Chart, administrative, and mortality data were merged to assemble the study database.
Outcome and Explanatory Variables
The outcome (dependent) variable was 30‐day all‐cause mortality. The following explanatory (independent) variables were examined:
Pneumonia severity index (PSI): We used PSI class as specified in the original studies.4
Recent hospitalization: Hospitalization in the prior 90 days and 30 days were explored.
Atypical presentation of illness: Acute geriatric syndromes (falls or acute impairment of mobility), and absence of cough and purulent sputum were examined. Delirium was not one of the syndromes because PSI includes altered mental state as an item.4
Functional impairment: Pre‐morbid ambulation impairment and feeding impairment were examined. Impairment was defined as needing assistance or being totally dependent.
Additional comorbid conditions: We selected dementia and depression, as they may have impact on mortality in older persons but were not included in PSI.
We did not include bacteremia, because its presence cannot be determined at the time of illness presentation.
From previous experience, we anticipated missing values for functional status measures in up to 5% of charts. Where values were missing, we used the simple imputation strategy of assigning no ambulation or feeding impairment.
Sample Size Calculation
With a sample size of 1400 patients and a 30‐day mortality rate of 25%, 350 cases of death were expected. Using the rule of thumb of at least 10 cases per independent variable,19 we were able to work with 35 candidate explanatory variables in logistic regression for the entire group. Assuming that the subpopulations of CAP and HCAP consist of 700 patients each, with mortality rates of 20% and 30%, respectively, then 14 could be explored for CAP and 21 candidate variables for HCAP.
Data Analyses
Pre‐morbid ambulation impairment and feeding impairment probably represent different points along the continuum of functional impairment. During preliminary analyses when both variables were adjusted for each other in logistic regression, pre‐morbid ambulation impairment (odds ratio [OR] 4.94, 95% confidence interval [CI] 3.79 to 6.43) was associated with 30‐day mortality, whereas pre‐morbid feeding impairment was not (OR 0.82, 95% CI 0.61 to 1.09). As such, pre‐morbid ambulation impairment was selected as the variable to represent functional impairment. Hospitalization in the prior 30 days was more strongly associated with 30‐day mortality (OR 2.38, 95% CI 1.77 to 3.21) than was hospitalization in the prior 90 days (OR 1.90, 95% CI 1.49 to 2.41). Therefore, hospitalization in the prior 30 days was selected as the variable to reflect recent hospitalization.
We used logistic regression analysis and regressed 30‐day mortality on PSI class and other explanatory variables. OR estimates and their 95% CI were used to quantify the strength of associations of the explanatory variables with mortality, and to test their statistical significance. In addition, we explored the possibility of interactions between PSI class and the patient factors. To this end, we constructed additional regression models that included appropriate interaction terms and tested their statistical significance. As a form of sensitivity analysis, we repeated the regression analyses only for hospital episodes with complete functional data and observed the extent to which OR estimates changed. Furthermore, we performed 2‐level hierarchical modeling to account for clustering at the hospital level and re‐examined the OR and 95% CI for the patient factors. We conducted these analyses for the entire group, and repeated them separately for CAP and HCAP. Finally, to estimate the extent to which the patient factors would increase predicted 30‐day mortality, we performed marginal effects analyses for the entire group to quantify the increased risk when individual factors were present.
We used STATA version 9.2 (Stata Corp, College Station, TX) for all statistical analyses. Hierarchical modeling was performed using the xtlogit command. STATA post‐estimation commands mfx and prvalue were employed to estimate marginal effects and predicted probabilities, respectively. The unit of analysis was patients. Statistical significance was defined by P values of less than 0.05.
RESULTS
Among 1607 patients included, 890 (55.4%) had CAP and 717 (44.6%) had HCAP. Baseline patient characteristics of patients with CAP and HCAP are shown in Table 1. The 30‐day mortality rate was 28.1% for the entire group, and 20.6% and 37.4% for patients with CAP and HCAP, respectively. When stratified according to PSI classes 2, 3, 4, and 5, this rate was 0%, 8.2%, 24.4%, and 56.0%, respectively. Because there were no deaths among those with PSI class 2, this category was merged with class 3 for the regression analyses. Missing data on pre‐morbid ambulation impairment and feeding impairment occurred for 39 (2.4%) and 69 (4.6%) patients, respectively.
Whole Study Population (n = 1607) | Those With CAP (n = 890) | Those With HCAP (n = 717) | |
---|---|---|---|
| |||
Median age, years (IQR) | 80 (7487) | 79 (7385) | 82 (7588) |
Male, n (%) | 876 (54.5) | 477 (53.6) | 399 (55.7) |
Median pneumonia severity index (PSI) score, (IQR) | 109 (87134) | 100 (82121) | 120 (99144) |
PSI class: | |||
2 | 98 (6.1) | 84 (9.4) | 14 (2.0) |
3 | 353 (22.0) | 260 (29.2) | 93 (13.0) |
4 | 713 (44.4) | 386 (43.4) | 327 (45.6) |
5 | 443 (27.6) | 160 (18.0) | 283 (39.5) |
Pre‐morbid ambulation impairment, n (%) | 798 (49.7) | 287 (32.3) | 511 (71.3) |
Pre‐morbid feeding impairment, n (%) | 298 (18.5) | 74 (8.3) | 224 (31.2) |
Hospitalization in prior 30 days, n (%) | 209 (13.0) | 0 (0) | 209 (29.2) |
Nursing home residence, n (%) | 362 (22.5) | 0 (0) | 362 (50.5) |
Acute geriatric syndromes, n (%) | 442 (27.5) | 241 (27.1) | 201 (28.0) |
Absence of both cough and purulent sputum, n (%) | 559 (34.8) | 226 (25.4) | 333 (46.4) |
Dementia, n (%) | 307 (19.1) | 121 (13.6) | 178 (25.8) |
Depression, n (%) | 165 (10.3) | 53 (6.0) | 109 (15.8) |
Neoplastic disease, n (%) | 108 (6.7) | 33 (3.7) | 75 (10.5) |
Liver disease, n (%) | 48 (3.0) | 25 (2.8) | 23 (3.2) |
Congestive heart failure, n (%) | 257 (16.0) | 129 (14.5) | 128 (17.9) |
Stroke, n (%) | 490 (30.5) | 215 (24.2) | 275 (38.4) |
Renal failure, n (%) | 220 (13.7) | 97 (10.9) | 123 (17.2) |
Chronic lung disease, n (%) | 316 (19.7) | 177 (19.9) | 139 (19.4) |
Diabetes mellitus, n (%) | 515 (32.1) | 273 (30.7) | 242 (33.8) |
Emergency department diagnosis of pneumonia, n (%) | 857 (53.3) | 494 (55.5) | 363 (50.6) |
For CAP and HCAP together, pre‐morbid ambulation impairment was associated with increased 30‐day mortality (339/798 [42.5%] vs 112/809 [13.8%], unadjusted OR 4.60, 95% CI 3.60 to 5.87, P < 0.01), as was hospitalization in the prior 30 days (94/209 [45.0%] vs 357/1398 [25.5%], unadjusted OR 2.38, 95% CI 1.77 to 3.21, P = 0.02). This was also the case for dementia (118/307 [38.4%] vs 333/1300 [25.6%], unadjusted OR 1.81, 95% CI 1.40 to 2.35, P < 0.01), acute geriatric syndromes (163/442 [36.9%] vs 288/1165 [24.7%], unadjusted OR 1.78, 95% CI 1.41 to 2.25, P < 0.01), and absence of cough and purulent sputum (226/559 [40.4%] vs 225/1048 [21.5%], unadjusted OR 2.48, 95% CI 1.98 to 3.11, P < 0.01). However, depression was not significantly associated with 30‐day mortality (57/165 [34.6%] vs 394/1442 [27.3%], unadjusted OR 1.40, 95% CI 1.00 to 1.97, P = 0.05).
Table 2 summarizes the results of logistic regression. It shows that pre‐morbid ambulation impairment, hospitalization in the prior 30 days, and absence of cough and purulent sputum were all independently associated with 30‐day mortality after adjustment for PSI score for the entire group. These associations remained statistically significant when CAP and HCAP were examined separately. Because none of those with CAP could have hospitalization in the prior 30 days, this factor was not included in the CAP model. The strength of association for the same patient factor varied across the pneumonia sub‐type. This was markedly so for pre‐morbid ambulation impairment, with the OR estimate being almost 3‐fold higher for CAP than for HCAP. Dementia, depression, and acute geriatric syndromes were not associated with 30‐day mortality. When the analyses were repeated after excluding hospital episodes with missing values for pre‐morbid ambulation impairment, the same 3 variables were significantly associated with 30‐day mortality, with trivial differences in strength of association compared to when imputation was performed. The OR estimates for pre‐morbid ambulation impairment, hospitalization in the prior 30 days, and absence of cough and purulent sputum were 2.82 (95% CI 2.12 to 3.76), 1.83 (95% CI 1.42 to 2.83), and 1.47 (95% CI 1.14 to 1.91).
Baseline Patient Factors | Adjusted Odds Ratio (95% Confidence Interval) | ||
---|---|---|---|
All Patients (n = 1607) | Patients With CAP (n = 890) | Patients With HCAP (n = 717) | |
| |||
Pneumonia severity index (PSI) class (reference: PSI classes 2 and 3 combined): | |||
4 | 3.37* (2.20 to 5.17) | 4.02* (2.29 to 7.08) | 2.69* (1.38 to 5.26) |
5 | 11.19* (7.14 to 17.55) | 13.03* (7.00 to 24.24) | 9.73* (4.86 to 19.46) |
Pre‐morbid ambulation impairment | 2.61* (1.98 to 3.45) | 4.56* (3.06 to 6.78) | 1.60* (1.06 to 2.42) |
Hospitalization in the prior 30 days | 1.93* (1.38 to 2.71) | 2.13* (1.47 to 3.09) | |
Dementia | 1.00 (0.74 to 1.37) | 0.82 (0.49 to 1.38) | 1.15 (0.78 to 1.69) |
Depression | 0.83 (0.56 to 1.23) | 1.03 (0.48 to 2.18) | 0.83 (0.53 to 1.31) |
Acute geriatric syndromes | 0.96 (0.72 to 1.26) | 1.26 (0.83 to 1.92) | 0.74 (0.50 to 1.08) |
Absence of cough and purulent sputum | 1.47* (1.14 to 1.90) | 1.64* (1.08 to 2.46) | 1.45* (1.04 to 2.03) |
Two‐level hierarchical modeling to account for clustering at the hospital level obtained negligible change in OR estimates of the patient factors and their 95% CI. There were no statistically significant interactions between PSI class and the 3 patient factors (results not shown).
The model‐predicted increase in mortality risk with presence of individual patient factors for the entire group is shown in Table 3. Across the 3 factors, 30‐day mortality increased by 1.9% to 6.1% for those with PSI class 2 and 3, and by 9.0% to 23.2% for those with PSI class 5. The upper end of these ranges represented the effect of pre‐morbid ambulation impairment, while the lower end was that for absence of cough and purulent sputum. With reference to the predicted mortality rates for PSI class which are listed in the footnotes of Table 3, the adverse prognosis conferred by individual patient factors amounted to relative risk inflation of 27% to 145% depending on the specific factor and PSI class.
Predicted Increase in 30‐Day Mortality With Presence of Single Baseline Patient Factors, % (95% Confidence Interval) | |||
---|---|---|---|
PSI Classes 2 and 3 (n = 449) | PSI Class 4 (n = 700) | PSI Class 5 (n = 413) | |
| |||
Pre‐morbid ambulation impairment | 6.1 (3.2 to 9.0) | 15.0 (10.2 to 19.7) | 23.2 (16.8 to 29.7) |
Hospitalization in the prior 30 days | 3.6 (0.9 to 6.3) | 9.3 (3.6 to 15.1) | 15.7 (7.3 to 24.2) |
Absence of cough and purulent sputum | 1.9 (0.4 to 3.4) | 5.0 (1.4 to 8.6) | 9.0 (3.0 to 15.0) |
DISCUSSION
After accounting for PSI class, we found 3 additional patient factors that were independently associated with 30‐day mortality among older persons hospitalized for pneumonia. Firstly, our study confirms that impaired physical function reflected by pre‐morbid ambulation impairment increases mortality risk, as previously demonstrated by Torres et al.10 It is likely that impaired function reflects an underlying vulnerability for adverse outcomes that is seen across primary diagnoses.7 Secondly, recent hospitalization often indicates clinical, functional, and social complexities, as well as increased likelihood of infection by more virulent organisms commonly associated with healthcare‐related infections. Together, these 2 factors could increase mortality risk. Thirdly, atypical presentations may be associated with increased mortality, because these often occur in frail older persons who are vulnerable to adverse outcomes8 due to diseases suffered and treatment received. Atypical presentations may also result in delayed diagnosis and treatment of pneumonia.
Pilotto et al. found that a multidimensional index comprising functional status, comorbidity burden, mental status, and nutritional assessment, among others, had a higher predictive accuracy for 30‐day mortality than did PSI.20 While there was a previous attempt to combine PSI with independent predictors to identify low‐risk older patients with CAP,21 we could not find similar work on the range of patient factors examined in this study. Indeed, the most important contribution that our study brings to the growing body of literature on short‐term mortality, among older persons hospitalized for pneumonia, is the prognostic importance of these 3 additional patient factors over and above severity of illness measured by PSI. With reference to the baseline predicted risk for different PSI class categories shown in Table 3, we have demonstrated that the predicted increase in mortality risk with the presence of these 3 factors is often not trivial, particularly for those with more severe pneumonia.
These 3 patient factors retained prognostic significance after accounting for PSI class for HCAP. However, only 2 factors were associated with mortality for CAP, because by definition recent hospitalization does not occur. A relevant discussion point is whether CAP and HCAP should be grouped together or classified separately. It is pertinent to reflect that the utility of making a distinction between CAP and HCAP appears to lie largely in the domain of therapeutics regarding the initial choice of antibiotics,18, 2225 although there has been some debate on this point.26 Moreover, the major features of HCAP, namely recent hospitalization (albeit in the prior 30 days, rather than 90 days) and nursing home residence (an item in PSI) were included in our regression analyses. Therefore, it seems reasonable to consider CAP and HCAP as a single group for risk stratification at the clinical frontline. We also argue that combining CAP and HCAP for risk adjustment will result in larger sample sizes that can minimize uncertainty around treatment effect estimates, when comparing across different interventions or providers. The same approach of analyzing CAP and HCAP together was adopted in a recent study that compared US hospitals on their risk‐adjusted performance for pneumonia among Medicare beneficiaries.27
The 30‐day mortality rates in this study are higher than those in the original PSI studies, even when stratified according to PSI class. However, more recent studies also registered relatively high mortality rates ranging from 18% to 19%.12, 28 There are a number of possible reasons for the higher mortality rates observed in our study. Firstly, we included both CAP and HCAP, whereas some other studies focused only on CAP. Secondly, the original PSI studies excluded patients with previous hospitalization within 7 days of admission, while we included them. Thirdly, our study population was relatively old (median age: 80 years) and had a higher proportion from nursing homes (22%). Although age and nursing home residence are variables in the PSI, the weights assigned to these 2 items may not adequately reflect the magnitude of mortality risk they confer. Finally, our understanding is that the study population comprises a relatively high proportion of patients who have do‐not‐resuscitate (DNR) instructions, though this was not measured. All these patient characteristics are likely to be associated with higher mortality risk.
The major strength of this study relates to its real world setting, where there were no major exclusion criteria except for HIV/AIDS. In addition, the clinical data at our disposal allowed selection from a relatively wide range of patient factors, beyond that commonly available in administrative data alone.
However, a few important limitations need to be acknowledged. Firstly, the retrospective nature of the study restricted data to those routinely collected, rather than that specifically acquired for research. Important unmeasured factors include inflammatory markers such as C‐reactive protein (CRP) or procalcitonin levels which have been shown to have prognostic value.29 Others include frailty, socioeconomic status, and social support.20 Secondly, increased likelihood of measurement error associated with retrospectively collected data could result in bias with uncertain direction. Thirdly, our strategy of assuming no functional impairment in the absence of documentation raises the possibility of underidentification and consequent bias in the direction of underestimation of the strength of association between pre‐morbid ambulation impairment and mortality. If so, the true association could even be stronger. Finally, we did not capture do‐not‐resuscitate (DNR) decisions because these were not consistently documented in the charts. We concede that DNR status is expected to be associated with short‐term mortality30 and therefore remains an unobserved factor that may explain a proportion of the mortality risk attributed to other factors in our study, such as pre‐morbid ambulation impairment.
Where do we proceed from here? Given our findings, further work that examines the unmeasured factors mentioned should be done. CRP and procalcitonin levels can be extracted from the laboratory results database when they are measured. However, specification of the other 3 factors is more challenging, given that these represent clinical or social constructs wherein optimal measurement is less certain. It would be important to estimate how much these factors improve the prediction of short‐term mortality beyond that achieved by PSI and the patient factors we have identified.
Nonetheless, the clinical implications of our work are clear. While PSI class is a time‐tested tool, addition of pre‐morbid ambulation impairment, hospitalization in the prior 30 days, and absence of cough and purulent sputum can further improve risk stratification for short‐term mortality, when older persons present initially with clinical and radiological features of pneumonia. Information on these factors should be available in routine clinical care and, therefore, their use in risk stratification should be considered. For more valid and credible risk adjustment, these 3 factors could be considered in addition to severity of illness indices where data availability permits.
CONCLUSION
Recent hospitalization, pre‐morbid ambulation impairment, and atypical clinical presentation were independently associated with higher 30‐day mortality among older persons hospitalized for pneumonia, after adjusting for severity of illness with PSI class. These factors could be considered in addition to PSI, when performing risk stratification and adjustment in this setting.
Acknowledgements
The authors thank Clinical Associate Professor Sin Fai Lam for his assistance in the study, and the medical board chairmen of the 3 study hospitals for their support and encouragement.
Pneumonia occurs more commonly among older persons.1 With advancing age, the frequency of hospitalizations and mortality for pneumonia are higher.2 Among the tools developed to predict short‐term mortality is the pneumonia severity index (PSI), which is the best known among severity of illness indices for pneumonia.3 Its ability to predict short‐term mortality for CAP, particularly in identifying those at low risk was previously demonstrated.4 More recently, the extension of its utility in predicting 30‐day mortality for healthcare‐associated pneumonia (HCAP) was demonstrated.5
Severity of illness is one of several risk factors for adverse outcomes among older persons with acute illness. Besides comorbidity, other factors include functional impairment and atypical presentation. Information on physical functioning had equal importance as laboratory data in prognostication of in‐hospital mortality.6 In addition, walking impairment was 1 of 5 components of a risk adjustment index developed to predict 1‐year mortality for hospitalized older persons.7 Atypical presentations of illness, such as delirium and falls, independently predicted poor outcomes among hospitalized older patients.8
Specifically for pneumonia, functional status has also been shown to be an independent predictor of short‐term mortality among older patients hospitalized with CAP.913 Among atypical presentations, only absence of chills was an independent prognostic factor for CAP.9 Bacteremia was an independent factor related to death among adults with CAP, albeit for severe disease resulting in intensive care unit admission.14 It was also included in a severity assessment score; its higher scores were associated with early mortality.15 However, blood culture results are only available 2 to 3 days into the hospital episode. Therefore, bacteremia is a potential risk factor for mortality that is not identifiable at the start of hospitalization.
While PSI is a comprehensive collection of demographic, clinical, and investigative measures, it does not include items on functional status or atypical presentation. Neither does it account for recent hospitalization or comorbid conditions of significance to older persons, such as dementia and depression. It is plausible that at least some of these factors hold added prognostic value.
With all these in mind, we conducted a study with the following objectives: 1) to determine whether functional impairment, recent hospitalization, comorbid conditions of particular significance with advancing age, and atypical presentation are significantly associated with short‐term mortality among older patients hospitalized for CAP and HCAP, after taking into account PSI; and 2) if so, to estimate the magnitude of increased mortality risk with these factors. We tested our null hypotheses that, after adjustment for PSI class, 1) recent hospitalization, 2) pre‐morbid functional impairment, 3) dementia and depression, and 4) atypical presentation of illness have no association with 30‐day mortality for older persons hospitalized for CAP and HCAP, both combined and alone.
PATIENTS AND METHODS
Design and Setting
This was a retrospective cohort study that employed secondary analyses of chart and administrative data. The setting was 3 acute care public hospitals of the National Healthcare Group (NHG) cluster in Singapore. We merged data from hospital charts, the NHG Operations Data Store administrative database, and the national death registry. The local Institution Review Board (IRB) approved waiver of consent, and all other study procedures were consistent with the principles of the Helsinki Declaration.
Patient Population
We included first hospital episodes of adults aged 65 years or older with the principal diagnosis of pneumonia in 2007. These episodes were identified by their primary International Classification of Diseases, 9th revision, Clinical Modification (ICD‐9‐CM) codes of 480 to 486 in the administrative data. Next, we applied our study definition of pneumonia, which required the presence of acute symptoms or signs of pneumonia at the point of hospital admission, and a chest radiograph with features consistent with pneumonia that was obtained during the period from 24 hours before, to 48 hours after, hospital admission. In doing so, we included patients with community‐acquired pneumonia (CAP)16 and healthcare‐associated pneumonia (HCAP),17 but not hospital‐acquired pneumonia (HAP). We excluded patients whose charts were not accessible for review because of human immunodeficiency virus/acquired immune deficiency syndrome (HIV/AIDS) and those whose charts were unavailable for other reasons. The study flow diagram is shown in Figure 1.

We assigned the diagnosis of HCAP to patients who were admitted to an acute care hospital for 2 or more days in the prior 90 days, resided in a nursing home or long‐term care facility, or received of intravenous antibiotic therapy, chemotherapy, wound care, or hemodialysis in the prior 30 days.18 Remaining patients were assigned CAP.
Data Collection
Trained research nurses used an abstraction protocol to collect demographic and clinical information from the charts, and to extract laboratory results and chest radiograph reports from the computerized clinical records. Where radiological reports were equivocal with respect to features of pneumonia, we obtained the opinion of one of our respiratory physician investigators whose decision was final. A researcher with bio‐informatics expertise extracted admission‐related information from the administrative data. Chart, administrative, and mortality data were merged to assemble the study database.
Outcome and Explanatory Variables
The outcome (dependent) variable was 30‐day all‐cause mortality. The following explanatory (independent) variables were examined:
Pneumonia severity index (PSI): We used PSI class as specified in the original studies.4
Recent hospitalization: Hospitalization in the prior 90 days and 30 days were explored.
Atypical presentation of illness: Acute geriatric syndromes (falls or acute impairment of mobility), and absence of cough and purulent sputum were examined. Delirium was not one of the syndromes because PSI includes altered mental state as an item.4
Functional impairment: Pre‐morbid ambulation impairment and feeding impairment were examined. Impairment was defined as needing assistance or being totally dependent.
Additional comorbid conditions: We selected dementia and depression, as they may have impact on mortality in older persons but were not included in PSI.
We did not include bacteremia, because its presence cannot be determined at the time of illness presentation.
From previous experience, we anticipated missing values for functional status measures in up to 5% of charts. Where values were missing, we used the simple imputation strategy of assigning no ambulation or feeding impairment.
Sample Size Calculation
With a sample size of 1400 patients and a 30‐day mortality rate of 25%, 350 cases of death were expected. Using the rule of thumb of at least 10 cases per independent variable,19 we were able to work with 35 candidate explanatory variables in logistic regression for the entire group. Assuming that the subpopulations of CAP and HCAP consist of 700 patients each, with mortality rates of 20% and 30%, respectively, then 14 could be explored for CAP and 21 candidate variables for HCAP.
Data Analyses
Pre‐morbid ambulation impairment and feeding impairment probably represent different points along the continuum of functional impairment. During preliminary analyses when both variables were adjusted for each other in logistic regression, pre‐morbid ambulation impairment (odds ratio [OR] 4.94, 95% confidence interval [CI] 3.79 to 6.43) was associated with 30‐day mortality, whereas pre‐morbid feeding impairment was not (OR 0.82, 95% CI 0.61 to 1.09). As such, pre‐morbid ambulation impairment was selected as the variable to represent functional impairment. Hospitalization in the prior 30 days was more strongly associated with 30‐day mortality (OR 2.38, 95% CI 1.77 to 3.21) than was hospitalization in the prior 90 days (OR 1.90, 95% CI 1.49 to 2.41). Therefore, hospitalization in the prior 30 days was selected as the variable to reflect recent hospitalization.
We used logistic regression analysis and regressed 30‐day mortality on PSI class and other explanatory variables. OR estimates and their 95% CI were used to quantify the strength of associations of the explanatory variables with mortality, and to test their statistical significance. In addition, we explored the possibility of interactions between PSI class and the patient factors. To this end, we constructed additional regression models that included appropriate interaction terms and tested their statistical significance. As a form of sensitivity analysis, we repeated the regression analyses only for hospital episodes with complete functional data and observed the extent to which OR estimates changed. Furthermore, we performed 2‐level hierarchical modeling to account for clustering at the hospital level and re‐examined the OR and 95% CI for the patient factors. We conducted these analyses for the entire group, and repeated them separately for CAP and HCAP. Finally, to estimate the extent to which the patient factors would increase predicted 30‐day mortality, we performed marginal effects analyses for the entire group to quantify the increased risk when individual factors were present.
We used STATA version 9.2 (Stata Corp, College Station, TX) for all statistical analyses. Hierarchical modeling was performed using the xtlogit command. STATA post‐estimation commands mfx and prvalue were employed to estimate marginal effects and predicted probabilities, respectively. The unit of analysis was patients. Statistical significance was defined by P values of less than 0.05.
RESULTS
Among 1607 patients included, 890 (55.4%) had CAP and 717 (44.6%) had HCAP. Baseline patient characteristics of patients with CAP and HCAP are shown in Table 1. The 30‐day mortality rate was 28.1% for the entire group, and 20.6% and 37.4% for patients with CAP and HCAP, respectively. When stratified according to PSI classes 2, 3, 4, and 5, this rate was 0%, 8.2%, 24.4%, and 56.0%, respectively. Because there were no deaths among those with PSI class 2, this category was merged with class 3 for the regression analyses. Missing data on pre‐morbid ambulation impairment and feeding impairment occurred for 39 (2.4%) and 69 (4.6%) patients, respectively.
Whole Study Population (n = 1607) | Those With CAP (n = 890) | Those With HCAP (n = 717) | |
---|---|---|---|
| |||
Median age, years (IQR) | 80 (7487) | 79 (7385) | 82 (7588) |
Male, n (%) | 876 (54.5) | 477 (53.6) | 399 (55.7) |
Median pneumonia severity index (PSI) score, (IQR) | 109 (87134) | 100 (82121) | 120 (99144) |
PSI class: | |||
2 | 98 (6.1) | 84 (9.4) | 14 (2.0) |
3 | 353 (22.0) | 260 (29.2) | 93 (13.0) |
4 | 713 (44.4) | 386 (43.4) | 327 (45.6) |
5 | 443 (27.6) | 160 (18.0) | 283 (39.5) |
Pre‐morbid ambulation impairment, n (%) | 798 (49.7) | 287 (32.3) | 511 (71.3) |
Pre‐morbid feeding impairment, n (%) | 298 (18.5) | 74 (8.3) | 224 (31.2) |
Hospitalization in prior 30 days, n (%) | 209 (13.0) | 0 (0) | 209 (29.2) |
Nursing home residence, n (%) | 362 (22.5) | 0 (0) | 362 (50.5) |
Acute geriatric syndromes, n (%) | 442 (27.5) | 241 (27.1) | 201 (28.0) |
Absence of both cough and purulent sputum, n (%) | 559 (34.8) | 226 (25.4) | 333 (46.4) |
Dementia, n (%) | 307 (19.1) | 121 (13.6) | 178 (25.8) |
Depression, n (%) | 165 (10.3) | 53 (6.0) | 109 (15.8) |
Neoplastic disease, n (%) | 108 (6.7) | 33 (3.7) | 75 (10.5) |
Liver disease, n (%) | 48 (3.0) | 25 (2.8) | 23 (3.2) |
Congestive heart failure, n (%) | 257 (16.0) | 129 (14.5) | 128 (17.9) |
Stroke, n (%) | 490 (30.5) | 215 (24.2) | 275 (38.4) |
Renal failure, n (%) | 220 (13.7) | 97 (10.9) | 123 (17.2) |
Chronic lung disease, n (%) | 316 (19.7) | 177 (19.9) | 139 (19.4) |
Diabetes mellitus, n (%) | 515 (32.1) | 273 (30.7) | 242 (33.8) |
Emergency department diagnosis of pneumonia, n (%) | 857 (53.3) | 494 (55.5) | 363 (50.6) |
For CAP and HCAP together, pre‐morbid ambulation impairment was associated with increased 30‐day mortality (339/798 [42.5%] vs 112/809 [13.8%], unadjusted OR 4.60, 95% CI 3.60 to 5.87, P < 0.01), as was hospitalization in the prior 30 days (94/209 [45.0%] vs 357/1398 [25.5%], unadjusted OR 2.38, 95% CI 1.77 to 3.21, P = 0.02). This was also the case for dementia (118/307 [38.4%] vs 333/1300 [25.6%], unadjusted OR 1.81, 95% CI 1.40 to 2.35, P < 0.01), acute geriatric syndromes (163/442 [36.9%] vs 288/1165 [24.7%], unadjusted OR 1.78, 95% CI 1.41 to 2.25, P < 0.01), and absence of cough and purulent sputum (226/559 [40.4%] vs 225/1048 [21.5%], unadjusted OR 2.48, 95% CI 1.98 to 3.11, P < 0.01). However, depression was not significantly associated with 30‐day mortality (57/165 [34.6%] vs 394/1442 [27.3%], unadjusted OR 1.40, 95% CI 1.00 to 1.97, P = 0.05).
Table 2 summarizes the results of logistic regression. It shows that pre‐morbid ambulation impairment, hospitalization in the prior 30 days, and absence of cough and purulent sputum were all independently associated with 30‐day mortality after adjustment for PSI score for the entire group. These associations remained statistically significant when CAP and HCAP were examined separately. Because none of those with CAP could have hospitalization in the prior 30 days, this factor was not included in the CAP model. The strength of association for the same patient factor varied across the pneumonia sub‐type. This was markedly so for pre‐morbid ambulation impairment, with the OR estimate being almost 3‐fold higher for CAP than for HCAP. Dementia, depression, and acute geriatric syndromes were not associated with 30‐day mortality. When the analyses were repeated after excluding hospital episodes with missing values for pre‐morbid ambulation impairment, the same 3 variables were significantly associated with 30‐day mortality, with trivial differences in strength of association compared to when imputation was performed. The OR estimates for pre‐morbid ambulation impairment, hospitalization in the prior 30 days, and absence of cough and purulent sputum were 2.82 (95% CI 2.12 to 3.76), 1.83 (95% CI 1.42 to 2.83), and 1.47 (95% CI 1.14 to 1.91).
Baseline Patient Factors | Adjusted Odds Ratio (95% Confidence Interval) | ||
---|---|---|---|
All Patients (n = 1607) | Patients With CAP (n = 890) | Patients With HCAP (n = 717) | |
| |||
Pneumonia severity index (PSI) class (reference: PSI classes 2 and 3 combined): | |||
4 | 3.37* (2.20 to 5.17) | 4.02* (2.29 to 7.08) | 2.69* (1.38 to 5.26) |
5 | 11.19* (7.14 to 17.55) | 13.03* (7.00 to 24.24) | 9.73* (4.86 to 19.46) |
Pre‐morbid ambulation impairment | 2.61* (1.98 to 3.45) | 4.56* (3.06 to 6.78) | 1.60* (1.06 to 2.42) |
Hospitalization in the prior 30 days | 1.93* (1.38 to 2.71) | 2.13* (1.47 to 3.09) | |
Dementia | 1.00 (0.74 to 1.37) | 0.82 (0.49 to 1.38) | 1.15 (0.78 to 1.69) |
Depression | 0.83 (0.56 to 1.23) | 1.03 (0.48 to 2.18) | 0.83 (0.53 to 1.31) |
Acute geriatric syndromes | 0.96 (0.72 to 1.26) | 1.26 (0.83 to 1.92) | 0.74 (0.50 to 1.08) |
Absence of cough and purulent sputum | 1.47* (1.14 to 1.90) | 1.64* (1.08 to 2.46) | 1.45* (1.04 to 2.03) |
Two‐level hierarchical modeling to account for clustering at the hospital level obtained negligible change in OR estimates of the patient factors and their 95% CI. There were no statistically significant interactions between PSI class and the 3 patient factors (results not shown).
The model‐predicted increase in mortality risk with presence of individual patient factors for the entire group is shown in Table 3. Across the 3 factors, 30‐day mortality increased by 1.9% to 6.1% for those with PSI class 2 and 3, and by 9.0% to 23.2% for those with PSI class 5. The upper end of these ranges represented the effect of pre‐morbid ambulation impairment, while the lower end was that for absence of cough and purulent sputum. With reference to the predicted mortality rates for PSI class which are listed in the footnotes of Table 3, the adverse prognosis conferred by individual patient factors amounted to relative risk inflation of 27% to 145% depending on the specific factor and PSI class.
Predicted Increase in 30‐Day Mortality With Presence of Single Baseline Patient Factors, % (95% Confidence Interval) | |||
---|---|---|---|
PSI Classes 2 and 3 (n = 449) | PSI Class 4 (n = 700) | PSI Class 5 (n = 413) | |
| |||
Pre‐morbid ambulation impairment | 6.1 (3.2 to 9.0) | 15.0 (10.2 to 19.7) | 23.2 (16.8 to 29.7) |
Hospitalization in the prior 30 days | 3.6 (0.9 to 6.3) | 9.3 (3.6 to 15.1) | 15.7 (7.3 to 24.2) |
Absence of cough and purulent sputum | 1.9 (0.4 to 3.4) | 5.0 (1.4 to 8.6) | 9.0 (3.0 to 15.0) |
DISCUSSION
After accounting for PSI class, we found 3 additional patient factors that were independently associated with 30‐day mortality among older persons hospitalized for pneumonia. Firstly, our study confirms that impaired physical function reflected by pre‐morbid ambulation impairment increases mortality risk, as previously demonstrated by Torres et al.10 It is likely that impaired function reflects an underlying vulnerability for adverse outcomes that is seen across primary diagnoses.7 Secondly, recent hospitalization often indicates clinical, functional, and social complexities, as well as increased likelihood of infection by more virulent organisms commonly associated with healthcare‐related infections. Together, these 2 factors could increase mortality risk. Thirdly, atypical presentations may be associated with increased mortality, because these often occur in frail older persons who are vulnerable to adverse outcomes8 due to diseases suffered and treatment received. Atypical presentations may also result in delayed diagnosis and treatment of pneumonia.
Pilotto et al. found that a multidimensional index comprising functional status, comorbidity burden, mental status, and nutritional assessment, among others, had a higher predictive accuracy for 30‐day mortality than did PSI.20 While there was a previous attempt to combine PSI with independent predictors to identify low‐risk older patients with CAP,21 we could not find similar work on the range of patient factors examined in this study. Indeed, the most important contribution that our study brings to the growing body of literature on short‐term mortality, among older persons hospitalized for pneumonia, is the prognostic importance of these 3 additional patient factors over and above severity of illness measured by PSI. With reference to the baseline predicted risk for different PSI class categories shown in Table 3, we have demonstrated that the predicted increase in mortality risk with the presence of these 3 factors is often not trivial, particularly for those with more severe pneumonia.
These 3 patient factors retained prognostic significance after accounting for PSI class for HCAP. However, only 2 factors were associated with mortality for CAP, because by definition recent hospitalization does not occur. A relevant discussion point is whether CAP and HCAP should be grouped together or classified separately. It is pertinent to reflect that the utility of making a distinction between CAP and HCAP appears to lie largely in the domain of therapeutics regarding the initial choice of antibiotics,18, 2225 although there has been some debate on this point.26 Moreover, the major features of HCAP, namely recent hospitalization (albeit in the prior 30 days, rather than 90 days) and nursing home residence (an item in PSI) were included in our regression analyses. Therefore, it seems reasonable to consider CAP and HCAP as a single group for risk stratification at the clinical frontline. We also argue that combining CAP and HCAP for risk adjustment will result in larger sample sizes that can minimize uncertainty around treatment effect estimates, when comparing across different interventions or providers. The same approach of analyzing CAP and HCAP together was adopted in a recent study that compared US hospitals on their risk‐adjusted performance for pneumonia among Medicare beneficiaries.27
The 30‐day mortality rates in this study are higher than those in the original PSI studies, even when stratified according to PSI class. However, more recent studies also registered relatively high mortality rates ranging from 18% to 19%.12, 28 There are a number of possible reasons for the higher mortality rates observed in our study. Firstly, we included both CAP and HCAP, whereas some other studies focused only on CAP. Secondly, the original PSI studies excluded patients with previous hospitalization within 7 days of admission, while we included them. Thirdly, our study population was relatively old (median age: 80 years) and had a higher proportion from nursing homes (22%). Although age and nursing home residence are variables in the PSI, the weights assigned to these 2 items may not adequately reflect the magnitude of mortality risk they confer. Finally, our understanding is that the study population comprises a relatively high proportion of patients who have do‐not‐resuscitate (DNR) instructions, though this was not measured. All these patient characteristics are likely to be associated with higher mortality risk.
The major strength of this study relates to its real world setting, where there were no major exclusion criteria except for HIV/AIDS. In addition, the clinical data at our disposal allowed selection from a relatively wide range of patient factors, beyond that commonly available in administrative data alone.
However, a few important limitations need to be acknowledged. Firstly, the retrospective nature of the study restricted data to those routinely collected, rather than that specifically acquired for research. Important unmeasured factors include inflammatory markers such as C‐reactive protein (CRP) or procalcitonin levels which have been shown to have prognostic value.29 Others include frailty, socioeconomic status, and social support.20 Secondly, increased likelihood of measurement error associated with retrospectively collected data could result in bias with uncertain direction. Thirdly, our strategy of assuming no functional impairment in the absence of documentation raises the possibility of underidentification and consequent bias in the direction of underestimation of the strength of association between pre‐morbid ambulation impairment and mortality. If so, the true association could even be stronger. Finally, we did not capture do‐not‐resuscitate (DNR) decisions because these were not consistently documented in the charts. We concede that DNR status is expected to be associated with short‐term mortality30 and therefore remains an unobserved factor that may explain a proportion of the mortality risk attributed to other factors in our study, such as pre‐morbid ambulation impairment.
Where do we proceed from here? Given our findings, further work that examines the unmeasured factors mentioned should be done. CRP and procalcitonin levels can be extracted from the laboratory results database when they are measured. However, specification of the other 3 factors is more challenging, given that these represent clinical or social constructs wherein optimal measurement is less certain. It would be important to estimate how much these factors improve the prediction of short‐term mortality beyond that achieved by PSI and the patient factors we have identified.
Nonetheless, the clinical implications of our work are clear. While PSI class is a time‐tested tool, addition of pre‐morbid ambulation impairment, hospitalization in the prior 30 days, and absence of cough and purulent sputum can further improve risk stratification for short‐term mortality, when older persons present initially with clinical and radiological features of pneumonia. Information on these factors should be available in routine clinical care and, therefore, their use in risk stratification should be considered. For more valid and credible risk adjustment, these 3 factors could be considered in addition to severity of illness indices where data availability permits.
CONCLUSION
Recent hospitalization, pre‐morbid ambulation impairment, and atypical clinical presentation were independently associated with higher 30‐day mortality among older persons hospitalized for pneumonia, after adjusting for severity of illness with PSI class. These factors could be considered in addition to PSI, when performing risk stratification and adjustment in this setting.
Acknowledgements
The authors thank Clinical Associate Professor Sin Fai Lam for his assistance in the study, and the medical board chairmen of the 3 study hospitals for their support and encouragement.
- Community‐acquired pneumonia in the elderly.Clin Infect Dis.2000;31:1066–1078. .
- Hospitalized community‐acquired pneumonia in the elderly—age‐ and sex‐related patterns of care and outcome in the United States.Am J Respir Crit Care Med.2002;165:766–772. , , , , , .
- Validation of a pneumonia prognostic index using the MedisGroups Comparative Hospital Database.Am J Med.1993;94:153–159. , , , , .
- A prediction rule to identify low‐risk patients with community‐acquired pneumonia.N Engl J Med.1997;336:243–250. , , , et al.
- Application and comparison of scoring indices to predict outcomes in patients with healthcare‐associated pneumonia.Critical Care.2011;15:R32. , , , et al.
- Predicting in‐hospital mortality: the importance of functional status information.Med Care.1995;33:906–921. , , , , , .
- Burden of illness score for elderly persons: risk adjustment incorporating the cumulative impact of diseases, physiologic abnormalities, and functional impairments.Med Care.2003;41:70–83. , , , et al.
- Illness presentation in elderly patients.Arch Intern Med.1995;155:1060–1064. , , , , .
- Community‐acquired pneumonia in the elderly: Spanish multicentre study.Eur Respir J.2003;21:294–302. , , , et al.
- Outcome predictors of pneumonia in elderly patients: importance of functional assessment.J Am Geriatr Soc.2004;52:1603–1609. , , , et al.
- Factors influencing in‐hospital mortality in community‐acquired pneumonia: a prospective study of patients not initially admitted to the ICU.Chest.2005;127;1260–1270. , .
- Assessment of pneumonia in older adults: effect of functional status.J Am Geriatr Soc.2006;54:1062–1067. , , .
- Only severely limited, premorbid functional status is associated with short‐ and longterm mortality in patients with pneumonia who are critically ill: a prospective observational study.Chest.2011;139:88–94. , , , et al.
- Severe community‐acquired pneumonia: assessment of microbial aetiology as mortality factor.Eur Respir J.2004;24:779–785. , , , et al.
- PIRO score for community‐acquired pneumonia: a new prediction rule for assessment of severity in intensive care unit patients with community‐acquired pneumonia.Crit Care Med.2009;37:456–462. , , , , , .
- Practice guidelines for the management of community‐acquired pneumonia.Clin Infect Dis.2000;31:347–382. , , , , , .
- Epidemiology and outcomes of health‐care–associated pneumonia—results from a large US database of culture‐positive pneumonia.Chest.2005;128:3854–3862. , , , , , .
- American Thoracic Society and Infectious Diseases Society of America.Guidelines for the management of adults with hospital‐acquired, ventilator‐associated, and healthcare‐associated pneumonia.Am J Respir Crit Care Med.2005;171:388–416.
- Conceptual and practical issues in developing risk‐adjustment methods. In: Iezzoni LI, editor.Risk Adjustment for Measuring Health Care Outcomes.3rd ed.Chicago, IL:Health Administration Press;2003:179–205. , , .
- The multidimensional prognostic index predicts short‐ and long‐term mortality in hospitalized geriatric patients with pneumonia.J Gerontol A Biol Sci Med Sci.2009;64A:880–887. , , , et al.
- A validation and potential modification of the pneumonia severity index in elderly patients with community‐acquired pneumonia.J Am Geriatr Soc.2006;54:1212–1219. , , , et al.
- Health care‐associated pneumonia—a new therapeutic paradigm.Chest.2005;128:3784–3786. , .
- Health care‐associated pneumonia requiring hospital admission.Arch Intern Med.2007;167:1393–1399. , , , et al.
- Health care‐associated pneumonia (HCAP): a critical appraisal to improve identification, management, and outcomes—Proceedings of the HCAP Summit.Clin Infect Dis.2008;46(suppl 4):S296–S334. , , , et al.
- for the Study Group of the Italian Society of Internal Medicine.Outcomes of patients hospitalized with community‐acquired, health care‐associated, and hospital‐acquired pneumonia.Ann Intern Med.2009;150:19–26. , , , , ;
- Healthcare‐associated pneumonia is a heterogeneous disease, and all patients do not need the same broad‐spectrum antibiotic therapy as complex nosocomial pneumonia.Curr Opin Infect Dis.2009;22:316–325. , .
- The performance of US hospitals as reflected in risk‐standardized 30‐day mortality and readmission rates for Medicare beneficiaries with pneumonia.J Hosp Med.2010;5:E12–E18. , , , et al.
- Temporal trends in outcomes of older patients with pneumonia.Arch Intern Med.2000;160:3385–3391. , , , et al.
- Clinical review: the role of biomarkers in the diagnosis and management of community‐acquired pneumonia.Critical Care.2010;14:203. , .
- Community‐acquired pneumonia and do‐not‐resuscitate orders.J Am Geriatr Soc.2002;50:290–299. , , , et al.
- Community‐acquired pneumonia in the elderly.Clin Infect Dis.2000;31:1066–1078. .
- Hospitalized community‐acquired pneumonia in the elderly—age‐ and sex‐related patterns of care and outcome in the United States.Am J Respir Crit Care Med.2002;165:766–772. , , , , , .
- Validation of a pneumonia prognostic index using the MedisGroups Comparative Hospital Database.Am J Med.1993;94:153–159. , , , , .
- A prediction rule to identify low‐risk patients with community‐acquired pneumonia.N Engl J Med.1997;336:243–250. , , , et al.
- Application and comparison of scoring indices to predict outcomes in patients with healthcare‐associated pneumonia.Critical Care.2011;15:R32. , , , et al.
- Predicting in‐hospital mortality: the importance of functional status information.Med Care.1995;33:906–921. , , , , , .
- Burden of illness score for elderly persons: risk adjustment incorporating the cumulative impact of diseases, physiologic abnormalities, and functional impairments.Med Care.2003;41:70–83. , , , et al.
- Illness presentation in elderly patients.Arch Intern Med.1995;155:1060–1064. , , , , .
- Community‐acquired pneumonia in the elderly: Spanish multicentre study.Eur Respir J.2003;21:294–302. , , , et al.
- Outcome predictors of pneumonia in elderly patients: importance of functional assessment.J Am Geriatr Soc.2004;52:1603–1609. , , , et al.
- Factors influencing in‐hospital mortality in community‐acquired pneumonia: a prospective study of patients not initially admitted to the ICU.Chest.2005;127;1260–1270. , .
- Assessment of pneumonia in older adults: effect of functional status.J Am Geriatr Soc.2006;54:1062–1067. , , .
- Only severely limited, premorbid functional status is associated with short‐ and longterm mortality in patients with pneumonia who are critically ill: a prospective observational study.Chest.2011;139:88–94. , , , et al.
- Severe community‐acquired pneumonia: assessment of microbial aetiology as mortality factor.Eur Respir J.2004;24:779–785. , , , et al.
- PIRO score for community‐acquired pneumonia: a new prediction rule for assessment of severity in intensive care unit patients with community‐acquired pneumonia.Crit Care Med.2009;37:456–462. , , , , , .
- Practice guidelines for the management of community‐acquired pneumonia.Clin Infect Dis.2000;31:347–382. , , , , , .
- Epidemiology and outcomes of health‐care–associated pneumonia—results from a large US database of culture‐positive pneumonia.Chest.2005;128:3854–3862. , , , , , .
- American Thoracic Society and Infectious Diseases Society of America.Guidelines for the management of adults with hospital‐acquired, ventilator‐associated, and healthcare‐associated pneumonia.Am J Respir Crit Care Med.2005;171:388–416.
- Conceptual and practical issues in developing risk‐adjustment methods. In: Iezzoni LI, editor.Risk Adjustment for Measuring Health Care Outcomes.3rd ed.Chicago, IL:Health Administration Press;2003:179–205. , , .
- The multidimensional prognostic index predicts short‐ and long‐term mortality in hospitalized geriatric patients with pneumonia.J Gerontol A Biol Sci Med Sci.2009;64A:880–887. , , , et al.
- A validation and potential modification of the pneumonia severity index in elderly patients with community‐acquired pneumonia.J Am Geriatr Soc.2006;54:1212–1219. , , , et al.
- Health care‐associated pneumonia—a new therapeutic paradigm.Chest.2005;128:3784–3786. , .
- Health care‐associated pneumonia requiring hospital admission.Arch Intern Med.2007;167:1393–1399. , , , et al.
- Health care‐associated pneumonia (HCAP): a critical appraisal to improve identification, management, and outcomes—Proceedings of the HCAP Summit.Clin Infect Dis.2008;46(suppl 4):S296–S334. , , , et al.
- for the Study Group of the Italian Society of Internal Medicine.Outcomes of patients hospitalized with community‐acquired, health care‐associated, and hospital‐acquired pneumonia.Ann Intern Med.2009;150:19–26. , , , , ;
- Healthcare‐associated pneumonia is a heterogeneous disease, and all patients do not need the same broad‐spectrum antibiotic therapy as complex nosocomial pneumonia.Curr Opin Infect Dis.2009;22:316–325. , .
- The performance of US hospitals as reflected in risk‐standardized 30‐day mortality and readmission rates for Medicare beneficiaries with pneumonia.J Hosp Med.2010;5:E12–E18. , , , et al.
- Temporal trends in outcomes of older patients with pneumonia.Arch Intern Med.2000;160:3385–3391. , , , et al.
- Clinical review: the role of biomarkers in the diagnosis and management of community‐acquired pneumonia.Critical Care.2010;14:203. , .
- Community‐acquired pneumonia and do‐not‐resuscitate orders.J Am Geriatr Soc.2002;50:290–299. , , , et al.
Copyright © 2011 Society of Hospital Medicine
Observation Care in Children's Hospitals
Observation medicine has grown in recent decades out of changes in policies for hospital reimbursement, requirements for patients to meet admission criteria to qualify for inpatient admission, and efforts to avoid unnecessary or inappropriate admissions.1 Emergency physicians are frequently faced with patients who are too sick to be discharged home, but do not clearly meet criteria for an inpatient status admission. These patients often receive extended outpatient services (typically extending 24 to 48 hours) under the designation of observation status, in order to determine their response to treatment and need for hospitalization.
Observation care delivered to adult patients has increased substantially in recent years, and the confusion around the designation of observation versus inpatient care has received increasing attention in the lay press.27 According to the Centers for Medicare and Medicaid Services (CMS)8:
Observation care is a well‐defined set of specific, clinically appropriate services, which include ongoing short term treatment, assessment, and reassessment before a decision can be made regarding whether patients will require further treatment as hospital inpatients. Observation services are commonly ordered for patients who present to the emergency department and who then require a significant period of treatment or monitoring in order to make a decision concerning their admission or discharge.
Observation status is an administrative label that is applied to patients who do not meet inpatient level of care criteria, as defined by third parties such as InterQual. These criteria usually include a combination of the patient's clinical diagnoses, severity of illness, and expected needs for monitoring and interventions, in order to determine the admission status to which the patient may be assigned (eg, observation, inpatient, or intensive care). Observation services can be provided, in a variety of settings, to those patients who do not meet inpatient level of care but require a period of observation. Some hospitals provide observation care in discrete units in the emergency department (ED) or specific inpatient unit, and others have no designated unit but scatter observation patients throughout the institution, termed virtual observation units.9
For more than 30 years, observation unit (OU) admission has offered an alternative to traditional inpatient hospitalization for children with a variety of acute conditions.10, 11 Historically, the published literature on observation care for children in the United States has been largely based in dedicated emergency department OUs.12 Yet, in a 2001 survey of 21 pediatric EDs, just 6 reported the presence of a 23‐hour unit.13 There are single‐site examples of observation care delivered in other settings.14, 15 In 2 national surveys of US General Hospitals, 25% provided observation services in beds adjacent to the ED, and the remainder provided observation services in hospital inpatient units.16, 17 However, we are not aware of any previous multi‐institution studies exploring hospital‐wide practices related to observation care for children.
Recognizing that observation status can be designated using various standards, and that observation care can be delivered in locations outside of dedicated OUs,9 we developed 2 web‐based surveys to examine the current models of pediatric observation medicine in US children's hospitals. We hypothesized that observation care is most commonly applied as a billing designation and does not necessarily represent care delivered in a structurally or functionally distinct OU, nor does it represent a difference in care provided to those patients with inpatient designation.
METHODS
Study Design
Two web‐based surveys were distributed, in April 2010, to the 42 freestanding, tertiary care children's hospitals affiliated with the Child Health Corporation of America (CHCA; Shawnee Mission, KS) which contribute data to the Pediatric Health Information System (PHIS) database. The PHIS is a national administrative database that contains resource utilization data from participating hospitals located in noncompeting markets of 27 states plus the District of Columbia. These hospitals account for 20% of all tertiary care children's hospitals in the United States.
Survey Content
Survey 1
A survey of hospital observation status practices has been developed by CHCA as a part of the PHIS data quality initiative (see Supporting Appendix: Survey 1 in the online version of this article). Hospitals that did not provide observation patient data to PHIS were excluded after an initial screening question. This survey obtained information regarding the designation of observation status within each hospital. Hospitals provided free‐text responses to questions related to the criteria used to define observation, and to admit patients into observation status. Fixed‐choice response questions were used to determine specific observation status utilization criteria and clinical guidelines (eg, InterQual and Milliman) used by hospitals for the designation of observation status to patients.
Survey 2
We developed a detailed follow‐up survey in order to characterize the structures and processes of care associated with observation status (see Supporting Appendix: Survey 2 in the online version of this article). Within the follow‐up survey, an initial screening question was used to determine all types of patients to which observation status is assigned within the responding hospitals. All other questions in Survey 2 were focused specifically on those patients who required additional care following ED evaluation and treatment. Fixed‐choice response questions were used to explore differences in care for patients under observation and those admitted as inpatients. We also inquired of hospital practices related to boarding of patients in the ED while awaiting admission to an inpatient bed.
Survey Distribution
Two web‐based surveys were distributed to all 42 CHCA hospitals that contribute data to PHIS. During the month of April 2010, each hospital's designated PHIS operational contact received e‐mail correspondence requesting their participation in each survey. Within hospitals participating in PHIS, Operational Contacts have been assigned to serve as the day‐to‐day PHIS contact person based upon their experience working with the PHIS data. The Operational Contacts are CHCA's primary contact for issues related to the hospital's data quality and reporting to PHIS. Non‐responders were contacted by e‐mail for additional requests to complete the surveys. Each e‐mail provided an introduction to the topic of the survey and a link to complete the survey. The e‐mail requesting participation in Survey 1 was distributed the first week of April 2010, and the survey was open for responses during the first 3 weeks of the month. The e‐mail requesting participation in Survey 2 was sent the third week of April 2010, and the survey was open for responses during the subsequent 2 weeks.
DATA ANALYSIS
Survey responses were collected and are presented as a descriptive summary of results. Hospital characteristics were summarized with medians and interquartile ranges for continuous variables, and with percents for categorical variables. Characteristics were compared between hospitals that responded and those that did not respond to Survey 2 using Wilcoxon rank‐sum tests and chi‐square tests as appropriate. All analyses were performed using SAS v.9.2 (SAS Institute, Cary, NC), and a P value <0.05 was considered statistically significant. The study was reviewed by the University of Michigan Institutional Review Board and considered exempt.
RESULTS
Responses to Survey 1 were available from 37 of 42 (88%) of PHIS hospitals (Figure 1). For Survey 2, we received responses from 20 of 42 (48%) of PHIS hospitals. Based on information available from Survey 1, we know that 20 of the 31 (65%) PHIS hospitals that report observation status patient data to PHIS responded to Survey 2. Characteristics of the hospitals responding and not responding to Survey 2 are presented in Table 1. Respondents provided hospital identifying information which allowed for the linkage of data, from Survey 1, to 17 of the 20 hospitals responding to Survey 2. We did not have information available to link responses from 3 hospitals.

Respondent N = 20 | Non‐Respondent N = 22 | P Value | |
---|---|---|---|
| |||
No. of inpatient beds Median [IQR] (excluding Obstetrics) | 245 [219283] | 282 [250381] | 0.076 |
Annual admissions Median [IQR] (excluding births) | 11,658 [8,64213,213] | 13,522 [9,83018,705] | 0.106 |
ED volume Median [IQR] | 60,528 [47,85082,955] | 64,486 [47,38684,450] | 0.640 |
Percent government payer Median [IQR] | 53% [4662] | 49% [4158] | 0.528 |
Region | |||
Northeast | 37% | 0% | 0.021 |
Midwest | 21% | 33% | |
South | 21% | 50% | |
West | 21% | 17% | |
Reports observation status patients to PHIS | 85% | 90% | 0.555 |
Based on responses to the surveys and our knowledge of data reported to PHIS, our current understanding of patient flow from ED through observation to discharge home, and the application of observation status to the encounter, is presented in Figure 2. According to free‐text responses to Survey 1, various methods were applied to designate observation status (gray shaded boxes in Figure 2). Fixed‐choice responses to Survey 2 revealed that observation status patients were cared for in a variety of locations within hospitals, including ED beds, designated observation units, and inpatient beds (dashed boxes in Figure 2). Not every facility utilized all of the listed locations for observation care. Space constraints could dictate the location of care, regardless of patient status (eg, observation vs inpatient), in hospitals with more than one location of care available to observation patients. While patient status could change during a visit, only the final patient status at discharge enters the administrative record submitted to PHIS (black boxes in Figure 2). Facility charges for observation remained a part of the visit record and were reported to PHIS. Hospitals may or may not bill for all assigned charges depending on patient status, length of stay, or other specific criteria determined by contracts with individual payers.

Survey 1: Classification of Observation Patients and Presence of Observation Units in PHIS Hospitals
According to responses to Survey 1, designated OUs were not widespread, present in only 12 of the 31 hospitals. No hospital reported treating all observation status patients exclusively in a designated OU. Observation status was defined by both duration of treatment and either level of care criteria or clinical care guidelines in 21 of the 31 hospitals responding to Survey 1. Of the remaining 10 hospitals, 1 reported that treatment duration alone defines observation status, and the others relied on prespecified observation criteria. When considering duration of treatment, hospitals variably indicated that anticipated or actual lengths of stay were used to determine observation status. Regarding the maximum hours a patient can be observed, 12 hospitals limited observation to 24 hours or fewer, 12 hospitals observed patients for no more than 36 to 48 hours, and the remaining 7 hospitals allowed observation periods of 72 hours or longer.
When admitting patients to observation status, 30 of 31 hospitals specified the criteria that were used to determine observation admissions. InterQual criteria, the most common response, were used by 23 of the 30 hospitals reporting specified criteria; the remaining 7 hospitals had developed hospital‐specific criteria or modified existing criteria, such as InterQual or Milliman, to determine observation status admissions. In addition to these criteria, 11 hospitals required a physician order for admission to observation status. Twenty‐four hospitals indicated that policies were in place to change patient status from observation to inpatient, or inpatient to observation, typically through processes of utilization review and application of criteria listed above.
Most hospitals indicated that they faced substantial variation in the standards used from one payer to another when considering reimbursement for care delivered under observation status. Hospitals noted that duration‐of‐carebased reimbursement practices included hourly rates, per diem, and reimbursement for only the first 24 or 48 hours of observation care. Hospitals identified that payers variably determined reimbursement for observation based on InterQual level of care criteria and Milliman care guidelines. One hospital reported that it was not their practice to bill for the observation bed.
Survey 2: Understanding Observation Patient Type Administrative Data Following ED Care Within PHIS Hospitals
Of the 20 hospitals responding to Survey 2, there were 2 hospitals that did not apply observation status to patients after ED care and 2 hospitals that did not provide complete responses. The remaining 16 hospitals provided information regarding observation status as applied to patients after receiving treatment in the ED. The settings available for observation care and patient groups treated within each area are presented in Table 2. In addition to the patient groups listed in Table 2, there were 4 hospitals where patients could be admitted to observation status directly from an outpatient clinic. All responding hospitals provided virtual observation care (ie, observation status is assigned but the patient is cared for in the existing ED or inpatient ward). Nine hospitals also provided observation care within a dedicated ED or ward‐based OU (ie, a separate clinical area in which observation patients are treated).
Hospital No. | Available Observation Settings | Patient Groups Under Observation in Each Setting | UR to Assign Obs Status | When Obs Status Is Assigned | ||
---|---|---|---|---|---|---|
ED | Post‐Op | Test/Treat | ||||
| ||||||
1 | Virtual inpatient | X | X | X | Yes | Discharge |
Ward‐based OU | X | X | No | |||
2 | Virtual inpatient | X | X | Yes | Admission | |
Ward‐based OU | X | X | X | No | ||
3 | Virtual inpatient | X | X | X | Yes | Discharge |
Ward‐based OU | X | X | X | Yes | ||
ED OU | X | Yes | ||||
Virtual ED | X | Yes | ||||
4 | Virtual inpatient | X | X | X | Yes | Discharge |
ED OU | X | No | ||||
Virtual ED | X | No | ||||
5 | Virtual inpatient | X | X | X | N/A | Discharge |
6 | Virtual inpatient | X | X | X | Yes | Discharge |
7 | Virtual inpatient | X | X | Yes | No response | |
Ward‐based OU | X | Yes | ||||
Virtual ED | X | Yes | ||||
8 | Virtual inpatient | X | X | X | Yes | Admission |
9 | Virtual inpatient | X | X | Yes | Discharge | |
ED OU | X | Yes | ||||
Virtual ED | X | Yes | ||||
10 | Virtual inpatient | X | X | X | Yes | Admission |
ED OU | X | Yes | ||||
11 | Virtual inpatient | X | X | Yes | Discharge | |
Ward‐based OU | X | X | Yes | |||
ED OU | X | Yes | ||||
Virtual ED | X | Yes | ||||
12 | Virtual inpatient | X | X | X | Yes | Admission |
13 | Virtual inpatient | X | X | N/A | Discharge | |
Virtual ED | X | N/A | ||||
14 | Virtual inpatient | X | X | X | Yes | Both |
15 | Virtual inpatient | X | X | Yes | Admission | |
Ward‐based OU | X | X | Yes | |||
16 | Virtual inpatient | X | Yes | Admission |
When asked to identify differences between clinical care delivered to patients admitted under virtual observation and those admitted under inpatient status, 14 of 16 hospitals selected the option There are no differences in the care delivery of these patients. The differences identified by 2 hospitals included patient care orders, treatment protocols, and physician documentation. Within the hospitals that reported utilization of virtual ED observation, 2 reported differences in care compared with other ED patients, including patient care orders, physician rounds, documentation, and discharge process. When admitted patients were boarded in the ED while awaiting an inpatient bed, 11 of 16 hospitals allowed for observation or inpatient level of care to be provided in the ED. Fourteen hospitals allow an admitted patient to be discharged home from boarding in the ED without ever receiving care in an inpatient bed. The discharge decision was made by ED providers in 7 hospitals, and inpatient providers in the other 7 hospitals.
Responses to questions providing detailed information on the process of utilization review were provided by 12 hospitals. Among this subset of hospitals, utilization review was consistently used to assign virtual inpatient observation status and was applied at admission (n = 6) or discharge (n = 8), depending on the hospital. One hospital applied observation status at both admission and discharge; 1 hospital did not provide a response. Responses to questions regarding utilization review are presented in Table 3.
Survey Question | Yes N (%) | No N (%) |
---|---|---|
Preadmission utilization review is conducted at my hospital. | 3 (25) | 9 (75) |
Utilization review occurs daily at my hospital. | 10 (83) | 2 (17) |
A nonclinician can initiate an order for observation status. | 4 (33) | 8 (67) |
Status can be changed after the patient has been discharged. | 10 (83) | 2 (17) |
Inpatient status would always be assigned to a patient who receives less than 24 hours of care and meets inpatient criteria. | 9 (75) | 3 (25) |
The same status would be assigned to different patients who received the same treatment of the same duration but have different payers. | 6 (50) | 6 (50) |
DISCUSSION
This is the largest descriptive study of pediatric observation status practices in US freestanding children's hospitals and, to our knowledge, the first to include information about both the ED and inpatient treatment environments. There are two important findings of this study. First, designated OUs were uncommon among the group of freestanding children's hospitals that reported observation patient data to PHIS in 2010. Second, despite the fact that hospitals reported observation care was delivered in a variety of settings, virtual inpatient observation status was nearly ubiquitous. Among the subset of hospitals that provided information about the clinical care delivered to patients admitted under virtual inpatient observation, hospitals frequently reported there were no differences in the care delivered to observation patients when compared with other inpatients.
The results of our survey indicate that designated OUs are not a commonly available model of observation care in the study hospitals. In fact, the vast majority of the hospitals used virtual inpatient observation care, which did not differ from the care delivered to a child admitted as an inpatient. ED‐based OUs, which often provide operationally and physically distinct care to observation patients, have been touted as cost‐effective alternatives to inpatient care,1820 resulting in fewer admissions and reductions in length of stay19, 20 without a resultant increase in return ED‐visits or readmissions.2123 Research is needed to determine the patient‐level outcomes for short‐stay patients in the variety of available treatment settings (eg, physically or operationally distinct OUs and virtual observation), and to evaluate these outcomes in comparison to results published from designated OUs. The operationally and physically distinct features of a designated OU may be required to realize the benefits of observation attributed to individual patients.
While observation care has been historically provided by emergency physicians, there is increasing interest in the role of inpatient providers in observation care.9 According to our survey, children were admitted to observation status directly from clinics, following surgical procedures, scheduled tests and treatment, or after evaluation and treatment in the ED. As many of these children undergo virtual observation in inpatient areas, the role of inpatient providers, such as pediatric hospitalists, in observation care may be an important area for future study, education, and professional development. Novel models of care, with hospitalists collaborating with emergency physicians, may be of benefit to the children who require observation following initial stabilization and treatment in the ED.24, 25
We identified variation between hospitals in the methods used to assign observation status to an episode of care, including a wide range of length of stay criteria and different approaches to utilization review. In addition, the criteria payers use to reimburse for observation varied between payers, even within individual hospitals. The results of our survey may be driven by issues of reimbursement and not based on a model of optimizing patient care outcomes using designated OUs. Variations in reimbursement may limit hospital efforts to refine models of observation care for children. Designated OUs have been suggested as a method for improving ED patient flow,26 increasing inpatient capacity,27 and reducing costs of care.28 Standardization of observation status criteria and consistent reimbursement for observation services may be necessary for hospitals to develop operationally and physically distinct OUs, which may be essential to achieving the proposed benefits of observation medicine on costs of care, patient flow, and hospital capacity.
LIMITATIONS
Our study results should be interpreted with the following limitations in mind. First, the surveys were distributed only to freestanding children's hospitals who participate in PHIS. As a result, our findings may not be generalizable to the experiences of other children's hospitals or general hospitals caring for children. Questions in Survey 2 were focused on understanding observation care, delivered to patients following ED care, which may differ from observation practices related to a direct admission or following scheduled procedures, tests, or treatments. It is important to note that, hospitals that do not report observation status patient data to PHIS are still providing care to children with acute conditions that respond to brief periods of hospital treatment, even though it is not labeled observation. However, it was beyond the scope of this study to characterize the care delivered to all patients who experience a short stay.
The second main limitation of our study is the lower response rate to Survey 2. In addition, several surveys contained incomplete responses which further limits our sample size for some questions, specifically those related to utilization review. The lower response to Survey 2 could be related to the timing of the distribution of the 2 surveys, or to the information contained in the introductory e‐mail describing Survey 2. Hospitals with designated observation units, or where observation status care has been receiving attention, may have been more likely to respond to our survey, which may bias our results to reflect the experiences of hospitals experiencing particular successes or challenges with observation status care. A comparison of known hospital characteristics revealed no differences between hospitals that did and did not provide responses to Survey 2, but other unmeasured differences may exist.
CONCLUSION
Observation status is assigned using duration of treatment, clinical care guidelines, and level of care criteria, and is defined differently by individual hospitals and payers. Currently, the most widely available setting for pediatric observation status is within a virtual inpatient unit. Our results suggest that the care delivered to observation patients in virtual inpatient units is consistent with care provided to other inpatients. As such, observation status is largely an administrative/billing designation, which does not appear to reflect differences in clinical care. A consistent approach to the assignment of patients to observation status, and treatment of patients under observation among hospitals and payers, may be necessary to compare quality outcomes. Studies of the clinical care delivery and processes of care for short‐stay patients are needed to optimize models of pediatric observation care.
- Observation medicine: the healthcare system's tincture of time. In: Graff LG, ed.Principles of Observation Medicine.Dallas, TX:American College of Emergency Physicians;2010. Available at: http://www.acep.org/content.aspx?id=46142. Accessed February 18,year="2011"2011. .
- Hospital ‘observation’ status a matter of billing.The Columbus Dispatch. February 14,2011. .
- Hospital payments downgraded.Philadelphia Business Journal. February 18,2011. .
- Medicare rules give full hospital benefits only to those with ‘inpatient’ status.The Washington Post. September 7,2010. .
- Hospitals caught between a rock and a hard place over observation.Health Leaders Media. September 15,2010. .
- AHA: observation status fears on the rise.Health Leaders Media. October 29,2010. .
- Put your hospital bill under a microscope.The New York Times. September 13,2010. .
- Medicare Hospital Manual Section 455.Washington, DC:Department of Health and Human Services, Centers for Medicare and Medicaid Services;2001.
- The Observation Unit: An Operational Overview for the Hospitalist. Society of Hospital Medicine White Paper. May 21, 2009. Available at: http://www.hospitalmedicine.org/Content/NavigationMenu/Publications/White Papers/White_Papers.htm. Accessed May 21,2009. , , , , .
- Utilization and unexpected hospitalization rates of a pediatric emergency department 23‐hour observation unit.Pediatr Emerg Care.2008;24(9):589–594. , , , , .
- The pediatric hybrid observation unit: an analysis of 6477 consecutive patient encounters.Pediatrics.2005;115(5):e535–e542. , , .
- Pediatric observation units in the United States: a systematic review.J Hosp Med.2010;5(3):172–182. , , , , .
- Pediatric emergency department directors' benchmarking survey: fiscal year 2001.Pediatr Emerg Care.2003;19(3):143–147. , , .
- Pediatric observation status beds on an inpatient unit: an integrated care model.Pediatr Emerg Care.2004;20(1):17–21. , , , .
- Impact of a short stay unit on asthma patients admitted to a tertiary pediatric hospital.Qual Manag Health Care.1997;6(1):14–22. , , , .
- A national survey of observation units in the United States.Am J Emerg Med.2003;21(7):529–533. , , , .
- A survey of observation units in the United States.Am J Emerg Med.1989;7(6):576–580. , , , .
- When the patient requires observation not hospitalization.J Nurs Admin.1988;18(10):20–23. , , .
- A reduction in hospitalization, length of stay, and hospital charges for croup with the institution of a pediatric observation unit.Am J Emerg Med.2006;24(7):818–821. , , .
- Outpatient oral rehydration in the United States.Am J Dis Child.1986;140(3):211–215. , , .
- Pediatric closed head injuries treated in an observation unit.Pediatr Emerg Care.2005;21(10):639–644. , , , , .
- Use of pediatric observation unit for treatment of children with dehydration caused by gastroenteritis.Pediatr Emerg Care.2006;22(1):1–6. , , , .
- Children with asthma admitted to a pediatric observation unit.Pediatr Emerg Care.2005;21(10):645–649. , , , .
- Redefining the community pediatric hospitalist: the combined pediatric ED/inpatient unit.Pediatr Emerg Care.2007;23(1):33–37. , , , .
- Program description: a hospitalist‐run, medical short‐stay unit in a teaching hospital.Can Med Assoc J.2000;163(11):1477–1480. , , , .
- Impact of an observation unit and an emergency department‐admitted patient transfer mandate in decreasing overcrowding in a pediatric emergency department: a discrete event simulation exercise.Pediatr Emerg Care.2009;25(3):160–163. , .
- Children's hospitals do not acutely respond to high occupancy.Pediatrics.125(5):974–981. , , , et al.
- Trends in high‐turnover stays among children hospitalized in the United States, 1993‐2003.Pediatrics.2009;123(3):996–1002. , , , , , .
Observation medicine has grown in recent decades out of changes in policies for hospital reimbursement, requirements for patients to meet admission criteria to qualify for inpatient admission, and efforts to avoid unnecessary or inappropriate admissions.1 Emergency physicians are frequently faced with patients who are too sick to be discharged home, but do not clearly meet criteria for an inpatient status admission. These patients often receive extended outpatient services (typically extending 24 to 48 hours) under the designation of observation status, in order to determine their response to treatment and need for hospitalization.
Observation care delivered to adult patients has increased substantially in recent years, and the confusion around the designation of observation versus inpatient care has received increasing attention in the lay press.27 According to the Centers for Medicare and Medicaid Services (CMS)8:
Observation care is a well‐defined set of specific, clinically appropriate services, which include ongoing short term treatment, assessment, and reassessment before a decision can be made regarding whether patients will require further treatment as hospital inpatients. Observation services are commonly ordered for patients who present to the emergency department and who then require a significant period of treatment or monitoring in order to make a decision concerning their admission or discharge.
Observation status is an administrative label that is applied to patients who do not meet inpatient level of care criteria, as defined by third parties such as InterQual. These criteria usually include a combination of the patient's clinical diagnoses, severity of illness, and expected needs for monitoring and interventions, in order to determine the admission status to which the patient may be assigned (eg, observation, inpatient, or intensive care). Observation services can be provided, in a variety of settings, to those patients who do not meet inpatient level of care but require a period of observation. Some hospitals provide observation care in discrete units in the emergency department (ED) or specific inpatient unit, and others have no designated unit but scatter observation patients throughout the institution, termed virtual observation units.9
For more than 30 years, observation unit (OU) admission has offered an alternative to traditional inpatient hospitalization for children with a variety of acute conditions.10, 11 Historically, the published literature on observation care for children in the United States has been largely based in dedicated emergency department OUs.12 Yet, in a 2001 survey of 21 pediatric EDs, just 6 reported the presence of a 23‐hour unit.13 There are single‐site examples of observation care delivered in other settings.14, 15 In 2 national surveys of US General Hospitals, 25% provided observation services in beds adjacent to the ED, and the remainder provided observation services in hospital inpatient units.16, 17 However, we are not aware of any previous multi‐institution studies exploring hospital‐wide practices related to observation care for children.
Recognizing that observation status can be designated using various standards, and that observation care can be delivered in locations outside of dedicated OUs,9 we developed 2 web‐based surveys to examine the current models of pediatric observation medicine in US children's hospitals. We hypothesized that observation care is most commonly applied as a billing designation and does not necessarily represent care delivered in a structurally or functionally distinct OU, nor does it represent a difference in care provided to those patients with inpatient designation.
METHODS
Study Design
Two web‐based surveys were distributed, in April 2010, to the 42 freestanding, tertiary care children's hospitals affiliated with the Child Health Corporation of America (CHCA; Shawnee Mission, KS) which contribute data to the Pediatric Health Information System (PHIS) database. The PHIS is a national administrative database that contains resource utilization data from participating hospitals located in noncompeting markets of 27 states plus the District of Columbia. These hospitals account for 20% of all tertiary care children's hospitals in the United States.
Survey Content
Survey 1
A survey of hospital observation status practices has been developed by CHCA as a part of the PHIS data quality initiative (see Supporting Appendix: Survey 1 in the online version of this article). Hospitals that did not provide observation patient data to PHIS were excluded after an initial screening question. This survey obtained information regarding the designation of observation status within each hospital. Hospitals provided free‐text responses to questions related to the criteria used to define observation, and to admit patients into observation status. Fixed‐choice response questions were used to determine specific observation status utilization criteria and clinical guidelines (eg, InterQual and Milliman) used by hospitals for the designation of observation status to patients.
Survey 2
We developed a detailed follow‐up survey in order to characterize the structures and processes of care associated with observation status (see Supporting Appendix: Survey 2 in the online version of this article). Within the follow‐up survey, an initial screening question was used to determine all types of patients to which observation status is assigned within the responding hospitals. All other questions in Survey 2 were focused specifically on those patients who required additional care following ED evaluation and treatment. Fixed‐choice response questions were used to explore differences in care for patients under observation and those admitted as inpatients. We also inquired of hospital practices related to boarding of patients in the ED while awaiting admission to an inpatient bed.
Survey Distribution
Two web‐based surveys were distributed to all 42 CHCA hospitals that contribute data to PHIS. During the month of April 2010, each hospital's designated PHIS operational contact received e‐mail correspondence requesting their participation in each survey. Within hospitals participating in PHIS, Operational Contacts have been assigned to serve as the day‐to‐day PHIS contact person based upon their experience working with the PHIS data. The Operational Contacts are CHCA's primary contact for issues related to the hospital's data quality and reporting to PHIS. Non‐responders were contacted by e‐mail for additional requests to complete the surveys. Each e‐mail provided an introduction to the topic of the survey and a link to complete the survey. The e‐mail requesting participation in Survey 1 was distributed the first week of April 2010, and the survey was open for responses during the first 3 weeks of the month. The e‐mail requesting participation in Survey 2 was sent the third week of April 2010, and the survey was open for responses during the subsequent 2 weeks.
DATA ANALYSIS
Survey responses were collected and are presented as a descriptive summary of results. Hospital characteristics were summarized with medians and interquartile ranges for continuous variables, and with percents for categorical variables. Characteristics were compared between hospitals that responded and those that did not respond to Survey 2 using Wilcoxon rank‐sum tests and chi‐square tests as appropriate. All analyses were performed using SAS v.9.2 (SAS Institute, Cary, NC), and a P value <0.05 was considered statistically significant. The study was reviewed by the University of Michigan Institutional Review Board and considered exempt.
RESULTS
Responses to Survey 1 were available from 37 of 42 (88%) of PHIS hospitals (Figure 1). For Survey 2, we received responses from 20 of 42 (48%) of PHIS hospitals. Based on information available from Survey 1, we know that 20 of the 31 (65%) PHIS hospitals that report observation status patient data to PHIS responded to Survey 2. Characteristics of the hospitals responding and not responding to Survey 2 are presented in Table 1. Respondents provided hospital identifying information which allowed for the linkage of data, from Survey 1, to 17 of the 20 hospitals responding to Survey 2. We did not have information available to link responses from 3 hospitals.

Respondent N = 20 | Non‐Respondent N = 22 | P Value | |
---|---|---|---|
| |||
No. of inpatient beds Median [IQR] (excluding Obstetrics) | 245 [219283] | 282 [250381] | 0.076 |
Annual admissions Median [IQR] (excluding births) | 11,658 [8,64213,213] | 13,522 [9,83018,705] | 0.106 |
ED volume Median [IQR] | 60,528 [47,85082,955] | 64,486 [47,38684,450] | 0.640 |
Percent government payer Median [IQR] | 53% [4662] | 49% [4158] | 0.528 |
Region | |||
Northeast | 37% | 0% | 0.021 |
Midwest | 21% | 33% | |
South | 21% | 50% | |
West | 21% | 17% | |
Reports observation status patients to PHIS | 85% | 90% | 0.555 |
Based on responses to the surveys and our knowledge of data reported to PHIS, our current understanding of patient flow from ED through observation to discharge home, and the application of observation status to the encounter, is presented in Figure 2. According to free‐text responses to Survey 1, various methods were applied to designate observation status (gray shaded boxes in Figure 2). Fixed‐choice responses to Survey 2 revealed that observation status patients were cared for in a variety of locations within hospitals, including ED beds, designated observation units, and inpatient beds (dashed boxes in Figure 2). Not every facility utilized all of the listed locations for observation care. Space constraints could dictate the location of care, regardless of patient status (eg, observation vs inpatient), in hospitals with more than one location of care available to observation patients. While patient status could change during a visit, only the final patient status at discharge enters the administrative record submitted to PHIS (black boxes in Figure 2). Facility charges for observation remained a part of the visit record and were reported to PHIS. Hospitals may or may not bill for all assigned charges depending on patient status, length of stay, or other specific criteria determined by contracts with individual payers.

Survey 1: Classification of Observation Patients and Presence of Observation Units in PHIS Hospitals
According to responses to Survey 1, designated OUs were not widespread, present in only 12 of the 31 hospitals. No hospital reported treating all observation status patients exclusively in a designated OU. Observation status was defined by both duration of treatment and either level of care criteria or clinical care guidelines in 21 of the 31 hospitals responding to Survey 1. Of the remaining 10 hospitals, 1 reported that treatment duration alone defines observation status, and the others relied on prespecified observation criteria. When considering duration of treatment, hospitals variably indicated that anticipated or actual lengths of stay were used to determine observation status. Regarding the maximum hours a patient can be observed, 12 hospitals limited observation to 24 hours or fewer, 12 hospitals observed patients for no more than 36 to 48 hours, and the remaining 7 hospitals allowed observation periods of 72 hours or longer.
When admitting patients to observation status, 30 of 31 hospitals specified the criteria that were used to determine observation admissions. InterQual criteria, the most common response, were used by 23 of the 30 hospitals reporting specified criteria; the remaining 7 hospitals had developed hospital‐specific criteria or modified existing criteria, such as InterQual or Milliman, to determine observation status admissions. In addition to these criteria, 11 hospitals required a physician order for admission to observation status. Twenty‐four hospitals indicated that policies were in place to change patient status from observation to inpatient, or inpatient to observation, typically through processes of utilization review and application of criteria listed above.
Most hospitals indicated that they faced substantial variation in the standards used from one payer to another when considering reimbursement for care delivered under observation status. Hospitals noted that duration‐of‐carebased reimbursement practices included hourly rates, per diem, and reimbursement for only the first 24 or 48 hours of observation care. Hospitals identified that payers variably determined reimbursement for observation based on InterQual level of care criteria and Milliman care guidelines. One hospital reported that it was not their practice to bill for the observation bed.
Survey 2: Understanding Observation Patient Type Administrative Data Following ED Care Within PHIS Hospitals
Of the 20 hospitals responding to Survey 2, there were 2 hospitals that did not apply observation status to patients after ED care and 2 hospitals that did not provide complete responses. The remaining 16 hospitals provided information regarding observation status as applied to patients after receiving treatment in the ED. The settings available for observation care and patient groups treated within each area are presented in Table 2. In addition to the patient groups listed in Table 2, there were 4 hospitals where patients could be admitted to observation status directly from an outpatient clinic. All responding hospitals provided virtual observation care (ie, observation status is assigned but the patient is cared for in the existing ED or inpatient ward). Nine hospitals also provided observation care within a dedicated ED or ward‐based OU (ie, a separate clinical area in which observation patients are treated).
Hospital No. | Available Observation Settings | Patient Groups Under Observation in Each Setting | UR to Assign Obs Status | When Obs Status Is Assigned | ||
---|---|---|---|---|---|---|
ED | Post‐Op | Test/Treat | ||||
| ||||||
1 | Virtual inpatient | X | X | X | Yes | Discharge |
Ward‐based OU | X | X | No | |||
2 | Virtual inpatient | X | X | Yes | Admission | |
Ward‐based OU | X | X | X | No | ||
3 | Virtual inpatient | X | X | X | Yes | Discharge |
Ward‐based OU | X | X | X | Yes | ||
ED OU | X | Yes | ||||
Virtual ED | X | Yes | ||||
4 | Virtual inpatient | X | X | X | Yes | Discharge |
ED OU | X | No | ||||
Virtual ED | X | No | ||||
5 | Virtual inpatient | X | X | X | N/A | Discharge |
6 | Virtual inpatient | X | X | X | Yes | Discharge |
7 | Virtual inpatient | X | X | Yes | No response | |
Ward‐based OU | X | Yes | ||||
Virtual ED | X | Yes | ||||
8 | Virtual inpatient | X | X | X | Yes | Admission |
9 | Virtual inpatient | X | X | Yes | Discharge | |
ED OU | X | Yes | ||||
Virtual ED | X | Yes | ||||
10 | Virtual inpatient | X | X | X | Yes | Admission |
ED OU | X | Yes | ||||
11 | Virtual inpatient | X | X | Yes | Discharge | |
Ward‐based OU | X | X | Yes | |||
ED OU | X | Yes | ||||
Virtual ED | X | Yes | ||||
12 | Virtual inpatient | X | X | X | Yes | Admission |
13 | Virtual inpatient | X | X | N/A | Discharge | |
Virtual ED | X | N/A | ||||
14 | Virtual inpatient | X | X | X | Yes | Both |
15 | Virtual inpatient | X | X | Yes | Admission | |
Ward‐based OU | X | X | Yes | |||
16 | Virtual inpatient | X | Yes | Admission |
When asked to identify differences between clinical care delivered to patients admitted under virtual observation and those admitted under inpatient status, 14 of 16 hospitals selected the option There are no differences in the care delivery of these patients. The differences identified by 2 hospitals included patient care orders, treatment protocols, and physician documentation. Within the hospitals that reported utilization of virtual ED observation, 2 reported differences in care compared with other ED patients, including patient care orders, physician rounds, documentation, and discharge process. When admitted patients were boarded in the ED while awaiting an inpatient bed, 11 of 16 hospitals allowed for observation or inpatient level of care to be provided in the ED. Fourteen hospitals allow an admitted patient to be discharged home from boarding in the ED without ever receiving care in an inpatient bed. The discharge decision was made by ED providers in 7 hospitals, and inpatient providers in the other 7 hospitals.
Responses to questions providing detailed information on the process of utilization review were provided by 12 hospitals. Among this subset of hospitals, utilization review was consistently used to assign virtual inpatient observation status and was applied at admission (n = 6) or discharge (n = 8), depending on the hospital. One hospital applied observation status at both admission and discharge; 1 hospital did not provide a response. Responses to questions regarding utilization review are presented in Table 3.
Survey Question | Yes N (%) | No N (%) |
---|---|---|
Preadmission utilization review is conducted at my hospital. | 3 (25) | 9 (75) |
Utilization review occurs daily at my hospital. | 10 (83) | 2 (17) |
A nonclinician can initiate an order for observation status. | 4 (33) | 8 (67) |
Status can be changed after the patient has been discharged. | 10 (83) | 2 (17) |
Inpatient status would always be assigned to a patient who receives less than 24 hours of care and meets inpatient criteria. | 9 (75) | 3 (25) |
The same status would be assigned to different patients who received the same treatment of the same duration but have different payers. | 6 (50) | 6 (50) |
DISCUSSION
This is the largest descriptive study of pediatric observation status practices in US freestanding children's hospitals and, to our knowledge, the first to include information about both the ED and inpatient treatment environments. There are two important findings of this study. First, designated OUs were uncommon among the group of freestanding children's hospitals that reported observation patient data to PHIS in 2010. Second, despite the fact that hospitals reported observation care was delivered in a variety of settings, virtual inpatient observation status was nearly ubiquitous. Among the subset of hospitals that provided information about the clinical care delivered to patients admitted under virtual inpatient observation, hospitals frequently reported there were no differences in the care delivered to observation patients when compared with other inpatients.
The results of our survey indicate that designated OUs are not a commonly available model of observation care in the study hospitals. In fact, the vast majority of the hospitals used virtual inpatient observation care, which did not differ from the care delivered to a child admitted as an inpatient. ED‐based OUs, which often provide operationally and physically distinct care to observation patients, have been touted as cost‐effective alternatives to inpatient care,1820 resulting in fewer admissions and reductions in length of stay19, 20 without a resultant increase in return ED‐visits or readmissions.2123 Research is needed to determine the patient‐level outcomes for short‐stay patients in the variety of available treatment settings (eg, physically or operationally distinct OUs and virtual observation), and to evaluate these outcomes in comparison to results published from designated OUs. The operationally and physically distinct features of a designated OU may be required to realize the benefits of observation attributed to individual patients.
While observation care has been historically provided by emergency physicians, there is increasing interest in the role of inpatient providers in observation care.9 According to our survey, children were admitted to observation status directly from clinics, following surgical procedures, scheduled tests and treatment, or after evaluation and treatment in the ED. As many of these children undergo virtual observation in inpatient areas, the role of inpatient providers, such as pediatric hospitalists, in observation care may be an important area for future study, education, and professional development. Novel models of care, with hospitalists collaborating with emergency physicians, may be of benefit to the children who require observation following initial stabilization and treatment in the ED.24, 25
We identified variation between hospitals in the methods used to assign observation status to an episode of care, including a wide range of length of stay criteria and different approaches to utilization review. In addition, the criteria payers use to reimburse for observation varied between payers, even within individual hospitals. The results of our survey may be driven by issues of reimbursement and not based on a model of optimizing patient care outcomes using designated OUs. Variations in reimbursement may limit hospital efforts to refine models of observation care for children. Designated OUs have been suggested as a method for improving ED patient flow,26 increasing inpatient capacity,27 and reducing costs of care.28 Standardization of observation status criteria and consistent reimbursement for observation services may be necessary for hospitals to develop operationally and physically distinct OUs, which may be essential to achieving the proposed benefits of observation medicine on costs of care, patient flow, and hospital capacity.
LIMITATIONS
Our study results should be interpreted with the following limitations in mind. First, the surveys were distributed only to freestanding children's hospitals who participate in PHIS. As a result, our findings may not be generalizable to the experiences of other children's hospitals or general hospitals caring for children. Questions in Survey 2 were focused on understanding observation care, delivered to patients following ED care, which may differ from observation practices related to a direct admission or following scheduled procedures, tests, or treatments. It is important to note that, hospitals that do not report observation status patient data to PHIS are still providing care to children with acute conditions that respond to brief periods of hospital treatment, even though it is not labeled observation. However, it was beyond the scope of this study to characterize the care delivered to all patients who experience a short stay.
The second main limitation of our study is the lower response rate to Survey 2. In addition, several surveys contained incomplete responses which further limits our sample size for some questions, specifically those related to utilization review. The lower response to Survey 2 could be related to the timing of the distribution of the 2 surveys, or to the information contained in the introductory e‐mail describing Survey 2. Hospitals with designated observation units, or where observation status care has been receiving attention, may have been more likely to respond to our survey, which may bias our results to reflect the experiences of hospitals experiencing particular successes or challenges with observation status care. A comparison of known hospital characteristics revealed no differences between hospitals that did and did not provide responses to Survey 2, but other unmeasured differences may exist.
CONCLUSION
Observation status is assigned using duration of treatment, clinical care guidelines, and level of care criteria, and is defined differently by individual hospitals and payers. Currently, the most widely available setting for pediatric observation status is within a virtual inpatient unit. Our results suggest that the care delivered to observation patients in virtual inpatient units is consistent with care provided to other inpatients. As such, observation status is largely an administrative/billing designation, which does not appear to reflect differences in clinical care. A consistent approach to the assignment of patients to observation status, and treatment of patients under observation among hospitals and payers, may be necessary to compare quality outcomes. Studies of the clinical care delivery and processes of care for short‐stay patients are needed to optimize models of pediatric observation care.
Observation medicine has grown in recent decades out of changes in policies for hospital reimbursement, requirements for patients to meet admission criteria to qualify for inpatient admission, and efforts to avoid unnecessary or inappropriate admissions.1 Emergency physicians are frequently faced with patients who are too sick to be discharged home, but do not clearly meet criteria for an inpatient status admission. These patients often receive extended outpatient services (typically extending 24 to 48 hours) under the designation of observation status, in order to determine their response to treatment and need for hospitalization.
Observation care delivered to adult patients has increased substantially in recent years, and the confusion around the designation of observation versus inpatient care has received increasing attention in the lay press.27 According to the Centers for Medicare and Medicaid Services (CMS)8:
Observation care is a well‐defined set of specific, clinically appropriate services, which include ongoing short term treatment, assessment, and reassessment before a decision can be made regarding whether patients will require further treatment as hospital inpatients. Observation services are commonly ordered for patients who present to the emergency department and who then require a significant period of treatment or monitoring in order to make a decision concerning their admission or discharge.
Observation status is an administrative label that is applied to patients who do not meet inpatient level of care criteria, as defined by third parties such as InterQual. These criteria usually include a combination of the patient's clinical diagnoses, severity of illness, and expected needs for monitoring and interventions, in order to determine the admission status to which the patient may be assigned (eg, observation, inpatient, or intensive care). Observation services can be provided, in a variety of settings, to those patients who do not meet inpatient level of care but require a period of observation. Some hospitals provide observation care in discrete units in the emergency department (ED) or specific inpatient unit, and others have no designated unit but scatter observation patients throughout the institution, termed virtual observation units.9
For more than 30 years, observation unit (OU) admission has offered an alternative to traditional inpatient hospitalization for children with a variety of acute conditions.10, 11 Historically, the published literature on observation care for children in the United States has been largely based in dedicated emergency department OUs.12 Yet, in a 2001 survey of 21 pediatric EDs, just 6 reported the presence of a 23‐hour unit.13 There are single‐site examples of observation care delivered in other settings.14, 15 In 2 national surveys of US General Hospitals, 25% provided observation services in beds adjacent to the ED, and the remainder provided observation services in hospital inpatient units.16, 17 However, we are not aware of any previous multi‐institution studies exploring hospital‐wide practices related to observation care for children.
Recognizing that observation status can be designated using various standards, and that observation care can be delivered in locations outside of dedicated OUs,9 we developed 2 web‐based surveys to examine the current models of pediatric observation medicine in US children's hospitals. We hypothesized that observation care is most commonly applied as a billing designation and does not necessarily represent care delivered in a structurally or functionally distinct OU, nor does it represent a difference in care provided to those patients with inpatient designation.
METHODS
Study Design
Two web‐based surveys were distributed, in April 2010, to the 42 freestanding, tertiary care children's hospitals affiliated with the Child Health Corporation of America (CHCA; Shawnee Mission, KS) which contribute data to the Pediatric Health Information System (PHIS) database. The PHIS is a national administrative database that contains resource utilization data from participating hospitals located in noncompeting markets of 27 states plus the District of Columbia. These hospitals account for 20% of all tertiary care children's hospitals in the United States.
Survey Content
Survey 1
A survey of hospital observation status practices has been developed by CHCA as a part of the PHIS data quality initiative (see Supporting Appendix: Survey 1 in the online version of this article). Hospitals that did not provide observation patient data to PHIS were excluded after an initial screening question. This survey obtained information regarding the designation of observation status within each hospital. Hospitals provided free‐text responses to questions related to the criteria used to define observation, and to admit patients into observation status. Fixed‐choice response questions were used to determine specific observation status utilization criteria and clinical guidelines (eg, InterQual and Milliman) used by hospitals for the designation of observation status to patients.
Survey 2
We developed a detailed follow‐up survey in order to characterize the structures and processes of care associated with observation status (see Supporting Appendix: Survey 2 in the online version of this article). Within the follow‐up survey, an initial screening question was used to determine all types of patients to which observation status is assigned within the responding hospitals. All other questions in Survey 2 were focused specifically on those patients who required additional care following ED evaluation and treatment. Fixed‐choice response questions were used to explore differences in care for patients under observation and those admitted as inpatients. We also inquired of hospital practices related to boarding of patients in the ED while awaiting admission to an inpatient bed.
Survey Distribution
Two web‐based surveys were distributed to all 42 CHCA hospitals that contribute data to PHIS. During the month of April 2010, each hospital's designated PHIS operational contact received e‐mail correspondence requesting their participation in each survey. Within hospitals participating in PHIS, Operational Contacts have been assigned to serve as the day‐to‐day PHIS contact person based upon their experience working with the PHIS data. The Operational Contacts are CHCA's primary contact for issues related to the hospital's data quality and reporting to PHIS. Non‐responders were contacted by e‐mail for additional requests to complete the surveys. Each e‐mail provided an introduction to the topic of the survey and a link to complete the survey. The e‐mail requesting participation in Survey 1 was distributed the first week of April 2010, and the survey was open for responses during the first 3 weeks of the month. The e‐mail requesting participation in Survey 2 was sent the third week of April 2010, and the survey was open for responses during the subsequent 2 weeks.
DATA ANALYSIS
Survey responses were collected and are presented as a descriptive summary of results. Hospital characteristics were summarized with medians and interquartile ranges for continuous variables, and with percents for categorical variables. Characteristics were compared between hospitals that responded and those that did not respond to Survey 2 using Wilcoxon rank‐sum tests and chi‐square tests as appropriate. All analyses were performed using SAS v.9.2 (SAS Institute, Cary, NC), and a P value <0.05 was considered statistically significant. The study was reviewed by the University of Michigan Institutional Review Board and considered exempt.
RESULTS
Responses to Survey 1 were available from 37 of 42 (88%) of PHIS hospitals (Figure 1). For Survey 2, we received responses from 20 of 42 (48%) of PHIS hospitals. Based on information available from Survey 1, we know that 20 of the 31 (65%) PHIS hospitals that report observation status patient data to PHIS responded to Survey 2. Characteristics of the hospitals responding and not responding to Survey 2 are presented in Table 1. Respondents provided hospital identifying information which allowed for the linkage of data, from Survey 1, to 17 of the 20 hospitals responding to Survey 2. We did not have information available to link responses from 3 hospitals.

Respondent N = 20 | Non‐Respondent N = 22 | P Value | |
---|---|---|---|
| |||
No. of inpatient beds Median [IQR] (excluding Obstetrics) | 245 [219283] | 282 [250381] | 0.076 |
Annual admissions Median [IQR] (excluding births) | 11,658 [8,64213,213] | 13,522 [9,83018,705] | 0.106 |
ED volume Median [IQR] | 60,528 [47,85082,955] | 64,486 [47,38684,450] | 0.640 |
Percent government payer Median [IQR] | 53% [4662] | 49% [4158] | 0.528 |
Region | |||
Northeast | 37% | 0% | 0.021 |
Midwest | 21% | 33% | |
South | 21% | 50% | |
West | 21% | 17% | |
Reports observation status patients to PHIS | 85% | 90% | 0.555 |
Based on responses to the surveys and our knowledge of data reported to PHIS, our current understanding of patient flow from ED through observation to discharge home, and the application of observation status to the encounter, is presented in Figure 2. According to free‐text responses to Survey 1, various methods were applied to designate observation status (gray shaded boxes in Figure 2). Fixed‐choice responses to Survey 2 revealed that observation status patients were cared for in a variety of locations within hospitals, including ED beds, designated observation units, and inpatient beds (dashed boxes in Figure 2). Not every facility utilized all of the listed locations for observation care. Space constraints could dictate the location of care, regardless of patient status (eg, observation vs inpatient), in hospitals with more than one location of care available to observation patients. While patient status could change during a visit, only the final patient status at discharge enters the administrative record submitted to PHIS (black boxes in Figure 2). Facility charges for observation remained a part of the visit record and were reported to PHIS. Hospitals may or may not bill for all assigned charges depending on patient status, length of stay, or other specific criteria determined by contracts with individual payers.

Survey 1: Classification of Observation Patients and Presence of Observation Units in PHIS Hospitals
According to responses to Survey 1, designated OUs were not widespread, present in only 12 of the 31 hospitals. No hospital reported treating all observation status patients exclusively in a designated OU. Observation status was defined by both duration of treatment and either level of care criteria or clinical care guidelines in 21 of the 31 hospitals responding to Survey 1. Of the remaining 10 hospitals, 1 reported that treatment duration alone defines observation status, and the others relied on prespecified observation criteria. When considering duration of treatment, hospitals variably indicated that anticipated or actual lengths of stay were used to determine observation status. Regarding the maximum hours a patient can be observed, 12 hospitals limited observation to 24 hours or fewer, 12 hospitals observed patients for no more than 36 to 48 hours, and the remaining 7 hospitals allowed observation periods of 72 hours or longer.
When admitting patients to observation status, 30 of 31 hospitals specified the criteria that were used to determine observation admissions. InterQual criteria, the most common response, were used by 23 of the 30 hospitals reporting specified criteria; the remaining 7 hospitals had developed hospital‐specific criteria or modified existing criteria, such as InterQual or Milliman, to determine observation status admissions. In addition to these criteria, 11 hospitals required a physician order for admission to observation status. Twenty‐four hospitals indicated that policies were in place to change patient status from observation to inpatient, or inpatient to observation, typically through processes of utilization review and application of criteria listed above.
Most hospitals indicated that they faced substantial variation in the standards used from one payer to another when considering reimbursement for care delivered under observation status. Hospitals noted that duration‐of‐carebased reimbursement practices included hourly rates, per diem, and reimbursement for only the first 24 or 48 hours of observation care. Hospitals identified that payers variably determined reimbursement for observation based on InterQual level of care criteria and Milliman care guidelines. One hospital reported that it was not their practice to bill for the observation bed.
Survey 2: Understanding Observation Patient Type Administrative Data Following ED Care Within PHIS Hospitals
Of the 20 hospitals responding to Survey 2, there were 2 hospitals that did not apply observation status to patients after ED care and 2 hospitals that did not provide complete responses. The remaining 16 hospitals provided information regarding observation status as applied to patients after receiving treatment in the ED. The settings available for observation care and patient groups treated within each area are presented in Table 2. In addition to the patient groups listed in Table 2, there were 4 hospitals where patients could be admitted to observation status directly from an outpatient clinic. All responding hospitals provided virtual observation care (ie, observation status is assigned but the patient is cared for in the existing ED or inpatient ward). Nine hospitals also provided observation care within a dedicated ED or ward‐based OU (ie, a separate clinical area in which observation patients are treated).
Hospital No. | Available Observation Settings | Patient Groups Under Observation in Each Setting | UR to Assign Obs Status | When Obs Status Is Assigned | ||
---|---|---|---|---|---|---|
ED | Post‐Op | Test/Treat | ||||
| ||||||
1 | Virtual inpatient | X | X | X | Yes | Discharge |
Ward‐based OU | X | X | No | |||
2 | Virtual inpatient | X | X | Yes | Admission | |
Ward‐based OU | X | X | X | No | ||
3 | Virtual inpatient | X | X | X | Yes | Discharge |
Ward‐based OU | X | X | X | Yes | ||
ED OU | X | Yes | ||||
Virtual ED | X | Yes | ||||
4 | Virtual inpatient | X | X | X | Yes | Discharge |
ED OU | X | No | ||||
Virtual ED | X | No | ||||
5 | Virtual inpatient | X | X | X | N/A | Discharge |
6 | Virtual inpatient | X | X | X | Yes | Discharge |
7 | Virtual inpatient | X | X | Yes | No response | |
Ward‐based OU | X | Yes | ||||
Virtual ED | X | Yes | ||||
8 | Virtual inpatient | X | X | X | Yes | Admission |
9 | Virtual inpatient | X | X | Yes | Discharge | |
ED OU | X | Yes | ||||
Virtual ED | X | Yes | ||||
10 | Virtual inpatient | X | X | X | Yes | Admission |
ED OU | X | Yes | ||||
11 | Virtual inpatient | X | X | Yes | Discharge | |
Ward‐based OU | X | X | Yes | |||
ED OU | X | Yes | ||||
Virtual ED | X | Yes | ||||
12 | Virtual inpatient | X | X | X | Yes | Admission |
13 | Virtual inpatient | X | X | N/A | Discharge | |
Virtual ED | X | N/A | ||||
14 | Virtual inpatient | X | X | X | Yes | Both |
15 | Virtual inpatient | X | X | Yes | Admission | |
Ward‐based OU | X | X | Yes | |||
16 | Virtual inpatient | X | Yes | Admission |
When asked to identify differences between clinical care delivered to patients admitted under virtual observation and those admitted under inpatient status, 14 of 16 hospitals selected the option There are no differences in the care delivery of these patients. The differences identified by 2 hospitals included patient care orders, treatment protocols, and physician documentation. Within the hospitals that reported utilization of virtual ED observation, 2 reported differences in care compared with other ED patients, including patient care orders, physician rounds, documentation, and discharge process. When admitted patients were boarded in the ED while awaiting an inpatient bed, 11 of 16 hospitals allowed for observation or inpatient level of care to be provided in the ED. Fourteen hospitals allow an admitted patient to be discharged home from boarding in the ED without ever receiving care in an inpatient bed. The discharge decision was made by ED providers in 7 hospitals, and inpatient providers in the other 7 hospitals.
Responses to questions providing detailed information on the process of utilization review were provided by 12 hospitals. Among this subset of hospitals, utilization review was consistently used to assign virtual inpatient observation status and was applied at admission (n = 6) or discharge (n = 8), depending on the hospital. One hospital applied observation status at both admission and discharge; 1 hospital did not provide a response. Responses to questions regarding utilization review are presented in Table 3.
Survey Question | Yes N (%) | No N (%) |
---|---|---|
Preadmission utilization review is conducted at my hospital. | 3 (25) | 9 (75) |
Utilization review occurs daily at my hospital. | 10 (83) | 2 (17) |
A nonclinician can initiate an order for observation status. | 4 (33) | 8 (67) |
Status can be changed after the patient has been discharged. | 10 (83) | 2 (17) |
Inpatient status would always be assigned to a patient who receives less than 24 hours of care and meets inpatient criteria. | 9 (75) | 3 (25) |
The same status would be assigned to different patients who received the same treatment of the same duration but have different payers. | 6 (50) | 6 (50) |
DISCUSSION
This is the largest descriptive study of pediatric observation status practices in US freestanding children's hospitals and, to our knowledge, the first to include information about both the ED and inpatient treatment environments. There are two important findings of this study. First, designated OUs were uncommon among the group of freestanding children's hospitals that reported observation patient data to PHIS in 2010. Second, despite the fact that hospitals reported observation care was delivered in a variety of settings, virtual inpatient observation status was nearly ubiquitous. Among the subset of hospitals that provided information about the clinical care delivered to patients admitted under virtual inpatient observation, hospitals frequently reported there were no differences in the care delivered to observation patients when compared with other inpatients.
The results of our survey indicate that designated OUs are not a commonly available model of observation care in the study hospitals. In fact, the vast majority of the hospitals used virtual inpatient observation care, which did not differ from the care delivered to a child admitted as an inpatient. ED‐based OUs, which often provide operationally and physically distinct care to observation patients, have been touted as cost‐effective alternatives to inpatient care,1820 resulting in fewer admissions and reductions in length of stay19, 20 without a resultant increase in return ED‐visits or readmissions.2123 Research is needed to determine the patient‐level outcomes for short‐stay patients in the variety of available treatment settings (eg, physically or operationally distinct OUs and virtual observation), and to evaluate these outcomes in comparison to results published from designated OUs. The operationally and physically distinct features of a designated OU may be required to realize the benefits of observation attributed to individual patients.
While observation care has been historically provided by emergency physicians, there is increasing interest in the role of inpatient providers in observation care.9 According to our survey, children were admitted to observation status directly from clinics, following surgical procedures, scheduled tests and treatment, or after evaluation and treatment in the ED. As many of these children undergo virtual observation in inpatient areas, the role of inpatient providers, such as pediatric hospitalists, in observation care may be an important area for future study, education, and professional development. Novel models of care, with hospitalists collaborating with emergency physicians, may be of benefit to the children who require observation following initial stabilization and treatment in the ED.24, 25
We identified variation between hospitals in the methods used to assign observation status to an episode of care, including a wide range of length of stay criteria and different approaches to utilization review. In addition, the criteria payers use to reimburse for observation varied between payers, even within individual hospitals. The results of our survey may be driven by issues of reimbursement and not based on a model of optimizing patient care outcomes using designated OUs. Variations in reimbursement may limit hospital efforts to refine models of observation care for children. Designated OUs have been suggested as a method for improving ED patient flow,26 increasing inpatient capacity,27 and reducing costs of care.28 Standardization of observation status criteria and consistent reimbursement for observation services may be necessary for hospitals to develop operationally and physically distinct OUs, which may be essential to achieving the proposed benefits of observation medicine on costs of care, patient flow, and hospital capacity.
LIMITATIONS
Our study results should be interpreted with the following limitations in mind. First, the surveys were distributed only to freestanding children's hospitals who participate in PHIS. As a result, our findings may not be generalizable to the experiences of other children's hospitals or general hospitals caring for children. Questions in Survey 2 were focused on understanding observation care, delivered to patients following ED care, which may differ from observation practices related to a direct admission or following scheduled procedures, tests, or treatments. It is important to note that, hospitals that do not report observation status patient data to PHIS are still providing care to children with acute conditions that respond to brief periods of hospital treatment, even though it is not labeled observation. However, it was beyond the scope of this study to characterize the care delivered to all patients who experience a short stay.
The second main limitation of our study is the lower response rate to Survey 2. In addition, several surveys contained incomplete responses which further limits our sample size for some questions, specifically those related to utilization review. The lower response to Survey 2 could be related to the timing of the distribution of the 2 surveys, or to the information contained in the introductory e‐mail describing Survey 2. Hospitals with designated observation units, or where observation status care has been receiving attention, may have been more likely to respond to our survey, which may bias our results to reflect the experiences of hospitals experiencing particular successes or challenges with observation status care. A comparison of known hospital characteristics revealed no differences between hospitals that did and did not provide responses to Survey 2, but other unmeasured differences may exist.
CONCLUSION
Observation status is assigned using duration of treatment, clinical care guidelines, and level of care criteria, and is defined differently by individual hospitals and payers. Currently, the most widely available setting for pediatric observation status is within a virtual inpatient unit. Our results suggest that the care delivered to observation patients in virtual inpatient units is consistent with care provided to other inpatients. As such, observation status is largely an administrative/billing designation, which does not appear to reflect differences in clinical care. A consistent approach to the assignment of patients to observation status, and treatment of patients under observation among hospitals and payers, may be necessary to compare quality outcomes. Studies of the clinical care delivery and processes of care for short‐stay patients are needed to optimize models of pediatric observation care.
- Observation medicine: the healthcare system's tincture of time. In: Graff LG, ed.Principles of Observation Medicine.Dallas, TX:American College of Emergency Physicians;2010. Available at: http://www.acep.org/content.aspx?id=46142. Accessed February 18,year="2011"2011. .
- Hospital ‘observation’ status a matter of billing.The Columbus Dispatch. February 14,2011. .
- Hospital payments downgraded.Philadelphia Business Journal. February 18,2011. .
- Medicare rules give full hospital benefits only to those with ‘inpatient’ status.The Washington Post. September 7,2010. .
- Hospitals caught between a rock and a hard place over observation.Health Leaders Media. September 15,2010. .
- AHA: observation status fears on the rise.Health Leaders Media. October 29,2010. .
- Put your hospital bill under a microscope.The New York Times. September 13,2010. .
- Medicare Hospital Manual Section 455.Washington, DC:Department of Health and Human Services, Centers for Medicare and Medicaid Services;2001.
- The Observation Unit: An Operational Overview for the Hospitalist. Society of Hospital Medicine White Paper. May 21, 2009. Available at: http://www.hospitalmedicine.org/Content/NavigationMenu/Publications/White Papers/White_Papers.htm. Accessed May 21,2009. , , , , .
- Utilization and unexpected hospitalization rates of a pediatric emergency department 23‐hour observation unit.Pediatr Emerg Care.2008;24(9):589–594. , , , , .
- The pediatric hybrid observation unit: an analysis of 6477 consecutive patient encounters.Pediatrics.2005;115(5):e535–e542. , , .
- Pediatric observation units in the United States: a systematic review.J Hosp Med.2010;5(3):172–182. , , , , .
- Pediatric emergency department directors' benchmarking survey: fiscal year 2001.Pediatr Emerg Care.2003;19(3):143–147. , , .
- Pediatric observation status beds on an inpatient unit: an integrated care model.Pediatr Emerg Care.2004;20(1):17–21. , , , .
- Impact of a short stay unit on asthma patients admitted to a tertiary pediatric hospital.Qual Manag Health Care.1997;6(1):14–22. , , , .
- A national survey of observation units in the United States.Am J Emerg Med.2003;21(7):529–533. , , , .
- A survey of observation units in the United States.Am J Emerg Med.1989;7(6):576–580. , , , .
- When the patient requires observation not hospitalization.J Nurs Admin.1988;18(10):20–23. , , .
- A reduction in hospitalization, length of stay, and hospital charges for croup with the institution of a pediatric observation unit.Am J Emerg Med.2006;24(7):818–821. , , .
- Outpatient oral rehydration in the United States.Am J Dis Child.1986;140(3):211–215. , , .
- Pediatric closed head injuries treated in an observation unit.Pediatr Emerg Care.2005;21(10):639–644. , , , , .
- Use of pediatric observation unit for treatment of children with dehydration caused by gastroenteritis.Pediatr Emerg Care.2006;22(1):1–6. , , , .
- Children with asthma admitted to a pediatric observation unit.Pediatr Emerg Care.2005;21(10):645–649. , , , .
- Redefining the community pediatric hospitalist: the combined pediatric ED/inpatient unit.Pediatr Emerg Care.2007;23(1):33–37. , , , .
- Program description: a hospitalist‐run, medical short‐stay unit in a teaching hospital.Can Med Assoc J.2000;163(11):1477–1480. , , , .
- Impact of an observation unit and an emergency department‐admitted patient transfer mandate in decreasing overcrowding in a pediatric emergency department: a discrete event simulation exercise.Pediatr Emerg Care.2009;25(3):160–163. , .
- Children's hospitals do not acutely respond to high occupancy.Pediatrics.125(5):974–981. , , , et al.
- Trends in high‐turnover stays among children hospitalized in the United States, 1993‐2003.Pediatrics.2009;123(3):996–1002. , , , , , .
- Observation medicine: the healthcare system's tincture of time. In: Graff LG, ed.Principles of Observation Medicine.Dallas, TX:American College of Emergency Physicians;2010. Available at: http://www.acep.org/content.aspx?id=46142. Accessed February 18,year="2011"2011. .
- Hospital ‘observation’ status a matter of billing.The Columbus Dispatch. February 14,2011. .
- Hospital payments downgraded.Philadelphia Business Journal. February 18,2011. .
- Medicare rules give full hospital benefits only to those with ‘inpatient’ status.The Washington Post. September 7,2010. .
- Hospitals caught between a rock and a hard place over observation.Health Leaders Media. September 15,2010. .
- AHA: observation status fears on the rise.Health Leaders Media. October 29,2010. .
- Put your hospital bill under a microscope.The New York Times. September 13,2010. .
- Medicare Hospital Manual Section 455.Washington, DC:Department of Health and Human Services, Centers for Medicare and Medicaid Services;2001.
- The Observation Unit: An Operational Overview for the Hospitalist. Society of Hospital Medicine White Paper. May 21, 2009. Available at: http://www.hospitalmedicine.org/Content/NavigationMenu/Publications/White Papers/White_Papers.htm. Accessed May 21,2009. , , , , .
- Utilization and unexpected hospitalization rates of a pediatric emergency department 23‐hour observation unit.Pediatr Emerg Care.2008;24(9):589–594. , , , , .
- The pediatric hybrid observation unit: an analysis of 6477 consecutive patient encounters.Pediatrics.2005;115(5):e535–e542. , , .
- Pediatric observation units in the United States: a systematic review.J Hosp Med.2010;5(3):172–182. , , , , .
- Pediatric emergency department directors' benchmarking survey: fiscal year 2001.Pediatr Emerg Care.2003;19(3):143–147. , , .
- Pediatric observation status beds on an inpatient unit: an integrated care model.Pediatr Emerg Care.2004;20(1):17–21. , , , .
- Impact of a short stay unit on asthma patients admitted to a tertiary pediatric hospital.Qual Manag Health Care.1997;6(1):14–22. , , , .
- A national survey of observation units in the United States.Am J Emerg Med.2003;21(7):529–533. , , , .
- A survey of observation units in the United States.Am J Emerg Med.1989;7(6):576–580. , , , .
- When the patient requires observation not hospitalization.J Nurs Admin.1988;18(10):20–23. , , .
- A reduction in hospitalization, length of stay, and hospital charges for croup with the institution of a pediatric observation unit.Am J Emerg Med.2006;24(7):818–821. , , .
- Outpatient oral rehydration in the United States.Am J Dis Child.1986;140(3):211–215. , , .
- Pediatric closed head injuries treated in an observation unit.Pediatr Emerg Care.2005;21(10):639–644. , , , , .
- Use of pediatric observation unit for treatment of children with dehydration caused by gastroenteritis.Pediatr Emerg Care.2006;22(1):1–6. , , , .
- Children with asthma admitted to a pediatric observation unit.Pediatr Emerg Care.2005;21(10):645–649. , , , .
- Redefining the community pediatric hospitalist: the combined pediatric ED/inpatient unit.Pediatr Emerg Care.2007;23(1):33–37. , , , .
- Program description: a hospitalist‐run, medical short‐stay unit in a teaching hospital.Can Med Assoc J.2000;163(11):1477–1480. , , , .
- Impact of an observation unit and an emergency department‐admitted patient transfer mandate in decreasing overcrowding in a pediatric emergency department: a discrete event simulation exercise.Pediatr Emerg Care.2009;25(3):160–163. , .
- Children's hospitals do not acutely respond to high occupancy.Pediatrics.125(5):974–981. , , , et al.
- Trends in high‐turnover stays among children hospitalized in the United States, 1993‐2003.Pediatrics.2009;123(3):996–1002. , , , , , .
Copyright © 2011 Society of Hospital Medicine
Refined Risk Stratification Guides Leukemia Transplant Decisions
SAN FRANCISCO – Risk stratification is becoming progressively more refined in adults with acute leukemia, helping to identify patients most likely to benefit from transplantation, according to Dr. Robert S. Negrin.
"What has become clear is that there is important prognostic information that one can gain from the patient at the time of diagnosis that can really help guide therapy," Dr. Negrin, a professor of medicine and chief of the division of blood and marrow transplantation at Stanford (Calif.) University, said at the annual Oncology Congress.
"Clearly, one can identify patients who are at higher risk for [poor outcome]. They can be split, cytogenetics being the first pass and then molecular markers being the second pass," he told attendees.
AML Status Predicts Outcome
Typically, three groups of adults with acute myeloid leukemia (AML) are offered transplantation, he said: those having a failure of induction chemotherapy, those in a first complete remission but having an intermediate or high risk for relapse, and those beyond first complete remission.
"The No. 1 predictor of outcome is the status of the disease at the time of transplant consideration, by far and away," noted Dr. Negrin. With transplantation, the 10-year overall survival rate is only 17% for the patients with induction failure or relapsed disease, in the Stanford experience. But patients in first complete remission fare better, at 57%.
Outcomes among patients in first complete remission are varied, however, with cytogenetics identifying distinct subgroups: better risk (10%-15% of these patients), poor risk (20%-30%), and intermediate risk (all the rest).
The better-risk subgroup does fairly well with chemotherapy alone, according to Dr. Negrin. "Those are patients that we generally would recommend not to consider transplant in first complete remission. One would only consider transplant at time of relapse or second remission." At the other extreme, the poor-risk subgroup "should clearly be considered for transplant up front."
Then there is the large subgroup having intermediate risk, many of whom have normal cytogenetics. Molecular markers have shown these cytogenetically normal AMLs to be highly heterogeneous (Blood 2010;115:453-74) – information that is now being used to guide transplant decisions.
For example, patients with mutation of the nucleophosmin (NPM1) gene have a favorable prognosis and are generally managed with chemotherapy alone. In contrast, their counterparts with a mutation of the FMS-like tyrosine kinase 3 (FLT3) gene have an unfavorable prognosis with chemotherapy and may fare better with transplantation.
"So this [molecular analysis] is very helpful because it helps split those patients with cytogenetically normal AML into favorable and unfavorable groups of patients," he commented. And he predicted that such molecular risk stratification will likely be even further refined in the future.
Research is also showing that molecular prognostic information may modify cytogenetic prognostic information. For instance, in the better-risk subgroup in first remission, among patients having the favorable inversion 16 cytogenetic profile, those with a KIT mutation have poorer survival with chemotherapy than do their counterparts with wild-type KIT (J. Clin. Oncol. 2006;24:3904-11).
"By and large, unfortunately, negative markers overcome the positive ones. That’s obviously a gross generalization, but unfortunately, it is reasonably accurate," Dr. Negrin commented. "So just finding a favorable cytogenetic abnormality does not tell the whole story. One needs to do the molecular studies as well."
And doing them early is key.
"Cytogenetic and molecular studies should be done on all leukemic patients," he stressed. "When we see patients in referral, a lot of patients still are not having these molecular studies done on a routine basis, and that’s unfortunate because it’s very important that we do the best we can to try to [evaluate] patients with the most advanced technologies we have. ... It’s very important that we identify these patients up front to treat them as appropriately as we can."
Know bcr-abl Status in ALL
Risk stratification is also improving among adults with acute lymphoblastic leukemia (ALL). In these cases as well, three groups are typically offered transplantation: those having a failure of induction chemotherapy, those in first complete remission having high-risk disease, and those in either a second complete remission or first relapse.
"Clearly, one can identify patients who are at higher risk for [poor outcome]. They can be split."
Disease status at the time of transplantation is also the best predictor of outcome in ALL. In the Stanford experience, the 10-year rate of overall survival is 62% for patients who undergo transplantation in first complete remission, compared with 43% for patients having relapsed or refractory disease at transplantation.
In terms of cytogenetics, the bcr-abl translocation (Philadelphia chromosome) is "a very ominous" finding among patients with B cell–lineage ALL, according to Dr. Negrin. These patients are not cured by intensive chemotherapy and derive only short-term benefit from tyrosine kinase inhibitors. Transplantation can achieve cure, however, although less often than in other ALL subtypes.
At Stanford, the 10-year rate of overall survival for patients having this cytogenetic abnormality is about 55% among those in first complete remission at transplantation, and 20% among those beyond first complete remission.
"Clearly, patients with Philadelphia chromosome–positive ALL are at extraordinary risk and are those who do benefit from transplant," he said.
Dr. Negrin reported that he sits on the data safety monitoring boards for Abbott Pharmaceuticals and Ziopharm, and is a consultant to Genzyme and Baxter. The Oncology Congress is presented by Reed Medical Education. Reed Medical Education and this news organization are owned by Reed Elsevier Inc.
SAN FRANCISCO – Risk stratification is becoming progressively more refined in adults with acute leukemia, helping to identify patients most likely to benefit from transplantation, according to Dr. Robert S. Negrin.
"What has become clear is that there is important prognostic information that one can gain from the patient at the time of diagnosis that can really help guide therapy," Dr. Negrin, a professor of medicine and chief of the division of blood and marrow transplantation at Stanford (Calif.) University, said at the annual Oncology Congress.
"Clearly, one can identify patients who are at higher risk for [poor outcome]. They can be split, cytogenetics being the first pass and then molecular markers being the second pass," he told attendees.
AML Status Predicts Outcome
Typically, three groups of adults with acute myeloid leukemia (AML) are offered transplantation, he said: those having a failure of induction chemotherapy, those in a first complete remission but having an intermediate or high risk for relapse, and those beyond first complete remission.
"The No. 1 predictor of outcome is the status of the disease at the time of transplant consideration, by far and away," noted Dr. Negrin. With transplantation, the 10-year overall survival rate is only 17% for the patients with induction failure or relapsed disease, in the Stanford experience. But patients in first complete remission fare better, at 57%.
Outcomes among patients in first complete remission are varied, however, with cytogenetics identifying distinct subgroups: better risk (10%-15% of these patients), poor risk (20%-30%), and intermediate risk (all the rest).
The better-risk subgroup does fairly well with chemotherapy alone, according to Dr. Negrin. "Those are patients that we generally would recommend not to consider transplant in first complete remission. One would only consider transplant at time of relapse or second remission." At the other extreme, the poor-risk subgroup "should clearly be considered for transplant up front."
Then there is the large subgroup having intermediate risk, many of whom have normal cytogenetics. Molecular markers have shown these cytogenetically normal AMLs to be highly heterogeneous (Blood 2010;115:453-74) – information that is now being used to guide transplant decisions.
For example, patients with mutation of the nucleophosmin (NPM1) gene have a favorable prognosis and are generally managed with chemotherapy alone. In contrast, their counterparts with a mutation of the FMS-like tyrosine kinase 3 (FLT3) gene have an unfavorable prognosis with chemotherapy and may fare better with transplantation.
"So this [molecular analysis] is very helpful because it helps split those patients with cytogenetically normal AML into favorable and unfavorable groups of patients," he commented. And he predicted that such molecular risk stratification will likely be even further refined in the future.
Research is also showing that molecular prognostic information may modify cytogenetic prognostic information. For instance, in the better-risk subgroup in first remission, among patients having the favorable inversion 16 cytogenetic profile, those with a KIT mutation have poorer survival with chemotherapy than do their counterparts with wild-type KIT (J. Clin. Oncol. 2006;24:3904-11).
"By and large, unfortunately, negative markers overcome the positive ones. That’s obviously a gross generalization, but unfortunately, it is reasonably accurate," Dr. Negrin commented. "So just finding a favorable cytogenetic abnormality does not tell the whole story. One needs to do the molecular studies as well."
And doing them early is key.
"Cytogenetic and molecular studies should be done on all leukemic patients," he stressed. "When we see patients in referral, a lot of patients still are not having these molecular studies done on a routine basis, and that’s unfortunate because it’s very important that we do the best we can to try to [evaluate] patients with the most advanced technologies we have. ... It’s very important that we identify these patients up front to treat them as appropriately as we can."
Know bcr-abl Status in ALL
Risk stratification is also improving among adults with acute lymphoblastic leukemia (ALL). In these cases as well, three groups are typically offered transplantation: those having a failure of induction chemotherapy, those in first complete remission having high-risk disease, and those in either a second complete remission or first relapse.
"Clearly, one can identify patients who are at higher risk for [poor outcome]. They can be split."
Disease status at the time of transplantation is also the best predictor of outcome in ALL. In the Stanford experience, the 10-year rate of overall survival is 62% for patients who undergo transplantation in first complete remission, compared with 43% for patients having relapsed or refractory disease at transplantation.
In terms of cytogenetics, the bcr-abl translocation (Philadelphia chromosome) is "a very ominous" finding among patients with B cell–lineage ALL, according to Dr. Negrin. These patients are not cured by intensive chemotherapy and derive only short-term benefit from tyrosine kinase inhibitors. Transplantation can achieve cure, however, although less often than in other ALL subtypes.
At Stanford, the 10-year rate of overall survival for patients having this cytogenetic abnormality is about 55% among those in first complete remission at transplantation, and 20% among those beyond first complete remission.
"Clearly, patients with Philadelphia chromosome–positive ALL are at extraordinary risk and are those who do benefit from transplant," he said.
Dr. Negrin reported that he sits on the data safety monitoring boards for Abbott Pharmaceuticals and Ziopharm, and is a consultant to Genzyme and Baxter. The Oncology Congress is presented by Reed Medical Education. Reed Medical Education and this news organization are owned by Reed Elsevier Inc.
SAN FRANCISCO – Risk stratification is becoming progressively more refined in adults with acute leukemia, helping to identify patients most likely to benefit from transplantation, according to Dr. Robert S. Negrin.
"What has become clear is that there is important prognostic information that one can gain from the patient at the time of diagnosis that can really help guide therapy," Dr. Negrin, a professor of medicine and chief of the division of blood and marrow transplantation at Stanford (Calif.) University, said at the annual Oncology Congress.
"Clearly, one can identify patients who are at higher risk for [poor outcome]. They can be split, cytogenetics being the first pass and then molecular markers being the second pass," he told attendees.
AML Status Predicts Outcome
Typically, three groups of adults with acute myeloid leukemia (AML) are offered transplantation, he said: those having a failure of induction chemotherapy, those in a first complete remission but having an intermediate or high risk for relapse, and those beyond first complete remission.
"The No. 1 predictor of outcome is the status of the disease at the time of transplant consideration, by far and away," noted Dr. Negrin. With transplantation, the 10-year overall survival rate is only 17% for the patients with induction failure or relapsed disease, in the Stanford experience. But patients in first complete remission fare better, at 57%.
Outcomes among patients in first complete remission are varied, however, with cytogenetics identifying distinct subgroups: better risk (10%-15% of these patients), poor risk (20%-30%), and intermediate risk (all the rest).
The better-risk subgroup does fairly well with chemotherapy alone, according to Dr. Negrin. "Those are patients that we generally would recommend not to consider transplant in first complete remission. One would only consider transplant at time of relapse or second remission." At the other extreme, the poor-risk subgroup "should clearly be considered for transplant up front."
Then there is the large subgroup having intermediate risk, many of whom have normal cytogenetics. Molecular markers have shown these cytogenetically normal AMLs to be highly heterogeneous (Blood 2010;115:453-74) – information that is now being used to guide transplant decisions.
For example, patients with mutation of the nucleophosmin (NPM1) gene have a favorable prognosis and are generally managed with chemotherapy alone. In contrast, their counterparts with a mutation of the FMS-like tyrosine kinase 3 (FLT3) gene have an unfavorable prognosis with chemotherapy and may fare better with transplantation.
"So this [molecular analysis] is very helpful because it helps split those patients with cytogenetically normal AML into favorable and unfavorable groups of patients," he commented. And he predicted that such molecular risk stratification will likely be even further refined in the future.
Research is also showing that molecular prognostic information may modify cytogenetic prognostic information. For instance, in the better-risk subgroup in first remission, among patients having the favorable inversion 16 cytogenetic profile, those with a KIT mutation have poorer survival with chemotherapy than do their counterparts with wild-type KIT (J. Clin. Oncol. 2006;24:3904-11).
"By and large, unfortunately, negative markers overcome the positive ones. That’s obviously a gross generalization, but unfortunately, it is reasonably accurate," Dr. Negrin commented. "So just finding a favorable cytogenetic abnormality does not tell the whole story. One needs to do the molecular studies as well."
And doing them early is key.
"Cytogenetic and molecular studies should be done on all leukemic patients," he stressed. "When we see patients in referral, a lot of patients still are not having these molecular studies done on a routine basis, and that’s unfortunate because it’s very important that we do the best we can to try to [evaluate] patients with the most advanced technologies we have. ... It’s very important that we identify these patients up front to treat them as appropriately as we can."
Know bcr-abl Status in ALL
Risk stratification is also improving among adults with acute lymphoblastic leukemia (ALL). In these cases as well, three groups are typically offered transplantation: those having a failure of induction chemotherapy, those in first complete remission having high-risk disease, and those in either a second complete remission or first relapse.
"Clearly, one can identify patients who are at higher risk for [poor outcome]. They can be split."
Disease status at the time of transplantation is also the best predictor of outcome in ALL. In the Stanford experience, the 10-year rate of overall survival is 62% for patients who undergo transplantation in first complete remission, compared with 43% for patients having relapsed or refractory disease at transplantation.
In terms of cytogenetics, the bcr-abl translocation (Philadelphia chromosome) is "a very ominous" finding among patients with B cell–lineage ALL, according to Dr. Negrin. These patients are not cured by intensive chemotherapy and derive only short-term benefit from tyrosine kinase inhibitors. Transplantation can achieve cure, however, although less often than in other ALL subtypes.
At Stanford, the 10-year rate of overall survival for patients having this cytogenetic abnormality is about 55% among those in first complete remission at transplantation, and 20% among those beyond first complete remission.
"Clearly, patients with Philadelphia chromosome–positive ALL are at extraordinary risk and are those who do benefit from transplant," he said.
Dr. Negrin reported that he sits on the data safety monitoring boards for Abbott Pharmaceuticals and Ziopharm, and is a consultant to Genzyme and Baxter. The Oncology Congress is presented by Reed Medical Education. Reed Medical Education and this news organization are owned by Reed Elsevier Inc.
EXPERT ANALYSIS FROM THE ANNUAL ONCOLOGY CONGRESS
Measuring Quality of Care
The measurement of quality of care has been the mantra of health policy care for the past decade, and has become as American as apple pie and Chevrolet. Yet there have been few data showing that the institution of quality of care guidelines has had any impact on mortality or morbidity.
Despite this lack of data, hospitals are being financially rewarded or penalized based on their ability to meet guidelines established by the Center for Medicare and Medicaid Services in conjunction with the American College of Cardiology and the American Heart Association. Two recent reports provide insight on the progress we have achieved with guidelines in heart failure and in instituting the shortening of the door-to-balloon time (D2B) for percutaneous coronary artery intervention (PCI) in ST-segment elevation MI.
Decreasing heart failure readmission within 30 days, which occurs in approximately one-third of hospitalized patients, has become a target for the quality improvement process. Using the "Get With the Guidelines Heart Failure" registry, a recent analysis indicates that there is a very poor correlation between the achievement or those standards and the 30 day mortality and readmission rate (Circulation 2011;124:712-9).
The guidelines include measurement of cardiac function, application of the usual heart failure medications, and discharge instructions. Data were collected in almost 20,000 patients in 153 hospitals during 2005. Adherence to these guidelines was quite good and was achieved in more than 75% of the hospitals, yet it was unrelated to the 30 day mortality or hospital readmission.
The authors emphasized that the factors that affect survival and readmission are very heterogeneous. Basing pay-for-performance standards on a single measure (such as readmission rates) may penalize institutions that face impediments that are unrelated to performance measurements. Penalizing hospitals that have high readmission rates as a result of a large populations of vulnerable patients may penalize institutions that actually could benefit from more resources in order to achieve better outcomes.
The effectiveness of PCI, when it is performed in less than 90 minutes in STEMI patients, has been supported by clinical data from selected cardiac centers. The application to the larger patient population of the guideline to shorten D2B time to less than 90 minutes has been championed by the ACC, which launched the D2B Alliance in 2006 and by the AHA in 2007 with its Mission: Lifeline program.
The success of these efforts was reported in August (Circulation 2011;124:1038-45) and indicates that in a selected group of CMS-reporting hospitals, D2B time decreased from 96 minutes in 2005 to 64 minutes in 2010. In addition, the percentage of patients with a D2B time of less than 90 minutes increased from 44% to 91%, and that of patients with D2B of less than 75 minutes rose from 27% to 70%. The success of this effort is to be applauded, but the report is striking for its absence of any information regarding outcomes of the shortened D2B time. Unfortunately, there is little outcome information available, with the exception of data from Michigan on all Medicare providers in that state, which indicates that although D2B time decreased by 90 minutes, there was no significant benefit.
Measurement of quality remains elusive, in spite of the good intentions of physicians and health planners to use a variety of seemingly beneficial criteria for its definition.
As consumers, we know that quality is not easy to measure. Most of us can compare the quality of American automobiles vs. their foreign competitors by "kicking the tires," that is, by doing a little research. But even with this knowledge, we are not always sure that the particular car we buy will be better or last longer. Health care faces the same problem. Establishing quality care measurements will require a great deal of further research before we can reward or penalize hospitals and physicians for their performance.
It is possible that in our zeal to measure what we can, we are confusing process with content. How to put a number on the performance that leads to quality remains uncertain using our current methodology.-
Dr. Sidney Goldstein is professor of medicine at Wayne State University and division head emeritus of cardiovascular medicine at Henry Ford Hospital, both in Detroit. He is on data safety monitoring committees for the National Institutes of Health and several pharmaceutical companies.
The measurement of quality of care has been the mantra of health policy care for the past decade, and has become as American as apple pie and Chevrolet. Yet there have been few data showing that the institution of quality of care guidelines has had any impact on mortality or morbidity.
Despite this lack of data, hospitals are being financially rewarded or penalized based on their ability to meet guidelines established by the Center for Medicare and Medicaid Services in conjunction with the American College of Cardiology and the American Heart Association. Two recent reports provide insight on the progress we have achieved with guidelines in heart failure and in instituting the shortening of the door-to-balloon time (D2B) for percutaneous coronary artery intervention (PCI) in ST-segment elevation MI.
Decreasing heart failure readmission within 30 days, which occurs in approximately one-third of hospitalized patients, has become a target for the quality improvement process. Using the "Get With the Guidelines Heart Failure" registry, a recent analysis indicates that there is a very poor correlation between the achievement or those standards and the 30 day mortality and readmission rate (Circulation 2011;124:712-9).
The guidelines include measurement of cardiac function, application of the usual heart failure medications, and discharge instructions. Data were collected in almost 20,000 patients in 153 hospitals during 2005. Adherence to these guidelines was quite good and was achieved in more than 75% of the hospitals, yet it was unrelated to the 30 day mortality or hospital readmission.
The authors emphasized that the factors that affect survival and readmission are very heterogeneous. Basing pay-for-performance standards on a single measure (such as readmission rates) may penalize institutions that face impediments that are unrelated to performance measurements. Penalizing hospitals that have high readmission rates as a result of a large populations of vulnerable patients may penalize institutions that actually could benefit from more resources in order to achieve better outcomes.
The effectiveness of PCI, when it is performed in less than 90 minutes in STEMI patients, has been supported by clinical data from selected cardiac centers. The application to the larger patient population of the guideline to shorten D2B time to less than 90 minutes has been championed by the ACC, which launched the D2B Alliance in 2006 and by the AHA in 2007 with its Mission: Lifeline program.
The success of these efforts was reported in August (Circulation 2011;124:1038-45) and indicates that in a selected group of CMS-reporting hospitals, D2B time decreased from 96 minutes in 2005 to 64 minutes in 2010. In addition, the percentage of patients with a D2B time of less than 90 minutes increased from 44% to 91%, and that of patients with D2B of less than 75 minutes rose from 27% to 70%. The success of this effort is to be applauded, but the report is striking for its absence of any information regarding outcomes of the shortened D2B time. Unfortunately, there is little outcome information available, with the exception of data from Michigan on all Medicare providers in that state, which indicates that although D2B time decreased by 90 minutes, there was no significant benefit.
Measurement of quality remains elusive, in spite of the good intentions of physicians and health planners to use a variety of seemingly beneficial criteria for its definition.
As consumers, we know that quality is not easy to measure. Most of us can compare the quality of American automobiles vs. their foreign competitors by "kicking the tires," that is, by doing a little research. But even with this knowledge, we are not always sure that the particular car we buy will be better or last longer. Health care faces the same problem. Establishing quality care measurements will require a great deal of further research before we can reward or penalize hospitals and physicians for their performance.
It is possible that in our zeal to measure what we can, we are confusing process with content. How to put a number on the performance that leads to quality remains uncertain using our current methodology.-
Dr. Sidney Goldstein is professor of medicine at Wayne State University and division head emeritus of cardiovascular medicine at Henry Ford Hospital, both in Detroit. He is on data safety monitoring committees for the National Institutes of Health and several pharmaceutical companies.
The measurement of quality of care has been the mantra of health policy care for the past decade, and has become as American as apple pie and Chevrolet. Yet there have been few data showing that the institution of quality of care guidelines has had any impact on mortality or morbidity.
Despite this lack of data, hospitals are being financially rewarded or penalized based on their ability to meet guidelines established by the Center for Medicare and Medicaid Services in conjunction with the American College of Cardiology and the American Heart Association. Two recent reports provide insight on the progress we have achieved with guidelines in heart failure and in instituting the shortening of the door-to-balloon time (D2B) for percutaneous coronary artery intervention (PCI) in ST-segment elevation MI.
Decreasing heart failure readmission within 30 days, which occurs in approximately one-third of hospitalized patients, has become a target for the quality improvement process. Using the "Get With the Guidelines Heart Failure" registry, a recent analysis indicates that there is a very poor correlation between the achievement or those standards and the 30 day mortality and readmission rate (Circulation 2011;124:712-9).
The guidelines include measurement of cardiac function, application of the usual heart failure medications, and discharge instructions. Data were collected in almost 20,000 patients in 153 hospitals during 2005. Adherence to these guidelines was quite good and was achieved in more than 75% of the hospitals, yet it was unrelated to the 30 day mortality or hospital readmission.
The authors emphasized that the factors that affect survival and readmission are very heterogeneous. Basing pay-for-performance standards on a single measure (such as readmission rates) may penalize institutions that face impediments that are unrelated to performance measurements. Penalizing hospitals that have high readmission rates as a result of a large populations of vulnerable patients may penalize institutions that actually could benefit from more resources in order to achieve better outcomes.
The effectiveness of PCI, when it is performed in less than 90 minutes in STEMI patients, has been supported by clinical data from selected cardiac centers. The application to the larger patient population of the guideline to shorten D2B time to less than 90 minutes has been championed by the ACC, which launched the D2B Alliance in 2006 and by the AHA in 2007 with its Mission: Lifeline program.
The success of these efforts was reported in August (Circulation 2011;124:1038-45) and indicates that in a selected group of CMS-reporting hospitals, D2B time decreased from 96 minutes in 2005 to 64 minutes in 2010. In addition, the percentage of patients with a D2B time of less than 90 minutes increased from 44% to 91%, and that of patients with D2B of less than 75 minutes rose from 27% to 70%. The success of this effort is to be applauded, but the report is striking for its absence of any information regarding outcomes of the shortened D2B time. Unfortunately, there is little outcome information available, with the exception of data from Michigan on all Medicare providers in that state, which indicates that although D2B time decreased by 90 minutes, there was no significant benefit.
Measurement of quality remains elusive, in spite of the good intentions of physicians and health planners to use a variety of seemingly beneficial criteria for its definition.
As consumers, we know that quality is not easy to measure. Most of us can compare the quality of American automobiles vs. their foreign competitors by "kicking the tires," that is, by doing a little research. But even with this knowledge, we are not always sure that the particular car we buy will be better or last longer. Health care faces the same problem. Establishing quality care measurements will require a great deal of further research before we can reward or penalize hospitals and physicians for their performance.
It is possible that in our zeal to measure what we can, we are confusing process with content. How to put a number on the performance that leads to quality remains uncertain using our current methodology.-
Dr. Sidney Goldstein is professor of medicine at Wayne State University and division head emeritus of cardiovascular medicine at Henry Ford Hospital, both in Detroit. He is on data safety monitoring committees for the National Institutes of Health and several pharmaceutical companies.