The Effect of Hospital Safety Net Status on the Association Between Bundled Payment Participation and Changes in Medical Episode Outcomes

Article Type
Changed
Mon, 11/29/2021 - 10:26
Display Headline
The Effect of Hospital Safety Net Status on the Association Between Bundled Payment Participation and Changes in Medical Episode Outcomes

Bundled payments represent one of the most prominent value-based payment arrangements nationwide. Under this payment approach, hospitals assume responsibility for quality and costs across discrete episodes of care. Hospitals that maintain quality while achieving cost reductions are eligible for financial incentives, whereas those that do not are subject to financial penalties.

To date, the largest completed bundled payment program nationwide is Medicare’s Bundled Payments for Care Improvement (BPCI) initiative. Among four different participation models in BPCI, hospital enrollment was greatest in Model 2, in which episodes spanned from hospitalization through 90 days of post–acute care. The overall results from BPCI Model 2 have been positive: hospitals participating in both common surgical episodes, such as joint replacement surgery, and medical episodes, such as acute myocardial infarction (AMI) and congestive heart failure (CHF), have demonstrated long-term financial savings with stable quality performance.1,2

Safety net hospitals that disproportionately serve low-income patients may fare differently than other hospitals under bundled payment models. At baseline, these hospitals typically have fewer financial resources, which may limit their ability to implement measures to standardize care during hospitalization (eg, clinical pathways) or after discharge (eg, postdischarge programs and other strategies to reduce readmissions).3 Efforts to redesign care may be further complicated by greater clinical complexity and social and structural determinants of health among patients seeking care at safety net hospitals. Given the well-known interactions between social determinants and health conditions, these factors are highly relevant for patients hospitalized at safety net hospitals for acute medical events or exacerbations of chronic conditions.

Existing evidence has shown that safety net hospitals have not performed as well as other hospitals in other value-based reforms.4-8 In the context of bundled payments for joint replacement surgery, safety net hospitals have been less likely to achieve financial savings but more likely to receive penalties.9-11 Moreover, the savings achieved by safety net hospitals have been smaller than those achieved by non–safety net hospitals.12

Despite these concerning findings, there are few data about how safety net hospitals have fared under bundled payments for common medical conditions. To address this critical knowledge gap, we evaluated the effect of hospital safety net status on the association between BPCI Model 2 participation and changes in outcomes for medical condition episodes.

METHODS

This study was approved by the University of Pennsylvania Institutional Review Board with a waiver of informed consent.

Data

We used 100% Medicare claims data from 2011 to 2016 for patients receiving care at hospitals participating in BPCI Model 2 for one of four common medical condition episodes: AMI, pneumonia, CHF, and chronic obstructive pulmonary disease (COPD). A 20% random national sample was used for patients hospitalized at nonparticipant hospitals. Publicly available data from the Centers for Medicare & Medicaid Services (CMS) were used to identify hospital enrollment in BPCI Model 2, while data from the 2017 CMS Impact File were used to quantify each hospital’s disproportionate patient percentage (DPP), which reflects the proportion of Medicaid and low-income Medicare beneficiaries served and determines a hospital’s eligibility to earn disproportionate share hospital payments.

Data from the 2011 American Hospital Association Annual Survey were used to capture hospital characteristics, such as number of beds, teaching status, and profit status, while data from the Medicare provider of service, beneficiary summary, and accountable care organization files were used to capture additional hospital characteristics and market characteristics, such as population size and Medicare Advantage penetration. The Medicare Provider Enrollment, Chain, and Ownership System file was used to identify and remove BPCI episodes from physician group practices. State-level data about area deprivation index—a census tract–based measure that incorporates factors such as income, education, employment, and housing quality to describe socioeconomic disadvantage among neighborhoods—were used to define socioeconomically disadvantaged areas as those in the top 20% of area deprivation index statewide.13 Markets were defined using hospital referral regions.14

Study Periods and Hospital Groups

Our analysis spanned the period between January 1, 2011, and December 31, 2016. We separated this period into a baseline period (January 2011–September 2013) prior to the start of BPCI and a subsequent BPCI period (October 2013–December 2016).

We defined any hospitals participating in BPCI Model 2 across this period for any of the four included medical condition episodes as BPCI hospitals. Because hospitals were able to enter or exit BPCI over time, and enrollment data were provided by CMS as quarterly participation files, we were able to identify dates of entry into or exit from BPCI over time by hospital-condition pairs. Hospitals were considered BPCI hospitals until the end of the study period, regardless of subsequent exit.

We defined non-BPCI hospitals as those that never participated in the program and had 10 or more admissions in the BPCI period for the included medical condition episodes. We used this approach to minimize potential bias arising from BPCI entry and exit over time.

Across both BPCI and non-BPCI hospital groups, we followed prior methods and defined safety net hospitals based on a hospital’s DPP.15 Specifically, safety net hospitals were those in the top quartile of DPP among all hospitals nationwide, and hospitals in the other three quartiles were defined as non–safety net hospitals.9,12

Study Sample and Episode Construction

Our study sample included Medicare fee-for-service beneficiaries admitted to BPCI and non-BPCI hospitals for any of the four medical conditions of interest. We adhered to BPCI program rules, which defined each episode type based on a set of Medicare Severity Diagnosis Related Group (MS-DRG) codes (eg, myocardial infarction episodes were defined as MS-DRGs 280-282). From this sample, we excluded beneficiaries with end-stage renal disease or insurance coverage through Medicare Advantage, as well as beneficiaries who died during the index hospital admission, had any non–Inpatient Prospective Payment System claims, or lacked continuous primary Medicare fee-for-service coverage either during the episode or in the 12 months preceding it.

We constructed 90-day medical condition episodes that began with hospital admission and spanned 90 days after hospital discharge. To avoid bias arising from CMS rules related to precedence (rules for handling how overlapping episodes are assigned to hospitals), we followed prior methods and constructed naturally occurring episodes by assigning overlapping ones to the earlier hospital admission.2,16 From this set of episodes, we identified those for AMI, CHF, COPD, and pneumonia.

Exposure and Covariate Variables

Our study exposure was the interaction between hospital safety net status and hospital BPCI participation, which captured whether the association between BPCI participation and outcomes varied by safety net status (eg, whether differential changes in an outcome related to BPCI participation were different for safety net and non–safety net hospitals in the program). BPCI participation was defined using a time-varying indicator of BPCI participation to distinguish between episodes occurring under the program (ie, after a hospital began participating) or before participation in it. Covariates were chosen based on prior studies and included patient variables such as age, sex, Elixhauser comorbidities, frailty, and Medicare/Medicaid dual-eligibility status.17-23 Additionally, our analysis included market variables such as population size and Medicare Advantage penetration.

Outcome Variables

The prespecified primary study outcome was standardized 90-day postdischarge spending. This outcome was chosen owing to the lack of variation in standardized index hospitalization spending given the MS-DRG system and prior work suggesting that bundled payment participants instead targeted changes to postdischarge utilization and spending.2 Secondary outcomes included 90-day unplanned readmission rates, 90-day postdischarge mortality rates, discharge to institutional post–acute care providers (defined as either skilled nursing facilities [SNFs] or inpatient rehabilitation facilities), discharge home with home health agency services, and—among patients discharged to SNFs—SNF length of stay (LOS), measured in number of days.

Statistical Analysis

We described the characteristics of patients and hospitals in our samples. In adjusted analyses, we used a series of difference-in-differences (DID) generalized linear models to conduct a heterogeneity analysis evaluating whether the relationship between hospital BPCI participation and medical condition episode outcomes varied based on hospital safety net status.

In these models, the DID estimator was a time-varying indicator of hospital BPCI participation (equal to 1 for episodes occurring during the BPCI period at BPCI hospitals after they initiated participation; 0 otherwise) together with hospital and quarter-time fixed effects. To examine differences in the association between BPCI and episode outcomes by hospital safety net status—that is, whether there was heterogeneity in the outcome changes between safety net and non–safety net hospitals participating in BPCI—our models also included an interaction term between hospital safety net status and the time-varying BPCI participation term (Appendix Methods). In this approach, BPCI safety net and BPCI non–safety net hospitals were compared with non-BPCI hospitals as the comparison group. The comparisons were chosen to yield the most policy-salient findings, since Medicare evaluated hospitals in BPCI, whether safety net or not, by comparing their performance to nonparticipating hospitals, whether safety net or not.

All models controlled for patient and time-varying market characteristics and included hospital fixed effects (to account for time-invariant hospital market characteristics) and MS-DRG fixed effects. All outcomes were evaluated using models with identity links and normal distributions (ie, ordinary least squares). These variables and models were applied to data from the baseline period to examine consistency with the parallel trends assumption. Overall, Wald tests did not indicate divergent baseline period trends in outcomes between BPCI and non-BPCI hospitals (Appendix Figure 1) or BPCI safety net versus BPCI non–safety net hospitals (Appendix Figure 2).

We conducted sensitivity analyses to evaluate the robustness of our results. First, instead of comparing differential changes at BPCI safety net vs BPCI non–safety net hospitals (ie, evaluating safety net status among BPCI hospitals), we evaluated changes at BPCI safety net vs non-BPCI safety net hospitals compared with changes at BPCI non–safety net vs non-BPCI non–safety net hospitals (ie, marginal differences in the changes associated with BPCI participation among safety net vs non–safety net hospitals). Because safety net hospitals in BPCI were compared with nonparticipating safety net hospitals, and non–safety net hospitals in BPCI were compared with nonparticipating non–safety net hospitals, this set of analyses helped address potential concerns about unobservable differences between safety net and non–safety net organizations and their potential impact on our findings.

Second, we used an alternative, BPCI-specific definition for safety net hospitals: instead of defining safety net status based on all hospitals nationwide, we defined it only among BPCI hospitals (safety net hospitals defined as those in the top quartile of DPP among all BPCI hospitals) and non-BPCI hospitals (safety net hospitals defined as those in the top quartile of DPP among all non-BPCI hospitals). Third, we repeated our main analyses using models with standard errors clustered at the hospital level and without hospital fixed effects. Fourth, we repeated analysis using models with alternative nonlinear link functions and outcome distributions and without hospital fixed effects.

Statistical tests were two-tailed and considered significant at α = .05 for the primary outcome. Statistical analyses were conducted using SAS 9.4 (SAS Institute, Inc.).

RESULTS

Our sample consisted of 3066 hospitals nationwide that collectively provided medical condition episode care to a total of 1,611,848 Medicare fee-for-service beneficiaries. This sample included 238 BPCI hospitals and 2769 non-BPCI hospitals (Table 1, Appendix Table 1).

Among BPCI hospitals, 63 were safety net and 175 were non–safety net hospitals. Compared with non–safety net hospitals, safety net hospitals tended to be larger and were more likely to be urban teaching hospitals. Safety net hospitals also tended to be located in areas with larger populations, more low-income individuals, and greater Medicare Advantage penetration.

In both the baseline and BPCI periods, there were differences in several characteristics for patients admitted to safety net vs non–safety net hospitals (Table 2; Appendix Table 2). Among BPCI hospitals, in both periods, patients admitted at safety net hospitals were younger and more likely to be Black, be Medicare/Medicaid dual eligible, and report having a disability than patients admitted to non–safety net hospitals. Patients admitted to safety net hospitals were also more likely to reside in socioeconomically disadvantaged areas.

Safety Net Status Among BPCI Hospitals

In the baseline period (Appendix Table 3), postdischarge spending was slightly greater among patients admitted to BPCI safety net hospitals ($18,817) than those admitted to BPCI non–safety net hospitals ($18,335). There were also small differences in secondary outcomes between the BPCI safety net and non−safety net groups.

In adjusted analyses evaluating heterogeneity in the effect of BPCI participation between safety net and non–safety net hospitals (Figure 1), differential changes in postdischarge spending between baseline and BPCI participation periods did not differ between safety net and non–safety net hospitals participating in BPCI (aDID, $40; 95% CI, –$254 to $335; P = .79).

With respect to secondary outcomes (Figure 2; Appendix Figure 3), changes between baseline and BPCI participation periods for BPCI safety net vs BPCI non–safety net hospitals were differentially greater for rates of discharge to institutional post–acute care providers (aDID, 1.06 percentage points; 95% CI, 0.37-1.76; P = .003) and differentially lower rates of discharge home with home health agency (aDID, –1.15 percentage points; 95% CI, –1.73 to –0.58; P < .001). Among BPCI hospitals, safety net status was not associated with differential changes from baseline to BPCI periods in other secondary outcomes, including SNF LOS (aDID, 0.32 days; 95% CI, –0.04 to 0.67 days; P = .08).

Sensitivity Analysis

Analyses of BPCI participation among safety net vs non–safety net hospitals nationwide yielded results that were similar to those from our main analyses (Appendix Figures 4, 5, and 6). Compared with BPCI participation among non–safety net hospitals, participation among safety net hospitals was associated with a differential increase from baseline to BPCI periods in discharge to institutional post–acute care providers (aDID, 1.07 percentage points; 95% CI, 0.47-1.67 percentage points; P < .001), but no differential changes between baseline and BPCI periods in postdischarge spending (aDID, –$199;95% CI, –$461 to $63; P = .14), SNF LOS (aDID, –0.22 days; 95% CI, –0.54 to 0.09 days; P = .16), or other secondary outcomes.

Replicating our main analyses using an alternative, BPCI-specific definition of safety net hospitals yielded similar results overall (Appendix Table 4; Appendix Figures 7, 8, and 9). There were no differential changes between baseline and BPCI periods in postdischarge spending between BPCI safety net and BPCI non–safety net hospitals (aDID, $111; 95% CI, –$189 to $411; P = .47). Results for secondary outcomes were also qualitatively similar to results from main analyses, with the exception that among BPCI hospitals, safety net hospitals had a differentially higher SNF LOS than non–safety net hospitals between baseline and BPCI periods (aDID, 0.38 days; 95% CI, 0.02-0.74 days; P = .04).

Compared with results from our main analysis, findings were qualitatively similar overall in analyses using models with hospital-clustered standard errors and without hospital fixed effects (Appendix Figures 10, 11, and 12) as well as models with alternative link functions and outcome distributions and without hospital fixed effects (Appendix Figures 13, 14, and 15).

Discussion

This analysis builds on prior work by evaluating how hospital safety net status affected the known association between bundled payment participation and decreased spending and stable quality for medical condition episodes. Although safety net status did not appear to affect those relationships, it did affect the relationship between participation and post–acute care utilization. These results have three main implications.

First, our results suggest that policymakers should continue engaging safety net hospitals in medical condition bundled payments while monitoring for unintended consequences. Our findings with regard to spending provide some reassurance that safety net hospitals can potentially achieve savings while maintaining quality under bundled payments, similar to other types of hospitals. However, the differences in patient populations and post–acute care utilization patterns suggest that policymakers should continue to carefully monitor for disparities based on hospital safety net status and consider implementing measures that have been used in other payment reforms to support safety net organizations. Such measures could involve providing customized technical assistance or evaluating performance using “peer groups” that compare performance among safety net hospitals alone rather than among all hospitals.24,25

Second, our findings underscore potential challenges that safety net hospitals may face when attempting to redesign care. For instance, among hospitals accepting bundled payments for medical conditions, successful strategies in BPCI have often included maintaining the proportion of patients discharged to institutional post–acute care providers while reducing SNF LOS.2 However, in our study, discharge to institutional post–acute care providers actually increased among safety net hospitals relative to other hospitals while SNF LOS did not decrease. Additionally, while other hospitals in bundled payments have exhibited differentially greater discharge home with home health services, we found that safety net hospitals did not. These represent areas for future work, particularly because little is known about how safety net hospitals coordinate post–acute care (eg, the extent to which safety net hospitals integrate with post–acute care providers or coordinate home-based care for vulnerable patient populations).

Third, study results offer insight into potential challenges to practice changes. Compared with other hospitals, safety net hospitals in our analysis provided medical condition episode care to more Black, Medicare/Medicaid dual-eligible, and disabled patients, as well as individuals living in socioeconomically disadvantaged areas. Collectively, these groups may face more challenging socioeconomic circumstances or existing disparities. The combination of these factors and limited financial resources at safety net hospitals could complicate their ability to manage transitions of care after hospitalization by shifting discharge away from high-intensity institutional post–acute care facilities.

Our analysis has limitations. First, given the observational study design, findings are subject to residual confounding and selection bias. For instance, findings related to post–acute care utilization could have been influenced by unobservable changes in market supply and other factors. However, we mitigated these risks using a quasi-experimental methodology that also directly accounted for multiple patient, hospital, and market characteristics and also used fixed effects to account for unobserved heterogeneity. Second, in studying BPCI Model 2, we evaluated one model within one bundled payment program. However, BPCI Model 2 encompassed a wide range of medical conditions, and both this scope and program design have served as the direct basis for subsequent bundled payment models, such as the ongoing BPCI Advanced and other forthcoming programs.26 Third, while our analysis evaluated multiple aspects of patient complexity, individuals may be “high risk” owing to several clinical and social determinants. Future work should evaluate different features of patient risk and how they affect outcomes under payment models such as bundled payments.

CONCLUSION

Safety net status appeared to affect the relationship between bundled payment participation and post–acute care utilization, but not episode spending. These findings suggest that policymakers could support safety net hospitals within bundled payment programs and consider safety net status when evaluating them.

Files
References

1. Navathe AS, Emanuel EJ, Venkataramani AS, et al. Spending and quality after three years of Medicare’s voluntary bundled payment for joint replacement surgery. Health Aff (Millwood). 2020;39(1):58-66. https://doi.org/10.1377/hlthaff.2019.00466
2. Rolnick JA, Liao JM, Emanuel EJ, et al. Spending and quality after three years of Medicare’s bundled payments for medical conditions: quasi-experimental difference-in-differences study. BMJ. 2020;369:m1780. https://doi.org/10.1136/bmj.m1780
3. Figueroa JF, Joynt KE, Zhou X, Orav EJ, Jha AK. Safety-net hospitals face more barriers yet use fewer strategies to reduce readmissions. Med Care. 2017;55(3):229-235. https://doi.org/10.1097/MLR.0000000000000687
4. Werner RM, Goldman LE, Dudley RA. Comparison of change in quality of care between safety-net and non–safety-net hospitals. JAMA. 2008;299(18):2180-2187. https://doi/org/10.1001/jama.299.18.2180
5. Ross JS, Bernheim SM, Lin Z, et al. Based on key measures, care quality for Medicare enrollees at safety-net and non–safety-net hospitals was almost equal. Health Aff (Millwood). 2012;31(8):1739-1748. https://doi.org/10.1377/hlthaff.2011.1028
6. Gilman M, Adams EK, Hockenberry JM, Milstein AS, Wilson IB, Becker ER. Safety-net hospitals more likely than other hospitals to fare poorly under Medicare’s value-based purchasing. Health Aff (Millwood). 2015;34(3):398-405. https://doi.org/10.1377/hlthaff.2014.1059
7. Joynt KE, Jha AK. Characteristics of hospitals receiving penalties under the Hospital Readmissions Reduction Program. JAMA. 2013;309(4):342-343. https://doi.org/10.1001/jama.2012.94856
8. Rajaram R, Chung JW, Kinnier CV, et al. Hospital characteristics associated with penalties in the Centers for Medicare & Medicaid Services Hospital-Acquired Condition Reduction Program. JAMA. 2015;314(4):375-383. https://doi.org/10.1001/jama.2015.8609
9. Navathe AS, Liao JM, Shah Y, et al. Characteristics of hospitals earning savings in the first year of mandatory bundled payment for hip and knee surgery. JAMA. 2018;319(9):930-932. https://doi.org/10.1001/jama.2018.0678
10. Thirukumaran CP, Glance LG, Cai X, Balkissoon R, Mesfin A, Li Y. Performance of safety-net hospitals in year 1 of the Comprehensive Care for Joint Replacement Model. Health Aff (Millwood). 2019;38(2):190-196. https://doi.org/10.1377/hlthaff.2018.05264
11. Thirukumaran CP, Glance LG, Cai X, Kim Y, Li Y. Penalties and rewards for safety net vs non–safety net hospitals in the first 2 years of the Comprehensive Care for Joint Replacement Model. JAMA. 2019;321(20):2027-2030. https://doi.org/10.1001/jama.2019.5118
12. Kim H, Grunditz JI, Meath THA, Quiñones AR, Ibrahim SA, McConnell KJ. Level of reconciliation payments by safety-net hospital status under the first year of the Comprehensive Care for Joint Replacement Program. JAMA Surg. 2019;154(2):178-179. https://doi.org/10.1001/jamasurg.2018.3098
13. Department of Medicine, University of Wisconsin School of Medicine and Public Health. Neighborhood Atlas. Accessed March 1, 2021. https://www.neighborhoodatlas.medicine.wisc.edu/
14. Dartmouth Atlas Project. The Dartmouth Atlas of Health Care. Accessed March 1, 2021. https://www.dartmouthatlas.org/
15. Chatterjee P, Joynt KE, Orav EJ, Jha AK. Patient experience in safety-net hospitals: implications for improving care and value-based purchasing. Arch Intern Med. 2012;172(16):1204-1210. https://doi.org/10.1001/archinternmed.2012.3158
16. Rolnick JA, Liao JM, Navathe AS. Programme design matters—lessons from bundled payments in the US. June 17, 2020. Accessed March 1, 2021. https://blogs.bmj.com/bmj/2020/06/17/programme-design-matters-lessons-from-bundled-payments-in-the-us
17. Dummit LA, Kahvecioglu D, Marrufo G, et al. Association between hospital participation in a Medicare bundled payment initiative and payments and quality outcomes for lower extremity joint replacement episodes. JAMA. 2016;316(12):1267-1278. https://doi.org/10.1001/jama.2016.12717
18. Navathe AS, Liao JM, Dykstra SE, et al. Association of hospital participation in a Medicare bundled payment program with volume and case mix of lower extremity joint replacement episodes. JAMA. 2018;320(9):901-910. https://doi.org/10.1001/jama.2018.12345
19. Joynt Maddox KE, Orav EJ, Zheng J, Epstein AM. Evaluation of Medicare’s bundled payments initiative for medical conditions. N Engl J Med. 2018;379(3):260-269. https://doi.org/10.1056/NEJMsa1801569
20. Navathe AS, Emanuel EJ, Venkataramani AS, et al. Spending and quality after three years of Medicare’s voluntary bundled payment for joint replacement surgery. Health Aff (Millwood). 2020;39(1):58-66. https://doi.org/10.1377/hlthaff.2019.00466
21. Liao JM, Emanuel EJ, Venkataramani AS, et al. Association of bundled payments for joint replacement surgery and patient outcomes with simultaneous hospital participation in accountable care organizations. JAMA Netw Open. 2019;2(9):e1912270. https://doi.org/10.1001/jamanetworkopen.2019.12270
22. Kim DH, Schneeweiss S. Measuring frailty using claims data for pharmacoepidemiologic studies of mortality in older adults: evidence and recommendations. Pharmacoepidemiol Drug Saf. 2014;23(9):891-901. https://doi.org/10.1002/pds.3674
23. Joynt KE, Figueroa JF, Beaulieu N, Wild RC, Orav EJ, Jha AK. Segmenting high-cost Medicare patients into potentially actionable cohorts. Healthc (Amst). 2017;5(1-2):62-67. https://doi.org/10.1016/j.hjdsi.2016.11.002
24. Quality Payment Program. Small, underserved, and rural practices. Accessed March 1, 2021. https://qpp.cms.gov/about/small-underserved-rural-practices
25. McCarthy CP, Vaduganathan M, Patel KV, et al. Association of the new peer group–stratified method with the reclassification of penalty status in the Hospital Readmission Reduction Program. JAMA Netw Open. 2019;2(4):e192987. https://doi.org/10.1001/jamanetworkopen.2019.2987
26. Centers for Medicare & Medicaid Services. BPCI Advanced. Updated September 16, 2021. Accessed October 18, 2021. https://innovation.cms.gov/innovation-models/bpci-advanced

Article PDF
Author and Disclosure Information

1Department of Medicine, University of Washington School of Medicine, Seattle, Washington; 2Leonard Davis Institute of Health Economics, University of Pennsylvania, Philadelphia, Pennsylvania; 3Department of Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania; 4Department of Medical Ethics and Health Policy, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania; 5Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania; 6Corporal Michael J Crescenz VA Medical Center, Philadelphia, Pennsylvania.

Disclosures
Dr Liao reports personal fees from Kaiser Permanente Washington Health Research Institute, textbook royalties from Wolters Kluwer, and honoraria from Wolters Kluwer, the Journal of Clinical Pathways, and the American College of Physicians, all outside the submitted work. Dr Navathe reports grants from Hawaii Medical Service Association, Anthem Public Policy Institute, Commonwealth Fund, Oscar Health, Cigna Corporation, Robert Wood Johnson Foundation, Donaghue Foundation, Pennsylvania Department of Health, Ochsner Health System, United Healthcare, Blue Cross Blue Shield of North Carolina, Blue Shield of California, and Humana; personal fees from Navvis Healthcare, Agathos, Inc., YNHHSC/CORE, MaineHealth Accountable Care Organization, Maine Department of Health and Human Services, National University Health System—Singapore, Ministry of Health—Singapore, Elsevier, Medicare Payment Advisory Commission, Cleveland Clinic, Analysis Group, VBID Health, Federal Trade Commission, and Advocate Physician Partners; personal fees and equity from NavaHealth; equity from Embedded Healthcare; and noncompensated board membership from Integrated Services, Inc., outside the submitted work. This article does not necessarily represent the views of the US government or the Department of Veterans Affairs or the Pennsylvania Department of Health.

Funding
This study was funded in part by the National Institute on Minority Health and Health Disparities (R01MD013859) and the Agency for Healthcare Research and Quality (R01HS027595). The funders had no role in the design and conduct of the study; collection, management, analysis, and interpretation of the data; preparation, review, or approval of the manuscript; and decision to submit the manuscript for publication.

Issue
Journal of Hospital Medicine 16(12)
Publications
Topics
Page Number
716-723. Published Online First November 17, 2021
Sections
Files
Files
Author and Disclosure Information

1Department of Medicine, University of Washington School of Medicine, Seattle, Washington; 2Leonard Davis Institute of Health Economics, University of Pennsylvania, Philadelphia, Pennsylvania; 3Department of Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania; 4Department of Medical Ethics and Health Policy, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania; 5Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania; 6Corporal Michael J Crescenz VA Medical Center, Philadelphia, Pennsylvania.

Disclosures
Dr Liao reports personal fees from Kaiser Permanente Washington Health Research Institute, textbook royalties from Wolters Kluwer, and honoraria from Wolters Kluwer, the Journal of Clinical Pathways, and the American College of Physicians, all outside the submitted work. Dr Navathe reports grants from Hawaii Medical Service Association, Anthem Public Policy Institute, Commonwealth Fund, Oscar Health, Cigna Corporation, Robert Wood Johnson Foundation, Donaghue Foundation, Pennsylvania Department of Health, Ochsner Health System, United Healthcare, Blue Cross Blue Shield of North Carolina, Blue Shield of California, and Humana; personal fees from Navvis Healthcare, Agathos, Inc., YNHHSC/CORE, MaineHealth Accountable Care Organization, Maine Department of Health and Human Services, National University Health System—Singapore, Ministry of Health—Singapore, Elsevier, Medicare Payment Advisory Commission, Cleveland Clinic, Analysis Group, VBID Health, Federal Trade Commission, and Advocate Physician Partners; personal fees and equity from NavaHealth; equity from Embedded Healthcare; and noncompensated board membership from Integrated Services, Inc., outside the submitted work. This article does not necessarily represent the views of the US government or the Department of Veterans Affairs or the Pennsylvania Department of Health.

Funding
This study was funded in part by the National Institute on Minority Health and Health Disparities (R01MD013859) and the Agency for Healthcare Research and Quality (R01HS027595). The funders had no role in the design and conduct of the study; collection, management, analysis, and interpretation of the data; preparation, review, or approval of the manuscript; and decision to submit the manuscript for publication.

Author and Disclosure Information

1Department of Medicine, University of Washington School of Medicine, Seattle, Washington; 2Leonard Davis Institute of Health Economics, University of Pennsylvania, Philadelphia, Pennsylvania; 3Department of Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania; 4Department of Medical Ethics and Health Policy, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania; 5Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania; 6Corporal Michael J Crescenz VA Medical Center, Philadelphia, Pennsylvania.

Disclosures
Dr Liao reports personal fees from Kaiser Permanente Washington Health Research Institute, textbook royalties from Wolters Kluwer, and honoraria from Wolters Kluwer, the Journal of Clinical Pathways, and the American College of Physicians, all outside the submitted work. Dr Navathe reports grants from Hawaii Medical Service Association, Anthem Public Policy Institute, Commonwealth Fund, Oscar Health, Cigna Corporation, Robert Wood Johnson Foundation, Donaghue Foundation, Pennsylvania Department of Health, Ochsner Health System, United Healthcare, Blue Cross Blue Shield of North Carolina, Blue Shield of California, and Humana; personal fees from Navvis Healthcare, Agathos, Inc., YNHHSC/CORE, MaineHealth Accountable Care Organization, Maine Department of Health and Human Services, National University Health System—Singapore, Ministry of Health—Singapore, Elsevier, Medicare Payment Advisory Commission, Cleveland Clinic, Analysis Group, VBID Health, Federal Trade Commission, and Advocate Physician Partners; personal fees and equity from NavaHealth; equity from Embedded Healthcare; and noncompensated board membership from Integrated Services, Inc., outside the submitted work. This article does not necessarily represent the views of the US government or the Department of Veterans Affairs or the Pennsylvania Department of Health.

Funding
This study was funded in part by the National Institute on Minority Health and Health Disparities (R01MD013859) and the Agency for Healthcare Research and Quality (R01HS027595). The funders had no role in the design and conduct of the study; collection, management, analysis, and interpretation of the data; preparation, review, or approval of the manuscript; and decision to submit the manuscript for publication.

Article PDF
Article PDF
Related Articles

Bundled payments represent one of the most prominent value-based payment arrangements nationwide. Under this payment approach, hospitals assume responsibility for quality and costs across discrete episodes of care. Hospitals that maintain quality while achieving cost reductions are eligible for financial incentives, whereas those that do not are subject to financial penalties.

To date, the largest completed bundled payment program nationwide is Medicare’s Bundled Payments for Care Improvement (BPCI) initiative. Among four different participation models in BPCI, hospital enrollment was greatest in Model 2, in which episodes spanned from hospitalization through 90 days of post–acute care. The overall results from BPCI Model 2 have been positive: hospitals participating in both common surgical episodes, such as joint replacement surgery, and medical episodes, such as acute myocardial infarction (AMI) and congestive heart failure (CHF), have demonstrated long-term financial savings with stable quality performance.1,2

Safety net hospitals that disproportionately serve low-income patients may fare differently than other hospitals under bundled payment models. At baseline, these hospitals typically have fewer financial resources, which may limit their ability to implement measures to standardize care during hospitalization (eg, clinical pathways) or after discharge (eg, postdischarge programs and other strategies to reduce readmissions).3 Efforts to redesign care may be further complicated by greater clinical complexity and social and structural determinants of health among patients seeking care at safety net hospitals. Given the well-known interactions between social determinants and health conditions, these factors are highly relevant for patients hospitalized at safety net hospitals for acute medical events or exacerbations of chronic conditions.

Existing evidence has shown that safety net hospitals have not performed as well as other hospitals in other value-based reforms.4-8 In the context of bundled payments for joint replacement surgery, safety net hospitals have been less likely to achieve financial savings but more likely to receive penalties.9-11 Moreover, the savings achieved by safety net hospitals have been smaller than those achieved by non–safety net hospitals.12

Despite these concerning findings, there are few data about how safety net hospitals have fared under bundled payments for common medical conditions. To address this critical knowledge gap, we evaluated the effect of hospital safety net status on the association between BPCI Model 2 participation and changes in outcomes for medical condition episodes.

METHODS

This study was approved by the University of Pennsylvania Institutional Review Board with a waiver of informed consent.

Data

We used 100% Medicare claims data from 2011 to 2016 for patients receiving care at hospitals participating in BPCI Model 2 for one of four common medical condition episodes: AMI, pneumonia, CHF, and chronic obstructive pulmonary disease (COPD). A 20% random national sample was used for patients hospitalized at nonparticipant hospitals. Publicly available data from the Centers for Medicare & Medicaid Services (CMS) were used to identify hospital enrollment in BPCI Model 2, while data from the 2017 CMS Impact File were used to quantify each hospital’s disproportionate patient percentage (DPP), which reflects the proportion of Medicaid and low-income Medicare beneficiaries served and determines a hospital’s eligibility to earn disproportionate share hospital payments.

Data from the 2011 American Hospital Association Annual Survey were used to capture hospital characteristics, such as number of beds, teaching status, and profit status, while data from the Medicare provider of service, beneficiary summary, and accountable care organization files were used to capture additional hospital characteristics and market characteristics, such as population size and Medicare Advantage penetration. The Medicare Provider Enrollment, Chain, and Ownership System file was used to identify and remove BPCI episodes from physician group practices. State-level data about area deprivation index—a census tract–based measure that incorporates factors such as income, education, employment, and housing quality to describe socioeconomic disadvantage among neighborhoods—were used to define socioeconomically disadvantaged areas as those in the top 20% of area deprivation index statewide.13 Markets were defined using hospital referral regions.14

Study Periods and Hospital Groups

Our analysis spanned the period between January 1, 2011, and December 31, 2016. We separated this period into a baseline period (January 2011–September 2013) prior to the start of BPCI and a subsequent BPCI period (October 2013–December 2016).

We defined any hospitals participating in BPCI Model 2 across this period for any of the four included medical condition episodes as BPCI hospitals. Because hospitals were able to enter or exit BPCI over time, and enrollment data were provided by CMS as quarterly participation files, we were able to identify dates of entry into or exit from BPCI over time by hospital-condition pairs. Hospitals were considered BPCI hospitals until the end of the study period, regardless of subsequent exit.

We defined non-BPCI hospitals as those that never participated in the program and had 10 or more admissions in the BPCI period for the included medical condition episodes. We used this approach to minimize potential bias arising from BPCI entry and exit over time.

Across both BPCI and non-BPCI hospital groups, we followed prior methods and defined safety net hospitals based on a hospital’s DPP.15 Specifically, safety net hospitals were those in the top quartile of DPP among all hospitals nationwide, and hospitals in the other three quartiles were defined as non–safety net hospitals.9,12

Study Sample and Episode Construction

Our study sample included Medicare fee-for-service beneficiaries admitted to BPCI and non-BPCI hospitals for any of the four medical conditions of interest. We adhered to BPCI program rules, which defined each episode type based on a set of Medicare Severity Diagnosis Related Group (MS-DRG) codes (eg, myocardial infarction episodes were defined as MS-DRGs 280-282). From this sample, we excluded beneficiaries with end-stage renal disease or insurance coverage through Medicare Advantage, as well as beneficiaries who died during the index hospital admission, had any non–Inpatient Prospective Payment System claims, or lacked continuous primary Medicare fee-for-service coverage either during the episode or in the 12 months preceding it.

We constructed 90-day medical condition episodes that began with hospital admission and spanned 90 days after hospital discharge. To avoid bias arising from CMS rules related to precedence (rules for handling how overlapping episodes are assigned to hospitals), we followed prior methods and constructed naturally occurring episodes by assigning overlapping ones to the earlier hospital admission.2,16 From this set of episodes, we identified those for AMI, CHF, COPD, and pneumonia.

Exposure and Covariate Variables

Our study exposure was the interaction between hospital safety net status and hospital BPCI participation, which captured whether the association between BPCI participation and outcomes varied by safety net status (eg, whether differential changes in an outcome related to BPCI participation were different for safety net and non–safety net hospitals in the program). BPCI participation was defined using a time-varying indicator of BPCI participation to distinguish between episodes occurring under the program (ie, after a hospital began participating) or before participation in it. Covariates were chosen based on prior studies and included patient variables such as age, sex, Elixhauser comorbidities, frailty, and Medicare/Medicaid dual-eligibility status.17-23 Additionally, our analysis included market variables such as population size and Medicare Advantage penetration.

Outcome Variables

The prespecified primary study outcome was standardized 90-day postdischarge spending. This outcome was chosen owing to the lack of variation in standardized index hospitalization spending given the MS-DRG system and prior work suggesting that bundled payment participants instead targeted changes to postdischarge utilization and spending.2 Secondary outcomes included 90-day unplanned readmission rates, 90-day postdischarge mortality rates, discharge to institutional post–acute care providers (defined as either skilled nursing facilities [SNFs] or inpatient rehabilitation facilities), discharge home with home health agency services, and—among patients discharged to SNFs—SNF length of stay (LOS), measured in number of days.

Statistical Analysis

We described the characteristics of patients and hospitals in our samples. In adjusted analyses, we used a series of difference-in-differences (DID) generalized linear models to conduct a heterogeneity analysis evaluating whether the relationship between hospital BPCI participation and medical condition episode outcomes varied based on hospital safety net status.

In these models, the DID estimator was a time-varying indicator of hospital BPCI participation (equal to 1 for episodes occurring during the BPCI period at BPCI hospitals after they initiated participation; 0 otherwise) together with hospital and quarter-time fixed effects. To examine differences in the association between BPCI and episode outcomes by hospital safety net status—that is, whether there was heterogeneity in the outcome changes between safety net and non–safety net hospitals participating in BPCI—our models also included an interaction term between hospital safety net status and the time-varying BPCI participation term (Appendix Methods). In this approach, BPCI safety net and BPCI non–safety net hospitals were compared with non-BPCI hospitals as the comparison group. The comparisons were chosen to yield the most policy-salient findings, since Medicare evaluated hospitals in BPCI, whether safety net or not, by comparing their performance to nonparticipating hospitals, whether safety net or not.

All models controlled for patient and time-varying market characteristics and included hospital fixed effects (to account for time-invariant hospital market characteristics) and MS-DRG fixed effects. All outcomes were evaluated using models with identity links and normal distributions (ie, ordinary least squares). These variables and models were applied to data from the baseline period to examine consistency with the parallel trends assumption. Overall, Wald tests did not indicate divergent baseline period trends in outcomes between BPCI and non-BPCI hospitals (Appendix Figure 1) or BPCI safety net versus BPCI non–safety net hospitals (Appendix Figure 2).

We conducted sensitivity analyses to evaluate the robustness of our results. First, instead of comparing differential changes at BPCI safety net vs BPCI non–safety net hospitals (ie, evaluating safety net status among BPCI hospitals), we evaluated changes at BPCI safety net vs non-BPCI safety net hospitals compared with changes at BPCI non–safety net vs non-BPCI non–safety net hospitals (ie, marginal differences in the changes associated with BPCI participation among safety net vs non–safety net hospitals). Because safety net hospitals in BPCI were compared with nonparticipating safety net hospitals, and non–safety net hospitals in BPCI were compared with nonparticipating non–safety net hospitals, this set of analyses helped address potential concerns about unobservable differences between safety net and non–safety net organizations and their potential impact on our findings.

Second, we used an alternative, BPCI-specific definition for safety net hospitals: instead of defining safety net status based on all hospitals nationwide, we defined it only among BPCI hospitals (safety net hospitals defined as those in the top quartile of DPP among all BPCI hospitals) and non-BPCI hospitals (safety net hospitals defined as those in the top quartile of DPP among all non-BPCI hospitals). Third, we repeated our main analyses using models with standard errors clustered at the hospital level and without hospital fixed effects. Fourth, we repeated analysis using models with alternative nonlinear link functions and outcome distributions and without hospital fixed effects.

Statistical tests were two-tailed and considered significant at α = .05 for the primary outcome. Statistical analyses were conducted using SAS 9.4 (SAS Institute, Inc.).

RESULTS

Our sample consisted of 3066 hospitals nationwide that collectively provided medical condition episode care to a total of 1,611,848 Medicare fee-for-service beneficiaries. This sample included 238 BPCI hospitals and 2769 non-BPCI hospitals (Table 1, Appendix Table 1).

Among BPCI hospitals, 63 were safety net and 175 were non–safety net hospitals. Compared with non–safety net hospitals, safety net hospitals tended to be larger and were more likely to be urban teaching hospitals. Safety net hospitals also tended to be located in areas with larger populations, more low-income individuals, and greater Medicare Advantage penetration.

In both the baseline and BPCI periods, there were differences in several characteristics for patients admitted to safety net vs non–safety net hospitals (Table 2; Appendix Table 2). Among BPCI hospitals, in both periods, patients admitted at safety net hospitals were younger and more likely to be Black, be Medicare/Medicaid dual eligible, and report having a disability than patients admitted to non–safety net hospitals. Patients admitted to safety net hospitals were also more likely to reside in socioeconomically disadvantaged areas.

Safety Net Status Among BPCI Hospitals

In the baseline period (Appendix Table 3), postdischarge spending was slightly greater among patients admitted to BPCI safety net hospitals ($18,817) than those admitted to BPCI non–safety net hospitals ($18,335). There were also small differences in secondary outcomes between the BPCI safety net and non−safety net groups.

In adjusted analyses evaluating heterogeneity in the effect of BPCI participation between safety net and non–safety net hospitals (Figure 1), differential changes in postdischarge spending between baseline and BPCI participation periods did not differ between safety net and non–safety net hospitals participating in BPCI (aDID, $40; 95% CI, –$254 to $335; P = .79).

With respect to secondary outcomes (Figure 2; Appendix Figure 3), changes between baseline and BPCI participation periods for BPCI safety net vs BPCI non–safety net hospitals were differentially greater for rates of discharge to institutional post–acute care providers (aDID, 1.06 percentage points; 95% CI, 0.37-1.76; P = .003) and differentially lower rates of discharge home with home health agency (aDID, –1.15 percentage points; 95% CI, –1.73 to –0.58; P < .001). Among BPCI hospitals, safety net status was not associated with differential changes from baseline to BPCI periods in other secondary outcomes, including SNF LOS (aDID, 0.32 days; 95% CI, –0.04 to 0.67 days; P = .08).

Sensitivity Analysis

Analyses of BPCI participation among safety net vs non–safety net hospitals nationwide yielded results that were similar to those from our main analyses (Appendix Figures 4, 5, and 6). Compared with BPCI participation among non–safety net hospitals, participation among safety net hospitals was associated with a differential increase from baseline to BPCI periods in discharge to institutional post–acute care providers (aDID, 1.07 percentage points; 95% CI, 0.47-1.67 percentage points; P < .001), but no differential changes between baseline and BPCI periods in postdischarge spending (aDID, –$199;95% CI, –$461 to $63; P = .14), SNF LOS (aDID, –0.22 days; 95% CI, –0.54 to 0.09 days; P = .16), or other secondary outcomes.

Replicating our main analyses using an alternative, BPCI-specific definition of safety net hospitals yielded similar results overall (Appendix Table 4; Appendix Figures 7, 8, and 9). There were no differential changes between baseline and BPCI periods in postdischarge spending between BPCI safety net and BPCI non–safety net hospitals (aDID, $111; 95% CI, –$189 to $411; P = .47). Results for secondary outcomes were also qualitatively similar to results from main analyses, with the exception that among BPCI hospitals, safety net hospitals had a differentially higher SNF LOS than non–safety net hospitals between baseline and BPCI periods (aDID, 0.38 days; 95% CI, 0.02-0.74 days; P = .04).

Compared with results from our main analysis, findings were qualitatively similar overall in analyses using models with hospital-clustered standard errors and without hospital fixed effects (Appendix Figures 10, 11, and 12) as well as models with alternative link functions and outcome distributions and without hospital fixed effects (Appendix Figures 13, 14, and 15).

Discussion

This analysis builds on prior work by evaluating how hospital safety net status affected the known association between bundled payment participation and decreased spending and stable quality for medical condition episodes. Although safety net status did not appear to affect those relationships, it did affect the relationship between participation and post–acute care utilization. These results have three main implications.

First, our results suggest that policymakers should continue engaging safety net hospitals in medical condition bundled payments while monitoring for unintended consequences. Our findings with regard to spending provide some reassurance that safety net hospitals can potentially achieve savings while maintaining quality under bundled payments, similar to other types of hospitals. However, the differences in patient populations and post–acute care utilization patterns suggest that policymakers should continue to carefully monitor for disparities based on hospital safety net status and consider implementing measures that have been used in other payment reforms to support safety net organizations. Such measures could involve providing customized technical assistance or evaluating performance using “peer groups” that compare performance among safety net hospitals alone rather than among all hospitals.24,25

Second, our findings underscore potential challenges that safety net hospitals may face when attempting to redesign care. For instance, among hospitals accepting bundled payments for medical conditions, successful strategies in BPCI have often included maintaining the proportion of patients discharged to institutional post–acute care providers while reducing SNF LOS.2 However, in our study, discharge to institutional post–acute care providers actually increased among safety net hospitals relative to other hospitals while SNF LOS did not decrease. Additionally, while other hospitals in bundled payments have exhibited differentially greater discharge home with home health services, we found that safety net hospitals did not. These represent areas for future work, particularly because little is known about how safety net hospitals coordinate post–acute care (eg, the extent to which safety net hospitals integrate with post–acute care providers or coordinate home-based care for vulnerable patient populations).

Third, study results offer insight into potential challenges to practice changes. Compared with other hospitals, safety net hospitals in our analysis provided medical condition episode care to more Black, Medicare/Medicaid dual-eligible, and disabled patients, as well as individuals living in socioeconomically disadvantaged areas. Collectively, these groups may face more challenging socioeconomic circumstances or existing disparities. The combination of these factors and limited financial resources at safety net hospitals could complicate their ability to manage transitions of care after hospitalization by shifting discharge away from high-intensity institutional post–acute care facilities.

Our analysis has limitations. First, given the observational study design, findings are subject to residual confounding and selection bias. For instance, findings related to post–acute care utilization could have been influenced by unobservable changes in market supply and other factors. However, we mitigated these risks using a quasi-experimental methodology that also directly accounted for multiple patient, hospital, and market characteristics and also used fixed effects to account for unobserved heterogeneity. Second, in studying BPCI Model 2, we evaluated one model within one bundled payment program. However, BPCI Model 2 encompassed a wide range of medical conditions, and both this scope and program design have served as the direct basis for subsequent bundled payment models, such as the ongoing BPCI Advanced and other forthcoming programs.26 Third, while our analysis evaluated multiple aspects of patient complexity, individuals may be “high risk” owing to several clinical and social determinants. Future work should evaluate different features of patient risk and how they affect outcomes under payment models such as bundled payments.

CONCLUSION

Safety net status appeared to affect the relationship between bundled payment participation and post–acute care utilization, but not episode spending. These findings suggest that policymakers could support safety net hospitals within bundled payment programs and consider safety net status when evaluating them.

Bundled payments represent one of the most prominent value-based payment arrangements nationwide. Under this payment approach, hospitals assume responsibility for quality and costs across discrete episodes of care. Hospitals that maintain quality while achieving cost reductions are eligible for financial incentives, whereas those that do not are subject to financial penalties.

To date, the largest completed bundled payment program nationwide is Medicare’s Bundled Payments for Care Improvement (BPCI) initiative. Among four different participation models in BPCI, hospital enrollment was greatest in Model 2, in which episodes spanned from hospitalization through 90 days of post–acute care. The overall results from BPCI Model 2 have been positive: hospitals participating in both common surgical episodes, such as joint replacement surgery, and medical episodes, such as acute myocardial infarction (AMI) and congestive heart failure (CHF), have demonstrated long-term financial savings with stable quality performance.1,2

Safety net hospitals that disproportionately serve low-income patients may fare differently than other hospitals under bundled payment models. At baseline, these hospitals typically have fewer financial resources, which may limit their ability to implement measures to standardize care during hospitalization (eg, clinical pathways) or after discharge (eg, postdischarge programs and other strategies to reduce readmissions).3 Efforts to redesign care may be further complicated by greater clinical complexity and social and structural determinants of health among patients seeking care at safety net hospitals. Given the well-known interactions between social determinants and health conditions, these factors are highly relevant for patients hospitalized at safety net hospitals for acute medical events or exacerbations of chronic conditions.

Existing evidence has shown that safety net hospitals have not performed as well as other hospitals in other value-based reforms.4-8 In the context of bundled payments for joint replacement surgery, safety net hospitals have been less likely to achieve financial savings but more likely to receive penalties.9-11 Moreover, the savings achieved by safety net hospitals have been smaller than those achieved by non–safety net hospitals.12

Despite these concerning findings, there are few data about how safety net hospitals have fared under bundled payments for common medical conditions. To address this critical knowledge gap, we evaluated the effect of hospital safety net status on the association between BPCI Model 2 participation and changes in outcomes for medical condition episodes.

METHODS

This study was approved by the University of Pennsylvania Institutional Review Board with a waiver of informed consent.

Data

We used 100% Medicare claims data from 2011 to 2016 for patients receiving care at hospitals participating in BPCI Model 2 for one of four common medical condition episodes: AMI, pneumonia, CHF, and chronic obstructive pulmonary disease (COPD). A 20% random national sample was used for patients hospitalized at nonparticipant hospitals. Publicly available data from the Centers for Medicare & Medicaid Services (CMS) were used to identify hospital enrollment in BPCI Model 2, while data from the 2017 CMS Impact File were used to quantify each hospital’s disproportionate patient percentage (DPP), which reflects the proportion of Medicaid and low-income Medicare beneficiaries served and determines a hospital’s eligibility to earn disproportionate share hospital payments.

Data from the 2011 American Hospital Association Annual Survey were used to capture hospital characteristics, such as number of beds, teaching status, and profit status, while data from the Medicare provider of service, beneficiary summary, and accountable care organization files were used to capture additional hospital characteristics and market characteristics, such as population size and Medicare Advantage penetration. The Medicare Provider Enrollment, Chain, and Ownership System file was used to identify and remove BPCI episodes from physician group practices. State-level data about area deprivation index—a census tract–based measure that incorporates factors such as income, education, employment, and housing quality to describe socioeconomic disadvantage among neighborhoods—were used to define socioeconomically disadvantaged areas as those in the top 20% of area deprivation index statewide.13 Markets were defined using hospital referral regions.14

Study Periods and Hospital Groups

Our analysis spanned the period between January 1, 2011, and December 31, 2016. We separated this period into a baseline period (January 2011–September 2013) prior to the start of BPCI and a subsequent BPCI period (October 2013–December 2016).

We defined any hospitals participating in BPCI Model 2 across this period for any of the four included medical condition episodes as BPCI hospitals. Because hospitals were able to enter or exit BPCI over time, and enrollment data were provided by CMS as quarterly participation files, we were able to identify dates of entry into or exit from BPCI over time by hospital-condition pairs. Hospitals were considered BPCI hospitals until the end of the study period, regardless of subsequent exit.

We defined non-BPCI hospitals as those that never participated in the program and had 10 or more admissions in the BPCI period for the included medical condition episodes. We used this approach to minimize potential bias arising from BPCI entry and exit over time.

Across both BPCI and non-BPCI hospital groups, we followed prior methods and defined safety net hospitals based on a hospital’s DPP.15 Specifically, safety net hospitals were those in the top quartile of DPP among all hospitals nationwide, and hospitals in the other three quartiles were defined as non–safety net hospitals.9,12

Study Sample and Episode Construction

Our study sample included Medicare fee-for-service beneficiaries admitted to BPCI and non-BPCI hospitals for any of the four medical conditions of interest. We adhered to BPCI program rules, which defined each episode type based on a set of Medicare Severity Diagnosis Related Group (MS-DRG) codes (eg, myocardial infarction episodes were defined as MS-DRGs 280-282). From this sample, we excluded beneficiaries with end-stage renal disease or insurance coverage through Medicare Advantage, as well as beneficiaries who died during the index hospital admission, had any non–Inpatient Prospective Payment System claims, or lacked continuous primary Medicare fee-for-service coverage either during the episode or in the 12 months preceding it.

We constructed 90-day medical condition episodes that began with hospital admission and spanned 90 days after hospital discharge. To avoid bias arising from CMS rules related to precedence (rules for handling how overlapping episodes are assigned to hospitals), we followed prior methods and constructed naturally occurring episodes by assigning overlapping ones to the earlier hospital admission.2,16 From this set of episodes, we identified those for AMI, CHF, COPD, and pneumonia.

Exposure and Covariate Variables

Our study exposure was the interaction between hospital safety net status and hospital BPCI participation, which captured whether the association between BPCI participation and outcomes varied by safety net status (eg, whether differential changes in an outcome related to BPCI participation were different for safety net and non–safety net hospitals in the program). BPCI participation was defined using a time-varying indicator of BPCI participation to distinguish between episodes occurring under the program (ie, after a hospital began participating) or before participation in it. Covariates were chosen based on prior studies and included patient variables such as age, sex, Elixhauser comorbidities, frailty, and Medicare/Medicaid dual-eligibility status.17-23 Additionally, our analysis included market variables such as population size and Medicare Advantage penetration.

Outcome Variables

The prespecified primary study outcome was standardized 90-day postdischarge spending. This outcome was chosen owing to the lack of variation in standardized index hospitalization spending given the MS-DRG system and prior work suggesting that bundled payment participants instead targeted changes to postdischarge utilization and spending.2 Secondary outcomes included 90-day unplanned readmission rates, 90-day postdischarge mortality rates, discharge to institutional post–acute care providers (defined as either skilled nursing facilities [SNFs] or inpatient rehabilitation facilities), discharge home with home health agency services, and—among patients discharged to SNFs—SNF length of stay (LOS), measured in number of days.

Statistical Analysis

We described the characteristics of patients and hospitals in our samples. In adjusted analyses, we used a series of difference-in-differences (DID) generalized linear models to conduct a heterogeneity analysis evaluating whether the relationship between hospital BPCI participation and medical condition episode outcomes varied based on hospital safety net status.

In these models, the DID estimator was a time-varying indicator of hospital BPCI participation (equal to 1 for episodes occurring during the BPCI period at BPCI hospitals after they initiated participation; 0 otherwise) together with hospital and quarter-time fixed effects. To examine differences in the association between BPCI and episode outcomes by hospital safety net status—that is, whether there was heterogeneity in the outcome changes between safety net and non–safety net hospitals participating in BPCI—our models also included an interaction term between hospital safety net status and the time-varying BPCI participation term (Appendix Methods). In this approach, BPCI safety net and BPCI non–safety net hospitals were compared with non-BPCI hospitals as the comparison group. The comparisons were chosen to yield the most policy-salient findings, since Medicare evaluated hospitals in BPCI, whether safety net or not, by comparing their performance to nonparticipating hospitals, whether safety net or not.

All models controlled for patient and time-varying market characteristics and included hospital fixed effects (to account for time-invariant hospital market characteristics) and MS-DRG fixed effects. All outcomes were evaluated using models with identity links and normal distributions (ie, ordinary least squares). These variables and models were applied to data from the baseline period to examine consistency with the parallel trends assumption. Overall, Wald tests did not indicate divergent baseline period trends in outcomes between BPCI and non-BPCI hospitals (Appendix Figure 1) or BPCI safety net versus BPCI non–safety net hospitals (Appendix Figure 2).

We conducted sensitivity analyses to evaluate the robustness of our results. First, instead of comparing differential changes at BPCI safety net vs BPCI non–safety net hospitals (ie, evaluating safety net status among BPCI hospitals), we evaluated changes at BPCI safety net vs non-BPCI safety net hospitals compared with changes at BPCI non–safety net vs non-BPCI non–safety net hospitals (ie, marginal differences in the changes associated with BPCI participation among safety net vs non–safety net hospitals). Because safety net hospitals in BPCI were compared with nonparticipating safety net hospitals, and non–safety net hospitals in BPCI were compared with nonparticipating non–safety net hospitals, this set of analyses helped address potential concerns about unobservable differences between safety net and non–safety net organizations and their potential impact on our findings.

Second, we used an alternative, BPCI-specific definition for safety net hospitals: instead of defining safety net status based on all hospitals nationwide, we defined it only among BPCI hospitals (safety net hospitals defined as those in the top quartile of DPP among all BPCI hospitals) and non-BPCI hospitals (safety net hospitals defined as those in the top quartile of DPP among all non-BPCI hospitals). Third, we repeated our main analyses using models with standard errors clustered at the hospital level and without hospital fixed effects. Fourth, we repeated analysis using models with alternative nonlinear link functions and outcome distributions and without hospital fixed effects.

Statistical tests were two-tailed and considered significant at α = .05 for the primary outcome. Statistical analyses were conducted using SAS 9.4 (SAS Institute, Inc.).

RESULTS

Our sample consisted of 3066 hospitals nationwide that collectively provided medical condition episode care to a total of 1,611,848 Medicare fee-for-service beneficiaries. This sample included 238 BPCI hospitals and 2769 non-BPCI hospitals (Table 1, Appendix Table 1).

Among BPCI hospitals, 63 were safety net and 175 were non–safety net hospitals. Compared with non–safety net hospitals, safety net hospitals tended to be larger and were more likely to be urban teaching hospitals. Safety net hospitals also tended to be located in areas with larger populations, more low-income individuals, and greater Medicare Advantage penetration.

In both the baseline and BPCI periods, there were differences in several characteristics for patients admitted to safety net vs non–safety net hospitals (Table 2; Appendix Table 2). Among BPCI hospitals, in both periods, patients admitted at safety net hospitals were younger and more likely to be Black, be Medicare/Medicaid dual eligible, and report having a disability than patients admitted to non–safety net hospitals. Patients admitted to safety net hospitals were also more likely to reside in socioeconomically disadvantaged areas.

Safety Net Status Among BPCI Hospitals

In the baseline period (Appendix Table 3), postdischarge spending was slightly greater among patients admitted to BPCI safety net hospitals ($18,817) than those admitted to BPCI non–safety net hospitals ($18,335). There were also small differences in secondary outcomes between the BPCI safety net and non−safety net groups.

In adjusted analyses evaluating heterogeneity in the effect of BPCI participation between safety net and non–safety net hospitals (Figure 1), differential changes in postdischarge spending between baseline and BPCI participation periods did not differ between safety net and non–safety net hospitals participating in BPCI (aDID, $40; 95% CI, –$254 to $335; P = .79).

With respect to secondary outcomes (Figure 2; Appendix Figure 3), changes between baseline and BPCI participation periods for BPCI safety net vs BPCI non–safety net hospitals were differentially greater for rates of discharge to institutional post–acute care providers (aDID, 1.06 percentage points; 95% CI, 0.37-1.76; P = .003) and differentially lower rates of discharge home with home health agency (aDID, –1.15 percentage points; 95% CI, –1.73 to –0.58; P < .001). Among BPCI hospitals, safety net status was not associated with differential changes from baseline to BPCI periods in other secondary outcomes, including SNF LOS (aDID, 0.32 days; 95% CI, –0.04 to 0.67 days; P = .08).

Sensitivity Analysis

Analyses of BPCI participation among safety net vs non–safety net hospitals nationwide yielded results that were similar to those from our main analyses (Appendix Figures 4, 5, and 6). Compared with BPCI participation among non–safety net hospitals, participation among safety net hospitals was associated with a differential increase from baseline to BPCI periods in discharge to institutional post–acute care providers (aDID, 1.07 percentage points; 95% CI, 0.47-1.67 percentage points; P < .001), but no differential changes between baseline and BPCI periods in postdischarge spending (aDID, –$199;95% CI, –$461 to $63; P = .14), SNF LOS (aDID, –0.22 days; 95% CI, –0.54 to 0.09 days; P = .16), or other secondary outcomes.

Replicating our main analyses using an alternative, BPCI-specific definition of safety net hospitals yielded similar results overall (Appendix Table 4; Appendix Figures 7, 8, and 9). There were no differential changes between baseline and BPCI periods in postdischarge spending between BPCI safety net and BPCI non–safety net hospitals (aDID, $111; 95% CI, –$189 to $411; P = .47). Results for secondary outcomes were also qualitatively similar to results from main analyses, with the exception that among BPCI hospitals, safety net hospitals had a differentially higher SNF LOS than non–safety net hospitals between baseline and BPCI periods (aDID, 0.38 days; 95% CI, 0.02-0.74 days; P = .04).

Compared with results from our main analysis, findings were qualitatively similar overall in analyses using models with hospital-clustered standard errors and without hospital fixed effects (Appendix Figures 10, 11, and 12) as well as models with alternative link functions and outcome distributions and without hospital fixed effects (Appendix Figures 13, 14, and 15).

Discussion

This analysis builds on prior work by evaluating how hospital safety net status affected the known association between bundled payment participation and decreased spending and stable quality for medical condition episodes. Although safety net status did not appear to affect those relationships, it did affect the relationship between participation and post–acute care utilization. These results have three main implications.

First, our results suggest that policymakers should continue engaging safety net hospitals in medical condition bundled payments while monitoring for unintended consequences. Our findings with regard to spending provide some reassurance that safety net hospitals can potentially achieve savings while maintaining quality under bundled payments, similar to other types of hospitals. However, the differences in patient populations and post–acute care utilization patterns suggest that policymakers should continue to carefully monitor for disparities based on hospital safety net status and consider implementing measures that have been used in other payment reforms to support safety net organizations. Such measures could involve providing customized technical assistance or evaluating performance using “peer groups” that compare performance among safety net hospitals alone rather than among all hospitals.24,25

Second, our findings underscore potential challenges that safety net hospitals may face when attempting to redesign care. For instance, among hospitals accepting bundled payments for medical conditions, successful strategies in BPCI have often included maintaining the proportion of patients discharged to institutional post–acute care providers while reducing SNF LOS.2 However, in our study, discharge to institutional post–acute care providers actually increased among safety net hospitals relative to other hospitals while SNF LOS did not decrease. Additionally, while other hospitals in bundled payments have exhibited differentially greater discharge home with home health services, we found that safety net hospitals did not. These represent areas for future work, particularly because little is known about how safety net hospitals coordinate post–acute care (eg, the extent to which safety net hospitals integrate with post–acute care providers or coordinate home-based care for vulnerable patient populations).

Third, study results offer insight into potential challenges to practice changes. Compared with other hospitals, safety net hospitals in our analysis provided medical condition episode care to more Black, Medicare/Medicaid dual-eligible, and disabled patients, as well as individuals living in socioeconomically disadvantaged areas. Collectively, these groups may face more challenging socioeconomic circumstances or existing disparities. The combination of these factors and limited financial resources at safety net hospitals could complicate their ability to manage transitions of care after hospitalization by shifting discharge away from high-intensity institutional post–acute care facilities.

Our analysis has limitations. First, given the observational study design, findings are subject to residual confounding and selection bias. For instance, findings related to post–acute care utilization could have been influenced by unobservable changes in market supply and other factors. However, we mitigated these risks using a quasi-experimental methodology that also directly accounted for multiple patient, hospital, and market characteristics and also used fixed effects to account for unobserved heterogeneity. Second, in studying BPCI Model 2, we evaluated one model within one bundled payment program. However, BPCI Model 2 encompassed a wide range of medical conditions, and both this scope and program design have served as the direct basis for subsequent bundled payment models, such as the ongoing BPCI Advanced and other forthcoming programs.26 Third, while our analysis evaluated multiple aspects of patient complexity, individuals may be “high risk” owing to several clinical and social determinants. Future work should evaluate different features of patient risk and how they affect outcomes under payment models such as bundled payments.

CONCLUSION

Safety net status appeared to affect the relationship between bundled payment participation and post–acute care utilization, but not episode spending. These findings suggest that policymakers could support safety net hospitals within bundled payment programs and consider safety net status when evaluating them.

References

1. Navathe AS, Emanuel EJ, Venkataramani AS, et al. Spending and quality after three years of Medicare’s voluntary bundled payment for joint replacement surgery. Health Aff (Millwood). 2020;39(1):58-66. https://doi.org/10.1377/hlthaff.2019.00466
2. Rolnick JA, Liao JM, Emanuel EJ, et al. Spending and quality after three years of Medicare’s bundled payments for medical conditions: quasi-experimental difference-in-differences study. BMJ. 2020;369:m1780. https://doi.org/10.1136/bmj.m1780
3. Figueroa JF, Joynt KE, Zhou X, Orav EJ, Jha AK. Safety-net hospitals face more barriers yet use fewer strategies to reduce readmissions. Med Care. 2017;55(3):229-235. https://doi.org/10.1097/MLR.0000000000000687
4. Werner RM, Goldman LE, Dudley RA. Comparison of change in quality of care between safety-net and non–safety-net hospitals. JAMA. 2008;299(18):2180-2187. https://doi/org/10.1001/jama.299.18.2180
5. Ross JS, Bernheim SM, Lin Z, et al. Based on key measures, care quality for Medicare enrollees at safety-net and non–safety-net hospitals was almost equal. Health Aff (Millwood). 2012;31(8):1739-1748. https://doi.org/10.1377/hlthaff.2011.1028
6. Gilman M, Adams EK, Hockenberry JM, Milstein AS, Wilson IB, Becker ER. Safety-net hospitals more likely than other hospitals to fare poorly under Medicare’s value-based purchasing. Health Aff (Millwood). 2015;34(3):398-405. https://doi.org/10.1377/hlthaff.2014.1059
7. Joynt KE, Jha AK. Characteristics of hospitals receiving penalties under the Hospital Readmissions Reduction Program. JAMA. 2013;309(4):342-343. https://doi.org/10.1001/jama.2012.94856
8. Rajaram R, Chung JW, Kinnier CV, et al. Hospital characteristics associated with penalties in the Centers for Medicare & Medicaid Services Hospital-Acquired Condition Reduction Program. JAMA. 2015;314(4):375-383. https://doi.org/10.1001/jama.2015.8609
9. Navathe AS, Liao JM, Shah Y, et al. Characteristics of hospitals earning savings in the first year of mandatory bundled payment for hip and knee surgery. JAMA. 2018;319(9):930-932. https://doi.org/10.1001/jama.2018.0678
10. Thirukumaran CP, Glance LG, Cai X, Balkissoon R, Mesfin A, Li Y. Performance of safety-net hospitals in year 1 of the Comprehensive Care for Joint Replacement Model. Health Aff (Millwood). 2019;38(2):190-196. https://doi.org/10.1377/hlthaff.2018.05264
11. Thirukumaran CP, Glance LG, Cai X, Kim Y, Li Y. Penalties and rewards for safety net vs non–safety net hospitals in the first 2 years of the Comprehensive Care for Joint Replacement Model. JAMA. 2019;321(20):2027-2030. https://doi.org/10.1001/jama.2019.5118
12. Kim H, Grunditz JI, Meath THA, Quiñones AR, Ibrahim SA, McConnell KJ. Level of reconciliation payments by safety-net hospital status under the first year of the Comprehensive Care for Joint Replacement Program. JAMA Surg. 2019;154(2):178-179. https://doi.org/10.1001/jamasurg.2018.3098
13. Department of Medicine, University of Wisconsin School of Medicine and Public Health. Neighborhood Atlas. Accessed March 1, 2021. https://www.neighborhoodatlas.medicine.wisc.edu/
14. Dartmouth Atlas Project. The Dartmouth Atlas of Health Care. Accessed March 1, 2021. https://www.dartmouthatlas.org/
15. Chatterjee P, Joynt KE, Orav EJ, Jha AK. Patient experience in safety-net hospitals: implications for improving care and value-based purchasing. Arch Intern Med. 2012;172(16):1204-1210. https://doi.org/10.1001/archinternmed.2012.3158
16. Rolnick JA, Liao JM, Navathe AS. Programme design matters—lessons from bundled payments in the US. June 17, 2020. Accessed March 1, 2021. https://blogs.bmj.com/bmj/2020/06/17/programme-design-matters-lessons-from-bundled-payments-in-the-us
17. Dummit LA, Kahvecioglu D, Marrufo G, et al. Association between hospital participation in a Medicare bundled payment initiative and payments and quality outcomes for lower extremity joint replacement episodes. JAMA. 2016;316(12):1267-1278. https://doi.org/10.1001/jama.2016.12717
18. Navathe AS, Liao JM, Dykstra SE, et al. Association of hospital participation in a Medicare bundled payment program with volume and case mix of lower extremity joint replacement episodes. JAMA. 2018;320(9):901-910. https://doi.org/10.1001/jama.2018.12345
19. Joynt Maddox KE, Orav EJ, Zheng J, Epstein AM. Evaluation of Medicare’s bundled payments initiative for medical conditions. N Engl J Med. 2018;379(3):260-269. https://doi.org/10.1056/NEJMsa1801569
20. Navathe AS, Emanuel EJ, Venkataramani AS, et al. Spending and quality after three years of Medicare’s voluntary bundled payment for joint replacement surgery. Health Aff (Millwood). 2020;39(1):58-66. https://doi.org/10.1377/hlthaff.2019.00466
21. Liao JM, Emanuel EJ, Venkataramani AS, et al. Association of bundled payments for joint replacement surgery and patient outcomes with simultaneous hospital participation in accountable care organizations. JAMA Netw Open. 2019;2(9):e1912270. https://doi.org/10.1001/jamanetworkopen.2019.12270
22. Kim DH, Schneeweiss S. Measuring frailty using claims data for pharmacoepidemiologic studies of mortality in older adults: evidence and recommendations. Pharmacoepidemiol Drug Saf. 2014;23(9):891-901. https://doi.org/10.1002/pds.3674
23. Joynt KE, Figueroa JF, Beaulieu N, Wild RC, Orav EJ, Jha AK. Segmenting high-cost Medicare patients into potentially actionable cohorts. Healthc (Amst). 2017;5(1-2):62-67. https://doi.org/10.1016/j.hjdsi.2016.11.002
24. Quality Payment Program. Small, underserved, and rural practices. Accessed March 1, 2021. https://qpp.cms.gov/about/small-underserved-rural-practices
25. McCarthy CP, Vaduganathan M, Patel KV, et al. Association of the new peer group–stratified method with the reclassification of penalty status in the Hospital Readmission Reduction Program. JAMA Netw Open. 2019;2(4):e192987. https://doi.org/10.1001/jamanetworkopen.2019.2987
26. Centers for Medicare & Medicaid Services. BPCI Advanced. Updated September 16, 2021. Accessed October 18, 2021. https://innovation.cms.gov/innovation-models/bpci-advanced

References

1. Navathe AS, Emanuel EJ, Venkataramani AS, et al. Spending and quality after three years of Medicare’s voluntary bundled payment for joint replacement surgery. Health Aff (Millwood). 2020;39(1):58-66. https://doi.org/10.1377/hlthaff.2019.00466
2. Rolnick JA, Liao JM, Emanuel EJ, et al. Spending and quality after three years of Medicare’s bundled payments for medical conditions: quasi-experimental difference-in-differences study. BMJ. 2020;369:m1780. https://doi.org/10.1136/bmj.m1780
3. Figueroa JF, Joynt KE, Zhou X, Orav EJ, Jha AK. Safety-net hospitals face more barriers yet use fewer strategies to reduce readmissions. Med Care. 2017;55(3):229-235. https://doi.org/10.1097/MLR.0000000000000687
4. Werner RM, Goldman LE, Dudley RA. Comparison of change in quality of care between safety-net and non–safety-net hospitals. JAMA. 2008;299(18):2180-2187. https://doi/org/10.1001/jama.299.18.2180
5. Ross JS, Bernheim SM, Lin Z, et al. Based on key measures, care quality for Medicare enrollees at safety-net and non–safety-net hospitals was almost equal. Health Aff (Millwood). 2012;31(8):1739-1748. https://doi.org/10.1377/hlthaff.2011.1028
6. Gilman M, Adams EK, Hockenberry JM, Milstein AS, Wilson IB, Becker ER. Safety-net hospitals more likely than other hospitals to fare poorly under Medicare’s value-based purchasing. Health Aff (Millwood). 2015;34(3):398-405. https://doi.org/10.1377/hlthaff.2014.1059
7. Joynt KE, Jha AK. Characteristics of hospitals receiving penalties under the Hospital Readmissions Reduction Program. JAMA. 2013;309(4):342-343. https://doi.org/10.1001/jama.2012.94856
8. Rajaram R, Chung JW, Kinnier CV, et al. Hospital characteristics associated with penalties in the Centers for Medicare & Medicaid Services Hospital-Acquired Condition Reduction Program. JAMA. 2015;314(4):375-383. https://doi.org/10.1001/jama.2015.8609
9. Navathe AS, Liao JM, Shah Y, et al. Characteristics of hospitals earning savings in the first year of mandatory bundled payment for hip and knee surgery. JAMA. 2018;319(9):930-932. https://doi.org/10.1001/jama.2018.0678
10. Thirukumaran CP, Glance LG, Cai X, Balkissoon R, Mesfin A, Li Y. Performance of safety-net hospitals in year 1 of the Comprehensive Care for Joint Replacement Model. Health Aff (Millwood). 2019;38(2):190-196. https://doi.org/10.1377/hlthaff.2018.05264
11. Thirukumaran CP, Glance LG, Cai X, Kim Y, Li Y. Penalties and rewards for safety net vs non–safety net hospitals in the first 2 years of the Comprehensive Care for Joint Replacement Model. JAMA. 2019;321(20):2027-2030. https://doi.org/10.1001/jama.2019.5118
12. Kim H, Grunditz JI, Meath THA, Quiñones AR, Ibrahim SA, McConnell KJ. Level of reconciliation payments by safety-net hospital status under the first year of the Comprehensive Care for Joint Replacement Program. JAMA Surg. 2019;154(2):178-179. https://doi.org/10.1001/jamasurg.2018.3098
13. Department of Medicine, University of Wisconsin School of Medicine and Public Health. Neighborhood Atlas. Accessed March 1, 2021. https://www.neighborhoodatlas.medicine.wisc.edu/
14. Dartmouth Atlas Project. The Dartmouth Atlas of Health Care. Accessed March 1, 2021. https://www.dartmouthatlas.org/
15. Chatterjee P, Joynt KE, Orav EJ, Jha AK. Patient experience in safety-net hospitals: implications for improving care and value-based purchasing. Arch Intern Med. 2012;172(16):1204-1210. https://doi.org/10.1001/archinternmed.2012.3158
16. Rolnick JA, Liao JM, Navathe AS. Programme design matters—lessons from bundled payments in the US. June 17, 2020. Accessed March 1, 2021. https://blogs.bmj.com/bmj/2020/06/17/programme-design-matters-lessons-from-bundled-payments-in-the-us
17. Dummit LA, Kahvecioglu D, Marrufo G, et al. Association between hospital participation in a Medicare bundled payment initiative and payments and quality outcomes for lower extremity joint replacement episodes. JAMA. 2016;316(12):1267-1278. https://doi.org/10.1001/jama.2016.12717
18. Navathe AS, Liao JM, Dykstra SE, et al. Association of hospital participation in a Medicare bundled payment program with volume and case mix of lower extremity joint replacement episodes. JAMA. 2018;320(9):901-910. https://doi.org/10.1001/jama.2018.12345
19. Joynt Maddox KE, Orav EJ, Zheng J, Epstein AM. Evaluation of Medicare’s bundled payments initiative for medical conditions. N Engl J Med. 2018;379(3):260-269. https://doi.org/10.1056/NEJMsa1801569
20. Navathe AS, Emanuel EJ, Venkataramani AS, et al. Spending and quality after three years of Medicare’s voluntary bundled payment for joint replacement surgery. Health Aff (Millwood). 2020;39(1):58-66. https://doi.org/10.1377/hlthaff.2019.00466
21. Liao JM, Emanuel EJ, Venkataramani AS, et al. Association of bundled payments for joint replacement surgery and patient outcomes with simultaneous hospital participation in accountable care organizations. JAMA Netw Open. 2019;2(9):e1912270. https://doi.org/10.1001/jamanetworkopen.2019.12270
22. Kim DH, Schneeweiss S. Measuring frailty using claims data for pharmacoepidemiologic studies of mortality in older adults: evidence and recommendations. Pharmacoepidemiol Drug Saf. 2014;23(9):891-901. https://doi.org/10.1002/pds.3674
23. Joynt KE, Figueroa JF, Beaulieu N, Wild RC, Orav EJ, Jha AK. Segmenting high-cost Medicare patients into potentially actionable cohorts. Healthc (Amst). 2017;5(1-2):62-67. https://doi.org/10.1016/j.hjdsi.2016.11.002
24. Quality Payment Program. Small, underserved, and rural practices. Accessed March 1, 2021. https://qpp.cms.gov/about/small-underserved-rural-practices
25. McCarthy CP, Vaduganathan M, Patel KV, et al. Association of the new peer group–stratified method with the reclassification of penalty status in the Hospital Readmission Reduction Program. JAMA Netw Open. 2019;2(4):e192987. https://doi.org/10.1001/jamanetworkopen.2019.2987
26. Centers for Medicare & Medicaid Services. BPCI Advanced. Updated September 16, 2021. Accessed October 18, 2021. https://innovation.cms.gov/innovation-models/bpci-advanced

Issue
Journal of Hospital Medicine 16(12)
Issue
Journal of Hospital Medicine 16(12)
Page Number
716-723. Published Online First November 17, 2021
Page Number
716-723. Published Online First November 17, 2021
Publications
Publications
Topics
Article Type
Display Headline
The Effect of Hospital Safety Net Status on the Association Between Bundled Payment Participation and Changes in Medical Episode Outcomes
Display Headline
The Effect of Hospital Safety Net Status on the Association Between Bundled Payment Participation and Changes in Medical Episode Outcomes
Sections
Article Source

© 2021 Society of Hospital Medicine

Disallow All Ads
Correspondence Location
Joshua M Liao, MD, MSc; Email: [email protected]; Telephone: 206-616-6934. Twitter: @JoshuaLiaoMD.
Content Gating
Gated (full article locked unless allowed per User)
Alternative CME
Disqus Comments
Default
Use ProPublica
Hide sidebar & use full width
render the right sidebar.
Conference Recap Checkbox
Not Conference Recap
Clinical Edge
Display the Slideshow in this Article
Gating Strategy
First Page Free
Medscape Article
Display survey writer
Reuters content
Disable Inline Native ads
WebMD Article
Article PDF Media
Media Files

Improving Identification of Patients at Low Risk for Major Cardiac Events After Noncardiac Surgery Using Intraoperative Data

Article Type
Changed
Wed, 09/30/2020 - 08:55

Annually, more than 40 million noncardiac surgeries take place in the US,1 with 1%-3% of patients experiencing a major adverse cardiovascular event (MACE) such as acute myocardial infarction (AMI) or cardiac arrest postoperatively.2 Such patients are at markedly increased risk of both perioperative and long-term death.2-5

Over the past 40 years, efforts to model the risk of cardiac complications after noncardiac surgery have examined relationships between preoperative risk factors and postoperative cardiovascular events. The resulting risk-stratification tools, such as the Lee Revised Cardiac Risk Index (RCRI), have been used to inform perioperative care, including strategies for risk factor management prior to surgery, testing for cardiac events after surgery, and decisions regarding postoperative disposition.6 However, tools used in practice have not incorporated intraoperative data on hemodynamics or medication administration in the transition to postoperative care, which is often provided by nonsurgical clinicians such as hospitalists. Presently, there is active debate about the optimal approach to postoperative evaluation and management of MACE, particularly with regard to indications for cardiac biomarker testing after surgery in patients without signs or symptoms of acute cardiac syndromes. The lack of consensus is reflected in differences among guidelines for postoperative cardiac biomarker testing across professional societies in Europe, Canada, and the United States.7-9

In this study, we examined whether the addition of intraoperative data to preoperative data (together, perioperative data) improved prediction of MACE after noncardiac surgery when compared with RCRI. Additionally, to investigate how such a model could be applied in practice, we compared risk stratification based on our model to a published risk factor–based guideline algorithm for postoperative cardiac biomarker testing.7 In particular, we evaluated to what extent patients recommended for postoperative cardiac biomarkers under the risk factor–based guideline algorithm would be reclassified as low risk by the model using perioperative data. Conducting biomarker tests on these patients would potentially represent low-value care. We hypothesized that adding intraoperative data would (a) lead to improved prediction of MACE complications when compared with RCRI and (b) more effectively identify, compared with a risk factor–based guideline algorithm, patients for whom cardiac biomarker testing would or would not be clinically meaningful.

METHODS

We followed the Transparent Reporting of a Multivariable Prediction Model for Individual Prognosis or Diagnosis (TRIPOD) reporting guideline.10

Study Data

Baseline, preoperative, and intraoperative data were collected for patients undergoing surgery between January 2014 and April 2018 within the University of Pennsylvania Health System (UPHS) electronic health record (EHR), and these data were then integrated into a comprehensive perioperative dataset (data containing administrative, preoperative, intraoperative, and postoperative information related to surgeries) created through a collaboration with the Multicenter Perioperative Outcomes Group.11 The University of Pennsylvania Institutional Review Board approved this study.

Study Population

Patients aged 18 years or older who underwent inpatient major noncardiac surgery across four tertiary academic medical centers within UPHS in Pennsylvania during the study period were included in the cohort (see Appendix for inclusion/exclusion criteria).12,13 Noncardiac surgery was identified using primary Current Procedural Terminology (CPT) code specification ranges for noncardiac surgeries 10021-32999 and 34001-69990. The study sample was divided randomly into a training set (60%), validation (20%), and test set (20%),14 with similar rates of MACE in the resulting sets. We used a holdout test set for all final analyses to avoid overfitting during model selection.

Outcomes

The composite outcome used to develop the risk-stratification models was in-hospital MACE after major noncardiac surgery. Following prior literature, MACE was defined using billing codes for ST-elevation/non–ST-elevation myocardial infarction (STEMI/NSTEMI, ICD-9-CM 410.xx, ICD-10-CM I21.xx), cardiac arrest (ICD-9-CM 427.5, ICD-10-CM I46.x, I97.121), or all-cause in-hospital death.2,15-17

Variables

Variables were selected from baseline administrative, preoperative clinical, and intraoperative clinical data sources (full list in Appendix). Baseline variables included demographics, insurance type, and Elixhauser comorbidities.18,19 Preoperative variables included surgery type, laboratory results, and American Society of Anesthesiologists (ASA) Physical Status classification.20 Intraoperative variables included vital signs, estimated blood loss, fluid administration, and vasopressor use. We winsorized outlier values and used multiple imputation to address missingness. Rates of missing data can be found in Appendix Table 1.

Risk-Stratification Models Used as Comparisons

Briefly, RCRI variables include the presence of high-risk surgery,21 comorbid cardiovascular diseases (ie, ischemic heart disease, congestive heart failure, and cerebrovascular disease), preoperative use of insulin, and elevated preoperative serum creatinine.6 RCRI uses the inputs to calculate a point score that equates to different risk strata and is based on a stepwise logistic regression model with postoperative cardiovascular complications as the dependent outcome variable. For this study, we implemented the weighted version of the RCRI algorithm and computed the point scores (Appendix).6,7,22

We also applied a risk factor–based algorithm for postoperative cardiac biomarker testing published in 2017 by the Canadian Cardiovascular Society (CCS) guidelines to each patient in the study sample.7 Specifically, this algorithm recommends daily troponin surveillance for 48 to 72 hours after surgery among patients who have (1) an elevated NT-proBNP/BNP measurement or no NT-proBNP/BNP measurement before surgery, (2) have a Revised Cardiac Risk Index score of 1 or greater, (3) are aged 65 years and older, (4) are aged 45 to 64 years with significant cardiovascular disease undergoing elective surgery, or (5) are aged 18 to 64 years with significant cardiovascular disease undergoing semiurgent, urgent, or emergent surgery.

Statistical Analysis

We compared patient characteristics and outcomes between those who did and those who did not experience MACE during hospitalization. Chi-square tests were used to compare categorical variables and Mann Whitney tests were used to compare continuous variables.

To create the perioperative risk-stratification model based on baseline, preoperative, and intraoperative data, we used a logistic regression with elastic net selection using a dichotomous dependent variable indicating MACE and independent variables described earlier. This perioperative model was fit on the training set and the model coefficients were then applied to the patients in the test set. The area under the receiver operating characteristic curve (AUC) was reported and the outcomes were reported by predicted risk decile, with higher deciles indicating higher risk (ie, higher numbers of patients with MACE outcomes in higher deciles implied better risk stratification). Because predicted risk of postoperative MACE may not have been distributed evenly across deciles, we also examined the distribution of the predicted probability of MACE and examined the number of patients below thresholds of risk corresponding to 0.1% or less, 0.25% or less, 0.5% or less, and 1% or less. These thresholds were chosen because they were close to the overall rate of MACE within our cohort.

We tested for differences in predictive performance between the RCRI logistic regression model AUC and the perioperative model AUC using DeLong’s test.23 Additionally, we illustrated differences between the perioperative and RCRI models’ performance in two ways by stratifying patients into deciles based on predicted risk. First, we compared rates of MACE and MACE component events by predicted decile of the perioperative and RCRI models. Second, we further classified patients as RCRI high or low risk (per RCRI score classification in which RCRI score of 1 or greater is high risk and RCRI score of 0 is low risk) and examined numbers of surgical cases and MACE complications within these categories stratified by perioperative model predicted decile.

To compare the perioperative model’s performance with that of a risk factor–based guideline algorithm, we classified patients according to CCS guidelines as high risk (those for whom the CCS guidelines algorithm would recommend postoperative troponin surveillance testing) and low risk (those for whom the CCS guidelines algorithm would not recommend surveillance testing). We also used a logistic regression to examine if the predicted risk from our model was independently associated with MACE above and beyond the testing recommendation of the CCS guidelines algorithm. This model used MACE as the dependent variable and model-predicted risk and a CCS guidelines–defined high-risk indicator as predictors. We computed the association between a 10 percentage–point increase in predicted risk on observed MACE outcome rates.24

In sensitivity analyses, we used a random forest machine learning classifier to test an alternate model specification, used complete case analysis, varied RCRI thresholds, and limited to patients aged 50 years or older. We also varied the penalty parameter in the elastic net model and plotted AUC versus the number of variables included to examine parsimonious models. SAS v9.4 (SAS Institute Inc) was used for main analyses. Data preparations and sensitivity analysis were done in Python v3.6 with Pandas v0.24.2 and Scikit-learn v0.19.1.

Baseline Characteristics of Patients Who Underwent Noncardiac Surgery, 2014 to 2018

RESULTS

Study Sample

Patients who underwent major noncardiac surgery in our sample (n = 72,909) were approximately a mean age of 56 years, 58% female, 66% of White race and 26% of Black race, and most likely to have received orthopedic surgery (33%) or general surgery (20%). Those who experienced MACE (n = 558; 0.77%) differed along several characteristics (Table 1). For example, those with MACE were older (mean age, 65.4 vs 55.4 years; P < .001) and less likely to be female (41.9% vs 58.3%; P < .001).

Comparison of Perioperative and Revised Cardiac Risk Index Models’ Performance for Predicting Major Adverse Cardiovascular Events

Model Performance After Intraoperative Data Inclusion

In the perioperative model combining preoperative and intraoperative data, 26 variables were included after elastic net selection (Appendix Table 2). Model discrimination in the test set of patients demonstrated an AUC of 0.88 (95% CI, 0.85-0.92; Figure). When examining outcome rates by predicted decile, the outcome rates of in-hospital MACE complications were higher in the highest decile than in the lowest decile, notably with 58 of 92 (63%) cases with MACE complications within the top decile of predicted risk (Table 2). The majority of patients had low predicted risk of MACE, with 5,309 (36.1%), 8,796 (59.7%), 11,335 (77.0%), and 12,972 (88.1%) below the risk thresholds of to 0.1%, 0.25%, 0.5%, and 1.0% respectively. The associated MACE rates were 0.04%, 0.10%, 0.17%, and 0.25% (average rate in sample was 0.63%) (Appendix Table 3).

Perioperative Model Performance for Predicting Major Adverse Cardiac Events and Components by Risk Decile in Test Set

Model Performance Comparisons

The perioperative model AUC of 0.88 was higher when compared with RCRI’s AUC of 0.79 (95% CI, 0.74-0.84; P < .001). The number of MACE complications was more concentrated in the top decile of predicted risk of the perioperative model than it was in that of the RCRI model (58 vs 43 of 92 events, respectively; 63% vs 47%; Table 2). Furthermore, there were fewer cases with MACE complications in the low-risk deciles (ie, deciles 1 to 5) of the perioperative model than in the those of the RCRI model. These relative differences were consistent for MACE component outcomes of STEMI/NSTEMI, cardiac arrest, and in-hospital death, as well.

There was substantial heterogeneity in the perioperative model predicted risk of patients classified as either RCRI low risk or high risk (ie, each category included patients with low and high predicted risk) categories (Table 3). Patients in the bottom (low-risk) five deciles of the perioperative model’s predicted risk who were in the RCRI model’s high-risk group were very unlikely to experience MACE complications (3 out of 722 cases; 0.42%). Furthermore, among those classified as low risk by the RCRI model but were in the top decile of the perioperative model’s predicted risk, the MACE complication rate was 3.5% (8 out of 229), which was 6 times the sample mean MACE complication rate.

Comparison of Perioperative Model Results by Risk Factor–Based Recommendations

The perioperative model identified more patients as low risk than did the CCS guidelines’ risk factor–based algorithm (Table 3). For example, 2,341 of the patients the CCS guidelines algorithm identified as high risk were in the bottom 50% of the perioperative model’s predicted risk for experiencing MACE (below a 0.18% chance of a MACE complication); only four of these patients (0.17%) actually experienced MACE. This indicates that the 2,341 of 7,597 (31%) high-risk patients identified as low risk in the perioperative model would have been recommended for postoperative troponin testing by CCS guidelines based on preoperative risk factors alone—but did not go on to experience a MACE. Regression results indicated that both CCS guidelines risk-factor classification and the perioperative model’s predicted risk were predictive of MACE outcomes. A change in the perioperative model’s predicted risk of 10 percentage points was associated with an increase in the probability of a MACE outcomes of 0.45 percentage points (95% CI, 0.35-0.55 percentage points; P < .001) and moving from CCS guidelines’ low- to high-risk categories was associated with an increased probability of MACE by 0.96 percentage points (95% CI, 0.75-1.16 percentage points; P < .001).

Results were consistent with the main analysis across all sensitivity analyses (Appendix Tables 4-7). Parsimonious models with variables as few as eight variables retained strong predictive power (AUC, 0.870; Appendix Figure 1 and Table 8).

DISCUSSION

In this study, the addition of intraoperative data improved risk stratification for MACE complications when compared with standard risk tools such as RCRI. This approach also outperformed a guidelines-based approach and identified additional patients at low risk of cardiovascular complications. This study has three main implications.

First, this study demonstrated the additional value of combining intraoperative data with preoperative data in risk prediction for postoperative cardiovascular events. The intraoperative data most strongly associated with MACE, which likely were responsible for the performance improvement, included administration of medications (eg, sodium bicarbonate or calcium chloride) and blood products (eg, platelets and packed red blood cells), vitals (ie, heart rate), and intraoperative procedures (ie, arterial line placement); all model variables and coefficients are reported in Appendix Table 9. The risk-stratification model using intraoperative clinical data outperformed validated standard models such as RCRI. While this model should not be used in causal inference and cannot be used to inform decisions about risk-benefit tradeoffs of undergoing surgery, its improved performance relative to prior models highlights the potential in using real-time data. Preliminary illustrative analysis demonstrated that parsimonious models with as few as eight variables perform well, whose implementation as risk scores in EHRs is likely straightforward (Appendix Table 8). This is particularly important for longitudinal care in the hospital, in which patients frequently are cared for by multiple clinical services and experience handoffs. For example, many orthopedic surgery patients with significant medical comorbidity are managed postoperatively by hospitalist physicians after initial surgical care.

Second, our study aligns well with the cardiac risk-stratification literature more broadly. For example, the patient characteristics and clinical variables most associated with cardiovascular complications were age, history of ischemic heart disease, American Society of Anesthesiologists physical status, use of intraoperative sodium bicarbonate or vasopressors, lowest intraoperative heart rate measured, and lowest intraoperative mean arterial pressure measured. While many of these variables overlap with those included in the RCRI model, others (such as American Society of Anesthesiologists physical status) are not included in RCRI but have been shown to be important in risk prediction in other studies using different data variables.6,25,26

Third, we illustrated a clinical application of this model in identifying patients at low risk of cardiovascular complications, although benefit may extend to other patients as well. This is particularly germane to clinicians who frequently manage patients in the postsurgical or postprocedural setting. Moreover, the clinical relevance to these clinicians is underscored by the lack of consensus among professional societies across Europe, Canada, and the United States about which subgroups of patients undergoing noncardiac surgery should receive postoperative cardiac biomarker surveillance testing in the 48 to 72 hours after surgery.6-9 This may be in part caused by differences in clinical objectives. For example, the CCS guidelines in part aim to detect myocardial injury after noncardiac surgery (MINS) up to 30 days after surgery, which may be more sensitive to myocardial injury but less strongly associated with outcomes like MACE. The results of this study suggest that adopting such risk factor–based testing would likely lead to additional testing of low risk patients, which may represent low value surveillance tests. For example, there were 2,257 patients without postoperative cardiac biomarker testing in our data who would have been categorized as high risk by risk factor guidelines and therefore recommended to receive at least one postoperative cardiac biomarker surveillance test but were classified as low-risk individuals using a predicted probability of MACE less than 0.18% per our perioperative risk stratification model (Appendix Table 4). If each of these patients received one troponin biomarker test, the associated cost increase would be $372,405 (using the $165 cost per test reported at our institution). These costs would multiply if daily surveillance troponin biomarker tests were ordered for 48 to 72 hours after surgery, as recommended by the risk factor–based testing guidelines. This would be a departure from testing among patients using clinician discretion that may avoid low-value testing.

Applying the perioperative model developed in this paper to clinical practice still requires several steps. The technical aspects of finding a parsimonious model that can be implemented in the EHR is likely quite straightforward. Our preliminary analysis illustrates that doing so will not require accessing large numbers of intraoperative variables. Perhaps more important steps include prospective validation of the safety, usability, and clinical benefit of such an algorithm-based risk score.27

The study has several limitations. First, it was an observational study using EHR data subject to missingness and data quality issues that may have persisted despite our methods. Furthermore, EHR data is not generated randomly, and unmeasured variables observed by clinicians but not by researchers could confound the results. However, our approach used the statistical model to examine risk, not causal inference. Second, this is a single institution study and the availability of EHR data, as well as practice patterns, may vary at other institutions. Furthermore, it is possible that performance of the RCRI score, the model fitting RCRI classification of high vs low risk on the sample data, and our model’s performance may not generalize to other clinical settings. However, we utilized data from multiple hospitals within a health system with different surgery and anesthesia groups and providers, and a similar AUC was reported for RCRI in original validation study.6 Third, our follow up period was limited to the hospital setting and we do not capture longitudinal outcomes, such as 30-day MACE. This may impact the ability to risk stratify for other important longer-term outcomes, limit clinical utility, and hinder comparability to other studies. Fourth, results may vary for other important cardiovascular outcomes that may be more sensitive to myocardial injury, such as MINS. Fifth, we used a limited number of modeling strategies.

CONCLUSION

Addition of intraoperative data to preoperative data improves prediction of cardiovascular complications after noncardiac surgery. Improving the identification of patients at low risk for such complications could potentially be applied to reduce unnecessary postoperative cardiac biomarker testing after noncardiac surgery, but it will require further validation in prospective clinical settings.

Disclosures

Dr Navathe reports grants from the following entities: Hawaii Medical Service Association, Anthem Public Policy Institute, Commonwealth Fund, Oscar Health, Cigna Corporation, Robert Wood Johnson Foundation, Donaghue Foundation, Pennsylvania Department of Health, Ochsner Health System, United Healthcare, Blue Cross Blue Shield of NC, Blue Shield of CA; personal fees from the following: Navvis Healthcare, Agathos, Inc, Navahealth, YNHHSC/CORE, Maine Health Accountable Care Organization, Maine Department of Health and Human Services, National University Health System - Singapore, Ministry of Health - Singapore, Social Security Administration - France, Elsevier Press, Medicare Payment Advisory Commission, Cleveland Clinic, Embedded Healthcare; and other support from Integrated Services, Inc, outside of the submitted work. Dr Volpp reports grants from Humana during the conduct of the study; grants from Hawaii Medical Services Agency, Discovery (South Africa), Merck, Weight Watchers, and CVS outside of the submitted work; he has received consulting income from CVS and VALHealth and is a principal in VALHealth, a behavioral economics consulting firm. Dr Holmes receives funding from the Pennsylvania Department of Health, US Public Health Service, and the Cardiovascular Medicine Research and Education Foundation. All other authors declare no conflicts of interest.

Prior Presentations

2019 Academy Health Annual Research Meeting, Poster Abstract Presentation, June 2 to June 4, 2019, Washington, DC.

Funding

This project was funded, in part, under a grant with the Pennsylvania Department of Health. This research was independent from the funder. The funder had no role in the study design; in the collection, analysis, and interpretation of data; in the writing of the report; or in the decision to submit the article for publication. The department specifically disclaims responsibility for any analyses, interpretations, or conclusions.

Files
References

1. National Center for Health Statistics. National Hospital Discharge Survey: 2010 Table, Number of all-listed procedures for discharges from short-stay hospitals, by procedure category and age: United States, 2010. Centers for Disease Control and Prevention; 2010. Accessed November 11, 2018. https://www.cdc.gov/nchs/data/nhds/4procedures/2010pro4_numberprocedureage.pdf
2. Devereaux PJ, Goldman L, Cook DJ, Gilbert K, Leslie K, Guyatt GH. Perioperative cardiac events in patients undergoing noncardiac surgery: a review of the magnitude of the problem, the pathophysiology of the events and methods to estimate and communicate risk. CMAJ. 2005;173(6):627-634. https://doi.org/10.1503/cmaj.050011
3. Charlson M, Peterson J, Szatrowski TP, MacKenzie R, Gold J. Long-term prognosis after peri-operative cardiac complications. J Clin Epidemiol. 1994;47(12):1389-1400. https://doi.org/10.1016/0895-4356(94)90083-3
4. Devereaux PJ, Sessler DI. Cardiac complications in patients undergoing major noncardiac surgery. N Engl J Med. 2015;373(23):2258-2269. https://doi.org/10.1056/nejmra1502824
5. Sprung J, Warner ME, Contreras MG, et al. Predictors of survival following cardiac arrest in patients undergoing noncardiac surgery: a study of 518,294 patients at a tertiary referral center. Anesthesiology. 2003;99(2):259-269. https://doi.org/10.1097/00000542-200308000-00006
6. Lee TH, Marcantonio ER, Mangione CM, et al. Derivation and prospective validation of a simple index for prediction of cardiac risk of major noncardiac surgery. Circulation. 1999;100(10):1043-1049. https://doi.org/10.1161/01.cir.100.10.1043
7. Duceppe E, Parlow J, MacDonald P, et al. Canadian Cardiovascular Society guidelines on perioperative cardiac risk assessment and management for patients who undergo noncardiac surgery. Can J Cardiol. 2017;33(1):17-32. https://doi.org/10.1016/j.cjca.2016.09.008
8. Fleisher LA, Fleischmann KE, Auerbach AD, et al. 2014 ACC/AHA guideline on perioperative cardiovascular evaluation and management of patients undergoing noncardiac surgery: a report of the American College of Cardiology/American Heart Association Task Force on practice guidelines. J Am Coll Cardiol. 2014;64(22):e77-e137. https://doi.org/10.1016/j.jacc.2014.07.944
9. Kristensen SD, Knuuti J, Saraste A, et al. 2014 ESC/ESA guidelines on non-cardiac surgery: cardiovascular assessment and management: The Joint Task Force on non-cardiac surgery: cardiovascular assessment and management of the European Society of Cardiology (ESC) and the European Society of Anaesthesiology (ESA). Euro Heart J. 2014;35(35):2383-2431. https://doi.org/10.1093/eurheartj/ehu282
10. Collins GS, Reitsma JB, Altman DG, Moons KG. Transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD): the TRIPOD statement. Ann Intern Med. 2015;162(1):55-63. https://doi.org/10.7326/m14-0697
11. Freundlich RE, Kheterpal S. Perioperative effectiveness research using large databases. Best Pract Res Clin Anaesthesiol. 2011;25(4):489-498. https://doi.org/10.1016/j.bpa.2011.08.008
12. CPT® (Current Procedural Terminology). American Medical Association. 2018. Accessed November 11, 2018. https://www.ama-assn.org/practice-management/cpt-current-procedural-terminology
13. Surgery Flag Software for ICD-9-CM. AHRQ Healthcare Cost and Utilization Project; 2017. Accessed November 11, 2018. https://www.hcup-us.ahrq.gov/toolssoftware/surgflags/surgeryflags.jsp
14. Hastie T, Tibshirani R, Friedman J. The Elements of Statistical Learning: Data Mining, Inference, and Prediction. 2nd ed. Springer; 2009. https://www.springer.com/gp/book/9780387848570
15. Bucy R, Hanisko KA, Ewing LA, et al. Abstract 281: Validity of in-hospital cardiac arrest ICD-9-CM codes in veterans. Circ Cardiovasc Qual Outcomes. 2015;8(suppl_2):A281-A281.
16. Institute of Medicine; Board on Health Sciences Policy; Committee on the Treatment of Cardiac Arrest: Current Status and Future Directions. Graham R, McCoy MA, Schultz AM, eds. Strategies to Improve Cardiac Arrest Survival: A Time to Act. The National Academies Press; 2015. https://doi.org/10.17226/21723
17. Pladevall M, Goff DC, Nichaman MZ, et al. An assessment of the validity of ICD Code 410 to identify hospital admissions for myocardial infarction: The Corpus Christi Heart Project. Int J Epidemiol. 1996;25(5):948-952. https://doi.org/10.1093/ije/25.5.948
18. Elixhauser A, Steiner C, Harris DR, Coffey RM. Comorbidity measures for use with administrative data. Med Care. 1998;36(1):8-27. https://doi.org/10.1097/00005650-199801000-00004
19. Quan H, Sundararajan V, Halfon P, et al. Coding algorithms for defining comorbidities in ICD-9-CM and ICD-10 administrative data. Med Care. 2005;43(11):1130-1139. https://doi.org/10.1097/01.mlr.0000182534.19832.83
20. Keats AS. The ASA classification of physical status--a recapitulation. Anesthesiology. 1978;49(4):233-236. https://doi.org/10.1097/00000542-197810000-00001
21. Schwarze ML, Barnato AE, Rathouz PJ, et al. Development of a list of high-risk operations for patients 65 years and older. JAMA Surg. 2015;150(4):325-331. https://doi.org/10.1001/jamasurg.2014.1819
22. VISION Pilot Study Investigators, Devereaux PJ, Bradley D, et al. An international prospective cohort study evaluating major vascular complications among patients undergoing noncardiac surgery: the VISION Pilot Study. Open Med. 2011;5(4):e193-e200.
23. DeLong ER, DeLong DM, Clarke-Pearson DL. Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach. Biometrics. 1988;44(3):837-845.
24. Norton EC, Dowd BE, Maciejewski ML. Marginal effects-quantifying the effect of changes in risk factors in logistic regression models. JAMA. 2019;321(13):1304‐1305. https://doi.org/10.1001/jama.2019.1954
25. Bilimoria KY, Liu Y, Paruch JL, et al. Development and evaluation of the universal ACS NSQIP surgical risk calculator: a decision aid and informed consent tool for patients and surgeons. J Am Coll Surg. 2013;217(5):833-842. https://doi.org/10.1016/j.jamcollsurg.2013.07.385
26. Gawande AA, Kwaan MR, Regenbogen SE, Lipsitz SA, Zinner MJ. An Apgar score for surgery. J Am Coll Surg. 2007;204(2):201-208. https://doi.org/10.1016/j.jamcollsurg.2006.11.011
27. Parikh RB, Obermeyer Z, Navathe AS. Regulation of predictive analytics in medicine. Science. 2019;363(6429):810-812. https://doi.org/10.1126/science.aaw0029

Article PDF
Issue
Journal of Hospital Medicine 15(10)
Publications
Topics
Page Number
581-587. Published Online First September 23, 2020
Sections
Files
Files
Article PDF
Article PDF
Related Articles

Annually, more than 40 million noncardiac surgeries take place in the US,1 with 1%-3% of patients experiencing a major adverse cardiovascular event (MACE) such as acute myocardial infarction (AMI) or cardiac arrest postoperatively.2 Such patients are at markedly increased risk of both perioperative and long-term death.2-5

Over the past 40 years, efforts to model the risk of cardiac complications after noncardiac surgery have examined relationships between preoperative risk factors and postoperative cardiovascular events. The resulting risk-stratification tools, such as the Lee Revised Cardiac Risk Index (RCRI), have been used to inform perioperative care, including strategies for risk factor management prior to surgery, testing for cardiac events after surgery, and decisions regarding postoperative disposition.6 However, tools used in practice have not incorporated intraoperative data on hemodynamics or medication administration in the transition to postoperative care, which is often provided by nonsurgical clinicians such as hospitalists. Presently, there is active debate about the optimal approach to postoperative evaluation and management of MACE, particularly with regard to indications for cardiac biomarker testing after surgery in patients without signs or symptoms of acute cardiac syndromes. The lack of consensus is reflected in differences among guidelines for postoperative cardiac biomarker testing across professional societies in Europe, Canada, and the United States.7-9

In this study, we examined whether the addition of intraoperative data to preoperative data (together, perioperative data) improved prediction of MACE after noncardiac surgery when compared with RCRI. Additionally, to investigate how such a model could be applied in practice, we compared risk stratification based on our model to a published risk factor–based guideline algorithm for postoperative cardiac biomarker testing.7 In particular, we evaluated to what extent patients recommended for postoperative cardiac biomarkers under the risk factor–based guideline algorithm would be reclassified as low risk by the model using perioperative data. Conducting biomarker tests on these patients would potentially represent low-value care. We hypothesized that adding intraoperative data would (a) lead to improved prediction of MACE complications when compared with RCRI and (b) more effectively identify, compared with a risk factor–based guideline algorithm, patients for whom cardiac biomarker testing would or would not be clinically meaningful.

METHODS

We followed the Transparent Reporting of a Multivariable Prediction Model for Individual Prognosis or Diagnosis (TRIPOD) reporting guideline.10

Study Data

Baseline, preoperative, and intraoperative data were collected for patients undergoing surgery between January 2014 and April 2018 within the University of Pennsylvania Health System (UPHS) electronic health record (EHR), and these data were then integrated into a comprehensive perioperative dataset (data containing administrative, preoperative, intraoperative, and postoperative information related to surgeries) created through a collaboration with the Multicenter Perioperative Outcomes Group.11 The University of Pennsylvania Institutional Review Board approved this study.

Study Population

Patients aged 18 years or older who underwent inpatient major noncardiac surgery across four tertiary academic medical centers within UPHS in Pennsylvania during the study period were included in the cohort (see Appendix for inclusion/exclusion criteria).12,13 Noncardiac surgery was identified using primary Current Procedural Terminology (CPT) code specification ranges for noncardiac surgeries 10021-32999 and 34001-69990. The study sample was divided randomly into a training set (60%), validation (20%), and test set (20%),14 with similar rates of MACE in the resulting sets. We used a holdout test set for all final analyses to avoid overfitting during model selection.

Outcomes

The composite outcome used to develop the risk-stratification models was in-hospital MACE after major noncardiac surgery. Following prior literature, MACE was defined using billing codes for ST-elevation/non–ST-elevation myocardial infarction (STEMI/NSTEMI, ICD-9-CM 410.xx, ICD-10-CM I21.xx), cardiac arrest (ICD-9-CM 427.5, ICD-10-CM I46.x, I97.121), or all-cause in-hospital death.2,15-17

Variables

Variables were selected from baseline administrative, preoperative clinical, and intraoperative clinical data sources (full list in Appendix). Baseline variables included demographics, insurance type, and Elixhauser comorbidities.18,19 Preoperative variables included surgery type, laboratory results, and American Society of Anesthesiologists (ASA) Physical Status classification.20 Intraoperative variables included vital signs, estimated blood loss, fluid administration, and vasopressor use. We winsorized outlier values and used multiple imputation to address missingness. Rates of missing data can be found in Appendix Table 1.

Risk-Stratification Models Used as Comparisons

Briefly, RCRI variables include the presence of high-risk surgery,21 comorbid cardiovascular diseases (ie, ischemic heart disease, congestive heart failure, and cerebrovascular disease), preoperative use of insulin, and elevated preoperative serum creatinine.6 RCRI uses the inputs to calculate a point score that equates to different risk strata and is based on a stepwise logistic regression model with postoperative cardiovascular complications as the dependent outcome variable. For this study, we implemented the weighted version of the RCRI algorithm and computed the point scores (Appendix).6,7,22

We also applied a risk factor–based algorithm for postoperative cardiac biomarker testing published in 2017 by the Canadian Cardiovascular Society (CCS) guidelines to each patient in the study sample.7 Specifically, this algorithm recommends daily troponin surveillance for 48 to 72 hours after surgery among patients who have (1) an elevated NT-proBNP/BNP measurement or no NT-proBNP/BNP measurement before surgery, (2) have a Revised Cardiac Risk Index score of 1 or greater, (3) are aged 65 years and older, (4) are aged 45 to 64 years with significant cardiovascular disease undergoing elective surgery, or (5) are aged 18 to 64 years with significant cardiovascular disease undergoing semiurgent, urgent, or emergent surgery.

Statistical Analysis

We compared patient characteristics and outcomes between those who did and those who did not experience MACE during hospitalization. Chi-square tests were used to compare categorical variables and Mann Whitney tests were used to compare continuous variables.

To create the perioperative risk-stratification model based on baseline, preoperative, and intraoperative data, we used a logistic regression with elastic net selection using a dichotomous dependent variable indicating MACE and independent variables described earlier. This perioperative model was fit on the training set and the model coefficients were then applied to the patients in the test set. The area under the receiver operating characteristic curve (AUC) was reported and the outcomes were reported by predicted risk decile, with higher deciles indicating higher risk (ie, higher numbers of patients with MACE outcomes in higher deciles implied better risk stratification). Because predicted risk of postoperative MACE may not have been distributed evenly across deciles, we also examined the distribution of the predicted probability of MACE and examined the number of patients below thresholds of risk corresponding to 0.1% or less, 0.25% or less, 0.5% or less, and 1% or less. These thresholds were chosen because they were close to the overall rate of MACE within our cohort.

We tested for differences in predictive performance between the RCRI logistic regression model AUC and the perioperative model AUC using DeLong’s test.23 Additionally, we illustrated differences between the perioperative and RCRI models’ performance in two ways by stratifying patients into deciles based on predicted risk. First, we compared rates of MACE and MACE component events by predicted decile of the perioperative and RCRI models. Second, we further classified patients as RCRI high or low risk (per RCRI score classification in which RCRI score of 1 or greater is high risk and RCRI score of 0 is low risk) and examined numbers of surgical cases and MACE complications within these categories stratified by perioperative model predicted decile.

To compare the perioperative model’s performance with that of a risk factor–based guideline algorithm, we classified patients according to CCS guidelines as high risk (those for whom the CCS guidelines algorithm would recommend postoperative troponin surveillance testing) and low risk (those for whom the CCS guidelines algorithm would not recommend surveillance testing). We also used a logistic regression to examine if the predicted risk from our model was independently associated with MACE above and beyond the testing recommendation of the CCS guidelines algorithm. This model used MACE as the dependent variable and model-predicted risk and a CCS guidelines–defined high-risk indicator as predictors. We computed the association between a 10 percentage–point increase in predicted risk on observed MACE outcome rates.24

In sensitivity analyses, we used a random forest machine learning classifier to test an alternate model specification, used complete case analysis, varied RCRI thresholds, and limited to patients aged 50 years or older. We also varied the penalty parameter in the elastic net model and plotted AUC versus the number of variables included to examine parsimonious models. SAS v9.4 (SAS Institute Inc) was used for main analyses. Data preparations and sensitivity analysis were done in Python v3.6 with Pandas v0.24.2 and Scikit-learn v0.19.1.

Baseline Characteristics of Patients Who Underwent Noncardiac Surgery, 2014 to 2018

RESULTS

Study Sample

Patients who underwent major noncardiac surgery in our sample (n = 72,909) were approximately a mean age of 56 years, 58% female, 66% of White race and 26% of Black race, and most likely to have received orthopedic surgery (33%) or general surgery (20%). Those who experienced MACE (n = 558; 0.77%) differed along several characteristics (Table 1). For example, those with MACE were older (mean age, 65.4 vs 55.4 years; P < .001) and less likely to be female (41.9% vs 58.3%; P < .001).

Comparison of Perioperative and Revised Cardiac Risk Index Models’ Performance for Predicting Major Adverse Cardiovascular Events

Model Performance After Intraoperative Data Inclusion

In the perioperative model combining preoperative and intraoperative data, 26 variables were included after elastic net selection (Appendix Table 2). Model discrimination in the test set of patients demonstrated an AUC of 0.88 (95% CI, 0.85-0.92; Figure). When examining outcome rates by predicted decile, the outcome rates of in-hospital MACE complications were higher in the highest decile than in the lowest decile, notably with 58 of 92 (63%) cases with MACE complications within the top decile of predicted risk (Table 2). The majority of patients had low predicted risk of MACE, with 5,309 (36.1%), 8,796 (59.7%), 11,335 (77.0%), and 12,972 (88.1%) below the risk thresholds of to 0.1%, 0.25%, 0.5%, and 1.0% respectively. The associated MACE rates were 0.04%, 0.10%, 0.17%, and 0.25% (average rate in sample was 0.63%) (Appendix Table 3).

Perioperative Model Performance for Predicting Major Adverse Cardiac Events and Components by Risk Decile in Test Set

Model Performance Comparisons

The perioperative model AUC of 0.88 was higher when compared with RCRI’s AUC of 0.79 (95% CI, 0.74-0.84; P < .001). The number of MACE complications was more concentrated in the top decile of predicted risk of the perioperative model than it was in that of the RCRI model (58 vs 43 of 92 events, respectively; 63% vs 47%; Table 2). Furthermore, there were fewer cases with MACE complications in the low-risk deciles (ie, deciles 1 to 5) of the perioperative model than in the those of the RCRI model. These relative differences were consistent for MACE component outcomes of STEMI/NSTEMI, cardiac arrest, and in-hospital death, as well.

There was substantial heterogeneity in the perioperative model predicted risk of patients classified as either RCRI low risk or high risk (ie, each category included patients with low and high predicted risk) categories (Table 3). Patients in the bottom (low-risk) five deciles of the perioperative model’s predicted risk who were in the RCRI model’s high-risk group were very unlikely to experience MACE complications (3 out of 722 cases; 0.42%). Furthermore, among those classified as low risk by the RCRI model but were in the top decile of the perioperative model’s predicted risk, the MACE complication rate was 3.5% (8 out of 229), which was 6 times the sample mean MACE complication rate.

Comparison of Perioperative Model Results by Risk Factor–Based Recommendations

The perioperative model identified more patients as low risk than did the CCS guidelines’ risk factor–based algorithm (Table 3). For example, 2,341 of the patients the CCS guidelines algorithm identified as high risk were in the bottom 50% of the perioperative model’s predicted risk for experiencing MACE (below a 0.18% chance of a MACE complication); only four of these patients (0.17%) actually experienced MACE. This indicates that the 2,341 of 7,597 (31%) high-risk patients identified as low risk in the perioperative model would have been recommended for postoperative troponin testing by CCS guidelines based on preoperative risk factors alone—but did not go on to experience a MACE. Regression results indicated that both CCS guidelines risk-factor classification and the perioperative model’s predicted risk were predictive of MACE outcomes. A change in the perioperative model’s predicted risk of 10 percentage points was associated with an increase in the probability of a MACE outcomes of 0.45 percentage points (95% CI, 0.35-0.55 percentage points; P < .001) and moving from CCS guidelines’ low- to high-risk categories was associated with an increased probability of MACE by 0.96 percentage points (95% CI, 0.75-1.16 percentage points; P < .001).

Results were consistent with the main analysis across all sensitivity analyses (Appendix Tables 4-7). Parsimonious models with variables as few as eight variables retained strong predictive power (AUC, 0.870; Appendix Figure 1 and Table 8).

DISCUSSION

In this study, the addition of intraoperative data improved risk stratification for MACE complications when compared with standard risk tools such as RCRI. This approach also outperformed a guidelines-based approach and identified additional patients at low risk of cardiovascular complications. This study has three main implications.

First, this study demonstrated the additional value of combining intraoperative data with preoperative data in risk prediction for postoperative cardiovascular events. The intraoperative data most strongly associated with MACE, which likely were responsible for the performance improvement, included administration of medications (eg, sodium bicarbonate or calcium chloride) and blood products (eg, platelets and packed red blood cells), vitals (ie, heart rate), and intraoperative procedures (ie, arterial line placement); all model variables and coefficients are reported in Appendix Table 9. The risk-stratification model using intraoperative clinical data outperformed validated standard models such as RCRI. While this model should not be used in causal inference and cannot be used to inform decisions about risk-benefit tradeoffs of undergoing surgery, its improved performance relative to prior models highlights the potential in using real-time data. Preliminary illustrative analysis demonstrated that parsimonious models with as few as eight variables perform well, whose implementation as risk scores in EHRs is likely straightforward (Appendix Table 8). This is particularly important for longitudinal care in the hospital, in which patients frequently are cared for by multiple clinical services and experience handoffs. For example, many orthopedic surgery patients with significant medical comorbidity are managed postoperatively by hospitalist physicians after initial surgical care.

Second, our study aligns well with the cardiac risk-stratification literature more broadly. For example, the patient characteristics and clinical variables most associated with cardiovascular complications were age, history of ischemic heart disease, American Society of Anesthesiologists physical status, use of intraoperative sodium bicarbonate or vasopressors, lowest intraoperative heart rate measured, and lowest intraoperative mean arterial pressure measured. While many of these variables overlap with those included in the RCRI model, others (such as American Society of Anesthesiologists physical status) are not included in RCRI but have been shown to be important in risk prediction in other studies using different data variables.6,25,26

Third, we illustrated a clinical application of this model in identifying patients at low risk of cardiovascular complications, although benefit may extend to other patients as well. This is particularly germane to clinicians who frequently manage patients in the postsurgical or postprocedural setting. Moreover, the clinical relevance to these clinicians is underscored by the lack of consensus among professional societies across Europe, Canada, and the United States about which subgroups of patients undergoing noncardiac surgery should receive postoperative cardiac biomarker surveillance testing in the 48 to 72 hours after surgery.6-9 This may be in part caused by differences in clinical objectives. For example, the CCS guidelines in part aim to detect myocardial injury after noncardiac surgery (MINS) up to 30 days after surgery, which may be more sensitive to myocardial injury but less strongly associated with outcomes like MACE. The results of this study suggest that adopting such risk factor–based testing would likely lead to additional testing of low risk patients, which may represent low value surveillance tests. For example, there were 2,257 patients without postoperative cardiac biomarker testing in our data who would have been categorized as high risk by risk factor guidelines and therefore recommended to receive at least one postoperative cardiac biomarker surveillance test but were classified as low-risk individuals using a predicted probability of MACE less than 0.18% per our perioperative risk stratification model (Appendix Table 4). If each of these patients received one troponin biomarker test, the associated cost increase would be $372,405 (using the $165 cost per test reported at our institution). These costs would multiply if daily surveillance troponin biomarker tests were ordered for 48 to 72 hours after surgery, as recommended by the risk factor–based testing guidelines. This would be a departure from testing among patients using clinician discretion that may avoid low-value testing.

Applying the perioperative model developed in this paper to clinical practice still requires several steps. The technical aspects of finding a parsimonious model that can be implemented in the EHR is likely quite straightforward. Our preliminary analysis illustrates that doing so will not require accessing large numbers of intraoperative variables. Perhaps more important steps include prospective validation of the safety, usability, and clinical benefit of such an algorithm-based risk score.27

The study has several limitations. First, it was an observational study using EHR data subject to missingness and data quality issues that may have persisted despite our methods. Furthermore, EHR data is not generated randomly, and unmeasured variables observed by clinicians but not by researchers could confound the results. However, our approach used the statistical model to examine risk, not causal inference. Second, this is a single institution study and the availability of EHR data, as well as practice patterns, may vary at other institutions. Furthermore, it is possible that performance of the RCRI score, the model fitting RCRI classification of high vs low risk on the sample data, and our model’s performance may not generalize to other clinical settings. However, we utilized data from multiple hospitals within a health system with different surgery and anesthesia groups and providers, and a similar AUC was reported for RCRI in original validation study.6 Third, our follow up period was limited to the hospital setting and we do not capture longitudinal outcomes, such as 30-day MACE. This may impact the ability to risk stratify for other important longer-term outcomes, limit clinical utility, and hinder comparability to other studies. Fourth, results may vary for other important cardiovascular outcomes that may be more sensitive to myocardial injury, such as MINS. Fifth, we used a limited number of modeling strategies.

CONCLUSION

Addition of intraoperative data to preoperative data improves prediction of cardiovascular complications after noncardiac surgery. Improving the identification of patients at low risk for such complications could potentially be applied to reduce unnecessary postoperative cardiac biomarker testing after noncardiac surgery, but it will require further validation in prospective clinical settings.

Disclosures

Dr Navathe reports grants from the following entities: Hawaii Medical Service Association, Anthem Public Policy Institute, Commonwealth Fund, Oscar Health, Cigna Corporation, Robert Wood Johnson Foundation, Donaghue Foundation, Pennsylvania Department of Health, Ochsner Health System, United Healthcare, Blue Cross Blue Shield of NC, Blue Shield of CA; personal fees from the following: Navvis Healthcare, Agathos, Inc, Navahealth, YNHHSC/CORE, Maine Health Accountable Care Organization, Maine Department of Health and Human Services, National University Health System - Singapore, Ministry of Health - Singapore, Social Security Administration - France, Elsevier Press, Medicare Payment Advisory Commission, Cleveland Clinic, Embedded Healthcare; and other support from Integrated Services, Inc, outside of the submitted work. Dr Volpp reports grants from Humana during the conduct of the study; grants from Hawaii Medical Services Agency, Discovery (South Africa), Merck, Weight Watchers, and CVS outside of the submitted work; he has received consulting income from CVS and VALHealth and is a principal in VALHealth, a behavioral economics consulting firm. Dr Holmes receives funding from the Pennsylvania Department of Health, US Public Health Service, and the Cardiovascular Medicine Research and Education Foundation. All other authors declare no conflicts of interest.

Prior Presentations

2019 Academy Health Annual Research Meeting, Poster Abstract Presentation, June 2 to June 4, 2019, Washington, DC.

Funding

This project was funded, in part, under a grant with the Pennsylvania Department of Health. This research was independent from the funder. The funder had no role in the study design; in the collection, analysis, and interpretation of data; in the writing of the report; or in the decision to submit the article for publication. The department specifically disclaims responsibility for any analyses, interpretations, or conclusions.

Annually, more than 40 million noncardiac surgeries take place in the US,1 with 1%-3% of patients experiencing a major adverse cardiovascular event (MACE) such as acute myocardial infarction (AMI) or cardiac arrest postoperatively.2 Such patients are at markedly increased risk of both perioperative and long-term death.2-5

Over the past 40 years, efforts to model the risk of cardiac complications after noncardiac surgery have examined relationships between preoperative risk factors and postoperative cardiovascular events. The resulting risk-stratification tools, such as the Lee Revised Cardiac Risk Index (RCRI), have been used to inform perioperative care, including strategies for risk factor management prior to surgery, testing for cardiac events after surgery, and decisions regarding postoperative disposition.6 However, tools used in practice have not incorporated intraoperative data on hemodynamics or medication administration in the transition to postoperative care, which is often provided by nonsurgical clinicians such as hospitalists. Presently, there is active debate about the optimal approach to postoperative evaluation and management of MACE, particularly with regard to indications for cardiac biomarker testing after surgery in patients without signs or symptoms of acute cardiac syndromes. The lack of consensus is reflected in differences among guidelines for postoperative cardiac biomarker testing across professional societies in Europe, Canada, and the United States.7-9

In this study, we examined whether the addition of intraoperative data to preoperative data (together, perioperative data) improved prediction of MACE after noncardiac surgery when compared with RCRI. Additionally, to investigate how such a model could be applied in practice, we compared risk stratification based on our model to a published risk factor–based guideline algorithm for postoperative cardiac biomarker testing.7 In particular, we evaluated to what extent patients recommended for postoperative cardiac biomarkers under the risk factor–based guideline algorithm would be reclassified as low risk by the model using perioperative data. Conducting biomarker tests on these patients would potentially represent low-value care. We hypothesized that adding intraoperative data would (a) lead to improved prediction of MACE complications when compared with RCRI and (b) more effectively identify, compared with a risk factor–based guideline algorithm, patients for whom cardiac biomarker testing would or would not be clinically meaningful.

METHODS

We followed the Transparent Reporting of a Multivariable Prediction Model for Individual Prognosis or Diagnosis (TRIPOD) reporting guideline.10

Study Data

Baseline, preoperative, and intraoperative data were collected for patients undergoing surgery between January 2014 and April 2018 within the University of Pennsylvania Health System (UPHS) electronic health record (EHR), and these data were then integrated into a comprehensive perioperative dataset (data containing administrative, preoperative, intraoperative, and postoperative information related to surgeries) created through a collaboration with the Multicenter Perioperative Outcomes Group.11 The University of Pennsylvania Institutional Review Board approved this study.

Study Population

Patients aged 18 years or older who underwent inpatient major noncardiac surgery across four tertiary academic medical centers within UPHS in Pennsylvania during the study period were included in the cohort (see Appendix for inclusion/exclusion criteria).12,13 Noncardiac surgery was identified using primary Current Procedural Terminology (CPT) code specification ranges for noncardiac surgeries 10021-32999 and 34001-69990. The study sample was divided randomly into a training set (60%), validation (20%), and test set (20%),14 with similar rates of MACE in the resulting sets. We used a holdout test set for all final analyses to avoid overfitting during model selection.

Outcomes

The composite outcome used to develop the risk-stratification models was in-hospital MACE after major noncardiac surgery. Following prior literature, MACE was defined using billing codes for ST-elevation/non–ST-elevation myocardial infarction (STEMI/NSTEMI, ICD-9-CM 410.xx, ICD-10-CM I21.xx), cardiac arrest (ICD-9-CM 427.5, ICD-10-CM I46.x, I97.121), or all-cause in-hospital death.2,15-17

Variables

Variables were selected from baseline administrative, preoperative clinical, and intraoperative clinical data sources (full list in Appendix). Baseline variables included demographics, insurance type, and Elixhauser comorbidities.18,19 Preoperative variables included surgery type, laboratory results, and American Society of Anesthesiologists (ASA) Physical Status classification.20 Intraoperative variables included vital signs, estimated blood loss, fluid administration, and vasopressor use. We winsorized outlier values and used multiple imputation to address missingness. Rates of missing data can be found in Appendix Table 1.

Risk-Stratification Models Used as Comparisons

Briefly, RCRI variables include the presence of high-risk surgery,21 comorbid cardiovascular diseases (ie, ischemic heart disease, congestive heart failure, and cerebrovascular disease), preoperative use of insulin, and elevated preoperative serum creatinine.6 RCRI uses the inputs to calculate a point score that equates to different risk strata and is based on a stepwise logistic regression model with postoperative cardiovascular complications as the dependent outcome variable. For this study, we implemented the weighted version of the RCRI algorithm and computed the point scores (Appendix).6,7,22

We also applied a risk factor–based algorithm for postoperative cardiac biomarker testing published in 2017 by the Canadian Cardiovascular Society (CCS) guidelines to each patient in the study sample.7 Specifically, this algorithm recommends daily troponin surveillance for 48 to 72 hours after surgery among patients who have (1) an elevated NT-proBNP/BNP measurement or no NT-proBNP/BNP measurement before surgery, (2) have a Revised Cardiac Risk Index score of 1 or greater, (3) are aged 65 years and older, (4) are aged 45 to 64 years with significant cardiovascular disease undergoing elective surgery, or (5) are aged 18 to 64 years with significant cardiovascular disease undergoing semiurgent, urgent, or emergent surgery.

Statistical Analysis

We compared patient characteristics and outcomes between those who did and those who did not experience MACE during hospitalization. Chi-square tests were used to compare categorical variables and Mann Whitney tests were used to compare continuous variables.

To create the perioperative risk-stratification model based on baseline, preoperative, and intraoperative data, we used a logistic regression with elastic net selection using a dichotomous dependent variable indicating MACE and independent variables described earlier. This perioperative model was fit on the training set and the model coefficients were then applied to the patients in the test set. The area under the receiver operating characteristic curve (AUC) was reported and the outcomes were reported by predicted risk decile, with higher deciles indicating higher risk (ie, higher numbers of patients with MACE outcomes in higher deciles implied better risk stratification). Because predicted risk of postoperative MACE may not have been distributed evenly across deciles, we also examined the distribution of the predicted probability of MACE and examined the number of patients below thresholds of risk corresponding to 0.1% or less, 0.25% or less, 0.5% or less, and 1% or less. These thresholds were chosen because they were close to the overall rate of MACE within our cohort.

We tested for differences in predictive performance between the RCRI logistic regression model AUC and the perioperative model AUC using DeLong’s test.23 Additionally, we illustrated differences between the perioperative and RCRI models’ performance in two ways by stratifying patients into deciles based on predicted risk. First, we compared rates of MACE and MACE component events by predicted decile of the perioperative and RCRI models. Second, we further classified patients as RCRI high or low risk (per RCRI score classification in which RCRI score of 1 or greater is high risk and RCRI score of 0 is low risk) and examined numbers of surgical cases and MACE complications within these categories stratified by perioperative model predicted decile.

To compare the perioperative model’s performance with that of a risk factor–based guideline algorithm, we classified patients according to CCS guidelines as high risk (those for whom the CCS guidelines algorithm would recommend postoperative troponin surveillance testing) and low risk (those for whom the CCS guidelines algorithm would not recommend surveillance testing). We also used a logistic regression to examine if the predicted risk from our model was independently associated with MACE above and beyond the testing recommendation of the CCS guidelines algorithm. This model used MACE as the dependent variable and model-predicted risk and a CCS guidelines–defined high-risk indicator as predictors. We computed the association between a 10 percentage–point increase in predicted risk on observed MACE outcome rates.24

In sensitivity analyses, we used a random forest machine learning classifier to test an alternate model specification, used complete case analysis, varied RCRI thresholds, and limited to patients aged 50 years or older. We also varied the penalty parameter in the elastic net model and plotted AUC versus the number of variables included to examine parsimonious models. SAS v9.4 (SAS Institute Inc) was used for main analyses. Data preparations and sensitivity analysis were done in Python v3.6 with Pandas v0.24.2 and Scikit-learn v0.19.1.

Baseline Characteristics of Patients Who Underwent Noncardiac Surgery, 2014 to 2018

RESULTS

Study Sample

Patients who underwent major noncardiac surgery in our sample (n = 72,909) were approximately a mean age of 56 years, 58% female, 66% of White race and 26% of Black race, and most likely to have received orthopedic surgery (33%) or general surgery (20%). Those who experienced MACE (n = 558; 0.77%) differed along several characteristics (Table 1). For example, those with MACE were older (mean age, 65.4 vs 55.4 years; P < .001) and less likely to be female (41.9% vs 58.3%; P < .001).

Comparison of Perioperative and Revised Cardiac Risk Index Models’ Performance for Predicting Major Adverse Cardiovascular Events

Model Performance After Intraoperative Data Inclusion

In the perioperative model combining preoperative and intraoperative data, 26 variables were included after elastic net selection (Appendix Table 2). Model discrimination in the test set of patients demonstrated an AUC of 0.88 (95% CI, 0.85-0.92; Figure). When examining outcome rates by predicted decile, the outcome rates of in-hospital MACE complications were higher in the highest decile than in the lowest decile, notably with 58 of 92 (63%) cases with MACE complications within the top decile of predicted risk (Table 2). The majority of patients had low predicted risk of MACE, with 5,309 (36.1%), 8,796 (59.7%), 11,335 (77.0%), and 12,972 (88.1%) below the risk thresholds of to 0.1%, 0.25%, 0.5%, and 1.0% respectively. The associated MACE rates were 0.04%, 0.10%, 0.17%, and 0.25% (average rate in sample was 0.63%) (Appendix Table 3).

Perioperative Model Performance for Predicting Major Adverse Cardiac Events and Components by Risk Decile in Test Set

Model Performance Comparisons

The perioperative model AUC of 0.88 was higher when compared with RCRI’s AUC of 0.79 (95% CI, 0.74-0.84; P < .001). The number of MACE complications was more concentrated in the top decile of predicted risk of the perioperative model than it was in that of the RCRI model (58 vs 43 of 92 events, respectively; 63% vs 47%; Table 2). Furthermore, there were fewer cases with MACE complications in the low-risk deciles (ie, deciles 1 to 5) of the perioperative model than in the those of the RCRI model. These relative differences were consistent for MACE component outcomes of STEMI/NSTEMI, cardiac arrest, and in-hospital death, as well.

There was substantial heterogeneity in the perioperative model predicted risk of patients classified as either RCRI low risk or high risk (ie, each category included patients with low and high predicted risk) categories (Table 3). Patients in the bottom (low-risk) five deciles of the perioperative model’s predicted risk who were in the RCRI model’s high-risk group were very unlikely to experience MACE complications (3 out of 722 cases; 0.42%). Furthermore, among those classified as low risk by the RCRI model but were in the top decile of the perioperative model’s predicted risk, the MACE complication rate was 3.5% (8 out of 229), which was 6 times the sample mean MACE complication rate.

Comparison of Perioperative Model Results by Risk Factor–Based Recommendations

The perioperative model identified more patients as low risk than did the CCS guidelines’ risk factor–based algorithm (Table 3). For example, 2,341 of the patients the CCS guidelines algorithm identified as high risk were in the bottom 50% of the perioperative model’s predicted risk for experiencing MACE (below a 0.18% chance of a MACE complication); only four of these patients (0.17%) actually experienced MACE. This indicates that the 2,341 of 7,597 (31%) high-risk patients identified as low risk in the perioperative model would have been recommended for postoperative troponin testing by CCS guidelines based on preoperative risk factors alone—but did not go on to experience a MACE. Regression results indicated that both CCS guidelines risk-factor classification and the perioperative model’s predicted risk were predictive of MACE outcomes. A change in the perioperative model’s predicted risk of 10 percentage points was associated with an increase in the probability of a MACE outcomes of 0.45 percentage points (95% CI, 0.35-0.55 percentage points; P < .001) and moving from CCS guidelines’ low- to high-risk categories was associated with an increased probability of MACE by 0.96 percentage points (95% CI, 0.75-1.16 percentage points; P < .001).

Results were consistent with the main analysis across all sensitivity analyses (Appendix Tables 4-7). Parsimonious models with variables as few as eight variables retained strong predictive power (AUC, 0.870; Appendix Figure 1 and Table 8).

DISCUSSION

In this study, the addition of intraoperative data improved risk stratification for MACE complications when compared with standard risk tools such as RCRI. This approach also outperformed a guidelines-based approach and identified additional patients at low risk of cardiovascular complications. This study has three main implications.

First, this study demonstrated the additional value of combining intraoperative data with preoperative data in risk prediction for postoperative cardiovascular events. The intraoperative data most strongly associated with MACE, which likely were responsible for the performance improvement, included administration of medications (eg, sodium bicarbonate or calcium chloride) and blood products (eg, platelets and packed red blood cells), vitals (ie, heart rate), and intraoperative procedures (ie, arterial line placement); all model variables and coefficients are reported in Appendix Table 9. The risk-stratification model using intraoperative clinical data outperformed validated standard models such as RCRI. While this model should not be used in causal inference and cannot be used to inform decisions about risk-benefit tradeoffs of undergoing surgery, its improved performance relative to prior models highlights the potential in using real-time data. Preliminary illustrative analysis demonstrated that parsimonious models with as few as eight variables perform well, whose implementation as risk scores in EHRs is likely straightforward (Appendix Table 8). This is particularly important for longitudinal care in the hospital, in which patients frequently are cared for by multiple clinical services and experience handoffs. For example, many orthopedic surgery patients with significant medical comorbidity are managed postoperatively by hospitalist physicians after initial surgical care.

Second, our study aligns well with the cardiac risk-stratification literature more broadly. For example, the patient characteristics and clinical variables most associated with cardiovascular complications were age, history of ischemic heart disease, American Society of Anesthesiologists physical status, use of intraoperative sodium bicarbonate or vasopressors, lowest intraoperative heart rate measured, and lowest intraoperative mean arterial pressure measured. While many of these variables overlap with those included in the RCRI model, others (such as American Society of Anesthesiologists physical status) are not included in RCRI but have been shown to be important in risk prediction in other studies using different data variables.6,25,26

Third, we illustrated a clinical application of this model in identifying patients at low risk of cardiovascular complications, although benefit may extend to other patients as well. This is particularly germane to clinicians who frequently manage patients in the postsurgical or postprocedural setting. Moreover, the clinical relevance to these clinicians is underscored by the lack of consensus among professional societies across Europe, Canada, and the United States about which subgroups of patients undergoing noncardiac surgery should receive postoperative cardiac biomarker surveillance testing in the 48 to 72 hours after surgery.6-9 This may be in part caused by differences in clinical objectives. For example, the CCS guidelines in part aim to detect myocardial injury after noncardiac surgery (MINS) up to 30 days after surgery, which may be more sensitive to myocardial injury but less strongly associated with outcomes like MACE. The results of this study suggest that adopting such risk factor–based testing would likely lead to additional testing of low risk patients, which may represent low value surveillance tests. For example, there were 2,257 patients without postoperative cardiac biomarker testing in our data who would have been categorized as high risk by risk factor guidelines and therefore recommended to receive at least one postoperative cardiac biomarker surveillance test but were classified as low-risk individuals using a predicted probability of MACE less than 0.18% per our perioperative risk stratification model (Appendix Table 4). If each of these patients received one troponin biomarker test, the associated cost increase would be $372,405 (using the $165 cost per test reported at our institution). These costs would multiply if daily surveillance troponin biomarker tests were ordered for 48 to 72 hours after surgery, as recommended by the risk factor–based testing guidelines. This would be a departure from testing among patients using clinician discretion that may avoid low-value testing.

Applying the perioperative model developed in this paper to clinical practice still requires several steps. The technical aspects of finding a parsimonious model that can be implemented in the EHR is likely quite straightforward. Our preliminary analysis illustrates that doing so will not require accessing large numbers of intraoperative variables. Perhaps more important steps include prospective validation of the safety, usability, and clinical benefit of such an algorithm-based risk score.27

The study has several limitations. First, it was an observational study using EHR data subject to missingness and data quality issues that may have persisted despite our methods. Furthermore, EHR data is not generated randomly, and unmeasured variables observed by clinicians but not by researchers could confound the results. However, our approach used the statistical model to examine risk, not causal inference. Second, this is a single institution study and the availability of EHR data, as well as practice patterns, may vary at other institutions. Furthermore, it is possible that performance of the RCRI score, the model fitting RCRI classification of high vs low risk on the sample data, and our model’s performance may not generalize to other clinical settings. However, we utilized data from multiple hospitals within a health system with different surgery and anesthesia groups and providers, and a similar AUC was reported for RCRI in original validation study.6 Third, our follow up period was limited to the hospital setting and we do not capture longitudinal outcomes, such as 30-day MACE. This may impact the ability to risk stratify for other important longer-term outcomes, limit clinical utility, and hinder comparability to other studies. Fourth, results may vary for other important cardiovascular outcomes that may be more sensitive to myocardial injury, such as MINS. Fifth, we used a limited number of modeling strategies.

CONCLUSION

Addition of intraoperative data to preoperative data improves prediction of cardiovascular complications after noncardiac surgery. Improving the identification of patients at low risk for such complications could potentially be applied to reduce unnecessary postoperative cardiac biomarker testing after noncardiac surgery, but it will require further validation in prospective clinical settings.

Disclosures

Dr Navathe reports grants from the following entities: Hawaii Medical Service Association, Anthem Public Policy Institute, Commonwealth Fund, Oscar Health, Cigna Corporation, Robert Wood Johnson Foundation, Donaghue Foundation, Pennsylvania Department of Health, Ochsner Health System, United Healthcare, Blue Cross Blue Shield of NC, Blue Shield of CA; personal fees from the following: Navvis Healthcare, Agathos, Inc, Navahealth, YNHHSC/CORE, Maine Health Accountable Care Organization, Maine Department of Health and Human Services, National University Health System - Singapore, Ministry of Health - Singapore, Social Security Administration - France, Elsevier Press, Medicare Payment Advisory Commission, Cleveland Clinic, Embedded Healthcare; and other support from Integrated Services, Inc, outside of the submitted work. Dr Volpp reports grants from Humana during the conduct of the study; grants from Hawaii Medical Services Agency, Discovery (South Africa), Merck, Weight Watchers, and CVS outside of the submitted work; he has received consulting income from CVS and VALHealth and is a principal in VALHealth, a behavioral economics consulting firm. Dr Holmes receives funding from the Pennsylvania Department of Health, US Public Health Service, and the Cardiovascular Medicine Research and Education Foundation. All other authors declare no conflicts of interest.

Prior Presentations

2019 Academy Health Annual Research Meeting, Poster Abstract Presentation, June 2 to June 4, 2019, Washington, DC.

Funding

This project was funded, in part, under a grant with the Pennsylvania Department of Health. This research was independent from the funder. The funder had no role in the study design; in the collection, analysis, and interpretation of data; in the writing of the report; or in the decision to submit the article for publication. The department specifically disclaims responsibility for any analyses, interpretations, or conclusions.

References

1. National Center for Health Statistics. National Hospital Discharge Survey: 2010 Table, Number of all-listed procedures for discharges from short-stay hospitals, by procedure category and age: United States, 2010. Centers for Disease Control and Prevention; 2010. Accessed November 11, 2018. https://www.cdc.gov/nchs/data/nhds/4procedures/2010pro4_numberprocedureage.pdf
2. Devereaux PJ, Goldman L, Cook DJ, Gilbert K, Leslie K, Guyatt GH. Perioperative cardiac events in patients undergoing noncardiac surgery: a review of the magnitude of the problem, the pathophysiology of the events and methods to estimate and communicate risk. CMAJ. 2005;173(6):627-634. https://doi.org/10.1503/cmaj.050011
3. Charlson M, Peterson J, Szatrowski TP, MacKenzie R, Gold J. Long-term prognosis after peri-operative cardiac complications. J Clin Epidemiol. 1994;47(12):1389-1400. https://doi.org/10.1016/0895-4356(94)90083-3
4. Devereaux PJ, Sessler DI. Cardiac complications in patients undergoing major noncardiac surgery. N Engl J Med. 2015;373(23):2258-2269. https://doi.org/10.1056/nejmra1502824
5. Sprung J, Warner ME, Contreras MG, et al. Predictors of survival following cardiac arrest in patients undergoing noncardiac surgery: a study of 518,294 patients at a tertiary referral center. Anesthesiology. 2003;99(2):259-269. https://doi.org/10.1097/00000542-200308000-00006
6. Lee TH, Marcantonio ER, Mangione CM, et al. Derivation and prospective validation of a simple index for prediction of cardiac risk of major noncardiac surgery. Circulation. 1999;100(10):1043-1049. https://doi.org/10.1161/01.cir.100.10.1043
7. Duceppe E, Parlow J, MacDonald P, et al. Canadian Cardiovascular Society guidelines on perioperative cardiac risk assessment and management for patients who undergo noncardiac surgery. Can J Cardiol. 2017;33(1):17-32. https://doi.org/10.1016/j.cjca.2016.09.008
8. Fleisher LA, Fleischmann KE, Auerbach AD, et al. 2014 ACC/AHA guideline on perioperative cardiovascular evaluation and management of patients undergoing noncardiac surgery: a report of the American College of Cardiology/American Heart Association Task Force on practice guidelines. J Am Coll Cardiol. 2014;64(22):e77-e137. https://doi.org/10.1016/j.jacc.2014.07.944
9. Kristensen SD, Knuuti J, Saraste A, et al. 2014 ESC/ESA guidelines on non-cardiac surgery: cardiovascular assessment and management: The Joint Task Force on non-cardiac surgery: cardiovascular assessment and management of the European Society of Cardiology (ESC) and the European Society of Anaesthesiology (ESA). Euro Heart J. 2014;35(35):2383-2431. https://doi.org/10.1093/eurheartj/ehu282
10. Collins GS, Reitsma JB, Altman DG, Moons KG. Transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD): the TRIPOD statement. Ann Intern Med. 2015;162(1):55-63. https://doi.org/10.7326/m14-0697
11. Freundlich RE, Kheterpal S. Perioperative effectiveness research using large databases. Best Pract Res Clin Anaesthesiol. 2011;25(4):489-498. https://doi.org/10.1016/j.bpa.2011.08.008
12. CPT® (Current Procedural Terminology). American Medical Association. 2018. Accessed November 11, 2018. https://www.ama-assn.org/practice-management/cpt-current-procedural-terminology
13. Surgery Flag Software for ICD-9-CM. AHRQ Healthcare Cost and Utilization Project; 2017. Accessed November 11, 2018. https://www.hcup-us.ahrq.gov/toolssoftware/surgflags/surgeryflags.jsp
14. Hastie T, Tibshirani R, Friedman J. The Elements of Statistical Learning: Data Mining, Inference, and Prediction. 2nd ed. Springer; 2009. https://www.springer.com/gp/book/9780387848570
15. Bucy R, Hanisko KA, Ewing LA, et al. Abstract 281: Validity of in-hospital cardiac arrest ICD-9-CM codes in veterans. Circ Cardiovasc Qual Outcomes. 2015;8(suppl_2):A281-A281.
16. Institute of Medicine; Board on Health Sciences Policy; Committee on the Treatment of Cardiac Arrest: Current Status and Future Directions. Graham R, McCoy MA, Schultz AM, eds. Strategies to Improve Cardiac Arrest Survival: A Time to Act. The National Academies Press; 2015. https://doi.org/10.17226/21723
17. Pladevall M, Goff DC, Nichaman MZ, et al. An assessment of the validity of ICD Code 410 to identify hospital admissions for myocardial infarction: The Corpus Christi Heart Project. Int J Epidemiol. 1996;25(5):948-952. https://doi.org/10.1093/ije/25.5.948
18. Elixhauser A, Steiner C, Harris DR, Coffey RM. Comorbidity measures for use with administrative data. Med Care. 1998;36(1):8-27. https://doi.org/10.1097/00005650-199801000-00004
19. Quan H, Sundararajan V, Halfon P, et al. Coding algorithms for defining comorbidities in ICD-9-CM and ICD-10 administrative data. Med Care. 2005;43(11):1130-1139. https://doi.org/10.1097/01.mlr.0000182534.19832.83
20. Keats AS. The ASA classification of physical status--a recapitulation. Anesthesiology. 1978;49(4):233-236. https://doi.org/10.1097/00000542-197810000-00001
21. Schwarze ML, Barnato AE, Rathouz PJ, et al. Development of a list of high-risk operations for patients 65 years and older. JAMA Surg. 2015;150(4):325-331. https://doi.org/10.1001/jamasurg.2014.1819
22. VISION Pilot Study Investigators, Devereaux PJ, Bradley D, et al. An international prospective cohort study evaluating major vascular complications among patients undergoing noncardiac surgery: the VISION Pilot Study. Open Med. 2011;5(4):e193-e200.
23. DeLong ER, DeLong DM, Clarke-Pearson DL. Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach. Biometrics. 1988;44(3):837-845.
24. Norton EC, Dowd BE, Maciejewski ML. Marginal effects-quantifying the effect of changes in risk factors in logistic regression models. JAMA. 2019;321(13):1304‐1305. https://doi.org/10.1001/jama.2019.1954
25. Bilimoria KY, Liu Y, Paruch JL, et al. Development and evaluation of the universal ACS NSQIP surgical risk calculator: a decision aid and informed consent tool for patients and surgeons. J Am Coll Surg. 2013;217(5):833-842. https://doi.org/10.1016/j.jamcollsurg.2013.07.385
26. Gawande AA, Kwaan MR, Regenbogen SE, Lipsitz SA, Zinner MJ. An Apgar score for surgery. J Am Coll Surg. 2007;204(2):201-208. https://doi.org/10.1016/j.jamcollsurg.2006.11.011
27. Parikh RB, Obermeyer Z, Navathe AS. Regulation of predictive analytics in medicine. Science. 2019;363(6429):810-812. https://doi.org/10.1126/science.aaw0029

References

1. National Center for Health Statistics. National Hospital Discharge Survey: 2010 Table, Number of all-listed procedures for discharges from short-stay hospitals, by procedure category and age: United States, 2010. Centers for Disease Control and Prevention; 2010. Accessed November 11, 2018. https://www.cdc.gov/nchs/data/nhds/4procedures/2010pro4_numberprocedureage.pdf
2. Devereaux PJ, Goldman L, Cook DJ, Gilbert K, Leslie K, Guyatt GH. Perioperative cardiac events in patients undergoing noncardiac surgery: a review of the magnitude of the problem, the pathophysiology of the events and methods to estimate and communicate risk. CMAJ. 2005;173(6):627-634. https://doi.org/10.1503/cmaj.050011
3. Charlson M, Peterson J, Szatrowski TP, MacKenzie R, Gold J. Long-term prognosis after peri-operative cardiac complications. J Clin Epidemiol. 1994;47(12):1389-1400. https://doi.org/10.1016/0895-4356(94)90083-3
4. Devereaux PJ, Sessler DI. Cardiac complications in patients undergoing major noncardiac surgery. N Engl J Med. 2015;373(23):2258-2269. https://doi.org/10.1056/nejmra1502824
5. Sprung J, Warner ME, Contreras MG, et al. Predictors of survival following cardiac arrest in patients undergoing noncardiac surgery: a study of 518,294 patients at a tertiary referral center. Anesthesiology. 2003;99(2):259-269. https://doi.org/10.1097/00000542-200308000-00006
6. Lee TH, Marcantonio ER, Mangione CM, et al. Derivation and prospective validation of a simple index for prediction of cardiac risk of major noncardiac surgery. Circulation. 1999;100(10):1043-1049. https://doi.org/10.1161/01.cir.100.10.1043
7. Duceppe E, Parlow J, MacDonald P, et al. Canadian Cardiovascular Society guidelines on perioperative cardiac risk assessment and management for patients who undergo noncardiac surgery. Can J Cardiol. 2017;33(1):17-32. https://doi.org/10.1016/j.cjca.2016.09.008
8. Fleisher LA, Fleischmann KE, Auerbach AD, et al. 2014 ACC/AHA guideline on perioperative cardiovascular evaluation and management of patients undergoing noncardiac surgery: a report of the American College of Cardiology/American Heart Association Task Force on practice guidelines. J Am Coll Cardiol. 2014;64(22):e77-e137. https://doi.org/10.1016/j.jacc.2014.07.944
9. Kristensen SD, Knuuti J, Saraste A, et al. 2014 ESC/ESA guidelines on non-cardiac surgery: cardiovascular assessment and management: The Joint Task Force on non-cardiac surgery: cardiovascular assessment and management of the European Society of Cardiology (ESC) and the European Society of Anaesthesiology (ESA). Euro Heart J. 2014;35(35):2383-2431. https://doi.org/10.1093/eurheartj/ehu282
10. Collins GS, Reitsma JB, Altman DG, Moons KG. Transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD): the TRIPOD statement. Ann Intern Med. 2015;162(1):55-63. https://doi.org/10.7326/m14-0697
11. Freundlich RE, Kheterpal S. Perioperative effectiveness research using large databases. Best Pract Res Clin Anaesthesiol. 2011;25(4):489-498. https://doi.org/10.1016/j.bpa.2011.08.008
12. CPT® (Current Procedural Terminology). American Medical Association. 2018. Accessed November 11, 2018. https://www.ama-assn.org/practice-management/cpt-current-procedural-terminology
13. Surgery Flag Software for ICD-9-CM. AHRQ Healthcare Cost and Utilization Project; 2017. Accessed November 11, 2018. https://www.hcup-us.ahrq.gov/toolssoftware/surgflags/surgeryflags.jsp
14. Hastie T, Tibshirani R, Friedman J. The Elements of Statistical Learning: Data Mining, Inference, and Prediction. 2nd ed. Springer; 2009. https://www.springer.com/gp/book/9780387848570
15. Bucy R, Hanisko KA, Ewing LA, et al. Abstract 281: Validity of in-hospital cardiac arrest ICD-9-CM codes in veterans. Circ Cardiovasc Qual Outcomes. 2015;8(suppl_2):A281-A281.
16. Institute of Medicine; Board on Health Sciences Policy; Committee on the Treatment of Cardiac Arrest: Current Status and Future Directions. Graham R, McCoy MA, Schultz AM, eds. Strategies to Improve Cardiac Arrest Survival: A Time to Act. The National Academies Press; 2015. https://doi.org/10.17226/21723
17. Pladevall M, Goff DC, Nichaman MZ, et al. An assessment of the validity of ICD Code 410 to identify hospital admissions for myocardial infarction: The Corpus Christi Heart Project. Int J Epidemiol. 1996;25(5):948-952. https://doi.org/10.1093/ije/25.5.948
18. Elixhauser A, Steiner C, Harris DR, Coffey RM. Comorbidity measures for use with administrative data. Med Care. 1998;36(1):8-27. https://doi.org/10.1097/00005650-199801000-00004
19. Quan H, Sundararajan V, Halfon P, et al. Coding algorithms for defining comorbidities in ICD-9-CM and ICD-10 administrative data. Med Care. 2005;43(11):1130-1139. https://doi.org/10.1097/01.mlr.0000182534.19832.83
20. Keats AS. The ASA classification of physical status--a recapitulation. Anesthesiology. 1978;49(4):233-236. https://doi.org/10.1097/00000542-197810000-00001
21. Schwarze ML, Barnato AE, Rathouz PJ, et al. Development of a list of high-risk operations for patients 65 years and older. JAMA Surg. 2015;150(4):325-331. https://doi.org/10.1001/jamasurg.2014.1819
22. VISION Pilot Study Investigators, Devereaux PJ, Bradley D, et al. An international prospective cohort study evaluating major vascular complications among patients undergoing noncardiac surgery: the VISION Pilot Study. Open Med. 2011;5(4):e193-e200.
23. DeLong ER, DeLong DM, Clarke-Pearson DL. Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach. Biometrics. 1988;44(3):837-845.
24. Norton EC, Dowd BE, Maciejewski ML. Marginal effects-quantifying the effect of changes in risk factors in logistic regression models. JAMA. 2019;321(13):1304‐1305. https://doi.org/10.1001/jama.2019.1954
25. Bilimoria KY, Liu Y, Paruch JL, et al. Development and evaluation of the universal ACS NSQIP surgical risk calculator: a decision aid and informed consent tool for patients and surgeons. J Am Coll Surg. 2013;217(5):833-842. https://doi.org/10.1016/j.jamcollsurg.2013.07.385
26. Gawande AA, Kwaan MR, Regenbogen SE, Lipsitz SA, Zinner MJ. An Apgar score for surgery. J Am Coll Surg. 2007;204(2):201-208. https://doi.org/10.1016/j.jamcollsurg.2006.11.011
27. Parikh RB, Obermeyer Z, Navathe AS. Regulation of predictive analytics in medicine. Science. 2019;363(6429):810-812. https://doi.org/10.1126/science.aaw0029

Issue
Journal of Hospital Medicine 15(10)
Issue
Journal of Hospital Medicine 15(10)
Page Number
581-587. Published Online First September 23, 2020
Page Number
581-587. Published Online First September 23, 2020
Publications
Publications
Topics
Article Type
Sections
Article Source

© 2020 Society of Hospital Medicine

Disallow All Ads
Correspondence Location
Amol S Navathe, MD, PhD; Email: [email protected]; Telephone: 215-573-4047; Twitter: @AmolNavathe.
Content Gating
Gated (full article locked unless allowed per User)
Alternative CME
Disqus Comments
Default
Use ProPublica
Hide sidebar & use full width
render the right sidebar.
Conference Recap Checkbox
Not Conference Recap
Clinical Edge
Display the Slideshow in this Article
Gating Strategy
First Page Free
Article PDF Media
Media Files

Overlap between Medicare’s Voluntary Bundled Payment and Accountable Care Organization Programs

Article Type
Changed
Thu, 04/01/2021 - 11:49

Voluntary accountable care organizations (ACOs) and bundled payments have concurrently become cornerstone strategies in Medicare’s shift from volume-based fee-for-service toward value-based payment.

Physician practice and hospital participation in Medicare’s largest ACO model, the Medicare Shared Savings Program (MSSP),1 grew to include 561 organizations in 2018. Under MSSP, participants assume financial accountability for the global quality and costs of care for defined populations of Medicare fee-for-service patients. ACOs that manage to maintain or improve quality while achieving savings (ie, containing costs below a predefined population-wide spending benchmark) are eligible to receive a portion of the difference back from Medicare in the form of “shared savings”.

Similarly, hospital participation in Medicare’s bundled payment programs has grown over time. Most notably, more than 700 participants enrolled in the recently concluded Bundled Payments for Care Improvement (BPCI) initiative,2 Medicare’s largest bundled payment program over the past five years.3 Under BPCI, participants assumed financial accountability for the quality and costs of care for all Medicare patients triggering a qualifying “episode of care”. Participants that limit episode spending below a predefined benchmark without compromising quality were eligible for financial incentives.

As both ACOs and bundled payments grow in prominence and scale, they may increasingly overlap if patients attributed to ACOs receive care at bundled payment hospitals. Overlap could create synergies by increasing incentives to address shared processes (eg, discharge planning) or outcomes (eg, readmissions).4 An ACO focus on reducing hospital admissions could complement bundled payment efforts to increase hospital efficiency.

Conversely, Medicare’s approach to allocating savings and losses can penalize ACOs or bundled payment participants.3 For example, when a patient included in an MSSP ACO population receives episodic care at a hospital participating in BPCI, the historical costs of care for the hospital and the episode type, not the actual costs of care for that specific patient and his/her episode, are counted in the performance of the ACO. In other words, in these cases, the performance of the MSSP ACO is dependent on the historical spending at BPCI hospitals—despite it being out of ACO’s control and having little to do with the actual care its patients receive at BPCI hospitals—and MSSP ACOs cannot benefit from improvements over time. Therefore, MSSP ACOs may be functionally penalized if patients receive care at historically high-cost BPCI hospitals regardless of whether they have considerably improved the value of care delivered. As a corollary, Medicare rules involve a “claw back” stipulation in which savings are recouped from hospitals that participate in both BPCI and MSSP, effectively discouraging participation in both payment models.

Although these dynamics are complex, they highlight an intuitive point that has gained increasing awareness,5 ie, policymakers must understand the magnitude of overlap to evaluate the urgency in coordinating between the payment models. Our objective was to describe the extent of overlap and the characteristics of patients affected by it.

 

 

METHODS

We used 100% institutional Medicare claims, MSSP beneficiary attribution, and BPCI hospital data to identify fee-for-service beneficiaries attributed to MSSP and/or receiving care at BPCI hospitals for its 48 included episodes from the start of BPCI in 2013 quarter 4 through 2016 quarter 4.

We examined the trends in the number of episodes across the following three groups: MSSP-attributed patients hospitalized at BPCI hospitals for an episode included in BPCI (Overlap), MSSP-attributed patients hospitalized for that episode at non-BPCI hospitals (MSSP-only), and non-MSSP-attributed patients hospitalized at BPCI hospitals for a BPCI episode (BPCI-only). We used Medicare and United States Census Bureau data to compare groups with respect to sociodemographic (eg, age, sex, residence in a low-income area),6 clinical (eg, Elixhauser comorbidity index),7 and prior utilization (eg, skilled nursing facility discharge) characteristics.

Categorical and continuous variables were compared using logistic regression and one-way analysis of variance, respectively. Analyses were performed using Stata (StataCorp, College Station, Texas), version 15.0. Statistical tests were 2-tailed and significant at α = 0.05. This study was approved by the institutional review board at the University of Pennsylvania.

RESULTS

The number of MSSP ACOs increased from 220 in 2013 to 432 in 2016. The number of BPCI hospitals increased from 9 to 389 over this period, peaking at 413 hospitals in 2015. Over our study period, a total of 243,392, 2,824,898, and 702,864 episodes occurred in the Overlap, ACO-only, and BPCI-only groups, respectively (Table). Among episodes, patients in the Overlap group generally showed lower severity than those in other groups, although the differences were small. The BPCI-only, MSSP-only, and Overlap groups also exhibited small differences with respect to other characteristics such as the proportion of patients with Medicare/Medicaid dual-eligibility (15% of individual vs 16% and 12%, respectively) and prior use of skilled nursing facilities (33% vs 34% vs 31%, respectively) and acute care hospitals (45% vs 41% vs 39%, respectively) (P < .001 for all).

The overall overlap facing MSSP patients (overlap as a proportion of all MSSP patients) increased from 0.3% at the end of 2013 to 10% at the end of 2016, whereas over the same period, overlap facing bundled payment patients (overlap as a proportion of all bundled payment patients) increased from 11.9% to 27% (Appendix Figure). Overlap facing MSSP ACOs varied according to episode type, ranging from 3% for both acute myocardial infarction and chronic obstructive pulmonary disease episodes to 18% for automatic implantable cardiac defibrillator episodes at the end of 2016. Similarly, overlap facing bundled payment patients varied from 21% for spinal fusion episodes to 32% for lower extremity joint replacement and automatic implantable cardiac defibrillator episodes.

DISCUSSION

To our knowledge, this is the first study to describe the sizable and growing overlap facing ACOs with attributed patients who receive care at bundled payment hospitals, as well as bundled payment hospitals that treat patients attributed to ACOs.

The major implication of our findings is that policymakers must address and anticipate forthcoming payment model overlap as a key policy priority. Given the emphasis on ACOs and bundled payments as payment models—for example, Medicare continues to implement both nationwide via the Next Generation ACO model8 and the recently launched BPCI-Advanced program9—policymakers urgently need insights about the extent of payment model overlap. In that context, it is notable that although we have evaluated MSSP and BPCI as flagship programs, true overlap may actually be greater once other programs are considered.

Several factors may underlie the differences in the magnitude of overlap facing bundled payment versus ACO patients. The models differ in how they identify relevant patient populations, with patients falling under bundled payments via hospitalization for certain episode types but patients falling under ACOs via attribution based on the plurality of primary care services. Furthermore, BPCI participation lagged behind MSSP participation in time, while also occurring disproportionately in areas with existing MSSP ACOs.

Given these findings, understanding the implications of overlap should be a priority for future research and policy strategies. Potential policy considerations should include revising cost accounting processes so that when ACO-attributed patients receive episodic care at bundled payment hospitals, actual rather than historical hospital costs are counted toward ACO cost performance. To encourage hospitals to assume more accountability over outcomes—the ostensible overarching goal of value-based payment reform—Medicare could elect not to recoup savings from hospitals in both payment models. Although such changes require careful accounting to protect Medicare from financial losses as it forgoes some savings achieved through payment reforms, this may be worthwhile if hospital engagement in both models yields synergies.

Importantly, any policy changes made to address program overlap would need to accommodate ongoing changes in ACO, bundled payments, and other payment programs. For example, Medicare overhauled MSSP in December 2018. Compared to the earlier rules, in which ACOs could avoid downside financial risk altogether via “upside only” arrangements for up to six years, new MSSP rules require all participants to assume downside risk after several years of participation. Separately, forthcoming payment reforms such as direct contracting10 may draw clinicians and hospitals previously not participating in either Medicare fee-for-service or value-based payment models into payment reform. These factors may affect overlap in unpredictable ways (eg, they may increase the overlap by increasing the number of patients whose care is covered by different payment models or they may decrease overlap by raising the financial stakes of payment reforms to a degree that organizations drop out altogether).

This study has limitations. First, generalizability is limited by the fact that our analysis did not include bundled payment episodes assigned to physician group participants in BPCI or hospitals in mandatory joint replacement bundles under the Medicare Comprehensive Care for Joint Replacement model.11 Second, although this study provides the first description of overlap between ACO and bundled payment programs, it was descriptive in nature. Future research is needed to evaluate the impact of overlap on clinical, quality, and cost outcomes. This is particularly important because although we observed only small differences in patient characteristics among MSSP-only, BPCI-only, and Overlap groups, characteristics could change differentially over time. Payment reforms must be carefully monitored for potentially unintended consequences that could arise from differential changes in patient characteristics (eg, cherry-picking behavior that is disadvantageous to vulnerable individuals).

Nonetheless, this study underscores the importance and extent of overlap and the urgency to consider policy measures to coordinate between the payment models.

 

 

Acknowledgments

The authors thank research assistance from Sandra Vanderslice who did not receive any compensation for her work. This research was supported in part by The Commonwealth Fund. Rachel Werner was supported in part by K24-AG047908 from the NIA.

Files
References

1. Centers for Medicare and Medicaid Services. Shared Savings Program. https://www.cms.gov/Medicare/Medicare-Fee-For-Service-Payment/sharedsavingsprogram/index.html. Accessed July 22, 2019.
2. Centers for Medicare and Medicaid Services. Bundled Payments for Care Improvement (BPCI) Initiative: General Information. https://innovation.cms.gov/initiatives/bundled-payments/. Accessed July 22, 2019.
3. Mechanic RE. When new Medicare payment systems collide. N Engl J Med. 2016;374(18):1706-1709. https://doi.org/10.1056/NEJMp1601464.
4. Ryan AM, Krinsky S, Adler-Milstein J, Damberg CL, Maurer KA, Hollingsworth JM. Association between hospitals’ engagement in value-based reforms and readmission reduction in the hospital readmission reduction program. JAMA Intern Med. 2017;177(6):863-868. https://doi.org/10.1001/jamainternmed.2017.0518.
5. Liao JM, Dykstra SE, Werner RM, Navathe AS. BPCI Advanced will further emphasize the need to address overlap between bundled payments and accountable care organizations. https://www.healthaffairs.org/do/10.1377/hblog20180409.159181/full/. Accessed May 14, 2019.
6. Census Bureau. United States Census Bureau. https://www.census.gov/. Accessed May 14, 2018.
7. van Walraven C, Austin PC, Jennings A, Quan H, Forster AJ. A modification of the elixhauser comorbidity measures into a point system for hospital death using administrative data. Med Care. 2009;47(6):626-633. https://doi.org/10.1097/MLR.0b013e31819432e5.
8. Centers for Medicare and Medicaid Services. Next, Generation ACO Model. https://innovation.cms.gov/initiatives/next-generation-aco-model/. Accessed July 22, 2019.
9. Centers for Medicare and Medicaid Services. BPCI Advanced. https://innovation.cms.gov/initiatives/bpci-advanced. Accessed July 22, 2019.
10. Centers for Medicare and Medicaid Services. Direct Contracting. https://www.cms.gov/newsroom/fact-sheets/direct-contracting. Accessed July 22, 2019.
11. Centers for Medicare and Medicaid Services. Comprehensive Care for Joint Replacement Model. https://innovation.cms.gov/initiatives/CJR. Accessed July 22, 2019.

Article PDF
Author and Disclosure Information

1Corporal Michael J. Crescenz VA Medical Center, Philadelphia, Pennsylvania; 2Department of Medical Ethics and Health Policy, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania; 3Center for Health Incentives and Behavioral Economics, University of Pennsylvania, Philadelphia, Pennsylvania; 4Leonard Davis Institute of Health Economics, University of Pennsylvania, Philadelphia, Pennsylvania; 5The Wharton School of Business, University of Pennsylvania, Philadelphia, Pennsylvania; 6Division of General Internal Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania; 7Department of Medicine, University of Washington School of Medicine, Seattle, Washington; 8Value and Systems Science Lab, University of Washington School of Medicine, Seattle, Washington.

Disclosures

Dr. Navathe reported receiving grants from Hawaii Medical Service Association, Anthem Public Policy Institute, Cigna, Healthcare Research and Education Trust, and Oscar Health; personal fees from Navvis and Company, Navigant Inc., National University Health System of Singapore, and Agathos, Inc.; personal fees and equity from NavaHealth; equity from Embedded Healthcare; speaking fees from the Cleveland Clinic; serving as a board member of Integrated Services Inc. without compensation, and an honorarium from Elsevier Press, none of which are related to this manuscript. Dr. Dinh has nothing to disclose. Ms. Dykstra reports no conflicts. Dr. Werner reports personal fees from CarePort Health. Dr. Liao reports textbook royalties from Wolters Kluwer and personal fees from Kaiser Permanente Washington Research Institute, none of which are related to this manuscript.

Issue
Journal of Hospital Medicine 15(6)
Publications
Topics
Page Number
356-359. Published online first August 21, 2019
Sections
Files
Files
Author and Disclosure Information

1Corporal Michael J. Crescenz VA Medical Center, Philadelphia, Pennsylvania; 2Department of Medical Ethics and Health Policy, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania; 3Center for Health Incentives and Behavioral Economics, University of Pennsylvania, Philadelphia, Pennsylvania; 4Leonard Davis Institute of Health Economics, University of Pennsylvania, Philadelphia, Pennsylvania; 5The Wharton School of Business, University of Pennsylvania, Philadelphia, Pennsylvania; 6Division of General Internal Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania; 7Department of Medicine, University of Washington School of Medicine, Seattle, Washington; 8Value and Systems Science Lab, University of Washington School of Medicine, Seattle, Washington.

Disclosures

Dr. Navathe reported receiving grants from Hawaii Medical Service Association, Anthem Public Policy Institute, Cigna, Healthcare Research and Education Trust, and Oscar Health; personal fees from Navvis and Company, Navigant Inc., National University Health System of Singapore, and Agathos, Inc.; personal fees and equity from NavaHealth; equity from Embedded Healthcare; speaking fees from the Cleveland Clinic; serving as a board member of Integrated Services Inc. without compensation, and an honorarium from Elsevier Press, none of which are related to this manuscript. Dr. Dinh has nothing to disclose. Ms. Dykstra reports no conflicts. Dr. Werner reports personal fees from CarePort Health. Dr. Liao reports textbook royalties from Wolters Kluwer and personal fees from Kaiser Permanente Washington Research Institute, none of which are related to this manuscript.

Author and Disclosure Information

1Corporal Michael J. Crescenz VA Medical Center, Philadelphia, Pennsylvania; 2Department of Medical Ethics and Health Policy, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania; 3Center for Health Incentives and Behavioral Economics, University of Pennsylvania, Philadelphia, Pennsylvania; 4Leonard Davis Institute of Health Economics, University of Pennsylvania, Philadelphia, Pennsylvania; 5The Wharton School of Business, University of Pennsylvania, Philadelphia, Pennsylvania; 6Division of General Internal Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania; 7Department of Medicine, University of Washington School of Medicine, Seattle, Washington; 8Value and Systems Science Lab, University of Washington School of Medicine, Seattle, Washington.

Disclosures

Dr. Navathe reported receiving grants from Hawaii Medical Service Association, Anthem Public Policy Institute, Cigna, Healthcare Research and Education Trust, and Oscar Health; personal fees from Navvis and Company, Navigant Inc., National University Health System of Singapore, and Agathos, Inc.; personal fees and equity from NavaHealth; equity from Embedded Healthcare; speaking fees from the Cleveland Clinic; serving as a board member of Integrated Services Inc. without compensation, and an honorarium from Elsevier Press, none of which are related to this manuscript. Dr. Dinh has nothing to disclose. Ms. Dykstra reports no conflicts. Dr. Werner reports personal fees from CarePort Health. Dr. Liao reports textbook royalties from Wolters Kluwer and personal fees from Kaiser Permanente Washington Research Institute, none of which are related to this manuscript.

Article PDF
Article PDF

Voluntary accountable care organizations (ACOs) and bundled payments have concurrently become cornerstone strategies in Medicare’s shift from volume-based fee-for-service toward value-based payment.

Physician practice and hospital participation in Medicare’s largest ACO model, the Medicare Shared Savings Program (MSSP),1 grew to include 561 organizations in 2018. Under MSSP, participants assume financial accountability for the global quality and costs of care for defined populations of Medicare fee-for-service patients. ACOs that manage to maintain or improve quality while achieving savings (ie, containing costs below a predefined population-wide spending benchmark) are eligible to receive a portion of the difference back from Medicare in the form of “shared savings”.

Similarly, hospital participation in Medicare’s bundled payment programs has grown over time. Most notably, more than 700 participants enrolled in the recently concluded Bundled Payments for Care Improvement (BPCI) initiative,2 Medicare’s largest bundled payment program over the past five years.3 Under BPCI, participants assumed financial accountability for the quality and costs of care for all Medicare patients triggering a qualifying “episode of care”. Participants that limit episode spending below a predefined benchmark without compromising quality were eligible for financial incentives.

As both ACOs and bundled payments grow in prominence and scale, they may increasingly overlap if patients attributed to ACOs receive care at bundled payment hospitals. Overlap could create synergies by increasing incentives to address shared processes (eg, discharge planning) or outcomes (eg, readmissions).4 An ACO focus on reducing hospital admissions could complement bundled payment efforts to increase hospital efficiency.

Conversely, Medicare’s approach to allocating savings and losses can penalize ACOs or bundled payment participants.3 For example, when a patient included in an MSSP ACO population receives episodic care at a hospital participating in BPCI, the historical costs of care for the hospital and the episode type, not the actual costs of care for that specific patient and his/her episode, are counted in the performance of the ACO. In other words, in these cases, the performance of the MSSP ACO is dependent on the historical spending at BPCI hospitals—despite it being out of ACO’s control and having little to do with the actual care its patients receive at BPCI hospitals—and MSSP ACOs cannot benefit from improvements over time. Therefore, MSSP ACOs may be functionally penalized if patients receive care at historically high-cost BPCI hospitals regardless of whether they have considerably improved the value of care delivered. As a corollary, Medicare rules involve a “claw back” stipulation in which savings are recouped from hospitals that participate in both BPCI and MSSP, effectively discouraging participation in both payment models.

Although these dynamics are complex, they highlight an intuitive point that has gained increasing awareness,5 ie, policymakers must understand the magnitude of overlap to evaluate the urgency in coordinating between the payment models. Our objective was to describe the extent of overlap and the characteristics of patients affected by it.

 

 

METHODS

We used 100% institutional Medicare claims, MSSP beneficiary attribution, and BPCI hospital data to identify fee-for-service beneficiaries attributed to MSSP and/or receiving care at BPCI hospitals for its 48 included episodes from the start of BPCI in 2013 quarter 4 through 2016 quarter 4.

We examined the trends in the number of episodes across the following three groups: MSSP-attributed patients hospitalized at BPCI hospitals for an episode included in BPCI (Overlap), MSSP-attributed patients hospitalized for that episode at non-BPCI hospitals (MSSP-only), and non-MSSP-attributed patients hospitalized at BPCI hospitals for a BPCI episode (BPCI-only). We used Medicare and United States Census Bureau data to compare groups with respect to sociodemographic (eg, age, sex, residence in a low-income area),6 clinical (eg, Elixhauser comorbidity index),7 and prior utilization (eg, skilled nursing facility discharge) characteristics.

Categorical and continuous variables were compared using logistic regression and one-way analysis of variance, respectively. Analyses were performed using Stata (StataCorp, College Station, Texas), version 15.0. Statistical tests were 2-tailed and significant at α = 0.05. This study was approved by the institutional review board at the University of Pennsylvania.

RESULTS

The number of MSSP ACOs increased from 220 in 2013 to 432 in 2016. The number of BPCI hospitals increased from 9 to 389 over this period, peaking at 413 hospitals in 2015. Over our study period, a total of 243,392, 2,824,898, and 702,864 episodes occurred in the Overlap, ACO-only, and BPCI-only groups, respectively (Table). Among episodes, patients in the Overlap group generally showed lower severity than those in other groups, although the differences were small. The BPCI-only, MSSP-only, and Overlap groups also exhibited small differences with respect to other characteristics such as the proportion of patients with Medicare/Medicaid dual-eligibility (15% of individual vs 16% and 12%, respectively) and prior use of skilled nursing facilities (33% vs 34% vs 31%, respectively) and acute care hospitals (45% vs 41% vs 39%, respectively) (P < .001 for all).

The overall overlap facing MSSP patients (overlap as a proportion of all MSSP patients) increased from 0.3% at the end of 2013 to 10% at the end of 2016, whereas over the same period, overlap facing bundled payment patients (overlap as a proportion of all bundled payment patients) increased from 11.9% to 27% (Appendix Figure). Overlap facing MSSP ACOs varied according to episode type, ranging from 3% for both acute myocardial infarction and chronic obstructive pulmonary disease episodes to 18% for automatic implantable cardiac defibrillator episodes at the end of 2016. Similarly, overlap facing bundled payment patients varied from 21% for spinal fusion episodes to 32% for lower extremity joint replacement and automatic implantable cardiac defibrillator episodes.

DISCUSSION

To our knowledge, this is the first study to describe the sizable and growing overlap facing ACOs with attributed patients who receive care at bundled payment hospitals, as well as bundled payment hospitals that treat patients attributed to ACOs.

The major implication of our findings is that policymakers must address and anticipate forthcoming payment model overlap as a key policy priority. Given the emphasis on ACOs and bundled payments as payment models—for example, Medicare continues to implement both nationwide via the Next Generation ACO model8 and the recently launched BPCI-Advanced program9—policymakers urgently need insights about the extent of payment model overlap. In that context, it is notable that although we have evaluated MSSP and BPCI as flagship programs, true overlap may actually be greater once other programs are considered.

Several factors may underlie the differences in the magnitude of overlap facing bundled payment versus ACO patients. The models differ in how they identify relevant patient populations, with patients falling under bundled payments via hospitalization for certain episode types but patients falling under ACOs via attribution based on the plurality of primary care services. Furthermore, BPCI participation lagged behind MSSP participation in time, while also occurring disproportionately in areas with existing MSSP ACOs.

Given these findings, understanding the implications of overlap should be a priority for future research and policy strategies. Potential policy considerations should include revising cost accounting processes so that when ACO-attributed patients receive episodic care at bundled payment hospitals, actual rather than historical hospital costs are counted toward ACO cost performance. To encourage hospitals to assume more accountability over outcomes—the ostensible overarching goal of value-based payment reform—Medicare could elect not to recoup savings from hospitals in both payment models. Although such changes require careful accounting to protect Medicare from financial losses as it forgoes some savings achieved through payment reforms, this may be worthwhile if hospital engagement in both models yields synergies.

Importantly, any policy changes made to address program overlap would need to accommodate ongoing changes in ACO, bundled payments, and other payment programs. For example, Medicare overhauled MSSP in December 2018. Compared to the earlier rules, in which ACOs could avoid downside financial risk altogether via “upside only” arrangements for up to six years, new MSSP rules require all participants to assume downside risk after several years of participation. Separately, forthcoming payment reforms such as direct contracting10 may draw clinicians and hospitals previously not participating in either Medicare fee-for-service or value-based payment models into payment reform. These factors may affect overlap in unpredictable ways (eg, they may increase the overlap by increasing the number of patients whose care is covered by different payment models or they may decrease overlap by raising the financial stakes of payment reforms to a degree that organizations drop out altogether).

This study has limitations. First, generalizability is limited by the fact that our analysis did not include bundled payment episodes assigned to physician group participants in BPCI or hospitals in mandatory joint replacement bundles under the Medicare Comprehensive Care for Joint Replacement model.11 Second, although this study provides the first description of overlap between ACO and bundled payment programs, it was descriptive in nature. Future research is needed to evaluate the impact of overlap on clinical, quality, and cost outcomes. This is particularly important because although we observed only small differences in patient characteristics among MSSP-only, BPCI-only, and Overlap groups, characteristics could change differentially over time. Payment reforms must be carefully monitored for potentially unintended consequences that could arise from differential changes in patient characteristics (eg, cherry-picking behavior that is disadvantageous to vulnerable individuals).

Nonetheless, this study underscores the importance and extent of overlap and the urgency to consider policy measures to coordinate between the payment models.

 

 

Acknowledgments

The authors thank research assistance from Sandra Vanderslice who did not receive any compensation for her work. This research was supported in part by The Commonwealth Fund. Rachel Werner was supported in part by K24-AG047908 from the NIA.

Voluntary accountable care organizations (ACOs) and bundled payments have concurrently become cornerstone strategies in Medicare’s shift from volume-based fee-for-service toward value-based payment.

Physician practice and hospital participation in Medicare’s largest ACO model, the Medicare Shared Savings Program (MSSP),1 grew to include 561 organizations in 2018. Under MSSP, participants assume financial accountability for the global quality and costs of care for defined populations of Medicare fee-for-service patients. ACOs that manage to maintain or improve quality while achieving savings (ie, containing costs below a predefined population-wide spending benchmark) are eligible to receive a portion of the difference back from Medicare in the form of “shared savings”.

Similarly, hospital participation in Medicare’s bundled payment programs has grown over time. Most notably, more than 700 participants enrolled in the recently concluded Bundled Payments for Care Improvement (BPCI) initiative,2 Medicare’s largest bundled payment program over the past five years.3 Under BPCI, participants assumed financial accountability for the quality and costs of care for all Medicare patients triggering a qualifying “episode of care”. Participants that limit episode spending below a predefined benchmark without compromising quality were eligible for financial incentives.

As both ACOs and bundled payments grow in prominence and scale, they may increasingly overlap if patients attributed to ACOs receive care at bundled payment hospitals. Overlap could create synergies by increasing incentives to address shared processes (eg, discharge planning) or outcomes (eg, readmissions).4 An ACO focus on reducing hospital admissions could complement bundled payment efforts to increase hospital efficiency.

Conversely, Medicare’s approach to allocating savings and losses can penalize ACOs or bundled payment participants.3 For example, when a patient included in an MSSP ACO population receives episodic care at a hospital participating in BPCI, the historical costs of care for the hospital and the episode type, not the actual costs of care for that specific patient and his/her episode, are counted in the performance of the ACO. In other words, in these cases, the performance of the MSSP ACO is dependent on the historical spending at BPCI hospitals—despite it being out of ACO’s control and having little to do with the actual care its patients receive at BPCI hospitals—and MSSP ACOs cannot benefit from improvements over time. Therefore, MSSP ACOs may be functionally penalized if patients receive care at historically high-cost BPCI hospitals regardless of whether they have considerably improved the value of care delivered. As a corollary, Medicare rules involve a “claw back” stipulation in which savings are recouped from hospitals that participate in both BPCI and MSSP, effectively discouraging participation in both payment models.

Although these dynamics are complex, they highlight an intuitive point that has gained increasing awareness,5 ie, policymakers must understand the magnitude of overlap to evaluate the urgency in coordinating between the payment models. Our objective was to describe the extent of overlap and the characteristics of patients affected by it.

 

 

METHODS

We used 100% institutional Medicare claims, MSSP beneficiary attribution, and BPCI hospital data to identify fee-for-service beneficiaries attributed to MSSP and/or receiving care at BPCI hospitals for its 48 included episodes from the start of BPCI in 2013 quarter 4 through 2016 quarter 4.

We examined the trends in the number of episodes across the following three groups: MSSP-attributed patients hospitalized at BPCI hospitals for an episode included in BPCI (Overlap), MSSP-attributed patients hospitalized for that episode at non-BPCI hospitals (MSSP-only), and non-MSSP-attributed patients hospitalized at BPCI hospitals for a BPCI episode (BPCI-only). We used Medicare and United States Census Bureau data to compare groups with respect to sociodemographic (eg, age, sex, residence in a low-income area),6 clinical (eg, Elixhauser comorbidity index),7 and prior utilization (eg, skilled nursing facility discharge) characteristics.

Categorical and continuous variables were compared using logistic regression and one-way analysis of variance, respectively. Analyses were performed using Stata (StataCorp, College Station, Texas), version 15.0. Statistical tests were 2-tailed and significant at α = 0.05. This study was approved by the institutional review board at the University of Pennsylvania.

RESULTS

The number of MSSP ACOs increased from 220 in 2013 to 432 in 2016. The number of BPCI hospitals increased from 9 to 389 over this period, peaking at 413 hospitals in 2015. Over our study period, a total of 243,392, 2,824,898, and 702,864 episodes occurred in the Overlap, ACO-only, and BPCI-only groups, respectively (Table). Among episodes, patients in the Overlap group generally showed lower severity than those in other groups, although the differences were small. The BPCI-only, MSSP-only, and Overlap groups also exhibited small differences with respect to other characteristics such as the proportion of patients with Medicare/Medicaid dual-eligibility (15% of individual vs 16% and 12%, respectively) and prior use of skilled nursing facilities (33% vs 34% vs 31%, respectively) and acute care hospitals (45% vs 41% vs 39%, respectively) (P < .001 for all).

The overall overlap facing MSSP patients (overlap as a proportion of all MSSP patients) increased from 0.3% at the end of 2013 to 10% at the end of 2016, whereas over the same period, overlap facing bundled payment patients (overlap as a proportion of all bundled payment patients) increased from 11.9% to 27% (Appendix Figure). Overlap facing MSSP ACOs varied according to episode type, ranging from 3% for both acute myocardial infarction and chronic obstructive pulmonary disease episodes to 18% for automatic implantable cardiac defibrillator episodes at the end of 2016. Similarly, overlap facing bundled payment patients varied from 21% for spinal fusion episodes to 32% for lower extremity joint replacement and automatic implantable cardiac defibrillator episodes.

DISCUSSION

To our knowledge, this is the first study to describe the sizable and growing overlap facing ACOs with attributed patients who receive care at bundled payment hospitals, as well as bundled payment hospitals that treat patients attributed to ACOs.

The major implication of our findings is that policymakers must address and anticipate forthcoming payment model overlap as a key policy priority. Given the emphasis on ACOs and bundled payments as payment models—for example, Medicare continues to implement both nationwide via the Next Generation ACO model8 and the recently launched BPCI-Advanced program9—policymakers urgently need insights about the extent of payment model overlap. In that context, it is notable that although we have evaluated MSSP and BPCI as flagship programs, true overlap may actually be greater once other programs are considered.

Several factors may underlie the differences in the magnitude of overlap facing bundled payment versus ACO patients. The models differ in how they identify relevant patient populations, with patients falling under bundled payments via hospitalization for certain episode types but patients falling under ACOs via attribution based on the plurality of primary care services. Furthermore, BPCI participation lagged behind MSSP participation in time, while also occurring disproportionately in areas with existing MSSP ACOs.

Given these findings, understanding the implications of overlap should be a priority for future research and policy strategies. Potential policy considerations should include revising cost accounting processes so that when ACO-attributed patients receive episodic care at bundled payment hospitals, actual rather than historical hospital costs are counted toward ACO cost performance. To encourage hospitals to assume more accountability over outcomes—the ostensible overarching goal of value-based payment reform—Medicare could elect not to recoup savings from hospitals in both payment models. Although such changes require careful accounting to protect Medicare from financial losses as it forgoes some savings achieved through payment reforms, this may be worthwhile if hospital engagement in both models yields synergies.

Importantly, any policy changes made to address program overlap would need to accommodate ongoing changes in ACO, bundled payments, and other payment programs. For example, Medicare overhauled MSSP in December 2018. Compared to the earlier rules, in which ACOs could avoid downside financial risk altogether via “upside only” arrangements for up to six years, new MSSP rules require all participants to assume downside risk after several years of participation. Separately, forthcoming payment reforms such as direct contracting10 may draw clinicians and hospitals previously not participating in either Medicare fee-for-service or value-based payment models into payment reform. These factors may affect overlap in unpredictable ways (eg, they may increase the overlap by increasing the number of patients whose care is covered by different payment models or they may decrease overlap by raising the financial stakes of payment reforms to a degree that organizations drop out altogether).

This study has limitations. First, generalizability is limited by the fact that our analysis did not include bundled payment episodes assigned to physician group participants in BPCI or hospitals in mandatory joint replacement bundles under the Medicare Comprehensive Care for Joint Replacement model.11 Second, although this study provides the first description of overlap between ACO and bundled payment programs, it was descriptive in nature. Future research is needed to evaluate the impact of overlap on clinical, quality, and cost outcomes. This is particularly important because although we observed only small differences in patient characteristics among MSSP-only, BPCI-only, and Overlap groups, characteristics could change differentially over time. Payment reforms must be carefully monitored for potentially unintended consequences that could arise from differential changes in patient characteristics (eg, cherry-picking behavior that is disadvantageous to vulnerable individuals).

Nonetheless, this study underscores the importance and extent of overlap and the urgency to consider policy measures to coordinate between the payment models.

 

 

Acknowledgments

The authors thank research assistance from Sandra Vanderslice who did not receive any compensation for her work. This research was supported in part by The Commonwealth Fund. Rachel Werner was supported in part by K24-AG047908 from the NIA.

References

1. Centers for Medicare and Medicaid Services. Shared Savings Program. https://www.cms.gov/Medicare/Medicare-Fee-For-Service-Payment/sharedsavingsprogram/index.html. Accessed July 22, 2019.
2. Centers for Medicare and Medicaid Services. Bundled Payments for Care Improvement (BPCI) Initiative: General Information. https://innovation.cms.gov/initiatives/bundled-payments/. Accessed July 22, 2019.
3. Mechanic RE. When new Medicare payment systems collide. N Engl J Med. 2016;374(18):1706-1709. https://doi.org/10.1056/NEJMp1601464.
4. Ryan AM, Krinsky S, Adler-Milstein J, Damberg CL, Maurer KA, Hollingsworth JM. Association between hospitals’ engagement in value-based reforms and readmission reduction in the hospital readmission reduction program. JAMA Intern Med. 2017;177(6):863-868. https://doi.org/10.1001/jamainternmed.2017.0518.
5. Liao JM, Dykstra SE, Werner RM, Navathe AS. BPCI Advanced will further emphasize the need to address overlap between bundled payments and accountable care organizations. https://www.healthaffairs.org/do/10.1377/hblog20180409.159181/full/. Accessed May 14, 2019.
6. Census Bureau. United States Census Bureau. https://www.census.gov/. Accessed May 14, 2018.
7. van Walraven C, Austin PC, Jennings A, Quan H, Forster AJ. A modification of the elixhauser comorbidity measures into a point system for hospital death using administrative data. Med Care. 2009;47(6):626-633. https://doi.org/10.1097/MLR.0b013e31819432e5.
8. Centers for Medicare and Medicaid Services. Next, Generation ACO Model. https://innovation.cms.gov/initiatives/next-generation-aco-model/. Accessed July 22, 2019.
9. Centers for Medicare and Medicaid Services. BPCI Advanced. https://innovation.cms.gov/initiatives/bpci-advanced. Accessed July 22, 2019.
10. Centers for Medicare and Medicaid Services. Direct Contracting. https://www.cms.gov/newsroom/fact-sheets/direct-contracting. Accessed July 22, 2019.
11. Centers for Medicare and Medicaid Services. Comprehensive Care for Joint Replacement Model. https://innovation.cms.gov/initiatives/CJR. Accessed July 22, 2019.

References

1. Centers for Medicare and Medicaid Services. Shared Savings Program. https://www.cms.gov/Medicare/Medicare-Fee-For-Service-Payment/sharedsavingsprogram/index.html. Accessed July 22, 2019.
2. Centers for Medicare and Medicaid Services. Bundled Payments for Care Improvement (BPCI) Initiative: General Information. https://innovation.cms.gov/initiatives/bundled-payments/. Accessed July 22, 2019.
3. Mechanic RE. When new Medicare payment systems collide. N Engl J Med. 2016;374(18):1706-1709. https://doi.org/10.1056/NEJMp1601464.
4. Ryan AM, Krinsky S, Adler-Milstein J, Damberg CL, Maurer KA, Hollingsworth JM. Association between hospitals’ engagement in value-based reforms and readmission reduction in the hospital readmission reduction program. JAMA Intern Med. 2017;177(6):863-868. https://doi.org/10.1001/jamainternmed.2017.0518.
5. Liao JM, Dykstra SE, Werner RM, Navathe AS. BPCI Advanced will further emphasize the need to address overlap between bundled payments and accountable care organizations. https://www.healthaffairs.org/do/10.1377/hblog20180409.159181/full/. Accessed May 14, 2019.
6. Census Bureau. United States Census Bureau. https://www.census.gov/. Accessed May 14, 2018.
7. van Walraven C, Austin PC, Jennings A, Quan H, Forster AJ. A modification of the elixhauser comorbidity measures into a point system for hospital death using administrative data. Med Care. 2009;47(6):626-633. https://doi.org/10.1097/MLR.0b013e31819432e5.
8. Centers for Medicare and Medicaid Services. Next, Generation ACO Model. https://innovation.cms.gov/initiatives/next-generation-aco-model/. Accessed July 22, 2019.
9. Centers for Medicare and Medicaid Services. BPCI Advanced. https://innovation.cms.gov/initiatives/bpci-advanced. Accessed July 22, 2019.
10. Centers for Medicare and Medicaid Services. Direct Contracting. https://www.cms.gov/newsroom/fact-sheets/direct-contracting. Accessed July 22, 2019.
11. Centers for Medicare and Medicaid Services. Comprehensive Care for Joint Replacement Model. https://innovation.cms.gov/initiatives/CJR. Accessed July 22, 2019.

Issue
Journal of Hospital Medicine 15(6)
Issue
Journal of Hospital Medicine 15(6)
Page Number
356-359. Published online first August 21, 2019
Page Number
356-359. Published online first August 21, 2019
Publications
Publications
Topics
Article Type
Sections
Article Source

© 2019 Society of Hospital Medicine

Disallow All Ads
Correspondence Location
© 2019 Society of Hospital Medicine
Content Gating
Gated (full article locked unless allowed per User)
Alternative CME
Disqus Comments
Default
Use ProPublica
Hide sidebar & use full width
render the right sidebar.
Conference Recap Checkbox
Not Conference Recap
Clinical Edge
Display the Slideshow in this Article
Gating Strategy
First Peek Free
Medscape Article
Display survey writer
Reuters content
Article PDF Media
Media Files

Nationwide Hospital Performance on Publicly Reported Episode Spending Measures

Article Type
Changed
Wed, 03/31/2021 - 15:07

Amid the continued shift from fee-for-service toward value-based payment, policymakers such as the Centers for Medicare & Medicaid Services have initiated strategies to contain spending on episodes of care. This episode focus has led to nationwide implementation of payment models such as bundled payments, which hold hospitals accountable for quality and costs across procedure-­based (eg, coronary artery bypass surgery) and condition-­based (eg, congestive heart failure) episodes, which begin with hospitalization and encompass subsequent hospital and postdischarge care.

Simultaneously, Medicare has increased its emphasis on similarly designed episodes of care (eg, those spanning hospitalization and postdischarge care) using other strategies, such as public reporting and use of episode-based measures to evaluate hospital cost performance. In 2017, Medicare trialed the implementation of six Clinical Episode-Based Payment (CEBP) measures in the national Hospital Inpatient Quality Reporting Program in order to assess hospital and clinician spending on procedure and condition episodes.1,2

CEBP measures reflect episode-specific spending, conveying “how expensive a hospital is” by capturing facility and professional payments for a given episode spanning between 3 days prior to hospitalization and 30 days following discharge. Given standard payment rates used in Medicare, the variation in episode spending reflects differences in quantity and type of services utilized within an episode. Medicare has specified episode-related services and designed CEBP measures via logic and definition rules informed by a combination of claims and procedures-based grouping, as well as by physician input. For example, the CEBP measure for cellulitis encompasses services related to diagnosing and treating the infection within the episode window, but not unrelated services such as eye exams for coexisting glaucoma. To increase clinical salience, CEBP measures are subdivided to reflect differing complexity when possible. For instance, cellulitis measures are divided into episodes with or without major complications or comorbidities and further subdivided into subtypes for episodes reflecting cellulitis in patients with diabetes, patients with decubitus ulcers, or neither.

CEBPs are similar to other spending measures used in payment programs, such as the Medicare Spending Per Beneficiary, but are more clinically relevant because their focus on episodes more closely reflects clinical practice. CEBPs and Medicare Spending Per Beneficiary have similar designs (eg, same episode windows) and purpose (eg, to capture the cost efficiency of hospital care).3 However, unlike CEBPs, Medicare Spending Per Beneficiary is a “global” measure that summarizes a hospital’s cost efficiency aggregated across all inpatient episodes rather than represent it based on specific conditions or procedures.4 The limitations of publicly reported global hospital measures—for instance, the poor correlation between hospital performance on distinct publicly reported quality measures5—highlight the potential utility of episode-specific spending measures such as CEBP.

Compared with episode-based payment models, initiatives such as CEBP measures have gone largely unstudied. However, they represent signals of Medicare’s growing commitment to addressing care episodes, tested without potentially tedious rulemaking required to change payment. In fact, publicly reported episode spending measures offer policymakers several interrelated benefits: the ability to rapidly evaluate performance at a large number of hospitals (eg, Medicare scaling up CEBP measures among all eligible hospitals nationwide), the option of leveraging publicly reported feedback to prompt clinical improvements (eg, by including CEBP measures in the Hospital Inpatient Quality Reporting Program), and the platform for developing and testing promising spending measures for subsequent use in formal payment models (eg, by using CEBP measures that possess large variation or cost-reduction opportunities in future bundled payment programs).

Despite these benefits, little is known about hospital performance on publicly reported episode-specific spending measures. We addressed this knowledge gap by providing what is, to our knowledge, the first nationwide description of hospital performance on such measures. We also evaluated which episode components accounted for spending variation in procedural vs condition episodes, examined whether CEBP measures can be used to effectively identify high- vs low-cost hospitals, and compared spending performance on CEBPs vs Medicare Spending Per Beneficiary.

 

 

METHODS

Data and Study Sample

We utilized publicly available data from Hospital Compare, which include information about hospital-level CEBP and Medicare Spending Per Beneficiary performance for Medicare-­certified acute care hospitals nationwide.5 Our analysis evaluated the six CEBP measures tested by Medicare in 2017: three conditions (cellulitis, kidney/urinary tract infection [UTI], gastrointestinal hemorrhage) and three procedures (spinal fusion, cholecystectomy and common duct exploration, and aortic aneurysm repair). Per Medicare rules, CEBP measures are calculated only for hospitals with requisite volume for targeted conditions (minimum of 40 episodes) and procedures (minimum of 25 episodes) and are reported on Hospital Compare in risk-adjusted (eg, for age, hierarchical condition categories in alignment with existing Medicare methodology) and payment-­standardized form (ie, accounts for wage index, medical education, disproportionate share hospital payments) . Each CEBP encompasses episodes with or without major complications/comorbidities.

For each hospital, CEBP spending is reported as average total episode spending, as well as average spending on specific components. We grouped components into three groups: hospitalization, skilled nursing facility (SNF) use, and other (encompassing postdischarge readmissions, emergency department visits, and home health agency use), with a focus on SNF given existing evidence from episode-based payment models about the opportunity for savings from reduced SNF care. Hospital Compare also provides information about the national CEBP measure performance (ie, average spending for a given episode type among all eligible hospitals nationwide).

Hospital Groups

To evaluate hospitals’ CEBP performance for specific episode types, we categorized hospitals as either “below average spending” if their average episode spending was below the national average or “above average spending” if spending was above the national average. According to this approach, a hospital could have below average spending for some episodes but above average spending for others.

To compare hospitals across episode types simultaneously, we categorized hospitals as “low cost” if episode spending was below the national average for all applicable measures, “high cost” if episode spending was above the national average for all applicable measures, or “mixed cost” if episode spending was above the national average for some measures and below for others.

We also conducted sensitivity analyses using alternative hospital group definitions. For comparisons of specific episode types, we categorized hospitals as “high spending” (top quartile of average episode spending among eligible hospitals) or “other spending” (all others). For comparisons across all episode types, we focused on SNF care and categorized hospitals as “high SNF cost” (top quartile of episode spending attributed to SNF care) and “other SNF cost” (all others). We applied a similar approach to Medicare Spending Per Beneficiary, categorizing hospitals as either “low MSPB cost” if their episode spending was below the national average for Medicare Spending Per Beneficiary or “high MSPB cost” if not.

Statistical Analysis

We assessed variation by describing the distribution of total episode spending across eligible hospitals for each individual episode type, as well as the proportion of spending attributed to SNF care across all episode types. We reported the difference between the 10th and 90th percentile for each distribution to quantify variation. To evaluate how individual episode components contributed to overall spending variation, we used linear regression and applied analysis of variance to each episode component. Specifically, we regressed episode spending on each episode component (hospital, SNF, other) separately and used these results to generate predicted episode spending values for each hospital based on its value for each spending component. We then calculated the differen-ces (ie, residuals) between predicted and actual total episode spending values. We plotted residuals for each component, with lower residual plot variation (ie, a flatter curve) representing larger contribution of a spending component to overall spending variation.

 

 

Pearson correlation coefficients were used to assess within-­hospital CEBP correlation (ie, the extent to which performance was hospital specific). We evaluated if and how components of spending varied across hospitals by comparing spending groups (for individual episode types) and cost groups (for all episode types). To test the robustness of these categories, we conducted sensitivity analyses using high spending vs other spending groups (for individual episode types) and high SNF cost vs low SNF cost groups (for all episode types).

To assess concordance between CEBP and Medicare Spending Per Beneficiary, we cross tabulated hospital CEBP performance (high vs low vs mixed cost) and Medicare Spending Per Beneficiary performance (high vs low MSPB cost). This approached allowed us to quantify the number of hospitals that have concordant performance for both types of spending measures (ie, high cost or low cost on both) and the number with discordant performance (eg, high cost on one spending measure but low cost on the other). We used Pearson correlation coefficients to assess correlation between CEBP and Medicare Spending Per Beneficiary, with evaluation of CEBP performance in aggregate form (ie, hospitals’ average CEBP performance across all eligible episode types) and by individual episode types.

Chi-square and Kruskal-Wallis tests were used to compare categorical and continuous variables, respectively. To compare spending amounts, we evaluated the distribution of total episode spending (Appendix Figure 1) and used ordinary least squares regression with spending as the dependent variable and hospital group, episode components, and their interaction as independent variables. Because CEBP dollar amounts are reported through Hospital Compare on a risk-adjusted and payment-standardized basis, no additional adjustments were applied. Analyses were performed using SAS version 9.4 (SAS Institute; Cary, NC) and all tests of significance were two-tailed at alpha=0.05.

RESULTS

Of 3,129 hospitals, 1,778 achieved minimum thresholds and had CEBPs calculated for at least one of the six CEBP episode types.

Variation in CEBP Performance

For each episode type, spending varied across eligible hospitals (Appendix Figure 2). In particular, the difference between the 10th and 90th percentile values for cellulitis, kidney/UTI, and gastrointestinal hemorrhage were $2,873, $3,514, and $2,982, respectively. Differences were greater for procedural episodes of aortic aneurysm ($17,860), spinal fusion ($11,893), and cholecystectomy ($3,689). Evaluated across all episode types, the proportion of episode spending attributed to SNF care also varied across hospitals (Appendix Figure 3), with a difference of 24.7% between the 10th (4.5%) and 90th (29.2%) percentile values.

Residual plots demonstrated differences in which episode components accounted for variation in overall spending. For aortic aneurysm episodes, variation in the SNF episode component best explained variation in episode spending and thus had the lowest residual plot variation, followed by other and hospital components (Figure). Similar patterns were observed for spinal fusion and cholecystectomy episodes. In contrast, for cellulitis episodes, all three components had comparable residual-plot variation, which indicates that the variation in the components explained episode spending variation similarly (Figure)—a pattern reflected in kidney/UTI and gastrointestinal hemorrhage episodes.

Residual Plots for Episode Components

Correlation in Performance on CEBP Measures

 

 

Across hospitals in our sample, within-hospital correlations were generally low (Appendix Table 1). In particular, correlations ranged from –0.079 (between performance on aortic aneurysm and kidney/UTI episodes) to 0.42 (between performance on kidney/UTI and cellulitis episodes), with a median correlation coefficient of 0.13. Within-hospital correlations ranged from 0.037 to 0.28 when considered between procedural episodes and from 0.33 to 0.42 when considered between condition episodes. When assessed among the subset of 1,294 hospitals eligible for at least two CEBP measures, correlations were very similar (ranging from –0.080 to 0.42). Additional analyses among hospitals with more CEBPs (eg, all six measures) yielded correlations that were similar in magnitude.

CEBP Performance by Hospital Groups

Overall spending on specific episode types varied across hospital groups (Table). Spending for aortic aneurysm episodes was $42,633 at hospitals with above average spending and $37,730 at those with below average spending, while spending for spinal fusion episodes was $39,231 at those with above average spending and $34,832 at those with below average spending. In comparison, spending at hospitals deemed above and below average spending for cellulitis episodes was $10,763 and $9,064, respectively, and $11,223 and $9,161 at hospitals deemed above and below average spending for kidney/UTI episodes, respectively.

Episode Spending by Components

Spending on specific episode components also differed by hospital group (Table). Though the magnitude of absolute spending amounts and differences varied by specific episode, hospitals with above average spending tended to spend more on SNF than did those with below average spending. For example, hospitals with above average spending for cellulitis episodes spent an average of $2,564 on SNF (24% of overall episode spending) vs $1,293 (14% of episode spending) among those with below average spending. Similarly, hospitals with above and below average spending for kidney/UTI episodes spent $4,068 (36% of episode spending) and $2,232 (24% of episode spending) on SNF, respectively (P < .001 for both episode types). Findings were qualitatively similar in sensitivity analyses (Appendix Table 2).

Among hospitals in our sample, we categorized 481 as high cost (27%), 452 as low cost (25%), and 845 as mixed cost (48%), with hospital groups distributed broadly nationwide (Appendix Figure 4). Evaluated on performance across all six episode types, hospital groups also demonstrated differences in spending by cost components (Table). In particular, spending in SNF ranged from 18.1% of overall episode spending among high-cost hospitals to 10.7% among mixed-cost hospitals and 9.2% among low-cost hospitals. Additionally, spending on hospitalization accounted for 83.3% of overall episode spending among low-cost hospitals, compared with 81.2% and 73.4% among mixed-cost and high-cost hospitals, respectively (P < .001). Comparisons were qualitatively similar in sensitivity analyses (Appendix Table 3).

Comparison of CEBP and Medicare Spending Per Beneficiary Performance

Correlation between Medicare Spending Per Beneficiary and aggregated CEBPs was 0.42 and, for individual episode types, ranged between 0.14 and 0.36 (Appendix Table 2). There was low concordance between hospital performance on CEBP and Medicare Spending Per Beneficiary. Across all eligible hospitals, only 16.3% (290/1778) had positive concordance between performance on the two measure types (ie, low cost for both), while 16.5% (293/1778) had negative concordance (ie, high cost for both). There was discordant performance in most instances (67.2%; 1195/1778), which reflecting favorable performance on one measure type but not the other.

 

 

DISCUSSION

To our knowledge, this study is the first to describe hospitals’ episode-specific spending performance nationwide. It demonstrated significant variation across hospitals driven by different episode components for different episode types. It also showed low correlation between individual episode spending measures and poor concordance between episode-specific and global hospital spending measures. Two practice and policy implications are noteworthy.

First, our findings corroborate and build upon evidence from bundled payment programs about the opportunity for hospitals to improve their cost efficiency. Findings from bundled payment evaluations of surgical episodes suggest that the major area for cost savings is in the reduction of institutional post-acute care use such as that of SNFs.7-9 We demonstrated similar opportunity in a national sample of hospitals, finding that, for the three evaluated procedural CEBPs, SNF care accounted for more variation in overall episode spending than did other components. While variation may imply opportunity for greater efficiency and standardization, it is important to note that variation itself is not inherently problematic. Additional studies are needed to distinguish between warranted and unwarranted variation in procedural episodes, as well as identify strategies for reducing the latter.

Though bundled payment evaluations have predominantly emphasized procedural episodes, existing evidence suggests that participation in medical condition bundles has not been associated with cost savings or utilization changes.7-15 Findings from our analysis of variance—that there appear to be smaller variation-reduction opportunities for condition episodes than for procedural episodes—offer insight into this issue. Existing episodes are initiated by hospitalization and extend into the postacute period, a design that may not afford substantial post-acute care savings opportunities for condition episodes. This is an important insight as policymakers consider how to best design condition-based episodes in the future (eg, whether to use non–hospital based episode triggers). Future work should evaluate whether our findings reflect inherent differences between condition and procedural episodes16 or whether interventions can still optimize SNF care for these episodes despite smaller variation.

Second, our results highlight the potential limitations of global performance measures such as Medicare Spending Per Beneficiary. As a general measure of hospital spending, Medicare Spending Per Beneficiary is based on the premise that hospitals can be categorized as high or low cost with consideration of all inpatient episodic care. However, our analyses suggest that hospitals may be high cost for certain episodes and low cost for others—a fact highlighted by the low correlation and high discordance observed between hospital CEBP and Medicare Spending Per Beneficiary performance. Because overarching measures may miss spending differen-ces related to underlying clinical scenarios, episode-specific spending measures would provide important perspective and complements to global measures for assessing hospital cost performance, particularly in an era of value-based payments. Policymakers should consider prioritizing the development and implementation of such measures.

Our study has limitations. First, it is descriptive in nature, and future work should evaluate the association between episode-­specific spending measure performance and clinical and quality outcomes. Second, we evaluated all CEBP-eligible hospitals nationwide to provide a broad view of episode-specific spending. However, future studies should assess performance among hospital subtypes, such as vertically integrated or safety-­net organizations, because they may be more or less able to perform on these spending measures. Third, though findings may not be generalizable to other clinical episodes, our results were qualitatively consistent across episode types and broadly consistent with evidence from episode-based payment models. Fourth, we analyzed cost from the perspective of utilization and did not incorporate price considerations, which may be more relevant for commercial insurers than it is for Medicare.

Nonetheless, the emergence of CEBPs reflects the ongoing shift in policymaker attention toward episode-specific spending. In particular, though further scale or use of CEBP measures has been put on hold amid other payment reform changes, their nationwide implementation in 2017 signals Medicare’s broad interest in evaluating all hospitals on episode-specific spending efficiency, in addition to other facets of spending, quality, safety, and patient experience. Importantly, such efforts complement other ongoing nationwide initiatives for emphasizing episode spending, such as use of episode-based cost measures within the Merit-Based Incentive Payment System17 to score clinicians and groups in part based on their episode-specific spending efficiency. Insight about episode spending performance could help hospitals prepare for environments with increasing focus on episode spending and as policymakers incorporate this perspective into quality and value-­based payment policies.

 

 

Files
References

1. Centers for Medicare & Medicaid Services. Fiscal Year 2019 Clinical Episode-Based Payment Measures Overview. https://www.qualityreportingcenter.com/globalassets/migrated-pdf/cepb_slides_npc-6.17.2018_5.22.18_vfinal508.pdf. Accessed November 26, 2019.
2. Centers for Medicare & Medicaid Services. Hospital Inpatient Quality Reporting Program. https://www.cms.gov/Medicare/Quality-Initiatives-Patient-Assessment-Instruments/HospitalQualityInits/HospitalRHQDAPU.html. Accessed November 23, 2019.
3. Centers for Medicare & Medicaid Services. Medicare Spending Per Beneficiary (MSPB) Spending Breakdown by Claim Type. https://www.cms.gov/Medicare/Quality-Initiatives-Patient-Assessment-Instruments/hospital-value-based-purchasing/Downloads/Fact-Sheet-MSPB-Spending-Breakdowns-by-Claim-Type-Dec-2014.pdf. Accessed November 25, 2019.
4. Hu J, Jordan J, Rubinfeld I, Schreiber M, Waterman B, Nerenz D. Correlations among hospital quality measure: What “Hospital Compare” data tell us. Am J Med Qual. 2017;32(6):605-610. https://doi.org/10.1177/1062860616684012.
5. Centers for Medicare & Medicaid Services. Hospital Compare datasets. https://data.medicare.gov/data/hospital-compare. Accessed November 26, 2019.
6. American Hospital Association. AHA Data Products. https://www.aha.org/data-insights/aha-data-products. Accessed November 25, 2019.
7. Dummit LA, Kahvecioglu D, Marrufo G, et al. Bundled payment initiative and payments and quality outcomes for lower extremity joint replacement episodes. JAMA. 2016; 316(12):1267-1278. https://doi.org/10.1001/jama.2016.12717.
8. Finkelstein A, Ji Y, Mahoney N, Skinner J. Mandatory medicare bundled payment program for lower extremity joint replacement and discharge to institutional postacute care: Interim analysis of the first year of a 5-year randomized trial. JAMA. 2018;320(9):892-900. https://doi.org/10.1001/jama.2018.12346.
9. Navathe AS, Troxel AB, Liao JM, et al. Cost of joint replacement using bundled payment models. JAMA Intern Med. 2017;177(2):214-222. https://doi.org/10.1001/jamainternmed.2016.8263.
10. Liao JM, Emanuel EJ, Polsky DE, et al. National representativeness of hospitals and markets in Medicare’s mandatory bundled payment program. Health Aff. 2019;38(1):44-53.
11. Barnett ML, Wilcock A, McWilliams JM, et al. Two-year evaluation of mandatory bundled payments for joint replacement. N Engl J Med. 2019;380(3):252-262. https://doi.org/10.1056/NEJMsa1809010.
12. Navathe AS, Liao JM, Polsky D, et al. Comparison of hospitals participating in Medicare’s voluntary and mandatory orthopedic bundle programs. Health Aff. 2018;37(6):854-863. https://www.doi.org/10.1377/hlthaff.2017.1358.
13. Joynt Maddox KE, Orav EJ, Zheng J, Epstein AM. Participation and Dropout in the Bundled Payments for Care Improvement Initiative. JAMA. 2018;319(2):191-193. https://doi.org/10.1001/jama.2017.14771.
14. Navathe AS, Liao JM, Dykstra SE, et al. Association of hospital participation in a Medicare bundled payment program with volume and case mix of lower extremity joint replacement episodes. JAMA. 2018;320(9):901-910. https://doi.org/10.1001/jama.2018.12345.
15. Joynt Maddox KE, Orav EJ, Epstein AM. Medicare’s bundled payments initiative for medical conditions. N Engl J Med. 2018;379(18):e33. https://doi.org/10.1056/NEJMc1811049.
16. Navathe AS, Shan E, Liao JM. What have we learned about bundling medical conditions? Health Affairs Blog. https://www.healthaffairs.org/do/10.1377/hblog20180828.844613/full/. Accessed November 25, 2019.
17. Centers for Medicare & Medicaid Services. MACRA. https://www.cms.gov/medicare/quality-initiatives-patient-assessment-instruments/value-based-programs/macra-mips-and-apms/macra-mips-and-apms.html. Accessed November 26, 2019.

Article PDF
Author and Disclosure Information

1Department of Medicine, University of Washington School of Medicine, Seattle, Washington; 2Value & Systems Science Lab, University of Washington School of Medicine, Seattle, Washington; 3Leonard Davis Institute of Health Economics, University of Pennsylvania, Philadelphia, Pennsylvania; 4Corporal Michael J. Crescenz VA Medical Center, Philadelphia, Pennsylvania; 5Department of Medical Ethics and Health Policy, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania.

Disclosures

Dr. Liao reports textbook royalties from Wolters Kluwer and personal fees from Kaiser Permanente Washington Research Institute, none of which are related to this manuscript. Dr. Zhou has nothing to disclose. Dr. Navathe reported receiving grants from Hawaii Medical Service Association, Anthem Public Policy Institute, Healthcare Research and Education Trust, Cigna, and Oscar Health; personal fees from Navvis Healthcare, and Agathos, Inc.; personal fees and equity from NavaHealth; equity from Embedded Healthcare; speaking fees from the Cleveland Clinic; personal fees from the Medicare Payment Advisory Commission; and an honorarium from Elsevier Press, as well as serving as a board member of Integrated Services Inc. without compensation, none of which are related to this manuscript.

Issue
Journal of Hospital Medicine 16(4)
Publications
Topics
Page Number
204-210. Published Online First March 18, 2020
Sections
Files
Files
Author and Disclosure Information

1Department of Medicine, University of Washington School of Medicine, Seattle, Washington; 2Value & Systems Science Lab, University of Washington School of Medicine, Seattle, Washington; 3Leonard Davis Institute of Health Economics, University of Pennsylvania, Philadelphia, Pennsylvania; 4Corporal Michael J. Crescenz VA Medical Center, Philadelphia, Pennsylvania; 5Department of Medical Ethics and Health Policy, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania.

Disclosures

Dr. Liao reports textbook royalties from Wolters Kluwer and personal fees from Kaiser Permanente Washington Research Institute, none of which are related to this manuscript. Dr. Zhou has nothing to disclose. Dr. Navathe reported receiving grants from Hawaii Medical Service Association, Anthem Public Policy Institute, Healthcare Research and Education Trust, Cigna, and Oscar Health; personal fees from Navvis Healthcare, and Agathos, Inc.; personal fees and equity from NavaHealth; equity from Embedded Healthcare; speaking fees from the Cleveland Clinic; personal fees from the Medicare Payment Advisory Commission; and an honorarium from Elsevier Press, as well as serving as a board member of Integrated Services Inc. without compensation, none of which are related to this manuscript.

Author and Disclosure Information

1Department of Medicine, University of Washington School of Medicine, Seattle, Washington; 2Value & Systems Science Lab, University of Washington School of Medicine, Seattle, Washington; 3Leonard Davis Institute of Health Economics, University of Pennsylvania, Philadelphia, Pennsylvania; 4Corporal Michael J. Crescenz VA Medical Center, Philadelphia, Pennsylvania; 5Department of Medical Ethics and Health Policy, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania.

Disclosures

Dr. Liao reports textbook royalties from Wolters Kluwer and personal fees from Kaiser Permanente Washington Research Institute, none of which are related to this manuscript. Dr. Zhou has nothing to disclose. Dr. Navathe reported receiving grants from Hawaii Medical Service Association, Anthem Public Policy Institute, Healthcare Research and Education Trust, Cigna, and Oscar Health; personal fees from Navvis Healthcare, and Agathos, Inc.; personal fees and equity from NavaHealth; equity from Embedded Healthcare; speaking fees from the Cleveland Clinic; personal fees from the Medicare Payment Advisory Commission; and an honorarium from Elsevier Press, as well as serving as a board member of Integrated Services Inc. without compensation, none of which are related to this manuscript.

Article PDF
Article PDF
Related Articles

Amid the continued shift from fee-for-service toward value-based payment, policymakers such as the Centers for Medicare & Medicaid Services have initiated strategies to contain spending on episodes of care. This episode focus has led to nationwide implementation of payment models such as bundled payments, which hold hospitals accountable for quality and costs across procedure-­based (eg, coronary artery bypass surgery) and condition-­based (eg, congestive heart failure) episodes, which begin with hospitalization and encompass subsequent hospital and postdischarge care.

Simultaneously, Medicare has increased its emphasis on similarly designed episodes of care (eg, those spanning hospitalization and postdischarge care) using other strategies, such as public reporting and use of episode-based measures to evaluate hospital cost performance. In 2017, Medicare trialed the implementation of six Clinical Episode-Based Payment (CEBP) measures in the national Hospital Inpatient Quality Reporting Program in order to assess hospital and clinician spending on procedure and condition episodes.1,2

CEBP measures reflect episode-specific spending, conveying “how expensive a hospital is” by capturing facility and professional payments for a given episode spanning between 3 days prior to hospitalization and 30 days following discharge. Given standard payment rates used in Medicare, the variation in episode spending reflects differences in quantity and type of services utilized within an episode. Medicare has specified episode-related services and designed CEBP measures via logic and definition rules informed by a combination of claims and procedures-based grouping, as well as by physician input. For example, the CEBP measure for cellulitis encompasses services related to diagnosing and treating the infection within the episode window, but not unrelated services such as eye exams for coexisting glaucoma. To increase clinical salience, CEBP measures are subdivided to reflect differing complexity when possible. For instance, cellulitis measures are divided into episodes with or without major complications or comorbidities and further subdivided into subtypes for episodes reflecting cellulitis in patients with diabetes, patients with decubitus ulcers, or neither.

CEBPs are similar to other spending measures used in payment programs, such as the Medicare Spending Per Beneficiary, but are more clinically relevant because their focus on episodes more closely reflects clinical practice. CEBPs and Medicare Spending Per Beneficiary have similar designs (eg, same episode windows) and purpose (eg, to capture the cost efficiency of hospital care).3 However, unlike CEBPs, Medicare Spending Per Beneficiary is a “global” measure that summarizes a hospital’s cost efficiency aggregated across all inpatient episodes rather than represent it based on specific conditions or procedures.4 The limitations of publicly reported global hospital measures—for instance, the poor correlation between hospital performance on distinct publicly reported quality measures5—highlight the potential utility of episode-specific spending measures such as CEBP.

Compared with episode-based payment models, initiatives such as CEBP measures have gone largely unstudied. However, they represent signals of Medicare’s growing commitment to addressing care episodes, tested without potentially tedious rulemaking required to change payment. In fact, publicly reported episode spending measures offer policymakers several interrelated benefits: the ability to rapidly evaluate performance at a large number of hospitals (eg, Medicare scaling up CEBP measures among all eligible hospitals nationwide), the option of leveraging publicly reported feedback to prompt clinical improvements (eg, by including CEBP measures in the Hospital Inpatient Quality Reporting Program), and the platform for developing and testing promising spending measures for subsequent use in formal payment models (eg, by using CEBP measures that possess large variation or cost-reduction opportunities in future bundled payment programs).

Despite these benefits, little is known about hospital performance on publicly reported episode-specific spending measures. We addressed this knowledge gap by providing what is, to our knowledge, the first nationwide description of hospital performance on such measures. We also evaluated which episode components accounted for spending variation in procedural vs condition episodes, examined whether CEBP measures can be used to effectively identify high- vs low-cost hospitals, and compared spending performance on CEBPs vs Medicare Spending Per Beneficiary.

 

 

METHODS

Data and Study Sample

We utilized publicly available data from Hospital Compare, which include information about hospital-level CEBP and Medicare Spending Per Beneficiary performance for Medicare-­certified acute care hospitals nationwide.5 Our analysis evaluated the six CEBP measures tested by Medicare in 2017: three conditions (cellulitis, kidney/urinary tract infection [UTI], gastrointestinal hemorrhage) and three procedures (spinal fusion, cholecystectomy and common duct exploration, and aortic aneurysm repair). Per Medicare rules, CEBP measures are calculated only for hospitals with requisite volume for targeted conditions (minimum of 40 episodes) and procedures (minimum of 25 episodes) and are reported on Hospital Compare in risk-adjusted (eg, for age, hierarchical condition categories in alignment with existing Medicare methodology) and payment-­standardized form (ie, accounts for wage index, medical education, disproportionate share hospital payments) . Each CEBP encompasses episodes with or without major complications/comorbidities.

For each hospital, CEBP spending is reported as average total episode spending, as well as average spending on specific components. We grouped components into three groups: hospitalization, skilled nursing facility (SNF) use, and other (encompassing postdischarge readmissions, emergency department visits, and home health agency use), with a focus on SNF given existing evidence from episode-based payment models about the opportunity for savings from reduced SNF care. Hospital Compare also provides information about the national CEBP measure performance (ie, average spending for a given episode type among all eligible hospitals nationwide).

Hospital Groups

To evaluate hospitals’ CEBP performance for specific episode types, we categorized hospitals as either “below average spending” if their average episode spending was below the national average or “above average spending” if spending was above the national average. According to this approach, a hospital could have below average spending for some episodes but above average spending for others.

To compare hospitals across episode types simultaneously, we categorized hospitals as “low cost” if episode spending was below the national average for all applicable measures, “high cost” if episode spending was above the national average for all applicable measures, or “mixed cost” if episode spending was above the national average for some measures and below for others.

We also conducted sensitivity analyses using alternative hospital group definitions. For comparisons of specific episode types, we categorized hospitals as “high spending” (top quartile of average episode spending among eligible hospitals) or “other spending” (all others). For comparisons across all episode types, we focused on SNF care and categorized hospitals as “high SNF cost” (top quartile of episode spending attributed to SNF care) and “other SNF cost” (all others). We applied a similar approach to Medicare Spending Per Beneficiary, categorizing hospitals as either “low MSPB cost” if their episode spending was below the national average for Medicare Spending Per Beneficiary or “high MSPB cost” if not.

Statistical Analysis

We assessed variation by describing the distribution of total episode spending across eligible hospitals for each individual episode type, as well as the proportion of spending attributed to SNF care across all episode types. We reported the difference between the 10th and 90th percentile for each distribution to quantify variation. To evaluate how individual episode components contributed to overall spending variation, we used linear regression and applied analysis of variance to each episode component. Specifically, we regressed episode spending on each episode component (hospital, SNF, other) separately and used these results to generate predicted episode spending values for each hospital based on its value for each spending component. We then calculated the differen-ces (ie, residuals) between predicted and actual total episode spending values. We plotted residuals for each component, with lower residual plot variation (ie, a flatter curve) representing larger contribution of a spending component to overall spending variation.

 

 

Pearson correlation coefficients were used to assess within-­hospital CEBP correlation (ie, the extent to which performance was hospital specific). We evaluated if and how components of spending varied across hospitals by comparing spending groups (for individual episode types) and cost groups (for all episode types). To test the robustness of these categories, we conducted sensitivity analyses using high spending vs other spending groups (for individual episode types) and high SNF cost vs low SNF cost groups (for all episode types).

To assess concordance between CEBP and Medicare Spending Per Beneficiary, we cross tabulated hospital CEBP performance (high vs low vs mixed cost) and Medicare Spending Per Beneficiary performance (high vs low MSPB cost). This approached allowed us to quantify the number of hospitals that have concordant performance for both types of spending measures (ie, high cost or low cost on both) and the number with discordant performance (eg, high cost on one spending measure but low cost on the other). We used Pearson correlation coefficients to assess correlation between CEBP and Medicare Spending Per Beneficiary, with evaluation of CEBP performance in aggregate form (ie, hospitals’ average CEBP performance across all eligible episode types) and by individual episode types.

Chi-square and Kruskal-Wallis tests were used to compare categorical and continuous variables, respectively. To compare spending amounts, we evaluated the distribution of total episode spending (Appendix Figure 1) and used ordinary least squares regression with spending as the dependent variable and hospital group, episode components, and their interaction as independent variables. Because CEBP dollar amounts are reported through Hospital Compare on a risk-adjusted and payment-standardized basis, no additional adjustments were applied. Analyses were performed using SAS version 9.4 (SAS Institute; Cary, NC) and all tests of significance were two-tailed at alpha=0.05.

RESULTS

Of 3,129 hospitals, 1,778 achieved minimum thresholds and had CEBPs calculated for at least one of the six CEBP episode types.

Variation in CEBP Performance

For each episode type, spending varied across eligible hospitals (Appendix Figure 2). In particular, the difference between the 10th and 90th percentile values for cellulitis, kidney/UTI, and gastrointestinal hemorrhage were $2,873, $3,514, and $2,982, respectively. Differences were greater for procedural episodes of aortic aneurysm ($17,860), spinal fusion ($11,893), and cholecystectomy ($3,689). Evaluated across all episode types, the proportion of episode spending attributed to SNF care also varied across hospitals (Appendix Figure 3), with a difference of 24.7% between the 10th (4.5%) and 90th (29.2%) percentile values.

Residual plots demonstrated differences in which episode components accounted for variation in overall spending. For aortic aneurysm episodes, variation in the SNF episode component best explained variation in episode spending and thus had the lowest residual plot variation, followed by other and hospital components (Figure). Similar patterns were observed for spinal fusion and cholecystectomy episodes. In contrast, for cellulitis episodes, all three components had comparable residual-plot variation, which indicates that the variation in the components explained episode spending variation similarly (Figure)—a pattern reflected in kidney/UTI and gastrointestinal hemorrhage episodes.

Residual Plots for Episode Components

Correlation in Performance on CEBP Measures

 

 

Across hospitals in our sample, within-hospital correlations were generally low (Appendix Table 1). In particular, correlations ranged from –0.079 (between performance on aortic aneurysm and kidney/UTI episodes) to 0.42 (between performance on kidney/UTI and cellulitis episodes), with a median correlation coefficient of 0.13. Within-hospital correlations ranged from 0.037 to 0.28 when considered between procedural episodes and from 0.33 to 0.42 when considered between condition episodes. When assessed among the subset of 1,294 hospitals eligible for at least two CEBP measures, correlations were very similar (ranging from –0.080 to 0.42). Additional analyses among hospitals with more CEBPs (eg, all six measures) yielded correlations that were similar in magnitude.

CEBP Performance by Hospital Groups

Overall spending on specific episode types varied across hospital groups (Table). Spending for aortic aneurysm episodes was $42,633 at hospitals with above average spending and $37,730 at those with below average spending, while spending for spinal fusion episodes was $39,231 at those with above average spending and $34,832 at those with below average spending. In comparison, spending at hospitals deemed above and below average spending for cellulitis episodes was $10,763 and $9,064, respectively, and $11,223 and $9,161 at hospitals deemed above and below average spending for kidney/UTI episodes, respectively.

Episode Spending by Components

Spending on specific episode components also differed by hospital group (Table). Though the magnitude of absolute spending amounts and differences varied by specific episode, hospitals with above average spending tended to spend more on SNF than did those with below average spending. For example, hospitals with above average spending for cellulitis episodes spent an average of $2,564 on SNF (24% of overall episode spending) vs $1,293 (14% of episode spending) among those with below average spending. Similarly, hospitals with above and below average spending for kidney/UTI episodes spent $4,068 (36% of episode spending) and $2,232 (24% of episode spending) on SNF, respectively (P < .001 for both episode types). Findings were qualitatively similar in sensitivity analyses (Appendix Table 2).

Among hospitals in our sample, we categorized 481 as high cost (27%), 452 as low cost (25%), and 845 as mixed cost (48%), with hospital groups distributed broadly nationwide (Appendix Figure 4). Evaluated on performance across all six episode types, hospital groups also demonstrated differences in spending by cost components (Table). In particular, spending in SNF ranged from 18.1% of overall episode spending among high-cost hospitals to 10.7% among mixed-cost hospitals and 9.2% among low-cost hospitals. Additionally, spending on hospitalization accounted for 83.3% of overall episode spending among low-cost hospitals, compared with 81.2% and 73.4% among mixed-cost and high-cost hospitals, respectively (P < .001). Comparisons were qualitatively similar in sensitivity analyses (Appendix Table 3).

Comparison of CEBP and Medicare Spending Per Beneficiary Performance

Correlation between Medicare Spending Per Beneficiary and aggregated CEBPs was 0.42 and, for individual episode types, ranged between 0.14 and 0.36 (Appendix Table 2). There was low concordance between hospital performance on CEBP and Medicare Spending Per Beneficiary. Across all eligible hospitals, only 16.3% (290/1778) had positive concordance between performance on the two measure types (ie, low cost for both), while 16.5% (293/1778) had negative concordance (ie, high cost for both). There was discordant performance in most instances (67.2%; 1195/1778), which reflecting favorable performance on one measure type but not the other.

 

 

DISCUSSION

To our knowledge, this study is the first to describe hospitals’ episode-specific spending performance nationwide. It demonstrated significant variation across hospitals driven by different episode components for different episode types. It also showed low correlation between individual episode spending measures and poor concordance between episode-specific and global hospital spending measures. Two practice and policy implications are noteworthy.

First, our findings corroborate and build upon evidence from bundled payment programs about the opportunity for hospitals to improve their cost efficiency. Findings from bundled payment evaluations of surgical episodes suggest that the major area for cost savings is in the reduction of institutional post-acute care use such as that of SNFs.7-9 We demonstrated similar opportunity in a national sample of hospitals, finding that, for the three evaluated procedural CEBPs, SNF care accounted for more variation in overall episode spending than did other components. While variation may imply opportunity for greater efficiency and standardization, it is important to note that variation itself is not inherently problematic. Additional studies are needed to distinguish between warranted and unwarranted variation in procedural episodes, as well as identify strategies for reducing the latter.

Though bundled payment evaluations have predominantly emphasized procedural episodes, existing evidence suggests that participation in medical condition bundles has not been associated with cost savings or utilization changes.7-15 Findings from our analysis of variance—that there appear to be smaller variation-reduction opportunities for condition episodes than for procedural episodes—offer insight into this issue. Existing episodes are initiated by hospitalization and extend into the postacute period, a design that may not afford substantial post-acute care savings opportunities for condition episodes. This is an important insight as policymakers consider how to best design condition-based episodes in the future (eg, whether to use non–hospital based episode triggers). Future work should evaluate whether our findings reflect inherent differences between condition and procedural episodes16 or whether interventions can still optimize SNF care for these episodes despite smaller variation.

Second, our results highlight the potential limitations of global performance measures such as Medicare Spending Per Beneficiary. As a general measure of hospital spending, Medicare Spending Per Beneficiary is based on the premise that hospitals can be categorized as high or low cost with consideration of all inpatient episodic care. However, our analyses suggest that hospitals may be high cost for certain episodes and low cost for others—a fact highlighted by the low correlation and high discordance observed between hospital CEBP and Medicare Spending Per Beneficiary performance. Because overarching measures may miss spending differen-ces related to underlying clinical scenarios, episode-specific spending measures would provide important perspective and complements to global measures for assessing hospital cost performance, particularly in an era of value-based payments. Policymakers should consider prioritizing the development and implementation of such measures.

Our study has limitations. First, it is descriptive in nature, and future work should evaluate the association between episode-­specific spending measure performance and clinical and quality outcomes. Second, we evaluated all CEBP-eligible hospitals nationwide to provide a broad view of episode-specific spending. However, future studies should assess performance among hospital subtypes, such as vertically integrated or safety-­net organizations, because they may be more or less able to perform on these spending measures. Third, though findings may not be generalizable to other clinical episodes, our results were qualitatively consistent across episode types and broadly consistent with evidence from episode-based payment models. Fourth, we analyzed cost from the perspective of utilization and did not incorporate price considerations, which may be more relevant for commercial insurers than it is for Medicare.

Nonetheless, the emergence of CEBPs reflects the ongoing shift in policymaker attention toward episode-specific spending. In particular, though further scale or use of CEBP measures has been put on hold amid other payment reform changes, their nationwide implementation in 2017 signals Medicare’s broad interest in evaluating all hospitals on episode-specific spending efficiency, in addition to other facets of spending, quality, safety, and patient experience. Importantly, such efforts complement other ongoing nationwide initiatives for emphasizing episode spending, such as use of episode-based cost measures within the Merit-Based Incentive Payment System17 to score clinicians and groups in part based on their episode-specific spending efficiency. Insight about episode spending performance could help hospitals prepare for environments with increasing focus on episode spending and as policymakers incorporate this perspective into quality and value-­based payment policies.

 

 

Amid the continued shift from fee-for-service toward value-based payment, policymakers such as the Centers for Medicare & Medicaid Services have initiated strategies to contain spending on episodes of care. This episode focus has led to nationwide implementation of payment models such as bundled payments, which hold hospitals accountable for quality and costs across procedure-­based (eg, coronary artery bypass surgery) and condition-­based (eg, congestive heart failure) episodes, which begin with hospitalization and encompass subsequent hospital and postdischarge care.

Simultaneously, Medicare has increased its emphasis on similarly designed episodes of care (eg, those spanning hospitalization and postdischarge care) using other strategies, such as public reporting and use of episode-based measures to evaluate hospital cost performance. In 2017, Medicare trialed the implementation of six Clinical Episode-Based Payment (CEBP) measures in the national Hospital Inpatient Quality Reporting Program in order to assess hospital and clinician spending on procedure and condition episodes.1,2

CEBP measures reflect episode-specific spending, conveying “how expensive a hospital is” by capturing facility and professional payments for a given episode spanning between 3 days prior to hospitalization and 30 days following discharge. Given standard payment rates used in Medicare, the variation in episode spending reflects differences in quantity and type of services utilized within an episode. Medicare has specified episode-related services and designed CEBP measures via logic and definition rules informed by a combination of claims and procedures-based grouping, as well as by physician input. For example, the CEBP measure for cellulitis encompasses services related to diagnosing and treating the infection within the episode window, but not unrelated services such as eye exams for coexisting glaucoma. To increase clinical salience, CEBP measures are subdivided to reflect differing complexity when possible. For instance, cellulitis measures are divided into episodes with or without major complications or comorbidities and further subdivided into subtypes for episodes reflecting cellulitis in patients with diabetes, patients with decubitus ulcers, or neither.

CEBPs are similar to other spending measures used in payment programs, such as the Medicare Spending Per Beneficiary, but are more clinically relevant because their focus on episodes more closely reflects clinical practice. CEBPs and Medicare Spending Per Beneficiary have similar designs (eg, same episode windows) and purpose (eg, to capture the cost efficiency of hospital care).3 However, unlike CEBPs, Medicare Spending Per Beneficiary is a “global” measure that summarizes a hospital’s cost efficiency aggregated across all inpatient episodes rather than represent it based on specific conditions or procedures.4 The limitations of publicly reported global hospital measures—for instance, the poor correlation between hospital performance on distinct publicly reported quality measures5—highlight the potential utility of episode-specific spending measures such as CEBP.

Compared with episode-based payment models, initiatives such as CEBP measures have gone largely unstudied. However, they represent signals of Medicare’s growing commitment to addressing care episodes, tested without potentially tedious rulemaking required to change payment. In fact, publicly reported episode spending measures offer policymakers several interrelated benefits: the ability to rapidly evaluate performance at a large number of hospitals (eg, Medicare scaling up CEBP measures among all eligible hospitals nationwide), the option of leveraging publicly reported feedback to prompt clinical improvements (eg, by including CEBP measures in the Hospital Inpatient Quality Reporting Program), and the platform for developing and testing promising spending measures for subsequent use in formal payment models (eg, by using CEBP measures that possess large variation or cost-reduction opportunities in future bundled payment programs).

Despite these benefits, little is known about hospital performance on publicly reported episode-specific spending measures. We addressed this knowledge gap by providing what is, to our knowledge, the first nationwide description of hospital performance on such measures. We also evaluated which episode components accounted for spending variation in procedural vs condition episodes, examined whether CEBP measures can be used to effectively identify high- vs low-cost hospitals, and compared spending performance on CEBPs vs Medicare Spending Per Beneficiary.

 

 

METHODS

Data and Study Sample

We utilized publicly available data from Hospital Compare, which include information about hospital-level CEBP and Medicare Spending Per Beneficiary performance for Medicare-­certified acute care hospitals nationwide.5 Our analysis evaluated the six CEBP measures tested by Medicare in 2017: three conditions (cellulitis, kidney/urinary tract infection [UTI], gastrointestinal hemorrhage) and three procedures (spinal fusion, cholecystectomy and common duct exploration, and aortic aneurysm repair). Per Medicare rules, CEBP measures are calculated only for hospitals with requisite volume for targeted conditions (minimum of 40 episodes) and procedures (minimum of 25 episodes) and are reported on Hospital Compare in risk-adjusted (eg, for age, hierarchical condition categories in alignment with existing Medicare methodology) and payment-­standardized form (ie, accounts for wage index, medical education, disproportionate share hospital payments) . Each CEBP encompasses episodes with or without major complications/comorbidities.

For each hospital, CEBP spending is reported as average total episode spending, as well as average spending on specific components. We grouped components into three groups: hospitalization, skilled nursing facility (SNF) use, and other (encompassing postdischarge readmissions, emergency department visits, and home health agency use), with a focus on SNF given existing evidence from episode-based payment models about the opportunity for savings from reduced SNF care. Hospital Compare also provides information about the national CEBP measure performance (ie, average spending for a given episode type among all eligible hospitals nationwide).

Hospital Groups

To evaluate hospitals’ CEBP performance for specific episode types, we categorized hospitals as either “below average spending” if their average episode spending was below the national average or “above average spending” if spending was above the national average. According to this approach, a hospital could have below average spending for some episodes but above average spending for others.

To compare hospitals across episode types simultaneously, we categorized hospitals as “low cost” if episode spending was below the national average for all applicable measures, “high cost” if episode spending was above the national average for all applicable measures, or “mixed cost” if episode spending was above the national average for some measures and below for others.

We also conducted sensitivity analyses using alternative hospital group definitions. For comparisons of specific episode types, we categorized hospitals as “high spending” (top quartile of average episode spending among eligible hospitals) or “other spending” (all others). For comparisons across all episode types, we focused on SNF care and categorized hospitals as “high SNF cost” (top quartile of episode spending attributed to SNF care) and “other SNF cost” (all others). We applied a similar approach to Medicare Spending Per Beneficiary, categorizing hospitals as either “low MSPB cost” if their episode spending was below the national average for Medicare Spending Per Beneficiary or “high MSPB cost” if not.

Statistical Analysis

We assessed variation by describing the distribution of total episode spending across eligible hospitals for each individual episode type, as well as the proportion of spending attributed to SNF care across all episode types. We reported the difference between the 10th and 90th percentile for each distribution to quantify variation. To evaluate how individual episode components contributed to overall spending variation, we used linear regression and applied analysis of variance to each episode component. Specifically, we regressed episode spending on each episode component (hospital, SNF, other) separately and used these results to generate predicted episode spending values for each hospital based on its value for each spending component. We then calculated the differen-ces (ie, residuals) between predicted and actual total episode spending values. We plotted residuals for each component, with lower residual plot variation (ie, a flatter curve) representing larger contribution of a spending component to overall spending variation.

 

 

Pearson correlation coefficients were used to assess within-­hospital CEBP correlation (ie, the extent to which performance was hospital specific). We evaluated if and how components of spending varied across hospitals by comparing spending groups (for individual episode types) and cost groups (for all episode types). To test the robustness of these categories, we conducted sensitivity analyses using high spending vs other spending groups (for individual episode types) and high SNF cost vs low SNF cost groups (for all episode types).

To assess concordance between CEBP and Medicare Spending Per Beneficiary, we cross tabulated hospital CEBP performance (high vs low vs mixed cost) and Medicare Spending Per Beneficiary performance (high vs low MSPB cost). This approached allowed us to quantify the number of hospitals that have concordant performance for both types of spending measures (ie, high cost or low cost on both) and the number with discordant performance (eg, high cost on one spending measure but low cost on the other). We used Pearson correlation coefficients to assess correlation between CEBP and Medicare Spending Per Beneficiary, with evaluation of CEBP performance in aggregate form (ie, hospitals’ average CEBP performance across all eligible episode types) and by individual episode types.

Chi-square and Kruskal-Wallis tests were used to compare categorical and continuous variables, respectively. To compare spending amounts, we evaluated the distribution of total episode spending (Appendix Figure 1) and used ordinary least squares regression with spending as the dependent variable and hospital group, episode components, and their interaction as independent variables. Because CEBP dollar amounts are reported through Hospital Compare on a risk-adjusted and payment-standardized basis, no additional adjustments were applied. Analyses were performed using SAS version 9.4 (SAS Institute; Cary, NC) and all tests of significance were two-tailed at alpha=0.05.

RESULTS

Of 3,129 hospitals, 1,778 achieved minimum thresholds and had CEBPs calculated for at least one of the six CEBP episode types.

Variation in CEBP Performance

For each episode type, spending varied across eligible hospitals (Appendix Figure 2). In particular, the difference between the 10th and 90th percentile values for cellulitis, kidney/UTI, and gastrointestinal hemorrhage were $2,873, $3,514, and $2,982, respectively. Differences were greater for procedural episodes of aortic aneurysm ($17,860), spinal fusion ($11,893), and cholecystectomy ($3,689). Evaluated across all episode types, the proportion of episode spending attributed to SNF care also varied across hospitals (Appendix Figure 3), with a difference of 24.7% between the 10th (4.5%) and 90th (29.2%) percentile values.

Residual plots demonstrated differences in which episode components accounted for variation in overall spending. For aortic aneurysm episodes, variation in the SNF episode component best explained variation in episode spending and thus had the lowest residual plot variation, followed by other and hospital components (Figure). Similar patterns were observed for spinal fusion and cholecystectomy episodes. In contrast, for cellulitis episodes, all three components had comparable residual-plot variation, which indicates that the variation in the components explained episode spending variation similarly (Figure)—a pattern reflected in kidney/UTI and gastrointestinal hemorrhage episodes.

Residual Plots for Episode Components

Correlation in Performance on CEBP Measures

 

 

Across hospitals in our sample, within-hospital correlations were generally low (Appendix Table 1). In particular, correlations ranged from –0.079 (between performance on aortic aneurysm and kidney/UTI episodes) to 0.42 (between performance on kidney/UTI and cellulitis episodes), with a median correlation coefficient of 0.13. Within-hospital correlations ranged from 0.037 to 0.28 when considered between procedural episodes and from 0.33 to 0.42 when considered between condition episodes. When assessed among the subset of 1,294 hospitals eligible for at least two CEBP measures, correlations were very similar (ranging from –0.080 to 0.42). Additional analyses among hospitals with more CEBPs (eg, all six measures) yielded correlations that were similar in magnitude.

CEBP Performance by Hospital Groups

Overall spending on specific episode types varied across hospital groups (Table). Spending for aortic aneurysm episodes was $42,633 at hospitals with above average spending and $37,730 at those with below average spending, while spending for spinal fusion episodes was $39,231 at those with above average spending and $34,832 at those with below average spending. In comparison, spending at hospitals deemed above and below average spending for cellulitis episodes was $10,763 and $9,064, respectively, and $11,223 and $9,161 at hospitals deemed above and below average spending for kidney/UTI episodes, respectively.

Episode Spending by Components

Spending on specific episode components also differed by hospital group (Table). Though the magnitude of absolute spending amounts and differences varied by specific episode, hospitals with above average spending tended to spend more on SNF than did those with below average spending. For example, hospitals with above average spending for cellulitis episodes spent an average of $2,564 on SNF (24% of overall episode spending) vs $1,293 (14% of episode spending) among those with below average spending. Similarly, hospitals with above and below average spending for kidney/UTI episodes spent $4,068 (36% of episode spending) and $2,232 (24% of episode spending) on SNF, respectively (P < .001 for both episode types). Findings were qualitatively similar in sensitivity analyses (Appendix Table 2).

Among hospitals in our sample, we categorized 481 as high cost (27%), 452 as low cost (25%), and 845 as mixed cost (48%), with hospital groups distributed broadly nationwide (Appendix Figure 4). Evaluated on performance across all six episode types, hospital groups also demonstrated differences in spending by cost components (Table). In particular, spending in SNF ranged from 18.1% of overall episode spending among high-cost hospitals to 10.7% among mixed-cost hospitals and 9.2% among low-cost hospitals. Additionally, spending on hospitalization accounted for 83.3% of overall episode spending among low-cost hospitals, compared with 81.2% and 73.4% among mixed-cost and high-cost hospitals, respectively (P < .001). Comparisons were qualitatively similar in sensitivity analyses (Appendix Table 3).

Comparison of CEBP and Medicare Spending Per Beneficiary Performance

Correlation between Medicare Spending Per Beneficiary and aggregated CEBPs was 0.42 and, for individual episode types, ranged between 0.14 and 0.36 (Appendix Table 2). There was low concordance between hospital performance on CEBP and Medicare Spending Per Beneficiary. Across all eligible hospitals, only 16.3% (290/1778) had positive concordance between performance on the two measure types (ie, low cost for both), while 16.5% (293/1778) had negative concordance (ie, high cost for both). There was discordant performance in most instances (67.2%; 1195/1778), which reflecting favorable performance on one measure type but not the other.

 

 

DISCUSSION

To our knowledge, this study is the first to describe hospitals’ episode-specific spending performance nationwide. It demonstrated significant variation across hospitals driven by different episode components for different episode types. It also showed low correlation between individual episode spending measures and poor concordance between episode-specific and global hospital spending measures. Two practice and policy implications are noteworthy.

First, our findings corroborate and build upon evidence from bundled payment programs about the opportunity for hospitals to improve their cost efficiency. Findings from bundled payment evaluations of surgical episodes suggest that the major area for cost savings is in the reduction of institutional post-acute care use such as that of SNFs.7-9 We demonstrated similar opportunity in a national sample of hospitals, finding that, for the three evaluated procedural CEBPs, SNF care accounted for more variation in overall episode spending than did other components. While variation may imply opportunity for greater efficiency and standardization, it is important to note that variation itself is not inherently problematic. Additional studies are needed to distinguish between warranted and unwarranted variation in procedural episodes, as well as identify strategies for reducing the latter.

Though bundled payment evaluations have predominantly emphasized procedural episodes, existing evidence suggests that participation in medical condition bundles has not been associated with cost savings or utilization changes.7-15 Findings from our analysis of variance—that there appear to be smaller variation-reduction opportunities for condition episodes than for procedural episodes—offer insight into this issue. Existing episodes are initiated by hospitalization and extend into the postacute period, a design that may not afford substantial post-acute care savings opportunities for condition episodes. This is an important insight as policymakers consider how to best design condition-based episodes in the future (eg, whether to use non–hospital based episode triggers). Future work should evaluate whether our findings reflect inherent differences between condition and procedural episodes16 or whether interventions can still optimize SNF care for these episodes despite smaller variation.

Second, our results highlight the potential limitations of global performance measures such as Medicare Spending Per Beneficiary. As a general measure of hospital spending, Medicare Spending Per Beneficiary is based on the premise that hospitals can be categorized as high or low cost with consideration of all inpatient episodic care. However, our analyses suggest that hospitals may be high cost for certain episodes and low cost for others—a fact highlighted by the low correlation and high discordance observed between hospital CEBP and Medicare Spending Per Beneficiary performance. Because overarching measures may miss spending differen-ces related to underlying clinical scenarios, episode-specific spending measures would provide important perspective and complements to global measures for assessing hospital cost performance, particularly in an era of value-based payments. Policymakers should consider prioritizing the development and implementation of such measures.

Our study has limitations. First, it is descriptive in nature, and future work should evaluate the association between episode-­specific spending measure performance and clinical and quality outcomes. Second, we evaluated all CEBP-eligible hospitals nationwide to provide a broad view of episode-specific spending. However, future studies should assess performance among hospital subtypes, such as vertically integrated or safety-­net organizations, because they may be more or less able to perform on these spending measures. Third, though findings may not be generalizable to other clinical episodes, our results were qualitatively consistent across episode types and broadly consistent with evidence from episode-based payment models. Fourth, we analyzed cost from the perspective of utilization and did not incorporate price considerations, which may be more relevant for commercial insurers than it is for Medicare.

Nonetheless, the emergence of CEBPs reflects the ongoing shift in policymaker attention toward episode-specific spending. In particular, though further scale or use of CEBP measures has been put on hold amid other payment reform changes, their nationwide implementation in 2017 signals Medicare’s broad interest in evaluating all hospitals on episode-specific spending efficiency, in addition to other facets of spending, quality, safety, and patient experience. Importantly, such efforts complement other ongoing nationwide initiatives for emphasizing episode spending, such as use of episode-based cost measures within the Merit-Based Incentive Payment System17 to score clinicians and groups in part based on their episode-specific spending efficiency. Insight about episode spending performance could help hospitals prepare for environments with increasing focus on episode spending and as policymakers incorporate this perspective into quality and value-­based payment policies.

 

 

References

1. Centers for Medicare & Medicaid Services. Fiscal Year 2019 Clinical Episode-Based Payment Measures Overview. https://www.qualityreportingcenter.com/globalassets/migrated-pdf/cepb_slides_npc-6.17.2018_5.22.18_vfinal508.pdf. Accessed November 26, 2019.
2. Centers for Medicare & Medicaid Services. Hospital Inpatient Quality Reporting Program. https://www.cms.gov/Medicare/Quality-Initiatives-Patient-Assessment-Instruments/HospitalQualityInits/HospitalRHQDAPU.html. Accessed November 23, 2019.
3. Centers for Medicare & Medicaid Services. Medicare Spending Per Beneficiary (MSPB) Spending Breakdown by Claim Type. https://www.cms.gov/Medicare/Quality-Initiatives-Patient-Assessment-Instruments/hospital-value-based-purchasing/Downloads/Fact-Sheet-MSPB-Spending-Breakdowns-by-Claim-Type-Dec-2014.pdf. Accessed November 25, 2019.
4. Hu J, Jordan J, Rubinfeld I, Schreiber M, Waterman B, Nerenz D. Correlations among hospital quality measure: What “Hospital Compare” data tell us. Am J Med Qual. 2017;32(6):605-610. https://doi.org/10.1177/1062860616684012.
5. Centers for Medicare & Medicaid Services. Hospital Compare datasets. https://data.medicare.gov/data/hospital-compare. Accessed November 26, 2019.
6. American Hospital Association. AHA Data Products. https://www.aha.org/data-insights/aha-data-products. Accessed November 25, 2019.
7. Dummit LA, Kahvecioglu D, Marrufo G, et al. Bundled payment initiative and payments and quality outcomes for lower extremity joint replacement episodes. JAMA. 2016; 316(12):1267-1278. https://doi.org/10.1001/jama.2016.12717.
8. Finkelstein A, Ji Y, Mahoney N, Skinner J. Mandatory medicare bundled payment program for lower extremity joint replacement and discharge to institutional postacute care: Interim analysis of the first year of a 5-year randomized trial. JAMA. 2018;320(9):892-900. https://doi.org/10.1001/jama.2018.12346.
9. Navathe AS, Troxel AB, Liao JM, et al. Cost of joint replacement using bundled payment models. JAMA Intern Med. 2017;177(2):214-222. https://doi.org/10.1001/jamainternmed.2016.8263.
10. Liao JM, Emanuel EJ, Polsky DE, et al. National representativeness of hospitals and markets in Medicare’s mandatory bundled payment program. Health Aff. 2019;38(1):44-53.
11. Barnett ML, Wilcock A, McWilliams JM, et al. Two-year evaluation of mandatory bundled payments for joint replacement. N Engl J Med. 2019;380(3):252-262. https://doi.org/10.1056/NEJMsa1809010.
12. Navathe AS, Liao JM, Polsky D, et al. Comparison of hospitals participating in Medicare’s voluntary and mandatory orthopedic bundle programs. Health Aff. 2018;37(6):854-863. https://www.doi.org/10.1377/hlthaff.2017.1358.
13. Joynt Maddox KE, Orav EJ, Zheng J, Epstein AM. Participation and Dropout in the Bundled Payments for Care Improvement Initiative. JAMA. 2018;319(2):191-193. https://doi.org/10.1001/jama.2017.14771.
14. Navathe AS, Liao JM, Dykstra SE, et al. Association of hospital participation in a Medicare bundled payment program with volume and case mix of lower extremity joint replacement episodes. JAMA. 2018;320(9):901-910. https://doi.org/10.1001/jama.2018.12345.
15. Joynt Maddox KE, Orav EJ, Epstein AM. Medicare’s bundled payments initiative for medical conditions. N Engl J Med. 2018;379(18):e33. https://doi.org/10.1056/NEJMc1811049.
16. Navathe AS, Shan E, Liao JM. What have we learned about bundling medical conditions? Health Affairs Blog. https://www.healthaffairs.org/do/10.1377/hblog20180828.844613/full/. Accessed November 25, 2019.
17. Centers for Medicare & Medicaid Services. MACRA. https://www.cms.gov/medicare/quality-initiatives-patient-assessment-instruments/value-based-programs/macra-mips-and-apms/macra-mips-and-apms.html. Accessed November 26, 2019.

References

1. Centers for Medicare & Medicaid Services. Fiscal Year 2019 Clinical Episode-Based Payment Measures Overview. https://www.qualityreportingcenter.com/globalassets/migrated-pdf/cepb_slides_npc-6.17.2018_5.22.18_vfinal508.pdf. Accessed November 26, 2019.
2. Centers for Medicare & Medicaid Services. Hospital Inpatient Quality Reporting Program. https://www.cms.gov/Medicare/Quality-Initiatives-Patient-Assessment-Instruments/HospitalQualityInits/HospitalRHQDAPU.html. Accessed November 23, 2019.
3. Centers for Medicare & Medicaid Services. Medicare Spending Per Beneficiary (MSPB) Spending Breakdown by Claim Type. https://www.cms.gov/Medicare/Quality-Initiatives-Patient-Assessment-Instruments/hospital-value-based-purchasing/Downloads/Fact-Sheet-MSPB-Spending-Breakdowns-by-Claim-Type-Dec-2014.pdf. Accessed November 25, 2019.
4. Hu J, Jordan J, Rubinfeld I, Schreiber M, Waterman B, Nerenz D. Correlations among hospital quality measure: What “Hospital Compare” data tell us. Am J Med Qual. 2017;32(6):605-610. https://doi.org/10.1177/1062860616684012.
5. Centers for Medicare & Medicaid Services. Hospital Compare datasets. https://data.medicare.gov/data/hospital-compare. Accessed November 26, 2019.
6. American Hospital Association. AHA Data Products. https://www.aha.org/data-insights/aha-data-products. Accessed November 25, 2019.
7. Dummit LA, Kahvecioglu D, Marrufo G, et al. Bundled payment initiative and payments and quality outcomes for lower extremity joint replacement episodes. JAMA. 2016; 316(12):1267-1278. https://doi.org/10.1001/jama.2016.12717.
8. Finkelstein A, Ji Y, Mahoney N, Skinner J. Mandatory medicare bundled payment program for lower extremity joint replacement and discharge to institutional postacute care: Interim analysis of the first year of a 5-year randomized trial. JAMA. 2018;320(9):892-900. https://doi.org/10.1001/jama.2018.12346.
9. Navathe AS, Troxel AB, Liao JM, et al. Cost of joint replacement using bundled payment models. JAMA Intern Med. 2017;177(2):214-222. https://doi.org/10.1001/jamainternmed.2016.8263.
10. Liao JM, Emanuel EJ, Polsky DE, et al. National representativeness of hospitals and markets in Medicare’s mandatory bundled payment program. Health Aff. 2019;38(1):44-53.
11. Barnett ML, Wilcock A, McWilliams JM, et al. Two-year evaluation of mandatory bundled payments for joint replacement. N Engl J Med. 2019;380(3):252-262. https://doi.org/10.1056/NEJMsa1809010.
12. Navathe AS, Liao JM, Polsky D, et al. Comparison of hospitals participating in Medicare’s voluntary and mandatory orthopedic bundle programs. Health Aff. 2018;37(6):854-863. https://www.doi.org/10.1377/hlthaff.2017.1358.
13. Joynt Maddox KE, Orav EJ, Zheng J, Epstein AM. Participation and Dropout in the Bundled Payments for Care Improvement Initiative. JAMA. 2018;319(2):191-193. https://doi.org/10.1001/jama.2017.14771.
14. Navathe AS, Liao JM, Dykstra SE, et al. Association of hospital participation in a Medicare bundled payment program with volume and case mix of lower extremity joint replacement episodes. JAMA. 2018;320(9):901-910. https://doi.org/10.1001/jama.2018.12345.
15. Joynt Maddox KE, Orav EJ, Epstein AM. Medicare’s bundled payments initiative for medical conditions. N Engl J Med. 2018;379(18):e33. https://doi.org/10.1056/NEJMc1811049.
16. Navathe AS, Shan E, Liao JM. What have we learned about bundling medical conditions? Health Affairs Blog. https://www.healthaffairs.org/do/10.1377/hblog20180828.844613/full/. Accessed November 25, 2019.
17. Centers for Medicare & Medicaid Services. MACRA. https://www.cms.gov/medicare/quality-initiatives-patient-assessment-instruments/value-based-programs/macra-mips-and-apms/macra-mips-and-apms.html. Accessed November 26, 2019.

Issue
Journal of Hospital Medicine 16(4)
Issue
Journal of Hospital Medicine 16(4)
Page Number
204-210. Published Online First March 18, 2020
Page Number
204-210. Published Online First March 18, 2020
Publications
Publications
Topics
Article Type
Sections
Article Source

© 2020 Society of Hospital Medicine

Disallow All Ads
Correspondence Location
Joshua M. Liao, MD, MSc; Email: [email protected]; Telephone: 206-616-6934; Twitter: @JoshuaLiaoMD
Content Gating
Gated (full article locked unless allowed per User)
Alternative CME
Disqus Comments
Default
Use ProPublica
Hide sidebar & use full width
render the right sidebar.
Conference Recap Checkbox
Not Conference Recap
Clinical Edge
Display the Slideshow in this Article
Gating Strategy
First Peek Free
Medscape Article
Display survey writer
Reuters content
Article PDF Media
Media Files

Do Hospitals Participating in Accountable Care Organizations Discharge Patients to Higher Quality Nursing Homes?

Article Type
Changed
Sun, 05/26/2019 - 00:10

Accountable care organizations (ACOs) create incentives for more efficient healthcare utilization. For patients being discharged from the hospital, this may mean more efficient use of postacute care (PAC), including discharging patients to higher quality skilled nursing facilities (SNFs) in an effort to limit readmissions and other costly complications. Public reporting of nursing home quality has been associated with improved performance measures, although improvements in preventable hospitalizations have lagged.1 Evidence to date suggests that patients attributed to an ACO are not going to higher quality SNFs,2,3 but these effects may be concentrated in hospitals that participate in ACOs and face stronger incentives to alter their discharge patterns compared with non-ACO hospitals. Therefore, we examined whether hospitals participating in Medicare’s Shared Saving Program (MSSP) increased the use of highly rated SNFs or decreased the use of low-rated SNFs hospital-wide after initiation of their ACO contracts compared with non-ACO hospitals.

METHODS

We used discharge-level data from the 100% MedPAR file for all fee-for-service Medicare beneficiaries discharged from an acute care hospital to an SNF between 2010 and 2013. We measured the SNF quality using Medicare’s Nursing Home Compare star ratings. Our primary outcome was probability of discharge to high-rated (five star) and low-rated (one star) SNFs.

We utilized a difference-in-differences design. Using a linear probability model, we first estimated the change in the probability of discharge to five-star SNFs (compared to all other SNFs) among all beneficiaries discharged from one of the 233 ACO-participating hospitals after the hospital became an ACO provider compared with before and compared withall beneficiaries discharged from one of the 3,081 non-ACO hospitals over the same time period. Individual hospitals were determined to be “ACO-participating” if they were listed on Medicare’s website as being part of an ACO-participating hospital in the MSSP. ACOs joined the MSSP in three waves: April 1, 2012; July 1, 2012; and January 1, 2013, which were also determined based on information on Medicare’s website. We separately estimated the change in probability of discharge to a one-star SNF (compared to all other SNFs) using the same approach. Models were adjusted for beneficiary demographic and clinical characteristics (age, sex, race, dual eligibility, urban ZIP code, diagnosis-related group code, and Elixhauser comorbidities) and market characteristics (the concentration of hospital discharges, SNF discharges, and the number of five-star SNFs, all measured in each hospital referral region).

RESULTS

We examined a total of 12,736,287 discharges, 11.8% from ACO-participating hospitals and 88.2% from non-ACO-participating hospitals. ACO-participating hospitals cared for fewer black patients and fewer patients who were dually enrolled in Medicare and Medicaid (Table 1), but these characteristics did not change differentially between the two groups of hospitals over our study period. ACO-participating hospitals were also more likely to discharge patients to five-star SNFs prior to joining an ACO (in 2010-2011). After joining an ACO, the percentage of hospital discharges going to a 5-star SNF increased by 3.4 percentage points on a base of 15.4% (95% confidence interval [CI] 1.3-5.5, P = .002; Table 2) compared with non-ACO-participating hospitals over the same time period. The differential changes did not extend to SNFs rated as three stars and above (change of 0.5 percentage points, 95% CI, 1.3-2.8, P = .600).

 

 

The probability of discharge from an ACO hospital to low-quality (one-star) SNFs did not change significantly from its baseline level of 13.5% after joining an ACO compared with non-ACO-participating hospitals (change of 0.4 percentage points, 95% CI, 0.7-1.5, P = .494).

DISCUSSION

Our findings indicate that ACO-participating hospitals were more likely to discharge patients to the highest rated SNFs after they began their ACO contract but did not change the likelihood of discharge to lower rated SNFs in comparison with non-ACO hospitals. Previous research has suggested that patients attributed to a Medicare ACO were not more likely to use high-quality SNFs. However, we examined the effect of hospital participation in an ACO, not individual beneficiaries attributed to an ACO. These contrasting results suggest that hospitals could be instituting hospital-wide changes in discharge patterns once they join an ACO and that hospital-led ACOs could be particularly well positioned to manage postdischarge care relative to physician-led ACOs. One potential limitation of this study is that ACO-participating hospitals may differ in unobservable ways from non-ACO-participating hospitals. However, using hospital fixed effects, we mitigated this limitation to some extent by controlling for time-invariant observed and unobserved characteristics. Further work will need to explore the mechanisms of higher PAC quality, including hospital-SNF integration and coordination.

Disclosures

Dr. Werner reports receiving personal fees from CarePort Health. Dr. Bain reports no conflicts. Mr. Yuan reports no conflicts. Dr. Navathe reports receiving personal fees from Navvis and Company, Navigant Inc., Lynx Medical, Indegene Inc., Sutherland Global Services, and Agathos, Inc.; personal fees and equity from NavaHealth; an honorarium from Elsevier Press, serving on the board of Integrated Services, Inc. without compensation, and grants from Hawaii Medical Service Association, Anthem Public Policy Institute, and Oscar Health, none of which are related to this manuscript.

Funding

This research was funded by R01-HS024266 by the Agency for Healthcare Research and Quality. Rachel Werner was supported in part by K24-AG047908 from the National Institute on Aging.

 

References

1. Ryskina KL, Konetzka RT, Werner RM. Association between 5-star nursing home report card ratings and potentially preventable hospitalizations. Inquiry. 2018;55:46958018787323. doi: 10.1177/0046958018787323. PubMed
2. McWilliams JM, Gilstrap LG, Stevenson DG, Chernew ME, Huskamp HA, Grabowski DC. Changes in postacute care in the medicare shared savings program. JAMA Intern Med. 2017;177(4):518-526. doi: 10.1001/jamainternmed.2016.9115. PubMed
3. McWilliams JM, Hatfield LA, Chernew ME, Landon BE, Schwartz AL. Early performance of accountable care organizations in medicare. N Engl J Med. 2016;374(24):2357-2366. doi: 10.1056/NEJMsa1600142. PubMed

Article PDF
Issue
Journal of Hospital Medicine 14(5)
Publications
Topics
Page Number
288-289. Published online first March 20, 2019.
Sections
Article PDF
Article PDF

Accountable care organizations (ACOs) create incentives for more efficient healthcare utilization. For patients being discharged from the hospital, this may mean more efficient use of postacute care (PAC), including discharging patients to higher quality skilled nursing facilities (SNFs) in an effort to limit readmissions and other costly complications. Public reporting of nursing home quality has been associated with improved performance measures, although improvements in preventable hospitalizations have lagged.1 Evidence to date suggests that patients attributed to an ACO are not going to higher quality SNFs,2,3 but these effects may be concentrated in hospitals that participate in ACOs and face stronger incentives to alter their discharge patterns compared with non-ACO hospitals. Therefore, we examined whether hospitals participating in Medicare’s Shared Saving Program (MSSP) increased the use of highly rated SNFs or decreased the use of low-rated SNFs hospital-wide after initiation of their ACO contracts compared with non-ACO hospitals.

METHODS

We used discharge-level data from the 100% MedPAR file for all fee-for-service Medicare beneficiaries discharged from an acute care hospital to an SNF between 2010 and 2013. We measured the SNF quality using Medicare’s Nursing Home Compare star ratings. Our primary outcome was probability of discharge to high-rated (five star) and low-rated (one star) SNFs.

We utilized a difference-in-differences design. Using a linear probability model, we first estimated the change in the probability of discharge to five-star SNFs (compared to all other SNFs) among all beneficiaries discharged from one of the 233 ACO-participating hospitals after the hospital became an ACO provider compared with before and compared withall beneficiaries discharged from one of the 3,081 non-ACO hospitals over the same time period. Individual hospitals were determined to be “ACO-participating” if they were listed on Medicare’s website as being part of an ACO-participating hospital in the MSSP. ACOs joined the MSSP in three waves: April 1, 2012; July 1, 2012; and January 1, 2013, which were also determined based on information on Medicare’s website. We separately estimated the change in probability of discharge to a one-star SNF (compared to all other SNFs) using the same approach. Models were adjusted for beneficiary demographic and clinical characteristics (age, sex, race, dual eligibility, urban ZIP code, diagnosis-related group code, and Elixhauser comorbidities) and market characteristics (the concentration of hospital discharges, SNF discharges, and the number of five-star SNFs, all measured in each hospital referral region).

RESULTS

We examined a total of 12,736,287 discharges, 11.8% from ACO-participating hospitals and 88.2% from non-ACO-participating hospitals. ACO-participating hospitals cared for fewer black patients and fewer patients who were dually enrolled in Medicare and Medicaid (Table 1), but these characteristics did not change differentially between the two groups of hospitals over our study period. ACO-participating hospitals were also more likely to discharge patients to five-star SNFs prior to joining an ACO (in 2010-2011). After joining an ACO, the percentage of hospital discharges going to a 5-star SNF increased by 3.4 percentage points on a base of 15.4% (95% confidence interval [CI] 1.3-5.5, P = .002; Table 2) compared with non-ACO-participating hospitals over the same time period. The differential changes did not extend to SNFs rated as three stars and above (change of 0.5 percentage points, 95% CI, 1.3-2.8, P = .600).

 

 

The probability of discharge from an ACO hospital to low-quality (one-star) SNFs did not change significantly from its baseline level of 13.5% after joining an ACO compared with non-ACO-participating hospitals (change of 0.4 percentage points, 95% CI, 0.7-1.5, P = .494).

DISCUSSION

Our findings indicate that ACO-participating hospitals were more likely to discharge patients to the highest rated SNFs after they began their ACO contract but did not change the likelihood of discharge to lower rated SNFs in comparison with non-ACO hospitals. Previous research has suggested that patients attributed to a Medicare ACO were not more likely to use high-quality SNFs. However, we examined the effect of hospital participation in an ACO, not individual beneficiaries attributed to an ACO. These contrasting results suggest that hospitals could be instituting hospital-wide changes in discharge patterns once they join an ACO and that hospital-led ACOs could be particularly well positioned to manage postdischarge care relative to physician-led ACOs. One potential limitation of this study is that ACO-participating hospitals may differ in unobservable ways from non-ACO-participating hospitals. However, using hospital fixed effects, we mitigated this limitation to some extent by controlling for time-invariant observed and unobserved characteristics. Further work will need to explore the mechanisms of higher PAC quality, including hospital-SNF integration and coordination.

Disclosures

Dr. Werner reports receiving personal fees from CarePort Health. Dr. Bain reports no conflicts. Mr. Yuan reports no conflicts. Dr. Navathe reports receiving personal fees from Navvis and Company, Navigant Inc., Lynx Medical, Indegene Inc., Sutherland Global Services, and Agathos, Inc.; personal fees and equity from NavaHealth; an honorarium from Elsevier Press, serving on the board of Integrated Services, Inc. without compensation, and grants from Hawaii Medical Service Association, Anthem Public Policy Institute, and Oscar Health, none of which are related to this manuscript.

Funding

This research was funded by R01-HS024266 by the Agency for Healthcare Research and Quality. Rachel Werner was supported in part by K24-AG047908 from the National Institute on Aging.

 

Accountable care organizations (ACOs) create incentives for more efficient healthcare utilization. For patients being discharged from the hospital, this may mean more efficient use of postacute care (PAC), including discharging patients to higher quality skilled nursing facilities (SNFs) in an effort to limit readmissions and other costly complications. Public reporting of nursing home quality has been associated with improved performance measures, although improvements in preventable hospitalizations have lagged.1 Evidence to date suggests that patients attributed to an ACO are not going to higher quality SNFs,2,3 but these effects may be concentrated in hospitals that participate in ACOs and face stronger incentives to alter their discharge patterns compared with non-ACO hospitals. Therefore, we examined whether hospitals participating in Medicare’s Shared Saving Program (MSSP) increased the use of highly rated SNFs or decreased the use of low-rated SNFs hospital-wide after initiation of their ACO contracts compared with non-ACO hospitals.

METHODS

We used discharge-level data from the 100% MedPAR file for all fee-for-service Medicare beneficiaries discharged from an acute care hospital to an SNF between 2010 and 2013. We measured the SNF quality using Medicare’s Nursing Home Compare star ratings. Our primary outcome was probability of discharge to high-rated (five star) and low-rated (one star) SNFs.

We utilized a difference-in-differences design. Using a linear probability model, we first estimated the change in the probability of discharge to five-star SNFs (compared to all other SNFs) among all beneficiaries discharged from one of the 233 ACO-participating hospitals after the hospital became an ACO provider compared with before and compared withall beneficiaries discharged from one of the 3,081 non-ACO hospitals over the same time period. Individual hospitals were determined to be “ACO-participating” if they were listed on Medicare’s website as being part of an ACO-participating hospital in the MSSP. ACOs joined the MSSP in three waves: April 1, 2012; July 1, 2012; and January 1, 2013, which were also determined based on information on Medicare’s website. We separately estimated the change in probability of discharge to a one-star SNF (compared to all other SNFs) using the same approach. Models were adjusted for beneficiary demographic and clinical characteristics (age, sex, race, dual eligibility, urban ZIP code, diagnosis-related group code, and Elixhauser comorbidities) and market characteristics (the concentration of hospital discharges, SNF discharges, and the number of five-star SNFs, all measured in each hospital referral region).

RESULTS

We examined a total of 12,736,287 discharges, 11.8% from ACO-participating hospitals and 88.2% from non-ACO-participating hospitals. ACO-participating hospitals cared for fewer black patients and fewer patients who were dually enrolled in Medicare and Medicaid (Table 1), but these characteristics did not change differentially between the two groups of hospitals over our study period. ACO-participating hospitals were also more likely to discharge patients to five-star SNFs prior to joining an ACO (in 2010-2011). After joining an ACO, the percentage of hospital discharges going to a 5-star SNF increased by 3.4 percentage points on a base of 15.4% (95% confidence interval [CI] 1.3-5.5, P = .002; Table 2) compared with non-ACO-participating hospitals over the same time period. The differential changes did not extend to SNFs rated as three stars and above (change of 0.5 percentage points, 95% CI, 1.3-2.8, P = .600).

 

 

The probability of discharge from an ACO hospital to low-quality (one-star) SNFs did not change significantly from its baseline level of 13.5% after joining an ACO compared with non-ACO-participating hospitals (change of 0.4 percentage points, 95% CI, 0.7-1.5, P = .494).

DISCUSSION

Our findings indicate that ACO-participating hospitals were more likely to discharge patients to the highest rated SNFs after they began their ACO contract but did not change the likelihood of discharge to lower rated SNFs in comparison with non-ACO hospitals. Previous research has suggested that patients attributed to a Medicare ACO were not more likely to use high-quality SNFs. However, we examined the effect of hospital participation in an ACO, not individual beneficiaries attributed to an ACO. These contrasting results suggest that hospitals could be instituting hospital-wide changes in discharge patterns once they join an ACO and that hospital-led ACOs could be particularly well positioned to manage postdischarge care relative to physician-led ACOs. One potential limitation of this study is that ACO-participating hospitals may differ in unobservable ways from non-ACO-participating hospitals. However, using hospital fixed effects, we mitigated this limitation to some extent by controlling for time-invariant observed and unobserved characteristics. Further work will need to explore the mechanisms of higher PAC quality, including hospital-SNF integration and coordination.

Disclosures

Dr. Werner reports receiving personal fees from CarePort Health. Dr. Bain reports no conflicts. Mr. Yuan reports no conflicts. Dr. Navathe reports receiving personal fees from Navvis and Company, Navigant Inc., Lynx Medical, Indegene Inc., Sutherland Global Services, and Agathos, Inc.; personal fees and equity from NavaHealth; an honorarium from Elsevier Press, serving on the board of Integrated Services, Inc. without compensation, and grants from Hawaii Medical Service Association, Anthem Public Policy Institute, and Oscar Health, none of which are related to this manuscript.

Funding

This research was funded by R01-HS024266 by the Agency for Healthcare Research and Quality. Rachel Werner was supported in part by K24-AG047908 from the National Institute on Aging.

 

References

1. Ryskina KL, Konetzka RT, Werner RM. Association between 5-star nursing home report card ratings and potentially preventable hospitalizations. Inquiry. 2018;55:46958018787323. doi: 10.1177/0046958018787323. PubMed
2. McWilliams JM, Gilstrap LG, Stevenson DG, Chernew ME, Huskamp HA, Grabowski DC. Changes in postacute care in the medicare shared savings program. JAMA Intern Med. 2017;177(4):518-526. doi: 10.1001/jamainternmed.2016.9115. PubMed
3. McWilliams JM, Hatfield LA, Chernew ME, Landon BE, Schwartz AL. Early performance of accountable care organizations in medicare. N Engl J Med. 2016;374(24):2357-2366. doi: 10.1056/NEJMsa1600142. PubMed

References

1. Ryskina KL, Konetzka RT, Werner RM. Association between 5-star nursing home report card ratings and potentially preventable hospitalizations. Inquiry. 2018;55:46958018787323. doi: 10.1177/0046958018787323. PubMed
2. McWilliams JM, Gilstrap LG, Stevenson DG, Chernew ME, Huskamp HA, Grabowski DC. Changes in postacute care in the medicare shared savings program. JAMA Intern Med. 2017;177(4):518-526. doi: 10.1001/jamainternmed.2016.9115. PubMed
3. McWilliams JM, Hatfield LA, Chernew ME, Landon BE, Schwartz AL. Early performance of accountable care organizations in medicare. N Engl J Med. 2016;374(24):2357-2366. doi: 10.1056/NEJMsa1600142. PubMed

Issue
Journal of Hospital Medicine 14(5)
Issue
Journal of Hospital Medicine 14(5)
Page Number
288-289. Published online first March 20, 2019.
Page Number
288-289. Published online first March 20, 2019.
Publications
Publications
Topics
Article Type
Sections
Article Source

© 2019 Society of Hospital Medicine

Disallow All Ads
Correspondence Location
Amol S. Navathe, MD, PhD; E-mail: [email protected]; Telephone: (215) 573-4047; Twitter: @AmolNavathe
Content Gating
Gated (full article locked unless allowed per User)
Alternative CME
Disqus Comments
Default
Use ProPublica
Hide sidebar & use full width
render the right sidebar.
Gating Strategy
First Peek Free
Article PDF Media