Analytic Forces for the Ever-Important SDOH Battle

Analytic Forces for the Ever-Important SDOH Battle

By Leigh McCormack, MS; Nathan Albers, BS; Jeremiah Lowhorn, MS; and Michael Stearns, MD, CPC, CRC, CFPC

The National Center for Health Statistics estimates that US emergency departments (EDs) see more than 138 million visits each year. Research into factors beyond clinical need that drive ED utilization include healthcare access and availability, patient preferences, demographics, and policies such as the Emergency Medical Treatment and Labor Act (EMTLA). The EMTLA requires EDs to provide care for anyone regardless of their ability to pay. More recently, as new care delivery and payment models bring a shift from fee-for-service to value-based models, health systems, health insurers, and policymakers are recognizing the significance of social and economic factors—referred to as social determinants of health (SDOH)—on ED utilization.

Sparse Data for a Growing Need

While evidence of the impact of social risk factors on health outcomes, utilization, and costs is mounting, the efforts to capture and standardize these data is lacking. A subset of social and economic Z codes found in ICD-10-CM capture “factors that influence health status and contact with health services.” This set of SDOH Z codes have the potential to collect standardized information regarding patients’ social and economic risk factors. However, documentation of these factors and use of these codes are low, likely due to the lack of universally accepted standards and lack of reimbursement for the services that require these specific indicators. The Centers for Medicare and Medicaid Services (CMS) reported that Z codes were documented for only 1.4 percent of the 33.7 million fee-for-service (FFS) Medicare beneficiaries in 2017.1 This lack of utilization inhibits the proper targeting of resources and poses accuracy limitations when used in advanced analytics to properly identify socially vulnerable patients. Many innovation models and value-based programs will rely on social information to be successful. Those healthcare models include:

  • Community Paramedicine: Requires an understanding of the social needs of rural patients and prioritizing workloads
  • Accountable Health Communities: Relies on proper identification of social needs in order to align clinical and community services
  • Hospital Readmission Reduction Program: Augments clinical, analytical, and operational models with social needs and integrated workflows
  • Population Health Programs: Helps create robust stratification efforts for success under capitated and shared risk models
An Unstructured Army

As healthcare turns to data in this SDOH integration era, will artificial intelligence lead the way much like its statistical ancestor—randomized controlled trials—did for evidence-based medicine? Health informatics professionals are searching for innovative ways to uncover, define, and supply these SDOH data into clinical practice. Recent advances in natural language processing (NLP) provides a scalable way to broaden the capture of social risk factors from valuable narrative data found in unstructured clinical notes. NLP is a branch of artificial intelligence that enables computers to make sense of free-form text data. Using NLP methods to analyze clinical notes for understanding social and economic needs can contribute to understanding trends in outcomes and utilization without the reliance on broad adoption of SDOH Z codes.

Arming the Value-Based Fight

Specifically, social support has been linked to higher rates of ED utilization.2,3 Base Camp Health, a SDOH analytics company, conducted a retrospective case-control study using inpatient admission data from multiple acute care facilities, including SSM Health St. Anthony in Oklahoma City, OK, to understand the impact of social support risk factors on ED admission. Less than a third of a percent of admissions recorded an ICD-10-CM Z code related to social support. This volume was not large enough to determine the association between social support risk factors and whether an admission occurred through the ED. Therefore, a lightweight, semi-supervised topic modeling approach was used to identify social support risk factors in clinical notes. The NLP algorithm allows the specification of anchor words that define topics that may otherwise be underrepresented in the data. Specific parts of speech are considered when choosing topics—specifically, verbs and nouns. Rather than simply building topics from single words, the model also defined topics from phrases. Choosing anchor words to guide the model was an iterative process that required the identification of words and phrases that were believed to define social support risk. Key examples include:

  • Divorced, estranged, separated from spouse
  • Parent, spouse, or caregiver passed away
  • Lonely, depressed, isolated, invisible
  • [Lack of] intimacy, validation, support
  • [Lack of] family, network, friends, community
  • [Does not feel] valued, accepted, understood
The Proper Battle Plan

Incorporating NLP methods or insights into electronic health record (EHR) workflows at the point-of-care would be valuable in identifying needs upstream in the primary care and community setting to alleviate barriers to care and aid in avoiding possible unnecessary ED visits and costs. Base Camp Health encourages organizations looking to implement such approaches to consider the following:

  • Each organization must still develop the appropriate documentation, ontology and topics, and coding policies and guidelines for the incorporation of SDOH that meets their needs, whether from standardized code sets or analytic methods.
  • Organizations must be willing to live within the error that any statistical approach brings and must partner with health informatics professionals that are transparent about how this error translates into operational and financial impact.
  • Leaders need to relax the ideology that the EHR is the panacea of documentation and recall, as successful SDOH efforts will require many different data systems and actors, not all of which will be directly compatible with existing EHR vendors.
  • Health informatics professionals need to be cognizant of where clinical text data are collected and how that may limit how they can be leveraged. The higher odds of ED admission given the presence of a social support risk factor found in this study may be indicative of screening tools and protocols executed explicitly in the ED versus during an inpatient stay or at discharge.
  • Realize that any NLP process is iterative and takes human capital on the front end and throughout the lifecycle to tune and optimize the output.
Winning the War

The current shift in healthcare to value-based efforts will continue to require the understanding of patients’ social, economic, and behavioral needs as a supplement to medical data. However, SDOH data collection and definition will not be standard and universal without proper incentives or policy requirements. With less than three percent of clinicians documenting detailed SDOH, further refinement and adoption of standardized codes for SDOH should be a focus of all value-based initiatives and reform. A draft of the International Classification of Diseases, Eleventh Revision (ICD-11) reveals enhancement in these specific code sets.4 But we don’t have to sit around and wait. Many healthcare organizations will turn to unique data assets and analytic strategies such as NLP to understand holistic patient risk, properly preparing to better serve patients and the healthcare ecosystem.


Centers for Medicare and Medicaid Services. “Z Codes Utilization among Medicare Fee-for-Service (FFS) Beneficiaries in 2017.” Data Highlight no. 18, January 2020.

Hastings, Nicole et al. “Does Lack of Social Support Lead to More Ed Visits for Older Adults?” The American Journal of Emergency Medicine 26, no. 4 (May 2008): 454-461.

Sandoval, Elizabeth et al. “A Comparison of Frequent and Infrequent Visitors to an Urban Emergency Department.” The Journal of Emergency Medicine 38, no. 2 (February 2010): 115-121.

World Health Organization. “International Classification of Disease for Mortality and Morbidity Statistics – 11th Revision (Reference Guide Draft).” 2018.


Leigh McCormack ( is the CEO of Base Camp Health, LLC. Nathan Albers ( is an analytst and developer at Base Camp Health, LLC. Jeremiah Lowhorn ( is a senior data scientist at Data Tapestry. Michael Stearns ( is the founder and CEO of Apollo HIT, LLC.

Leave a comment

Send a Comment

Your email address will not be published. Required fields are marked *