Health Data

The Machine Learning Conundrum in the Care Delivery Ecosystem

The healthcare industry has made significant progress over the past few years in embracing machine learning and data science throughout its ecosystem. However, an interesting conundrum is now starting to surface regarding the use of data science and machine learning in the care delivery setup. This conundrum isn’t a surprise to most, but the gravity of how difficult it is to resolve or work alongside it is only starting to hit.

When a healthcare provider sees a score or a metric that is calculated using machine learning (e.g., the probability that a patient is likely to get re-admitted in next 30 days) listed in an electronic medical record system, their instinct is to consider that score in their decision-making and treat it like any other information they have on the patient. While it is advised that medical decisions are not made solely based on that machine learning-based score, and the score is only used to augment their decision-making, it still leaves them with two questions:

  1. Should they treat the machine learning-based score with equal weight as other information they have on the patient (i.e., believe in the “trueness” of that score 100 percent)? If not, how much or how little should they trust the score? How do they incorporate that fuzziness into their decision?
  2. Should the score be considered equally true for all patients? Are there patient segments for which the score is more true compared to others? What about minority groups? Can the score be considered sufficiently accurate for them as well?

Machine learning models are probabilistic by design, and while they aim for increasing accuracy with every training/retraining cycle, 100 percent accuracy isn’t a realistic goal in any situation. What happens when the statistic that the machine learning model predicted wasn’t accurate, but it is the only measure of that medical condition? Consider, for example, a metric that was being extracted from a lab report using optical character recognition technology. What if the wrong value gets extracted? Do we always recommend the physician in this case look at the source as well? Does this not defeat the purpose of using machine learning? These are some pertinent aspects of the situation that need to be addressed.

Based on experiential learning, we now know that there are at least three tangible steps that data science teams and healthcare providers can take when it comes to using machine learning/data science in the care delivery ecosystem:

1. Look at the Exclusion List

When the data science team hands over the model to use, there should be an accompanied report that outlines the distribution of the population on which the model was trained. The model should strictly be used to make inferences for similar demographic populations only. This might limit the scope for model usage to begin with, but it is the right thing to do. With time, as more data is collected for a broader population, the model can be expanded to include other population segments.

2. Pass Back the Feedback and Help (Re)Train the Model

Machine learning models are probabilistic models; data isn’t always reflective of how healthcare decisions are made, and healthcare is a complex field. Because of these facts, it might be easier to appreciate that the first iteration of a machine learning model, when put in production, will spring up surprises with regards to both usability and accuracy. It is critical for healthcare providers to pass on that feedback and let the model be recalibrated/retrained based on that—not once or twice, but on an ongoing basis. The buffer to let the model evolve with time has to be accounted and provisioned for. Crawl-walk-run, while being a cliché, accurately represents how training and using machine learning models should happen.

3. View the Machine Learning Based Score with Skepticism, and Spot-Check for Accuracy

When it comes to care delivery, no one knows the patient better than the physician. That fact remains. Hence, when physicians are making decisions incorporating machine learning-driven statistics, they should maintain a healthy skepticism regarding those scores/statistics. That skepticism should be then passed back as feedback to the model as well so that, with time, the model can be made more accurate and the skepticism allayed. Spot-checking of some patient records for score values and validating against the clinical knowledge, expertise, and patient-specific diagnosis by physicians can also go a long way in ensuring that we are not letting machine learning go astray. 

The Transformative Potential of Technology

There is no denying the fact that these new age digital technologies (including data science and machine learning) are exasperatingly ambiguous. The solution is not in turning a blind eye toward these technologies but instead in finding innovative ways to make decisions despite those ambiguities, with an understanding of the potential and value of these technologies. The need of the hour is to realize that the transformative potential of these technologies outweighs the challenges in decision-making. In case of healthcare, though, and especially in care delivery parlance, this road needs to be tread with extra caution.

Disclaimer: The views expressed in this article are authors’ own only, and not of the author’s employers nor of any other entities the author is associated with.


Kapila Monga (kapila.monga@gmail.com) is an artificial intelligence (AI) and machine learning (ML) professional with over 15+ years of experience working with healthcare and life sciences clients across North America, designing AI/ML solutions. She currently serves as head of data science at Bon Secours Mercy Health.