Revenue Cycle, Health Data, Workforce Development, CE Quizzes

Are LLMs Coming for Coding? Yes, and Medical Coders Should Prepare

Friday, June 28, 2024

First, it was robotic process automation. Then machine learning and natural language processing. These technologies continue to shape the medical coding profession in unprecedented ways, increasingly tasking coders with novel responsibilities like auditing and validating codes rather than assigning them directly.

More recently, generative artificial intelligence (AI)—specifically the application of large language models (LLMs)—has emerged as the next wave in technology that some experts hope will improve coding productivity and accuracy during a time when coding rules are more complex than ever before, and medical coder shortages continue to plague the industry.

However, there’s a big caveat: Several proprietary and open source LLMs in their current form aren’t exactly ready for prime time.

Opportunities, Limitations of LLMs in Medical Coding

Researchers evaluated whether and how often four LLMs (i.e., GPT-3.5, GPT-4, Gemini Pro, and Llama2-70b Chat) assigned correct CPT, ICD-9-CM, and ICD-10-CM codes. Their findings, published in an NEJM AI article titled “Large Language Models Are Poor Medical Coders—Benchmarking of Medical Code Querying,” indicate that there’s obvious room for improvement. More specifically, researchers revealed a maximum accuracy of 50 percent, and they found that LLMs often generate codes conveying imprecise or fabricated information.

“This is not surprising when we consider how these models were trained and built,” says Ali Soroush, MD, MS, lead author of the study and assistant professor of medicine in the division of digital and data-driven medicine at the Icahn School of Medicine at Mount Sinai in New York City. “They’re general-purpose tools trained on general-purpose text. Our point was to say that out of the box, you shouldn’t be using LLMs for coding. You need to use something that’s tailor made for healthcare.”

One of the challenges with ICD-10-CM in particular is that it’s so nuanced, says Soroush. “There are a lot of combinations of letters and numbers that don’t have any code associated with them, so there is a higher propensity to generate nonexistent codes,” he adds.

The same is true for evaluation and management CPT codes, Soroush says. LLMs performed poorly because these codes require a multi-step process that includes determining whether patients are new versus established, identifying time spent with the patient, and calculating medical decision-making.

However, with fine-tuning, the technology is capable of significantly higher coding accuracy rates, says Amy Raymond, senior vice president of revenue cycle operations and deployments at AKASA, a San Francisco, CA, generative AI company focused on the healthcare revenue cycle. Instead of using publicly available data, the most successful LLMs further train on healthcare- and hospital-specific data.

“Every health system is different in terms of how its coders code,” she says. “There’s a lot of nuances in terms of patient mix. You can’t take generic data and apply it to the entire industry, and you can’t necessarily even take one hospital’s data and apply it across the industry.”

Soroush agrees. “It’s important to highlight the weaknesses of general-purpose models but to also understand that it’s because it’s a general-purpose model,” he says. “That’s ok because all it means is that you need to do a little extra work to create the version you need. Doing that is more feasible now than it ever was.”

How Coders, Coding Managers Can Prepare

As LLMs continue to develop, experts say coders can — and should — prepare for inevitable shifts in roles and responsibilities. They provide these strategies:

Move past the fear. “Instead of being fearful, coders need to embrace the changes and be curious,” says Tami Montroy, MS, RHIA, CCS, director of central fee abstraction at the University of Pennsylvania Health System in Philadelphia. “When we educate ourselves, we become less fearful of the unknown because it’s not unknown anymore.” She says this is more of a challenge with seasoned coders—not necessarily current health information (HI) students and new graduates. “Some of the HI students I talk to now are really interested in the technology,” she says. “They want to help build and fine-tune it, and I think that’s great.”

While many coders fear LLMs will take their jobs, Montroy says that won’t happen. “As long as the United States healthcare system is going to reimburse providers based on CPT and ICD-10-CM and procedural coding system (PCS) rules of coding, you’re going to need coders. There will be a shift in roles and responsibilities, but it will happen slowly over time.”

Raymond agrees. “This isn’t about taking away peoples’ jobs,” she says. “It’s about using the technology to empower teams. Coders will work side-by-side with this amazing technology to level up.”
Become a “well-rounded” medical coder. Having a variety of tools in one’s toolbelt will be increasingly important because it will enable coders to apply their skills in areas where the technology may need refinement. For example, if you have an inpatient credential, seek an outpatient one and vice versa, says Montroy. Or look beyond your current specialty to gain additional experience. “Cross-credentialing so you can perform facility and professional coding — that single path — will give you a huge advantage,” she adds.
Provide input and expertise early in the technology selection process. “Coders should be at the forefront of the decision to adopt AI in coding,” says Montroy, whose own facility is in the early stages of adopting computer-assisted coding and exploring the use of generative AI in other areas before taking the leap of applying the technology in its revenue cycle.

Likewise, coding managers and other healthcare leaders should start exploring LLMs now, says Raymond. “Start the investigation, and research so you can be an informed buyer,” she says. “Approach vendors with skepticism and ask questions. You deserve answers.”

She provides these three questions to ask potential vendors:

How is the LLM updated over time?
On what data does the LLM train?
What is the user experience?

Looking Ahead

Medical coders should anticipate more widespread use of LLMs as the technology evolves and improves, says Soroush. “As the base models become more effective and efficient, everything downstream of that will be updated and iterated as well. Overcoming challenges will be easier because the base model can learn and adapt,” he adds. “You can use these as the foundation of your downstream tool for various purposes—summarizing notes, extracting data, or assigning codes.”

Raymond agrees. “If you think about it, LLMs have only been widely publicly accessible since the fall of 2022, so the technology is still in its infancy,” she says. “But the possibilities are endless and promising when you look at some of the newer developments.”

What’s most important for today’s medical coders to know about LLMs? “It’s coming,” says Soroush. “Figure out how you’re going to adapt. The people who figure out how to use it to make themselves more effective will be successful. If you’re the person who knows how to leverage them to code more effectively, you’ll be better off.”

Lisa A. Eramo, MA, is a freelance healthcare writer based in Cranston, RI.

Tags: coding , workforce development , artificial intelligence , ICD-10 , ICD-10-CM , ICD-10-PCS , upskilling , large language model