Ready or not, AI-based decision support tools are entering oncology clinics

Urologic oncologist Wayne Brisbane thought his patient might be a good candidate for focal therapy. 

The minimally invasive treatment for prostate cancer delivers a cell-destroying form of energy, such as a frigid gas or ultrasound, directly to the tumor. It’s ideal for patients whose cancer remains contained within the prostate, but is not near the urinary sphincter or urethra. 

But Brisbane needed to further probe the cancer’s characteristics before recommending a treatment. So, he turned to Unfold AI, a clinical decision support tool that uses artificial intelligence to combine a patient’s biopsy, MRI, and prostate-specific antigen test data to produce a 3D tumor map. 

“It looks like a heat map,” Brisbane, assistant professor of urology at UCLA Health, said to The Cancer Letter. “You can say, ‘Here’s the cancer. This is the extent.’ And then you can use that fairly simplistic model to make important decisions.” 

Unfold AI was developed by medical technology company Avenda Health in conjunction with the University of California, Los Angeles. It was cleared by FDA in 2022. 

On further examination, Brisbane’s patient turned out to be less likely to benefit from the initially projected treatment. 

“Once I saw that map, I was like, ‘Oh, wow, this doesn’t look like focal therapy is going to work,’” Brisbane said. Much of the patient’s prostate appeared cancerous—a far too extensive volume for focal therapy. 

That’s usually a difficult conversation to have with a patient, Brisbane said. “But when I showed the patient this map, it made a lot of sense to him.” 

Instead, Brisbane surgically removed the tumor, using the 3D map generated by Unfold AI as a guide, which helped him preserve as much healthy tissue as possible. 

“We were able to spare his nerves really aggressively, because I knew the exact location where I was predicting the cancer coming out of the prostate. And so, we were able to go wide in that single location and get a negative margin, but also preserve his erectile function pretty well,” Brisbane said. 

AI-based medical tools are being developed at a faster rate than ever before. FDA has authorized 950 artificial intelligence- and machine learning-enabled medical devices as of Aug. 7. That’s nearly twice as many as just two years earlier; on Oct. 5, 2022, the number of FDA-authorized AI/ML-enabled medical devices stood at 521.

That doesn’t mean that hundreds of AI-based decision support tools have entered clinics. In fact, most of those devices lack a code for reimbursement from health insurance, despite FDA’s stamp of approval. 

While Brisbane’s patient was not charged for the use of Unfold AI because it was provided in the context of clinical research, UCLA Health normally charges $3,984 for this procedure. As of July, Unfold AI-related services have a reimbursement code that insurers can use to pay for the tool’s use. That same month, the Centers for Medicare and Medicaid Services assigned a national payment rate of around $1,000 for such services. 

“There’s only a handful of AI technologies that do have a national payment rate,” said Brit Berry-Pusey, chief operating officer and co-founder of Avenda Health, who was “very happily surprised” at CMS’s decision on Unfold AI. 

These rapidly emerging technologies raise questions about defining meaningful benefits and developing criteria for regulating such tools. However, murky processes, technical and financial barriers, and an onslaught of tools that may or may not be helpful mean a long road before AI-based medical devices become widespread in oncology clinics.

“The vast majority of oncologists have never touched an AI-based device and probably won’t for a number of years,” Ravi Parikh, associate professor in the Department of Hematology and Medical Oncology at Emory University School of Medicine and medical director of Winship Data and Technology Applications Shared Resource at Winship Cancer Institute of Emory University, said to The Cancer Letter.

FDA and AI 

AI isn’t a new concept for FDA. The agency first authorized an AI/ML-enabled medical device in 1995. 

The device, called the PAPNET Testing System, was a “semi-automated test indicated to aid in the rescreening of cervical Papanicolaou (Pap) smears previously reported as negative.” The PAPNET scanner and computers used neural networks to analyze smears in search of potentially cancerous cells, which would then be displayed on a screen for review by a cytologist, according to a 1994 Cancer Letters paper by Laurie J. Mango, of Neuromedical Systems Inc., which developed PAPNET. 

However, the testing system came with enormous costs, and many insurers refused to pay for it. “This just duplicates the efforts of a human being,” a Cigna HealthCare spokesperson said to the Associated Press in 1996.

PAPNET was withdrawn from the market in 2009. 

The same problems plague many of today’s AI-based clinical decision support tools, said Sean Khozin, CEO of the CEO Roundtable on Cancer, founder of consulting firm PhyusionBio, and research affiliate at the MIT Laboratory for Financial Engineering. 

“A lot of startups are developing faster horses versus bringing truly novel solutions to the market. Faster horses are important, but they’re not going to be transformative,” he said to The Cancer Letter. “I think the transformative power of AI will come from developing de novo solutions, and those are in the minority.” 

There are three FDA authorization pathways for AI/ML-enabled medical devices: premarket approval, 510(k) clearance, and de novo classification. 

PMA is the most stringent type of device marketing application required by FDA. Approval in this category is based on the agency’s determination that the PMA contains sufficient valid scientific evidence showing that the device is safe and effective for its intended use.

PMA is given to devices that “support or sustain human life, are of substantial importance in preventing impairment of human health, or which present a potential, unreasonable risk of illness or injury,” according to FDA. PMA is necessary for such devices because of the level of risk associated with them, and because “general and special controls alone are insufficient to assure the safety and effectiveness.” 

Only four of the 950 devices authorized by FDA—including the defunct PAPNET Testing System—have gone through this pathway. 

But “the approval process for an AI-based device is different from a drug, and we should think about it differently than a drug,” Parikh said.

Continued Parikh: 

An FDA approval of a drug is a culmination of years and years of preclinical through phase III testing, and really is the final guardrail towards making these things available for patients. But devices are held to a much lower standard, and particularly AI devices.

For medical devices in general, the FDA approaches them with a risk-based framework. So, depending on how risky, how consequential the decision that’s being informed by the device, the FDA has different standards of evidence. 

At the low end, you can just have statistical metrics—not even patient-specific testing—that justify clearance, versus at the high end, when you’re actually intending to inform a clinical decision, there may be a requirement for some prospective testing. But it’s not nearly to the extent, even in that situation, that a typical drug would be—there’s not really a requirement for multi-institution testing in a phase III randomized controlled study.

Clearance through the 510(k) pathway—the “low end,” as Parikh put it—is how FDA has authorized the vast majority of AI/ML-enabled medical devices, including Unfold AI. Those 924 devices have been authorized based on whether they are “substantially equivalent to a legally marketed device,” commonly known as a “predicate,” according to FDA

Most devices approved via the 510(k) pathway are simply “faster horses” that Khozin alluded to, which he said primarily speed up existing processes. “These are essentially sort of meaningful, but very simple ways of using AI,” Khozin said. 

The third authorization pathway, de novo classification, lies in between. It “provides a marketing pathway to classify novel medical devices for which general controls alone, or general and special controls, provide reasonable assurance of safety and effectiveness for the intended use, but for which there is no legally marketed predicate device,” according to FDA

Twenty-two of the 950 FDA-authorized AI/ML-enabled medical devices have received de novo classification. 

FDA has been attempting to keep pace with technological advances spurred by the AI boom. 

According to an FDA spokesperson:

The traditional paradigm of medical device regulation was not designed for adaptive AI/ML technologies, which have the potential to adapt and optimize device performance in real-time to continuously improve health care for patients. The highly iterative, autonomous, and adaptive nature of these tools requires a new, total product lifecycle regulatory approach that facilitates a rapid cycle of product improvement and allows these devices to continually improve while providing effective safeguards.

We seek to foster a collaborative approach and alignment within the healthcare ecosystem around AI in health care. There are several ways to achieve this. 

First, agreeing on and adopting standards and best practices at the health care sector level for the AI development lifecycle, as well as risk management frameworks, can help address risks associated with the various phases of an AI model. This includes, for instance, approaches to ensure that data suitability, collection, and quality match the intent and risk profile of the AI model that is being trained. This could significantly reduce the risks of these models and support their providing appropriate, accurate, and beneficial recommendations.

Top of mind for device safety is quality assurance applied across the lifecycle of a model’s development and use in health care. Continuous performance monitoring before, during, and after deployment is one way to accomplish this, as well as by identifying data quality and performance issues before the model’s performance becomes unsatisfactory.

The agency has been working with the International Medical Device Regulators Forum to create documents related to a variety of topics affecting medical devices to keep up with evolving technologies, according to the FDA spokesperson. 

FDA’s own actions when considering regulation of AI/ML medical devices, provided by FDA, include:

Additionally, President Joe Biden ordered federal agencies and technology companies to develop guardrails for AI through an executive order, signed in Oct. 2023 (The Cancer Letter, Nov. 17, 2023). It will likely vanish soon after President-elect Donald Trump takes office in January. 

“We will repeal Joe Biden’s dangerous executive order that hinders AI innovation, and imposes radical leftwing ideas on the development of this technology,” states the 2024 GOP Platform. “In its place, Republicans support AI development rooted in free speech and human flourishing.” 

Even if a device receives a green light from FDA, success isn’t guaranteed for that product or the company behind it.

“We’ve seen a number of digital health AI companies that have imploded, even after getting clearance from the FDA, because payers haven’t paid for their solutions, and guidelines haven’t found any convincing evidence that they should incorporate their solutions,” Khozin said. 

For instance, digital therapeutics developer Pear Therapeutics created a mobile app to aid treatment for substance use disorder, which, in 2017, became the first-ever digital therapeutic to be cleared by FDA. The company’s 10-year run produced numerous apps to help treat conditions including insomnia, PTSD, and chronic pain. 

In 2021, Pear Therapeutics took a special purpose acquisition company deal worth about $1.6 billion to become a publicly traded company. Two years later, when the company filed for bankruptcy, its assets were sold for around $6 million. 

“It went from a unicorn to, unfortunately, having a valuation of zero,” Khozin said. 

On the day Pear Therapeutics announced its bankruptcy, the company’s CEO, Corey McCann, cited insurers’ refusal to pay for digital therapeutics and challenging market conditions as the main reasons for his and other digital therapeutics companies’ demise. 

Who is paying? 

To get insurers to pay for a new technology, its services need a Current Procedural Terminology, or CPT, code. 

This uniform language for communicating medical services and procedures helps healthcare providers generate bills for assistance provided to patients during medical encounters. Insurance companies receive that coded information to determine reimbursement. 

There are more than 11,000 CPT codes. Few of them cover procedures assisted by AI. 

While there is no single database of AI-related CPT codes, authors of a 2023 paper in NEJM AI found only 32 unique CPT codes associated with AI out of the 521 AI/ML-enabled medical devices publicly known as authorized by FDA at the time. The researchers examined 11 billion claims processed from Jan. 1, 2018 to June 1, 2023. 

Those 32 codes accounted for only 16 medical AI procedures, since some of the procedures could be reimbursed through multiple codes. 

“FDA clearance is rare,” Parikh said. “Getting a reimbursement code is even rarer.” 

To get a CPT code, the CPT Editorial Panel, which currently includes 20 health experts across medical disciplines, assesses whether an applicant procedure meets specific criteria, such as being consistent with current medical practice and showing clinical efficacy in peer-reviewed literature. The panel, which is appointed by the American Medical Association Board of Trustees, meets three times per year to update the CPT code set. 

The panel began to see a huge influx of CPT code applications for AI-related services around three or four years ago, said Barbara Levy, vice chair of the CPT Editorial Panel and chief medical officer of Visana Health Inc. Code criteria do not differ for procedures with an AI component, but the panel has made a few tweaks, she said. 

“We were getting applications that were squishy; there wasn’t a clear delineation of what exactly the algorithm was doing in the background,” Levy said to The Cancer Letter. “We’ve added questions to the code change application to ensure that we understand what the background was for development of the AI, and we understand clearly whether the AI is assistive, augmentative, or autonomous.

“We want to be very clear: What is the physician interpreting? What is the algorithm doing in the background that enhances our understanding of the disease condition or whatever’s going on with the patient? So, adding those questions just makes sure that we are clear on what the AI is doing, what the algorithm is doing in the background,” she said.

The new questions, which were added for new and revised CPT code applications reviewed at the most recent editorial panel meeting in September, include:

  1. Consistent with the Artificial Intelligence concepts addressed in Appendix S, is this a request for a service or procedure that relies on output from software which has performed more than data processing (data processing includes helping to aggregate, organize/arrange, transmit, develop, or otherwise visually enhance the data)?

  2. Can you provide a description of data, from input to output (PDF or Word file), such as a flow chart, or step-by-step narrative noting the following. If you respond no, please explain. 

    • The point(s) where the software acts, and the nature of the action.

    • Quality check of input (separately from data acquisition, preparation, & transmission), if any

    • Specify reporting mechanism and content of the output. 

  3. Does the proposed NEW or REVISED code describe the software output in relationship to the work performed by the physician or other QHP as assistive, augmentative, or autonomous? (Note for Augmentative or Autonomous, clinical meaningfulness may be established in part by evidence linking the output from this device to evidence for a service or procedure in current clinical use. Refer to Appendix S for criteria.)

    1. Assistive

    2. Augmentative

    3. Autonomous 

  4. For each NEW and/or REVISED code, provide documentation of FDA Summary of Safety and Effectiveness (SSED or DEN Summary) and the FDA approval letter. If not approved by the FDA, please provide other relevant documentation of the proposed indication for use (IFU), e.g. De Novo Classification Order (DENCO), Breakthrough Designation Determination (BDD), etc. These will be used to determine the relevance of any published evidence.

  5. For each NEW and/or REVISED code, if not yet approved by the FDA, please indicate current regulatory status. Indicate N/A if not applicable.

  6. Please describe efforts to ensure broadest generalizability of this software, e.g. curation of training databases, plans for surveillance of real world data, etc, (give references). (Note: Your answer to this question should be reflected also in the “Typical Patient” section.)

  7. Characterize the potential for perpetuation, propagation, or mitigation of social bias, which might reasonably be anticipated (give references). 

  8. Please identify and summarize (with clinicaltrials.gov references) any ongoing clinical trials related to the service(s) or procedure(s). Indicate N/A if not applicable.

Many of the AI-related procedures that have made it into the CPT code set are young and sparsely used by health systems, Levy said. These procedures have received category III codes, temporary tracking codes used to collect data and assess emerging technologies. 

Unfold AI, the prostate cancer mapping tool, was granted a category III code last November, and it went into effect July 1, 2024. Prior to that, the technology was paid for via clinical research or charged to patients, said Berry-Pusey, of Avenda Health. 

To advance from a temporary code to a permanent category I code, a procedure typically needs high-quality peer-reviewed literature comparing the service in question to the standard of care, support from a specialty society, and widespread use relative to the addressed health condition’s prevalence, Levy said. 

Getting a code—either category I or category III—doesn’t guarantee the procedure will be covered by Medicare, Medicaid, or commercial payers. 

“It definitely feels very ad hoc as to which devices get paid for versus not. I think even Medicare would be hard pressed to tell you that it’s only those that have evidence bases that are the ones getting paid for,” Parikh said. 

Unfold AI’s CMS national payment rate of $996.18 was assigned for the second half of 2024 as a standard clinical ambulatory payment classification, rather than a temporary one. An updated rate of $1,017.39 will go into effect in 2025. 

“That was very, very happy for us, and we’re grateful that Medicare saw the benefit of our technology,” Berry-Pusey said. “CMS is a little bit more opaque than FDA, so it’s a little harder to know what their expectations are, or what they need.” 

While Medicare has been processing most claims for procedures with Unfold AI, there currently isn’t enough data to assess whether private insurers are covering the costs, Berry-Pusey said. “We’re still in the early stages of seeing these claims go through.” 

Health system uptake 

Even if a procedure is granted a reimbursement code, which can provide a financial incentive to clinics, health systems must decide whether they want to—and are able to—implement the technology in question. 

“Usually, health systems and clinicians are making the ultimate decision about whether they take up a device or tool. And I would argue that many of them aren’t really paying attention to FDA clearance or not, with the exception of a few discrete clinical areas,” Parikh said. 

According to the NEJM AI paper, only four of the 16 medical AI procedures with CPT codes had more than 1,000 total claims from Jan. 1, 2018 to June 1, 2023. About two-thirds—or 67,306—of the nearly 92,000 AI-related claims found in the study were associated with procedures for coronary artery disease. These procedures involved products such as HeartFlow FFRCT, which creates a 3D model of coronary arteries from CT scans. 

However, those coronary artery disease procedures had CPT codes that were three years older than the other 15 procedures, which became effective in 2021 or later. 

Unfold AI is finding its own pockets of use. At the time of this writing, the tool is available in five health practices: UCLA Health, University of Alabama at Birmingham Medicine Urology, Scionti Prostate Center in Sarasota, FL, Kasraeian Urology in Jacksonville, FL, and Urology Associates in Cumberland, MD. 

“I hope that every patient who comes in and has one of these targeted biopsies will be eligible and will get an Unfold AI map so that they can review with a urologist and then discuss their options,” said Alan Priester, senior data scientist at Avenda Health who helped develop Unfold AI. “My goal is to give physicians the most accurate and the clearest picture of where that tumor is and is not likely to be, which is very powerful information to act upon in terms of making care more effective and safer.” 

Technical issues can thwart the use of innovations like Unfold AI. 

Healthcare providers in radiology departments in the Netherlands enumerated such hindrances in a survey published in 2023 in European Radiology. Respondents reported that IT and integration was the second most common obstacle (behind costs) to getting AI tools into clinical practices. 

“You think about something like a pathology-based decision support tool, like the prostate cancer one, for example. Well, that requires a hospital, one, to be collecting all of its pathology and largely to be digitizing it as well,” Parikh said. 

“Many of these startups market themselves as being able to digitize your X-rays and radiology and digitize your pathology and the like, but that is often a pretty costly endeavor. And for a regular mom-and-pop hospital that’s never digitized any of its pathology, it has to do some of that to be able to make this sustainable, even if there is a diagnosis code,” Parikh said. 

The University of Texas MD Anderson Cancer Center recognized the need to streamline its data, and named Caroline Chung its first chief data and analytics officer in 2021. 

“Up until this point, we’ve been generating data for human consumption, because the only entities that were consuming data up until this point have been humans,” said Chung, vice president, director of data science development and implementation at the Institute for Data Science in Oncology, and professor of radiation oncology and neuroradiology at MD Anderson. “Now, all of a sudden, we’re trying to have some other entity that doesn’t have that same intuition—that just takes everything at face value, and we have not been generating data for machine consumption.” 

To future-proof its data, MD Anderson has begun to capture and flow high-quality data accompanied by contextual description, Chung said, because “10 years ago, we weren’t thinking about how we could potentially be using it, and so not all of that data may be as useful as it could be.”

Still, the art of medicine contains nuances that aren’t—or can’t be—captured in data that will be fed into a machine, Chung said. Information exchanged during a conversation between physicians or a phone call with a patient’s caregiver, for instance, might not be entered into an electronic health record. 

“So, now you’re training an algorithm with a black box, and the black box is not the AI—it’s what we’ve actually not put into the EHR,” Chung said. 

Levy shares these concerns. 

“What if I, as the clinician, have already filtered out what you as a patient told me about your condition?” Levy said. “I’ve already categorized it in a way before I wrote it down. And I may have dismissed a lot of what you told me because it didn’t fit into my framework for what I think is going on with you. 

“If we mine all of that for data and analyze it, we’re missing big chunks of what patients have told us, but we didn’t write down,” she said. 

AI-based clinical tools—and the health professionals interpreting their outputs—can also be afflicted by a cluster of biases. 

Besides their potential to magnify biases towards minoritized groups, Chung said, there could be recency bias, in which a recent event could influence a person’s decisions, measurement bias, in which an AI tool might be trained on different quality data than what is available in a hospital, and automation bias, in which people put too much trust into an AI’s output.

“A lot of the models have been developed to present the data in the most certain way,” Chung said. “I know that sells, but I think that especially when it comes to clinical decision support, presenting that uncertainty would actually be really helpful.” 

Physicians must also be mindful to use an AI tool within the specific population that the tool was designed for and tested on, Levy said. 

“I think some of these tools can be used to reduce the cognitive load on providers and clinicians, as long as it doesn’t stop us from thinking, as long as we don’t end up relying on them,” she said. “It’s like the people who use GPS and drive into a lake.” 

Health professionals must also be ready to receive and use said tools prior to their implementation, said Chung. “Is it going to fit into their workflow and it makes sense, or is it an additional add-on that’s actually going to disrupt their workflow?” 

Even before that, health systems must consider the value proposition of a new technology, Chung said. “So, (a) what is your baseline bar, and can you actually raise that bar? And, (b) are you going to take action based on the output of the tool? Because if your answer is no to both, why are you spending time evaluating the tool?” 

But a yes to those questions, she said, means asking would-be users: Is this tool actually useful? Would you implement it? Is that realistic? 

“If all those are yes, then we evaluate the performance of the tools,” Chung said. “But you’d be surprised at how many tools come across that don’t actually meet any of those check marks. There are a lot of tools being built out there because data was available, and so it was a hammer looking for a nail.” 

AI’s future in oncology 

Oncology is just starting to appreciate AI’s potential to revolutionize the field. 

That said, “we haven’t reached a point where we’re bringing transformative solutions to market,” Khozin said. “Part of that is because of the lack of general understanding of the needs of the biomedical ecosystem or the world of oncology, in this case, clinicians. And then also, the fact that clinicians also may not necessarily know what AI can or cannot solve.” 

The Institute for Data Science in Oncology at MD Anderson, which was launched last year, seeks to change that, Chung said. “One of our biggest goals is, ‘How do we take the data science tools that are both emerging as well as already developed, but drive them to impact cancer?’ That really requires that crosstalk between the clinicians and basic scientists who have this data, and bringing together the conversations with data scientists.” 

Another issue is approaching AI in the wrong way, Khozin said. “We have to look at developing AI not like developing software, which is how we look at all these devices, but as a startup that’s developing an AI solution.” 

AI companies should be thought of as biotech companies, he said. That’s because their solutions will require years of research and development followed by clinical trials. 

Khozin believes this framework will bring about novel AI solutions instead of replicating what humans are already doing, like reaching current gold standards in medicine. 

“I’m much more interested in bringing new standards and challenging existing gold standards,” Khozin said. “None of these gold standards are set in stone—they’re just based on our current level of understanding. So, once we start to challenge the existing gold standards, I think that’s where we bring true transformation.” 

Chung also believes that AI developers need to shift their mindsets. 

“What we’ve been asking these tools to do is try to replace humans, as opposed to asking, ‘How can they actually augment humans?’” she said. “Asking a different question may actually change the way we actually leverage these tools to the best of our ability.”

Unfold AI, for instance, can boost the accuracy of clinicians’ assessments of prostate tumor contours, according to a paper published in July in The Journal of Urology.

Manually determining tumor outlines with MRI data, which is the standard of care, led to 67.2% balanced accuracy (mean of voxel-wise sensitivity and specificity) among 10 clinicians. Using Unfold AI to generate a map led to 84.7% balanced accuracy among those same doctors. Further, the negative margin rate jumped from 1.6% to 72.8%—a 45-fold increase—with the help of the AI tool. 

Unfold AI can even perform functions beyond those for which it was designed. A study published in August in BJUI Compassfound that Unfold AI outperformed five conventional predictors for extracapsular extension risk, a condition when the tumor grows outside of the prostate, and a sign that focal therapy should not be used.  

“It seems to be better than MRI at predicting extracapsular extension and better than some other nomograms, even though it was tuned for something different,” said UCLA Health’s Brisbane, who was a co-author of the BJUI Compass paper. “Because it’s describing a biological principle, or biological feature, which is tumor volume, it does seem to be accurate in other scenarios, even when it was not designed to give us information. 

“I do think that that’s an important call to arms for all the development of AI,” he said.

Previous
Previous

Larry Einhorn, pioneer of platinum-based chemotherapy, discusses 50 years of cancer treatment innovation

Next
Next

Susan Ellenberg started out as a high school math teacher—then became a leading biostatistician