AI has enormous potential when it comes to the healthcare field, capable of improving diagnoses and finding new, more effective drugs. However, as a piece in Scientific American recently discussed, the speed with which AI is penetrating the healthcare field also opens up many new challenges and risks.
Over the course of the past five years, the US Food and Drug Administration has approved over 40 different AI products. However, as reported by Scientific American, none of the products cleared for sale in the US have had their performance evaluated in randomized controlled clinical trials. Many AI medical tools don’t even require approval by the FDA.
Evan Topol, the author of “Deep Medicine: How Artificial Intelligence Can Make Healthcare Human Again, stated to Scientific American that many of the AI products which claim to be effective at tasks like diagnosing diseases have actually been rigorously tested in such a fashion, with the first major randomized trial of an AI detection and diagnosis toll being done this past October. Furthermore, very few tech startups publish their research papers in peer-review journals, which is where their work will be analyzed by scientists.
When properly tested and controlled, AI systems can be powerful tools that can help medical professionals detect otherwise unnoticed symptoms, improving health outcomes.
As an example, an AI tool for detecting diabetic eye disease was tested across hundreds of patients and seemed to prove reliable. The company responsible for the test worked alongside the FDA for over eight years in order to refine the product. The test, IDx-DR, is making its way to primary care clinics where it could potentially help detect early signs of diabetic retinopathy, referring patients to eye specialists if suspect symptoms are found.
If not tested carefully, AI systems that medical professionals may use to guide their diagnosis and treatment have the potential to create harm instead of avoiding it.
The Scientific American article details one potential problem with relying on AI to diagnose ailments, pointing to the example of an AI intended to analyze chest X-rays and detect which patients might develop pneumonia. While the system proved accurate when tested at the Mount Sinai Hospital in New York, it failed when tested on images taken at other hospitals. The researchers found that the AI was distinguishing between images created by portable X-ray systems vs. those created in a radiology department. Doctors use portable chest X-ray systems on patients who are often too sick to leave their beds, and these patients are at greater risk of developing pneumonia.
False alarms are also a concern. DeepMind created an AI mobile app that is capable of predicting acute kidney failure in hospitalized patients up to 48 hours in advance. However, the system reportedly also made two false alarms for every kidney failure that was successfully predicted. False positives can be harmful as they can encourage doctors to spend unnecessary time and resources ordering further tests or altering prescribed treatments.
In another incident, one AI system incorrectly concluded that patients who had pneumonia were more likely to survive if they had asthma, which could cause doctors to alter treatments for patients with asthma.
AI systems that are developed for one hospital often underperform when they are used in a different hospital. There are multiple causes for this. For one, AI systems are often trained on electronic health records, but many electronic health records are often incomplete or incorrect as their primary purpose is often billing and not patient care. For instance, one investigation carried out by KHN found that on occasion there were life-threatening errors in patients’ medical records, like medication lists containing improper meds. Beyond that, diseases are often just more complicated, and the healthcare system more complex, than can often be anticipated by AI engineers and scientists.
As AI becomes ever more prolific, it will be important for AI developers to work alongside health authorities to ensure that their AI systems are thoroughly tested and for regulatory bodies to ensure that standards are set and followed for the reliability of AI diagnostic tools.