Medical AI meets the mind: assessing more accurately whether AI is better than human diagnosis in medicine. From the article:

In The Lancet Digital Health, Xiaoxuan Liu and colleagues present a systematic review and meta-analysis in an attempt to answer the question of whether deep learning is better than human health-care professionals across all imaging domains of medicine. Despite the plethora of headlines proclaiming how the latest artificial intelligence (AI) has outperformed a human physician, the authors found surprisingly few studies that compare the performance of humans and these models….

The meta-analysis suggests equivalent performance of deep learning algorithms and health-care professionals in the 14 studies that used the same out-of-sample validation dataset to compare their performances…. This work nicely illustrates the challenge of attempting to compare AI with humans for medical applications, and the authors rightly qualify their conclusion with a detailed list of potential confounders and limitations….

AI cannot yet replicate the essence of the diagnostic process. In medicine, different datapoints become available at different times during a work-up. One test might be ordered because of the result of another. So, when AI algorithms are trained on a complete corpus of retrospective data that eliminates both the temporal variation and the dependency within the data, can it actually be compared with the human physician who made a series of related decisions to create that comprehensive dataset?

Additionally, formulation of a differential diagnosis often gets tossed aside when training an AI algorithm, because the focus shifts to making a single diagnosis rather than highlighting the relevant data that lead a physician to a particular set of diagnoses with associated likelihoods—ie, the differential diagnosis.

 

For the article by Liu and colleagues, see here.

 

For other articles on AI in medicine, see here.

 

h/t Philippa Göranson (@Bokofil)