During the pandemic, hundreds of image-based ML models have been developed.
However, few if any models achieved widespread clinical use for COVID-19 detection or prognostication because of several issues, including: (1) data bias; (2) data shift , and (3) methodological limitations.
JAMA Network
Yuan Luo, PhD1; Richard G. Wunderink, MD2; Donald Lloyd-Jones, MD, ScM3
January 27, 2022
activeeon
Executive Summary
by Joaquim Cardoso MSc.
Chief Editor of The Health Strategy Institute blog
February 22, 2022
What is the context?
- The growing momentum of machine learning (ML) — algorithms that leverage statistical methods to learn useful patterns from data — in health care before the COVID-19 pandemic created high expectations for its contributions to health care during the pandemic.
- However, these expectations have been largely unrealized except for a few notable successes.
What are the issues? (see infographic below)
- Each step of conventional ML workflows reacts to human expert input, whereas proactive ML has 2 key cycles to automate feature engineering (level 1) and refine upstream data collection/preparation (level 2) (Figure).
- Recent adoption of deep learning in health care research has started to enable automated feature learning toward completing the level 1 cycle.
- However, level 2 proactive ML is needed to evolve solutions in highly dynamic situations such as the pandemic.
- Reactive machine learning (ML) workflow requires input from human experts at each step.
- Proactive ML can assist human experts and automate monitoring and continuous improvement of the ML workflow, reducing the routine need for their input at each step.
Infographic
Moving From Reactive to Proactive Machine Learning in Health Care
The Pandemic as a Stress Test for ML
- During the pandemic, hundreds of image-based ML models have been developed.
- However, few if any models achieved widespread clinical use for COVID-19 detection or prognostication because of several issues,2 including: (1) data bias; (2) data shift , and (3) methodological limitations
- Beyond pandemics, the data bias and shift issues occur broadly in dynamic health care scenarios and result in progressively underperforming ML algorithms, necessitating proactive mitigation strategies.
The Greek application
- One notably successful example of ML during the pandemic used reinforcement learning to target more efficient COVID-19 testing for travelers entering Greece — This reinforcement-learning system improved testing efficiency up to 2 to 4 times during peak travel
Moving From Reactive to Proactive ML in Health Care
- Although health care has begun to adopt level 1 proactive ML, moving toward level 2 may open new avenues.
- An emerging form of adaptive platform trials has drawn increasing attention during the pandemic (see REMAP : The Randomized, Embedded, Multifactorial Adaptive Platform )
- Level 2 proactive ML also enables augmented data preparation, especially for otherwise difficult-to-extract information, using NLP (Natural Language Processing)
- Natural language processing also has potential to crowdsource patient perspectives into the level 2 feedback loop …
Conclusions
- The limited contributions of ML for addressing COVID-19 prompt reexamination of best practices during the pandemic and in the future.
- This fresh approach (proactive ML) could lessen the potential influence on ML of data issues under challenging and evolving situations and may allow ML to coevolve with data and meaningfully influence more health care decisions and policies.
ORIGINAL PUBLICATION
Proactive vs Reactive Machine Learning in Health Care — Lessons From the COVID-19 Pandemic
JAMA Network
Yuan Luo, PhD1; Richard G. Wunderink, MD2; Donald Lloyd-Jones, MD, ScM3
January 27, 2022
activeeon
The growing momentum of machine learning (ML) — algorithms that leverage statistical methods to learn useful patterns from data — in health care before the COVID-19 pandemic created high expectations for its contributions to health care during the pandemic.
However, these expectations have been largely unrealized except for a few notable successes.
This Viewpoint reflects on the underlying reasons and proposes to shift the approach from reactive to proactive ML to unleash its full potential.
Each step of conventional ML workflows reacts to human expert input, whereas proactive ML has 2 key cycles to automate feature engineering (level 1) and refine upstream data collection/preparation (level 2) (Figure).
Figure. Moving From Reactive to Proactive Machine Learning in Health Care
Recent adoption of deep learning in health care research has started to enable automated feature learning1 toward completing the level 1 cycle.
However, level 2 proactive ML is needed to evolve solutions in highly dynamic situations such as the pandemic.
Reactive machine learning (ML) workflow requires input from human experts at each step.
Proactive ML can assist human experts and automate monitoring and continuous improvement of the ML workflow, reducing the routine need for their input at each step.
The Pandemic as a Stress Test for ML
During the pandemic, hundreds of image-based ML models have been developed.
However, few if any models achieved widespread clinical use for COVID-19 detection or prognostication because of several issues,2 including
- data bias (training ML on small sample sizes with insufficient demographic coverage),
- data shift (mismatch between the training and deployment data sets, especially as the pandemic expanded to other populations and health care systems), and
- methodological limitations (overlooking the data issues, such as no validation with data from an institution different from the one in which the ML algorithm was developed).
… few if any models achieved widespread clinical use for COVID-19 detection or prognostication because of several issues, including: (1) data bias, (2) data shift and (3) methodological limitations
Beyond pandemics, the data bias and shift issues occur broadly in dynamic health care scenarios and result in progressively underperforming ML algorithms,3 necessitating proactive mitigation strategies.
Beyond pandemics, the data bias and shift issues occur broadly in dynamic health care scenarios and result in progressively underperforming ML algorithms, necessitating proactive mitigation strategies.
The Greece application
One notably successful example of ML during the pandemic used reinforcement learning to target more efficient COVID-19 testing for travelers entering Greece.4
Observing that population-level epidemiologic metrics poorly identified infected asymptomatic travelers, investigators collected traveler-specific features, including age, sex, and travel history, to stratify 6 084 954 travelers into risk phenotypes with higher resolution than merely country of origin.
Their reinforcement-learning system (known as Eva)
- allocated scarce polymerase chain reaction (PCR) tests to maximize detection of infected asymptomatic travelers among those tested (exploitation), and
- automatically provided feedback that enabled Greece to target data collection to improve risk-estimation precision for undersampled groups (exploration).
This reinforcement-learning system improved testing efficiency up to 2 to 4 times during peak travel compared with random testing and informed Greece’s policy for “gray listing” high-risk countries to require travelers from them to provide proof of a negative PCR test result before arrival in Greece.4
One notably successful example of ML during the pandemic used reinforcement learning to target more efficient COVID-19 testing for travelers entering Greece.
Data bias and shift problems
Compared with other ML applications that have experienced data bias and shift problems,2 the Greek system has experienced success that can be partially attributed to using reinforcement learning to complete the level 2 cycle to guide upstream targeted data collection.
The system prioritizes interpretability to increase transparency and adoption (eg, illustrating large CIs when communicating the need for testing targeted traveler groups [exploration]).
Exploitation for short-term reward by focusing tests on current high-risk groups will fail when new higher-risk groups emerge, which necessitates ongoing exploration for long-term benefits by strategically allocating tests to increase sampling rate and improve risk estimation for new groups.
Dynamically balancing exploitation-exploration objectives to guide continuous targeted data collection makes reinforcement learning a promising candidate for level 2 proactive ML.
Dynamically balancing exploitation-exploration objectives to guide continuous targeted data collection makes reinforcement learning a promising candidate for level 2 proactive ML.
Moving From Reactive to Proactive ML in Health Care
Although health care has begun to adopt level 1 proactive ML,1 moving toward level 2 may open new avenues.
For example, conventional clinical trials are expensive and often require large sample sizes, making their initiation and completion especially challenging for rapidly changing health care conditions such as the pandemic.
Although health care has begun to adopt level 1 proactive ML,1 moving toward level 2 may open new avenues. For example, conventional clinical trials are expensive and often require large sample sizes …
An emerging form of adaptive platform trials has drawn increasing attention during the pandemic.
The Randomized, Embedded, Multifactorial Adaptive Platform (REMAP) originally established for community-acquired pneumonia was pivoted to conduct trials for multiple COVID-19 treatments simultaneously and showed promising results for tocilizumab and sarilumab in critically ill patients.5
In REMAP, as clinical evidence and knowledge evolve, response-adaptive randomization preferentially randomizes patients to the most promising emerging interventions.5
Machine learning algorithms can predict which patients may benefit from being included in alternative and new treatment groups and help guide response-adaptive randomization.
An emerging form of adaptive platform trials has drawn increasing attention during the pandemic.
The Randomized, Embedded, Multifactorial Adaptive Platform (REMAP) originally established for community-acquired pneumonia was pivoted to conduct trials for multiple COVID-19 treatments simultaneously and showed promising results for tocilizumab and sarilumab in critically ill patients.5
The dynamic nature of adaptive platform trials and the logistic constraints (eg, finite sample sizes) call for targeted patient recruitment and data collection.
Level 2 proactive ML can address these needs and improve trial efficiency, analogous to Greece’s targeted border PCR testing.
Improving trial efficiency is important to accelerate discoveries, for even with REMAP, definitive conclusions for interventions may be reached much later than desirable due to competing interventions and evolving knowledge (eg, the futility of convalescent plasma on organ support–free days in critically ill patients was apparent only after 18 months into the pandemic 6).
Level 2 proactive ML also enables augmented data preparation, especially for otherwise difficult-to-extract information.
For example, as knowledge about chronic COVID-19 sequelae evolves with increasing numbers of patients (especially outpatients), structured data capture has struggled to adequately describe new symptoms and findings.
Much of the key information around COVID-19 sequelae comes from patient-authored text (eg, messages/discussions in patient portals/forums), which is not analysis ready. To leverage patient-authored text, traditional approaches relying on manual review are labor intensive and difficult to scale.
Natural language processing can identify symptom patterns undetected by physicians from patient-authored text and harvest important clinical variables over time to advance evidence-based research.7
Such extracted data could add to the descriptions for chronic COVID-19 sequelae and enhance surveillance for novel or emerging manifestations.
Natural language processing has potential to crowdsource patient perspectives into the level 2 feedback loop to augment data preparation and refine downstream analysis, including exploring patient subgroups with more coherent clinical trajectories and creating computable phenotypes defining COVID-19 sequelae.
Level 2 proactive ML also enables augmented data preparation, especially for otherwise difficult-to-extract information.
Natural language processing can identify symptom patterns undetected by physicians from patient-authored text and harvest important clinical variables over time to advance evidence-based research.
Natural language processing also has potential to crowdsource patient perspectives into the level 2 feedback loop …
Conclusions
The limited contributions of ML for addressing COVID-19 prompt reexamination of best practices during the pandemic and in the future.
The pandemic, and the attendant need for adapting to the rapidly evolving landscape of health care, has acted as a stress test for ML.
Understanding successes and unrealized opportunities not only highlights the nonrepresentative issues of the underlying data but also reveals the need to move from reactive toward more proactive ML.
This fresh approach could lessen the potential influence on ML of data issues under challenging and evolving situations and may allow ML to coevolve with data and meaningfully influence more health care decisions and policies.
The limited contributions of ML for addressing COVID-19 prompt reexamination of best practices during the pandemic and in the future.
This fresh approach (proactive ML) could lessen the potential influence on ML of data issues under challenging and evolving situations and may allow ML to coevolve with data and meaningfully influence more health care decisions and policies.
About the authors & affiliations
Yuan Luo, PhD1;
Richard G. Wunderink, MD2;
Donald Lloyd-Jones, MD, ScM3
- 1Division of Health and Biomedical Informatics, Department of Preventive Medicine, Northwestern University Feinberg School of Medicine, Chicago, Illinois
- 2Division of Pulmonary and Critical Care, Department of Medicine, Northwestern University Feinberg School of Medicine, Chicago, Illinois
- 3Division of Cardiology, Department of Medicine, and Department of Preventive Medicine, Northwestern University Feinberg School of Medicine, Chicago, Illinois
Originally published at https://jamanetwork.com