Lack of transparency in AI research limits reproducibility, renders work ‘worthless’


institute for health transformation (IHT)

research, strategy and advisory


Joaquim Cardoso MSc
Senior Advisor, Chief Researcher and Editor
January 18, 2023



KEY MESSAGE(S)


A lack of transparency in AI research can make it difficult to apply in the real-world, rendering the work “worthless” when the results are not reproducible. 


  • The analysis, which included 194 radiology and nuclear medicine research papers on AI, found that up to 97% of studies do not provide enough information about their raw data, source code or model to be used in real-world clinical scenarios. 

  • The authors suggest that wider access to code and data is necessary for the scientific community to build upon and improve original work, and suggest solutions to address privacy concerns when data cannot be made publicly available.

  • A lack of transparency in artificial intelligence research can make it difficult to apply in the real-world, rendering the work “worthless” when the results — no matter how positive — are not reproducible.


INFOGRAPHIC







SOURCE

Image by Xresch via Pixabay

Lack of transparency in AI research limits reproducibility, renders work ‘worthless’


Health Imaging
Hannah Murphy
December 19, 2022


A lack of transparency in artificial intelligence research can make it difficult to apply in the real-world, rendering the work “worthless” when the results — no matter how positive — are not reproducible.


A new analysis recently shared in Academic Radiology found that a significant amount of studies do not provide information pertaining to their raw data, source code or model. 


As a result, up to 97% of these studies do not produce systems that are fit to be used in real-world clinical scenarios, according to the experts’ data.


Corresponding author Burak Kocak, MD, of Basaksehir Cam and Sakura City Hospital in Turkey and colleagues explained that since code and data are so intertwined in AI research, that information should be made more readily available when studies are published.


since code and data are so intertwined in AI research, that information should be made more readily available when studies are published.


“Through this, a wider scientific community can build upon and improve the original work. Otherwise, the contribution of the researchers to the AI field will be ineffective, simply being no more than “show-and-tell” projects,” the authors suggested.


“Through this, a wider scientific community can build upon and improve the original work. Otherwise, the contribution of the researchers to the AI field will be ineffective, simply being no more than “show-and-tell” projects,” the authors suggested.


Kocak and colleagues included 194 radiology and nuclear medicine research papers on AI in their analysis. 


Raw data was available for around 18% of these papers, but private data was accessible in just a single paper from the entire batch. 

A shortage of modeling information in the AI studies was also observed, with just around one-tenth of them sharing their pre-modeling, modeling and post-modeling files.


A shortage of modeling information in the AI studies was also observed, with just around one-tenth of them sharing their pre-modeling, modeling and post-modeling files.


The authors attributed the nearly non-existent availability of private data to the regulatory hurdles that must be overcome to address privacy concerns, acknowledging that it can be a tedious process.


The authors attributed the nearly non-existent availability of private data to the regulatory hurdles that must be overcome to address privacy concerns, acknowledging that it can be a tedious process.


“It is time-consuming and labor-intensive [if], for instance, an institutional review board approval is required

In addition, there can be adverse consequences related to patient privacy, financial investments, and ownership of the intellectual property if the data sharing is not done properly,” they explained.


There are, however, solutions to address privacy concerns when data cannot be made publicly available, such as allowing permissions to designated independent investigators for validation purposes, the experts suggested, adding that similar approaches can be taken when sharing modeling code as well.


The authors also suggested that manuscript authors, peer-reviewers and journal editors could play a role in making AI studies reproducible in the future by being more cognizant of transparency, data, code and model availability when publishing research results.


The authors also suggested that manuscript authors, peer-reviewers and journal editors could play a role in making AI studies reproducible in the future by being more cognizant of transparency, data, code and model availability when publishing research results.






DEEP DIVE [excerpt]








Science Direct

BurakKocakMD1 Aytul HandeYardimciMD1SabahattinYuzkanMD1AliKelesMD1OmerAltunMD1ElifBulutMD1Osman NuriBayrakMD1Ahmet ArdaOkumusMD

14 December 2022


Rationale and objectives


  • Reproducibility of artificial intelligence (AI) research has become a growing concern. 

  • One of the fundamental reasons is the lack of transparency in data, code, and model. 

  • In this work, we aimed to systematically review the radiology and nuclear medicine papers on AI in terms of transparency and open science.

Materials and Methods


  • A systematic literature search was performed in PubMed to identify original research studies on AI. 
  • The search was restricted to studies published in Q1 and Q2 journals that are also indexed on the Web of Science. 
  • A random sampling of the literature was performed. 
  • Besides six baseline study characteristics, a total of five availability items were evaluated. 
  • Two groups of independent readers including eight readers participated in the study. Inter-rater agreement was analyzed. Disagreements were resolved with consensus.

Results


  • Following eligibility criteria, we included a final set of 194 papers.

  • The raw data was available in about one-fifth of the papers (34/194; 18%). 

  • However, the authors made their private data available only in one paper (1/161; 1%).

  • About one-tenth of the papers made their pre-modeling (25/194; 13%), modeling (28/194; 14%), or post-modeling files (15/194; 8%) available. 

  • Most of the papers (189/194; 97%) did not attempt to create a ready-to-use system for real-world usage.

  • Data origin, use of deep learning, and external validation had statistically significantly different distributions. 
  • The use of private data alone was negatively associated with the availability of at least one item ( p<0.001).

Conclusion


  • Overall rates of availability for items were poor, leaving room for substantial improvement.

Image by Xresch via Pixabay

INTRODUCTION


Hundreds of papers on the success of artificial intelligence (AI) in various tasks are submitted to journals every month (1, 2). 



AI research can be considered worthless for others when it is not usable, reproducible, replicable, robust, or generalizable. 


Therefore, ensuring reliability should be the most important objective in AI


AI research can be considered worthless for others when it is not usable, reproducible, replicable, robust, or generalizable.


Therefore, ensuring reliability should be the most important objective in AI



MATERIALS AND METHODS


Ethical approval was not required for this systematic review.



A flowchart of the literature search is presented in Figure 2. Syntax-based search resulted in 5195 entries. A collection of 358 articles were randomly sampled. Following eligibility criteria in screening and full-text reading, we included a final set of 194 papers in this work (Supplementary Table S2).


Baseline Characteristics


All the studies were published between 2017 and 2022. Most (143/194, 74%) were published after 2020. The numbers of studies published by year are presented in Supplementary Figure S1. Most of the


Study Overview


In this work, we systematically reviewed 194 radiology and nuclear medicine papers on AI in terms of open science and transparency issues, focusing on high-quality journals. 


Our study demonstrated inadequate data availability practices within radiology and nuclear medicine research on AI. 


Overall, availability rates of individual items, minimum availability (i.e., at least one item), complete availability, and other relevant combination of items were too low. Data origin, use of deep learning, …


CONCLUSIONS


Overall rates of availability for items were poor, leaving room for substantial improvement. 


Because of their critical role in patient care, radiology and nuclear medicine research must be grounded in reproducible methodology through increased transparency. 


To ensure the reproducibility of AI research, the manuscript authors, peer-reviewers, and journal editors need to pay full attention to transparency, data, code, and model availability in future studies. Availability items presented in this …


References


See the original publication (this is an excerpt version)


Originally published at https://www.sciencedirect.com.


About the author(s) & affiliation(s)


BurakKocak MD1
Aytul HandeYardimci MD1
SabahattinYuzkan MD1
AliKeles MD1
OmerAltun MD1
ElifBulut MD1
Osman NuriBayrak MD1
Ahmet ArdaOkumus MD1

1 Department of Radiology, University of Health Sciences, Basaksehir Cam and Sakura City Hospital, Basaksehir, 34480, Istanbul, Turkey

Total
0
Shares
Deixe um comentário

O seu endereço de e-mail não será publicado. Campos obrigatórios são marcados com *

Related Posts

Subscribe

PortugueseSpanishEnglish
Total
0
Share