AI Diagnosis with GPT-4: Enhancing Accuracy in Complex Cases

byJoaquim Cardoso

22 de agosto de 2023

5 minute read

the health strategist
multidisciplinary institute

Joaquim Cardoso MSc.

Chief Research and Strategy Officer (CRSO),
Chief Editor and Senior Advisor

August 22, 2023

What is the message?

The research letter “Use of GPT-4 to Analyze Medical Records of Patients With Extensive Investigations and Delayed Diagnosis”, published by JAMA Network, highlights the application of the GPT-4 program, an artificial intelligence tool, in analyzing medical records of older patients with delayed diagnoses.

The study reveals that GPT-4’s diagnostic accuracy in suggesting primary and differential diagnoses is notably higher than that of clinicians and a diagnostic decision support system.

While GPT-4 shows promise in aiding clinicians and improving diagnostic outcomes, its effectiveness depends on comprehensive patient information and careful interpretation within the clinical context.

Key takeaways:

What is the main focus of the article?

The article discusses the use of the GPT-4 (Generative Pre-trained Transformer 4) program, a form of artificial intelligence (AI), to analyze medical records of older patients with delayed diagnosis and aims to determine whether GPT-4 can enhance diagnostic accuracy in complex cases.

What is the hypothesis of the study?

The study hypothesizes that GPT-4 can improve the diagnostic accuracy of clinicians by providing the most probable diagnosis or suggesting differential diagnoses in complex cases, especially for patients in low-income countries where specialist care might be lacking.

How was the study conducted?

The medical histories of six patients aged 65 years or older, who had experienced a delay of more than a month in receiving a definitive diagnosis, were entered into GPT-4 without revealing the actual diagnosis. The responses generated by GPT-4, as well as those by clinicians and a diagnostic decision support system, were collected and compared. The study analyzed the accuracy of primary diagnoses and differential diagnoses provided by GPT-4 and clinicians.

What were the key findings of the study?

The accuracy of primary diagnoses made by GPT-4 was higher (66.7%) compared to clinicians (33.3%) and a diagnostic decision support system (0%). When including differential diagnoses, GPT-4’s accuracy was 83.3%, clinicians’ accuracy was 50.0%, and the decision support system’s accuracy was 33.3%. GPT-4 was found to suggest diagnoses not previously considered by clinicians, potentially leading to improved diagnostic outcomes.

What are the implications and limitations of the study?

The study suggests that GPT-4 has the potential to aid clinicians in diagnosing older patients with complex cases, particularly where specialist care is limited. However, GPT-4’s effectiveness relies on comprehensive entry of patient information. The study acknowledges limitations, such as GPT-4’s limitations in detecting multifocal infections and the need for clinical context in interpreting suggestions. The use of AI in diagnosis is considered both promising and challenging based on the study’s findings.

DEEP DIVE

Use of GPT-4 to Analyze Medical Records of Patients With Extensive Investigations and Delayed Diagnosis [excerpt]

JAMA Network

Yat-Fung Shea, MBBS¹; Cynthia Min Yao Lee, MBBS¹; Whitney Chin Tung Ip, MBBS, BSc¹; et alDik Wai Anderson Luk, MBBS, MRes¹; Stephanie Sze Wing Wong, MBChB¹

August 14, 2023

Introduction

Artificial intelligence (AI), especially machine learning, has been increasingly used in diagnosing conditions such as skin or breast cancer and Alzheimer disease. However, AI relies on clinical imaging.¹ In low-income countries, where specialist care may be lacking, AI may be useful for making clinical diagnoses. The GPT-4 (Generative Pre-trained Transformer 4) program allows analysis of clinical history in daily practice.² We hypothesized that GPT-4 could improve the diagnostic accuracy of clinicians by supplying the most probable diagnosis or suggesting differential diagnoses in complex cases.

Methods

The medical histories of 6 patients from the Division of Geriatrics in the Department of Medicine at Queen Mary Hospital who were aged 65 years or older and had delay of definitive diagnosis longer than 1 month in 2022 were retrieved after resolution.³^–5 The full medical histories were entered chronologically on April 16, 2023 (at admission, 1 week after admission, and before final diagnosis) into GPT-4 (powered by OpenAI via Platform for Open Exploration) without information about definitive diagnosis. The GPT-4 responses were copied out and further analyzed (eMethods in Supplement 1). One patient has been described previously.⁶ Responses by GPT-4 and clinicians were collected and compared. Differential diagnoses were also generated using a medical diagnostic decision support systemIsabel DDx Companion; Isabel Healthcare). The study was approved by the Institutional Review Board of the University of Hong Kong and Hospital Authority Hong Kong West Cluster. Written consent was provided for all patients. This report followed the reporting guideline for case series studies.

Results

Six patients 65 years or older (2 women and 4 men) were included in the analysis. The accuracy of the primary diagnoses made by GPT-4, clinicians, and Isabel DDx Companion was 4 of 6 patients (66.7%), 2 of 6 patients (33.3%), and 0 patients, respectively. If including differential diagnoses, the accuracy was 5 of 6 (83.3%) for GPT-4, 3 of 6 (50.0%) for clinicians, and 2 of 6 (33.3%) for Isabel DDx Companion (Table). By studying the changes in GPT-4’s responses, we determined that certain key words were required to make an appropriate clinical response, including abdominal aortic aneurysm (patient 1), proximal stiffness (patient 2), acid-fast bacilli in urine (patient 3), metronidazole (patient 4), and retroperitoneal lymphadenopathy (patient 6). GPT-4 could suggest diagnoses not considered by clinicians before definitive investigations: mycotic aneurysm for patient 1 after computed tomography showing an abdominal aortic aneurysm; a drug cause of seizure in patient 5; and the presence of necrotic lymph nodes from a previous computed tomographic scan, which should have led to the diagnosis of lymphoma, in patient 6.

Discussion

Overall, GPT-4 has potential clinical use in older patients without a definitive clinical diagnosis after 1 month but requires comprehensive entry of demographic and clinical (including radiological and pharmacological) information. GPT-4 may increase confidence in diagnosis and earlier commencement of appropriate treatment, alert clinicians missing important diagnoses, and offer suggestions similar to specialists to achieve the correct clinical diagnosis, which has potential value in low-income countries with lack of specialist care. Clinicians need to be aware that GPT-4 is limited in multifocal infection, and the suggested management plan should be correlated with clinical context, as suggestions may be redundant. Clinicians should consider a drug review and review the possible diagnosis of malignant disease if suggested.

This study has several limitations. First, GPT-4 may not detect 2 focuses of infection or pinpoint the source of recurrent infection. Second, GPT-4 did not suggest the use of gallium scan or 18-fluorodeoxyglucose positron emission tomography to look for infections or malignant neoplasms in all but 1 patient. Third, some investigations may not be appropriate (eg, temporal artery biopsy in the absence of typical symptoms of giant cell arteritis). Overall, our findings suggest that the use of AI in diagnosis is both promising and challenging.

Article Information

See the original publication (this is an excerpt version)

References

See the original publication (this is an excerpt version)

Authors and Affiliations

Yat-Fung Shea, MBBS¹; Cynthia Min Yao Lee, MBBS¹; Whitney Chin Tung Ip, MBBS, BSc¹; et alDik Wai Anderson Luk, MBBS, MRes¹; Stephanie Sze Wing Wong, MBChB¹

¹Department of Medicine, Queen Mary Hospital, University of Hong Kong, Hong Kong

Originally published at https://jamanetwork.com

Author

Joaquim Cardoso

Deixe um comentário Cancelar resposta

OpenAI Fixes GPT-4 “Laziness” Issue & Reduces API Prices in Latest Updates

the healthtransformationknowledge portal Joaquim Cardoso MScFounder and Chief Researcher, Editor & Strategist March 28, 2024 What is the…

byJoaquim Cardoso

GPT Store Launch: Navigating GenAI Marketplace Dynamics

healthtransformation.foundation Joaquim Cardoso MSc February 9, 2024 This summary was based on the article “The GPT Store Has Launched.…

byJoaquim Cardoso

“Cyborgs Consultants” that move back and forth between AI and human work perform much better

modern health . institute for continuous health transformation& digital health strategy Joaquim Cardoso MSc. Founder and CEO Chief Research and…

byJoaquim Cardoso

Revolutionizing Triage: AI Chatbot Integration at Tel Aviv Sourasky Medical Center

the health strategistresearch & strategy institute Joaquim Cardoso MSc. Chief Research Officer (CSO), Chief EditorChief Strategy Officer (CSO) and…

byJoaquim Cardoso

Microsoft eyes $10 billion bet on ChatGPT

institute for continuous health transformation Joaquim Cardoso MScJanuary 10, 2023 Key Points: SemaforLiz Hoffman and Reed AlbergottiJan 10, 2023 Microsoft…

byJoaquim Cardoso

Skeptical Take on the A.I. Revolution — [current A.I. systems lack mechanisms for checking the truth of their statements]

institute for health transformation Joaquim Cardoso MScSenior Advisor for Continuous Health Transformationand Digital HealthJanuary 8, 2023 Executive Summary What is…

byJoaquim Cardoso

ChatGPT creator Sam Altman doesn’t know what the future of A.I. holds

institute for continuous health transformation(InHealth) Joaquim Cardoso MScFounder, Chief Researcher & EditorJanuary 27, 2023 EXECUTIVE SUMMARY OpenAI, the San…

byJoaquim Cardoso

The newest version of ChatGPT diagnosed a 1 in 100,000 condition in seconds — showing better clinical judgment than “many doctors.” [BUT: it is not a substitute for ethical human doctors or nurses]

the health strategist research and strategy institute for value based health, care and tech transformation Joaquim Cardoso MScChief Researcher…

byJoaquim Cardoso

The Latest

Gaps in Data About Hospital and Health System Finances Limit Transparency for Policymakers and Patients (US context)

Equipping frontline workers with data, for decision making

Turning data into value, in Digital Hospitals

Smart Hospital. AI Wave of Digital Transformation, data driven

AI Diagnosis with GPT-4: Enhancing Accuracy in Complex Cases

the health strategist
multidisciplinary institute

Joaquim Cardoso MSc.

What is the message?

Key takeaways:

DEEP DIVE

Use of GPT-4 to Analyze Medical Records of Patients With Extensive Investigations and Delayed Diagnosis [excerpt]

Introduction

Methods

Results

Discussion

Article Information

References

Authors and Affiliations

Deixe um comentário Cancelar resposta

AI Diagnosis with GPT-4: Enhancing Accuracy in Complex Cases

the health strategistmultidisciplinary institute

Joaquim Cardoso MSc.

What is the message?

Key takeaways:

DEEP DIVE

Use of GPT-4 to Analyze Medical Records of Patients With Extensive Investigations and Delayed Diagnosis [excerpt]

Introduction

Methods

Results

Discussion

Article Information

References

Authors and Affiliations

Deixe um comentário Cancelar resposta

Related Posts

the health strategist
multidisciplinary institute