The newest version of ChatGPT diagnosed a 1 in 100,000 condition in seconds — showing better clinical judgment than “many doctors.” [BUT: it is not a substitute for ethical human doctors or nurses]

byJoaquim Cardoso

8 de abril de 2023

5 minute read

the health strategist

research and strategy institute
for value based health, care and tech transformation

Joaquim Cardoso MSc
Chief Researcher & Editor of the Site
March 31, 2023

ONE PAGE SUMMARY

The latest version of ChatGPT, GPT-4, has shown better clinical judgment than many doctors, according to Dr. Isaac Kohane, a Harvard computer scientist and physician.

In a forthcoming book, he explains that the AI model can diagnose rare conditions just as well as he can, and can also make suggestions on how to communicate effectively with patients.

Kohane tested GPT-4 by feeding it data on a real-life case involving a newborn baby, which it used to diagnose a 1 in 100,000 condition correctly.

However, the book also highlights that GPT-4 is not infallible, and errors can occur due to incomplete or inaccurate data.

The AI model is not a substitute for ethical human doctors or nurses, and additional checks are needed to verify its diagnoses.

Overall, GPT-4 has the potential to save time and resources in clinical settings, but its limitations should be considered when using it.

DEEP DIVE

The newest version of ChatGPT passed the US medical licensing exam with flying colors — and diagnosed a 1 in 100,000 condition in seconds

Insider
Hilary Brueck
Apr 6, 2023

KEY POINTS

A doctor and Harvard computer scientist says GPT-4 has better clinical judgment than “many doctors.”

The chatbot can diagnose rare conditions “just as I would,” he said.

But GPT-4 can also make mistakes, and it hasn’t taken the Hippocratic oath.

Dr. Isaac Kohane, who’s both a computer scientist at Harvard and a physician, teamed up with two colleagues to test drive GPT-4, with one main goal: To see how the newest artificial intelligence model from OpenAI performed in a medical setting.

“I’m stunned to say: better than many doctors I’ve observed,” he says in the forthcoming book, “ The AI Revolution in Medicine,” co-authored by independent journalist Carey Goldberg, and Microsoft vice president of research Peter Lee. (The authors say neither Microsoft nor OpenAI required any editorial oversight of the book, though Microsoft has invested billions of dollars into developing OpenAI’s technologies.)

In the book, Kohane says GPT-4, which was released in March 2023 to paying subscribers, answers US medical exam licensing questions correctly more than 90% of the time. It’s a much better test-taker than previous ChatGPT AI models, GPT-3 and -3.5, and a better one than some licensed doctors, too.

GPT-4 is not just a good test-taker and fact finder, though. It’s also a great translator. In the book it’s capable of translating discharge information for a patient who speaks Portuguese, and distilling wonky technical jargon into something 6th graders could easily read.

As the authors explain with vivid examples, GPT-4 can also give doctors helpful suggestions about bedside manner, offering tips on how to talk to patients about their conditions in compassionate, clear language, and it can read lengthy reports or studies and summarize them in the blink of an eye. The tech can even explain its reasoning through problems in a way that requires some measure of what looks like human-style intelligence.

But if you ask GPT-4 how it does all this, it will likely tell you that all of its intelligence is still “limited to patterns in the data and does not involve true understanding or intentionality.” That’s what GPT-4 told the authors of the book, when they asked it if it could actually engage in causal reasoning. Even with such limitations, as Kohane discovered in the book, GPT-4 can mimic how doctors diagnose conditions with stunning — albeit imperfect — success.

**OpenAI CEO Sam Altman. OpenAI developed ChatGPT, and its most refined network yet, GPT-4.** Jason Redmond / AFP via Getty Images

How GPT-4 can diagnose like a doctor

Kohane goes through a clinical thought experiment with GPT-4 in the book, based on a real-life case that involved a newborn baby he treated several years earlier. Giving the bot a few key details about the baby he gathered from a physical exam, as well as some information from an ultrasound and hormone levels, the machine was able to correctly diagnose a 1 in 100,000 condition called congenital adrenal hyperplasia “just as I would, with all my years of study and experience,” Kohane wrote.

The doctor was both impressed and horrified.

“On the one hand, I was having a sophisticated medical conversation with a computational process,” he wrote, “on the other hand, just as mind blowing was the anxious realization that millions of families would soon have access to this impressive medical expertise, and I could not figure out how we could guarantee or certify that GPT-4’s advice would be safe or effective.”

GPT-4 isn’t always right — and it has no ethical compass

GPT-4 isn’t always reliable, and the book is filled with examples of its blunders. They range from simple clerical errors, like misstating a BMI that the bot had correctly calculated moments earlier, to math mistakes like inaccurately “solving” a Sudoku puzzle, or forgetting to square a term in an equation. The mistakes are often subtle, and the system has a tendency to assert it is right, even when challenged. It’s not a stretch to imagine how a misplaced number or miscalculated weight could lead to serious errors in prescribing, or diagnosis.

Like previous GPTs, GPT-4 can also “hallucinate” — the technical euphemism for when AI makes up answers, or disobeys requests.

When asked about issue this by the authors of the book, GPT-4 said “I do not intend to deceive or mislead anyone, but I sometimes make mistakes or assumptions based on incomplete or inaccurate data. I also do not have the clinical judgment or the ethical responsibility of a human doctor or nurse.”

One potential cross-check the authors suggest in the book is to start a new session with GPT-4, and have it “read over” and “verify” its own work with a “fresh set of eyes.” This tactic sometimes works to reveal mistakes — though GPT-4 is somewhat reticent to admit when it’s been wrong. Another error-catching suggestion is to command the bot to show you its work, so you can verify it, human-style.

It’s clear that GPT-4 has the potential to free up precious time and resources in the clinic, allowing clinicians to be more present with patients, “instead of their computer screens,” the authors write. But, they say, “we have to force ourselves to imagine a world with smarter and smarter machines, eventually perhaps surpassing human intelligence in almost every dimension. And then think very hard about how we want that world to work.”

Originally published at https://www.insider.com on April 8, 2023.

Selected names

Dr. Isaac Kohane — He is a physician and computer scientist at Harvard.
Carey Goldberg — She is an independent journalist and co-author of the book “The AI Revolution in Medicine.”
Peter Lee — He is a vice president of research at Microsoft and co-author of the book “The AI Revolution in Medicine.”

Author

Joaquim Cardoso

Deixe um comentário Cancelar resposta

I’m an ER doctor: Here’s what I found when I asked ChatGPT to diagnose my patients – [some results may be life-threatening]

the health strategist institute for continuous transformation — in health and tech Joaquim Cardoso MScFounder and Chief Researcher & EditorMarch…

byJoaquim Cardoso

The Latest

GenerativeAI revisited by Goldman Sachs

Amazon Health Launches $49 Telehealth Service

Amil e Dasa Criam Segunda Maior Rede Hospitalar do Brasil: Fusão Estratégica e Preparação para IPO

Microsoft Discontinues Copilot GPT Builder, Sparks Concern Among Subscribers

The newest version of ChatGPT diagnosed a 1 in 100,000 condition in seconds — showing better clinical judgment than “many doctors.” [BUT: it is not a substitute for ethical human doctors or nurses]

the health strategist

Joaquim Cardoso MSc
Chief Researcher & Editor of the Site
March 31, 2023

ONE PAGE SUMMARY

The latest version of ChatGPT, GPT-4, has shown better clinical judgment than many doctors, according to Dr. Isaac Kohane, a Harvard computer scientist and physician.

However, the book also highlights that GPT-4 is not infallible, and errors can occur due to incomplete or inaccurate data.

Overall, GPT-4 has the potential to save time and resources in clinical settings, but its limitations should be considered when using it.

DEEP DIVE

The newest version of ChatGPT passed the US medical licensing exam with flying colors — and diagnosed a 1 in 100,000 condition in seconds

KEY POINTS

How GPT-4 can diagnose like a doctor

GPT-4 isn’t always right — and it has no ethical compass

Selected names

Deixe um comentário Cancelar resposta

The newest version of ChatGPT diagnosed a 1 in 100,000 condition in seconds — showing better clinical judgment than “many doctors.” [BUT: it is not a substitute for ethical human doctors or nurses]

the health strategist

Joaquim Cardoso MScChief Researcher & Editor of the SiteMarch 31, 2023

ONE PAGE SUMMARY

The latest version of ChatGPT, GPT-4, has shown better clinical judgment than many doctors, according to Dr. Isaac Kohane, a Harvard computer scientist and physician.

However, the book also highlights that GPT-4 is not infallible, and errors can occur due to incomplete or inaccurate data.

Overall, GPT-4 has the potential to save time and resources in clinical settings, but its limitations should be considered when using it.

DEEP DIVE

The newest version of ChatGPT passed the US medical licensing exam with flying colors — and diagnosed a 1 in 100,000 condition in seconds

KEY POINTS

How GPT-4 can diagnose like a doctor

GPT-4 isn’t always right — and it has no ethical compass

Selected names

Deixe um comentário Cancelar resposta

Related Posts

Joaquim Cardoso MSc
Chief Researcher & Editor of the Site
March 31, 2023