the health strategist
research institute for continuous transformation
– of health & tech
Joaquim Cardoso MSc
Chief Editor & Researcher
March 19, 2023
EXECUTIVE SUMMARY
Google has announced the release of MedPaLM 2, its latest medical large language model at its annual event “The Check Up”.
- MedPaLM 2 achieved a score of 85% on medical exam questions, surpassing similar AI models such as GPT-4, and showing an improvement of 18% from the previous performance of Med-PaLM.
- The model was evaluated by clinicians and non-clinicians against 14 criteria and Google identified disparities that it pledged to collaborate with researchers and healthcare professionals to narrow.
- The initial model called MedPaLM was evaluated using a new open-source medical question-answering benchmark called MultiMedQA and had achieved a passing score of over 60% on multiple-choice style questions.
- In addition, Google launched the PaLM API, which allows businesses and developers to construct applications utilizing Google’s large language model for the first time.
DEEP DIVE
Google Introduces MedPaLM 2, A GPT-4 Like Model for Healthcare
Analytics India Mag
By Shyam Nandan Upadhyay
March 15, 2023
Google at its annual event called “The Check Up”, announced the latest version of its medical large language model called MedPaLM — along with new health initiatives and partnerships.
According to the MedPaLM 2 team, their model achieved a score of 85% on medical exam questions (USMLE MedQA), which is comparable to the level of an “expert” doctor. This is an improvement of 18% from the previous performance of Med-PaLM, surpassing similar AI models — the likes of GPT-4 and others.
The team also obtained results on other benchmarks such as MedMCQA and MMLU clinical topics.
The evaluators, consisting of clinicians and non-clinicians from diverse backgrounds and countries, tested the models against 14 criteria which included factors such as scientific accuracy, exactness, conformity with medical consensus, logical thinking, partiality, and potential for harm.
Google identified significant disparities, however, it pledged to collaborate with researchers and healthcare professionals to narrow these disparities and enhance healthcare services.
Google Research and DeepMind had released the initial model called MedPaLM, in December 2022. MedPaLM was evaluated using a new open-source medical question-answering benchmark called MultiMedQA.
The AI system had achieved a passing score of over 60% on multiple-choice style questions, which are similar to those used in U.S. medical licensing exams. This was the first time that such a system had been able to do so successfully.
The researchers utilised PaLM, which is a large language model with 540 billion parameters, and its instruction-tuned variation called Flan-PaLM to create the model. They employed these models to evaluate other large language models using MultiMedQA.
In an interesting new development, Google also launched PaLM API right before OpenAI’s GPT-4. The latest API now permits businesses and developers to construct applications utilising Google’s SOTA large language model, which is identical to the one employed in Search, YouTube, and Gmail. Google is offering access to its underlying models for the first time.
Originally published at https://analyticsindiamag.com
Names mentioned
- Shyam Nandan Upadhyay — author of the article from Analytics India Mag.
- Google — the company that has introduced MedPaLM 2, a GPT-4 like model for healthcare, and launched the PaLM API.
- MedPaLM 2 team — the team responsible for developing the MedPaLM 2 model.
- DeepMind — the research team at Google that released the initial model called MedPaLM in December 2022.
- OpenAI — the organization that has developed the GPT-4 model, which is similar to MedPaLM 2.
- MultiMedQA — a new open-source medical question-answering benchmark used to evaluate the MedPaLM model.
- Clinicians and non-clinicians — individuals from diverse backgrounds and countries who evaluated the MedPaLM model against 14 criteria.