Validation of Open-Source LLMs for Healthcare Tasks

Validating and selecting an open-source LLM for the specific healthcare tasks you want to improve is critical because its prediction accuracy can vary significantly across different diseases. To help doctors choose the right LLMs for high-impact diseases, we have been conducting validation for the top open-source LLMs, including Llama-3.1-70B, Llama-3.1-8B, Llama3-70B, Llama3-8B, and Gemma2-9B.

The table below lists some of the validated diagnostic prediction tasks for which the best open-source LLMs have achieved >90% accuracy. A few example patient cases are provided for each disease task. Each example link will open AIChat, allowing you to test any of the open-source LLMs from the LLM dropdown list and compare them to the best commercial models.


Healthcare Tasks Patient Case Examples
Predicting Alzheimer's disease Example-1   Example-2   Example-3  
Predicting Parkinson's disease Example-1   Example-2   Example-3  
Predicting stroke Example-1   Example-2   Example-3  



ELHS GenAI Copilot alpha v1.1.1 Democratizing GenAI in Healthcare to Help Achieve Global Health Equity © 2023-2024 ELHS Institute. All rights reserved.
elhsi.org
Disclaimer: The contents and tools on this website are for informational purposes only. This information does not constitute medical advice or diagnosis.