User Help

The ELHS Copilot is a cutting-edge online generative AI (GenAI) learning and research tool for medical students, doctors, and all other healthcare professionals. Copilot takes a GenAI-first approach to streamline key tasks that are currently being transformed by GenAI based on large language models (LLMs). The goal is to make it easier for every doctor to start collaborating with a GenAI copilot in their medical education and healthcare career. Copilot is developed by the ELHS Institute and provided for free to accelerate the democratization of GenAI in healthcare, aiming to help achieve global health equity. It has grown out of a series of our published work, from the initial concept of a learning copilot published by JAMA, a foundational study of ChatGPT benchmarking published in JAMIA, to a literature review of advancing the democratization of GenAI in healthcare published in JHMHP.

Medicine has just entered the new world of GenAI, where the unparalleled capabilities of GenAI in natural language interactions and reasoning have made it feasible to analyze clinical cases and provide expert insights and predictions that can accelerate medical learning, clinical learning, and training. It is possible that every doctor will have a GenAI copilot for learning, research, and practice in future healthcare. Copilot has designed the following key functional modules to help doctors get started on this new journey:

Benchmarking Module: Providing scoreboards to show the potential benefits of GenAI in key healthcare tasks across most diseases.
Learning Module: Using multiple top GenAI chatbots to analyze clinical cases or answer any questions, with dynamic links to case-specific clinical learning guides.
Research Module: Studying clinical cases to evaluate the benefits of different LLMs, ensuring data quality in collaborations.
User Module: Managing user settings and contacting us for technical support.

Benchmarking Module

Systematic GenAI Benchmarking

Benchmarking GenAI for every applicable task in healthcare is crucial to the democratization of healthcare AI, setting the foundation for learning, research, and application of GenAI.

GenAI Healthcare Task Benchmarking and Scoreboards

Our ELHS Institute is systematically benchmarking the top GenAI LLMs and gradually providing benchmarking scoreboards in Copilot, including on the Copilot home page and the Benchmarking module.

The current benchmarking results have demonstrated that general-purpose LLMs such as ChatGPT-4 and Gemini-1.5 have high accuracy in diagnostic prediction across a wide range of diseases. Although the diagnostic prediction accuracy of open-source LLMs (OS-LLMs) like Llama-3 and Gemma-2 is much lower than that of ChatGPT-4, it is at a level that warrants using OS-LLMs as the baseline for fine-tuning to provide desired intelligence in some specific healthcare tasks.

Users can find diseases in the benchmarking datasets and test diagnostic prediction by different LLMs using the Learning module. From such comparisons, you may find potential target diseases that can be accurately predicted by the best GenAI but not by OS-LLMs. These differences provide good opportunities for you to bring GenAI abilities into your clinical research, as it is possible to fine-tune OS-LLMs to achieve high accuracy so that you can deploy your own fine-tuned LLMs for optimizing specific tasks in your environment.

Learning Module

Streamlined GenAI Learning in Copilot

Use multiple commercial and open-source LLMs together to learn GenAI for healthcare tasks and identify tasks for fine-tuning open-source LLMs.
While chatting with GenAI, learn from in-context clinical learning guides in bilingual languages.

The Learning module integrates multiple top LLMs in the AIChat form from both commercial and open-source domains. While AIChat allows users to chat about anything with GenAI, it also provides predefined tasks for analyzing patient cases. When predicting patient cases, AIChat will link the predicted diseases to clinical learning guides (CLGs), enabling in-context medical learning.

AIChat: Chat with GenAI

Users can chat with any of the top models and learn how LLMs can assist with various educational or healthcare tasks. Additionally, by comparing the best commercial models to the top open-source models, users may identify interesting tasks for creating their own fine-tuned open-source models to apply GenAI in their healthcare practices.

Task:

For predicting disease causes or checking symptoms, select a predefined task and enter a patient case without system instructions. AIChat will create an appropriate prompt for the chat.
For any other tasks, select “Any Task” and enter a complete question with instructions in the chat box.

Model:

Choose a model from the top commercial chatbots (ChatGPT, Gemini, Ernie) and open-source LLMs (Llama, Gemma, Mistral).
Choose different models to compare their performances. The latest versions of ChatGPT, Gemini, or Ernie should provide the best responses. The latest versions of Llama, Gemma, or Mistral should provide baseline responses from which users can build specialized fine-tuned LLMs.

Response Language:

Set the second language if you prefer bilingual responses from GenAI. For example, Chinese users may choose English as the second language, resulting in bilingual chat responses in Chinese and English.
This bilingual response feature makes it more convenient for non-English speaking users to learn medicine in both their native language and English.

Chat Input:

For “Any Task,” input a prompt with clear questions and instructions for GenAI to understand your intent and generate the best response for you.
For predefined tasks, e.g., Diagnostic prediction or Symptom checking, simply input a patient case, and AIChat will create an appropriate prompt to get the best answer from GenAI.
Click the Chat button. Depending on the model, it may take a few seconds to get the response.

GenAI Answer:

For any task, the answer depends on the questions and instructions. You may adjust your prompt to explore better responses.

For the predefined healthcare tasks, GenAI prediction depends on whether the patient case has enough information.

For the “Symptom checking” task, the patient case should include symptoms, medical history, family history, physical exams, environmental factors, etc., if available. It is usually for checking patient risk and triage.
For the “Diagnostic prediction” task, the patient case should include symptoms, medical history, family history, physical exams, environmental factors, lab tests, imaging results, pathology results, genetic tests, other diagnostic tests, etc., if available.
From the chat response, the predicted diseases will be dynamically linked to corresponding CLGs for further learning. This linking feature brings the related medical knowledge from GenAI to your clinical study context, enabling convenient “in-context” studying of the case-related learning guides while analyzing patient cases with GenAI.
- For some diseases, the CLGs are pre-created and reviewed based on the medical knowledge of GPT-4.
- For other diseases, the CLGs present a list of questions and will answer questions in real-time with GPT-4 when clicked by users.

AIChat Example 1:

Task: Symptom checking
Model: OpenAI ChatGPT 4o

Chat Input:
GenAI Answer:

Case-related Study: Clinical Learning Guides

Predicted Possible Diseases:

- Lung Cancer: Pulmonology

- Chronic Obstructive Pulmonary Disease (COPD) Exacerbation: Pulmonology

Triage: Urgent

Specialty: Pulmonology

AIChat Example 2:

Task: Diagnostic prediction
Model: OpenAI ChatGPT 4o

Chat Input:
GenAI Answer:

Case-related Study: Clinical Learning Guides

Predicted Possible Diseases:

- Lung Cancer: Oncology

- Mesothelioma: Oncology

Triage: Urgent

Specialty: Oncology

Clinical Learning Guides

The CLGs page lists the latest guides created from ChatGPT's vast medical knowledge. You can search CLGs by name or UMLS CUI. The CUI codes connect CLGs to international standards of medical ontologies, providing a bridge to traditional medical knowledge.

Each CLG for a disease currently focuses on disease diagnosis, which can be expanded to treatment, management, and prevention in the future. In general, disease CLGs cover the following basic questions and more. For example, the lung cancer CLG covers:

What is lung cancer? What are its types? Its synonyms? What is its parent (broader) disease, and what are all its child (narrower) diseases?
What patient information and test results, including symptoms, medical history, physical exams, and diagnostic tests, are required to differentially diagnose lung cancer?
What are the specific results from medical history taking, physical examinations, and diagnostic tests for a patient with lung cancer?
Please provide an example medical record containing symptoms, medical history, physical examination, and results of diagnostic tests required for a differential diagnosis of lung cancer.
Please provide a one-paragraph patient case example of lung cancer.
Briefly describe the diseases that should be ruled out in the differential diagnosis of lung cancer.

Bilingual Learning: CLGs make it more convenient for non-English speaking users to learn medicine in both their native language and English. For automatic CLGs, questions are answered in both the user's native language (as set in user settings) and English (as set in user GenAI settings). For manually reviewed CLGs, translation currently only applies to Chinese.

Please note that AI answers may have errors and are only for learning purposes. Users are responsible for their learning outcomes.

Research Module

Streamlined Healthcare GenAI Evaluation Research in Copilot

Use Copilot to evaluate the effectiveness of one or multiple top LLMs in any healthcare tasks of interest to you.
Standardize data collection in research collaborations to ensure data consistency and publish high-quality GenAI papers.

How To Get Started with Healthcare GenAI Research Now?

GenAI is democratizing AI in healthcare, enabling all medical students, doctors, and health professionals to study how AI can assist them in making more informed and better healthcare decisions. To help you get started now, Copilot's Research module streamlines the GenAI evaluation research process, from ideation to using multiple LLMs, data collection, and project team collaboration.

You can start using GenAI to analyze your clinical cases for diagnosis, treatment selection, or any other predictive tasks you would like to improve. By comparing your clinical decisions before and after using GenAI analysis information, you can evaluate whether GenAI can benefit clinical decision-making.

You may choose from the following top LLMs and compare their performance:

OpenAI ChatGPT
Google Gemini
Baidu Ernie
Meta Llama
Google Gemma
Mistral AI Mistral

Your cases are tracked in Copilot so that you can use them in your collaboration projects. Streamlined research projects make it easier to ensure data consistency and quality, which is crucial for peer-reviewed publication of any new GenAI effectiveness evidence generated from your research.

Next Steps:

To explore healthcare GenAI research ideas.
To understand healthcare GenAI research approaches.
To choose GenAI clinical evaluation research protocols.
To create a new case study.

My Clinical Cases

Copilot tracks your studied clinical cases, offering a platform where you can compare your decision-making before and after using GenAI predictions and analyses over time. This history comparison feature will become instrumental in self-assessing your journey of GenAI learning and research.

New Case Study

From the Research or Cases page, you can start to analyze a new patient's case. Follow these steps to conduct a case evaluation study:

Create a new patient case:
- Enter a Case ID and select a case-related specialty.
- Enter a patient case: If possible, use the same structure as an EHR record to organize the detailed information for the patient case in your first language. For example, to predict diagnosis, a patient's case may include these categories of information: symptoms, histories, physical exams, lab tests, imaging results, pathology results, and other diagnostic tests. Alternatively, you may enter the patient case details in paragraphs with no structure.
- For comparison purposes, enter your initial clinical decision before using GenAI. If a decision hasn't been made yet, simply enter “none”.
- Save the case.
Ask GenAI Copilot:
- Select a commercial or open-source LLM model.
- Select a predefined healthcare task, such as diagnostic prediction or symptom checking. If you instruct the model in the patient case box to do any other tasks, select “Any task”.
- Ask GenAI Copilot to analyze the case. Depending on the model, it may take a few seconds to get the response.
Learn from GenAI and Update Decision:
- After the model's response arrives, carefully learn from the insights provided by the GenAI analysis and adjust your clinical decision accordingly.
- Record your revised decision in the designated update box.
- Self-evaluate whether the GenAI analysis was beneficial in making your more informed decision. Assign the usefulness category.

For convenience, you may set your default task, model, bilingual language, specialty, etc., on the “User > GenAI” page.

Remember:

All personal identification information MUST be removed from patient cases to protect patient privacy before entering them into Copilot.

Case Study Example 1

Case ID: p1
Specialty: Family medicine

Patient case:

Symptoms and health factors:
      Non-specific symptoms.
      Symptoms similar to gastritis, ulcer, such as: upper abdominal discomfort or hidden pain after meals, loss of appetite, belching, acid reflux, nausea, vomiting, black stools.
      Subsequent symptoms include weight loss, anemia, fatigue, persistent upper abdominal pain leading to severe abdominal pain, difficulty swallowing or reflux, vomiting of stale food, gastrointestinal bleeding, lack of stomach acid, diarrhea due to accelerated gastric emptying.
      Late stages include severe weight loss, anemia, edema, fever, ascites, jaundice, cachexia.
      
      Medical and family histories:
      Possible history of Helicobacter pylori infection, chronic atrophic gastritis.
      Possible family history of stomach cancer.
      
      Physical examination:
      No obvious signs.
      Signs include deep pressure pain in the upper abdomen, sometimes with mild muscle resistance, upper abdominal mass, gastrointestinal obstruction, bloody ascites, enlarged supraclavicular lymph nodes, anterior rectal vault mass, umbilical mass.
      
      Laboratory tests:
      Elevated tumor markers: CA72-4, CEA, CA19-9, AFP, CA125.
      
      Imaging and other tests:
      None.

Record Initial Clinical Decision:
Ask GenAI Copilot to Analyze the Case for the Selected Task:

Select Model: Google Gemini 1.5
Select Task: Diagnostic prediction

GenAI Answer:

Prediction:

1. **Most likely causes:**
* **Gastric cancer:** The patient presents with classic symptoms of gastric cancer, including persistent upper abdominal pain, weight loss, anemia, and elevated tumor markers.
* **Chronic atrophic gastritis:** The patient's history of Helicobacter pylori infection and chronic atrophic gastritis, along with the presence of similar symptoms, makes this a possible cause.

2. **Triage:** Urgent referral to a **Gastroenterologist** is recommended due to the potential for gastric cancer.

After Learning from GenAI Analysis, Record Your Updated Decision for Comparison:

Is GenAI analysis helpful in your decision-making?: b: Helpful to improve my decision

My Research Projects

To publish your case study research in SCI journals, it is critical to organize your case studies as comparative effectiveness research (CER) projects. Copilot's project feature streamlines GenAI evaluation studies for you, ensuring data standardization, quality, and consistency. Conducting projects online makes it easy for you to collaborate among team members, standardize data collection, cross-review data, and efficiently manage data, thus helping produce high-quality data and evidence. Online collaboration can be within a hospital, across multiple hospitals, or even within a clinical research network (CRN) led by hospitals and involving community health centers, rural hospitals, and clinics.

Start New Project

From the project list, start creating a new project:

Define the project task and enter the project description, then create the project.
As the project manager, you can add co-leads with their account emails.
Invite members with their account emails to join the project.
Members accept the invitations, and then associate their studied cases to the project.
Co-leads review the cases, score the GenAI predictions, and determine the effectiveness of GenAI analysis on the doctor's decision-making.
Conduct statistical analysis and draw CER conclusions.
Write a manuscript for publication.

Simple 3-level scoring:

1 = correct prediction.
0.5 = prediction very close to expectation.
0 = incorrect and other predictions.

Simple 3-level effectiveness categories:

a: Most helpful to decision-making.
b: Helpful to decision-making.
c: Not helpful to decision-making.

Conducting a GenAI Project as an Individual User

Any individual user (doctor, student, or health professional) can independently conduct GenAI research in medical education and clinical training. For instance, a project for evaluating GenAI in stroke risk prediction involves the following steps:

Create a new project to evaluate the stroke risk prediction task by LLMs.
Analyze your clinical cases with GenAI and evaluate the effectiveness of GenAI analysis in your decision-making. Tag the cases for this project.
Repeat Step 2 until the project has enough cases.
Score GenAI prediction for each case, and calculate overall scores and accuracy.
Perform statistical analysis on the effectiveness data, correlating GenAI scores with effectiveness.
Invite collaborators to review the GenAI prediction scores and effectiveness data for verification of your analysis and results.
If the results generate new evidence for GenAI effectiveness, write a manuscript for publication.

Conducting a GenAI Project as a Project Team

If you are part of a project team, you can add co-leads and invite team members to the project. Members accept your invitation and then tag their clinical cases to the project. You and co-leads review all cases from team members, score GenAI predictions, assign GenAI effectiveness categories, and conduct statistical analysis. If the results generate new evidence for GenAI effectiveness, you can write a manuscript for publication.

Best Practices for Designing GenAI Research for Publication

Here are some key considerations to incorporate into your GenAI study design for publishing your research in SCI journals:

Employment of Rigorous Scientific Methods: Utilize established scientific methods, such as Comparative Effectiveness Research (CER) or Pragmatic Clinical Trial (PCT). An appropriate comparative analysis is pivotal to delineate the impact of AI in clinical care. Outcomes post-GenAI analysis should be compared against a previous baseline or a parallel control group.
Quantification and Statistical Analysis of Outcomes: It is essential to quantify the changes from using GenAI analysis and subject data to thorough statistical analysis to validate the findings.
Result Reproducibility: Ensure that your findings can be consistently reproduced in subsequent experiments to affirm their validity and reliability.
IRB Approval: Secure approval for your study plan from the Institutional Review Board (IRB) to ensure ethical compliance and integrity.

For assistance and support in GenAI research and publication, feel free to Contact us.

User Module

User Account module is for users to manage account and contact us for support.

User Registration

User self-registration requires a unique email address for verification, login, and communication. One email address can register for only one account. A user ID is automatically generated, but optionally you can provide a unique ID for login convenience. After you click the “Register” button, you will receive an email from support@elhsi.org. To verify your registration, click the verification link within the email.

User Login

User Settings

On the [User > Settings] page, you can update your user information and change your password. You may change your email address, but the new email address requires new email verification.

GenAI Settings

On the [User > GenAI] page, you can set your default GenAI settings:

Parameter	Options
GenAI LLM Model	Choose from top models: OpenAI ChatGPT Google Gemini Baidu Ernie Meta Llama Google Gemma Mistral AI Mistral
Prediction Task	DxP: Diagnostic prediction. DxLC: Diagnostic prediction with localization and characterization. ScP: Symptom checking.
Bilingual Language	Choose the second language for responses. Non-English speakers are recommended to select English so that GenAI responses will be bilingual with English for comparison.
Specialty:	Select your clinical specialty; the default is family medicine.
Case Type:	Real Patient Case (default): For studying de-identified real patient cases. Hypothetical Patient Case: For theoretical studies.

Contact Us

On the [User > Contact] page, you may ask any questions and request more GenAI credits. You are welcome to discuss technical support for your GenAI learning and training, GenAI research, open-source LLM fine-tuning and deployment, GenAI publication, etc. Contact page displays your contact history for record.