Research Protocols

GenAI Clinical Evaluation Research Protocols

By using one or multiple LLMs to analyze clinical cases, one can examine every task within healthcare delivery to assess the effectiveness of GenAI in enhancing informed decision-making. The scope of the study might focus on a single disease or encompass a group of diseases. Healthcare settings could range from a single hospital or clinic to multiple hospitals or even research networks that span hospitals and community/rural clinics. Below are three example protocol outlines for different settings.

Disease Study Protocol:

A doctor initiates a project, selects a specific disease of interest, and clearly defines a clinical task to enhance with the aid of GenAI.
The doctor adds a new clinical case derived from de-identified patient case to the case study page.
Copilot is instructed to analyze the case with a selected LLM, such as predicting possible causes.
The doctor uses insights learned from the AI analysis to refine the clinical decision for the case.
The doctor evaluates clinical decisions made before and after using AI analysis to assess the usefulness of AI information, assigning a category of usefulness (e.g. most useful, useful, not useful).
Steps 2-5 are repeated for a series of cases over a designated period, compiling a comprehensive dataset for analysis.
A collaborating doctor reviews the clinical decisions and verifies the usefulness assessment.
Both doctors also review and determine the accuracy of GenAI predictions.
Conduct statistical analysis for the effectiveness of GenAI analysis and correlate the GenAI prediction's accuracy with its usefulness.
Draw conclusions from the quantified real-world evaluation data. For publication, adhere to the common requirements outlined below.

Specialty Study Protocol:

Every departmental lead researcher can use this protocol to explore the impact of GenAI on clinical decisions related to a variety of diseases within their distinct medical specialty.
The project lead initiates a research project by setting clear goals and inviting fellow doctors to join the project.
Each team member uses Copilot to examine clinical cases derived from de-identified patient cases arising in the course of routine clinical care over a specified timeframe.
For each case, the doctor records clinical decisions made before and after using AI analysis in decision-making. Score AI prediction's accuracy and asses the usefulness of AI information, assigning a category of usefulness.
Repeat steps 3-4 until most diseases have enough cases.
Team members cross-review each other's assessments and resolve any differences.
Conduct statistical analysis for the effectiveness of GenAI analysis for each disease and correlate the GenAI prediction's accuracy with its usefulness.
Draw conclusions from the quantified real-world evaluation data for all the diseases under study.
For publication, adhere to the common requirements outlined below.

Clinical Research Network (CRN) Study Protocol:

Any Principal Investigator affiliated with a teaching hospital may use the CRN protocol to facilitate a collaborative effort to study and enhance care delivery across a diverse range of hospitals and clinics, including those in community and rural settings.
The project lead is responsible for initiating a project, defining a specific task for assessing GenAI's impact, and inviting collaborators to join the team.
All team members use Copilot to analyze clinical cases derived from de-identified patient cases encountered during routine clinical care over a designated period.
For each case, clinical decisions made pre- and post-AI analysis will be compared to ascertain the degree of improvement or the benefits realized from incorporating AI analysis. Assign a score to the GenAI prediction's accuracy and a category of usefulness of the GenAI information.
Repeat steps 3-4 until each disease has enough cases.
Team members cross-review each other's assessments and resolve any differences.
Conduct statistical analysis for the effectiveness of GenAI analysis for each disease and correlate the GenAI prediction's accuracy with its usefulness.
Characterize any disparities in the task outcome between doctors from hospitals and low-resource clinics. Compare the impact of GenAI between these doctor groups.
Draw conclusions from the quantified real-world evaluation data for all the diseases under study and determine if access to GenAI in clinical delivery can reduce healthcare disparities.
For publication, adhere to the common requirements outlined below.

Common Requirements for Research Publication

Employ a rigorous scientific method, such as the Comparative Effectiveness Research (CER) or Pragmatic Clinical Trial (PCT) methods. May include comparison of the results of using AI analysis to the previous baseline or a parallel control group.
Quantify the outcomes of using AI analysis to facilitate statistical analysis.
Ensure that your results are reproducible in additional experiments.
Obtain approval for the study plan from your Institutional Review Board (IRB).

GenAI Clinical Evaluation Research Protocols

Disease Study Protocol:

Specialty Study Protocol:

Clinical Research Network (CRN) Study Protocol:

Common Requirements for Research Publication

Futher reading:

Exploring Healthcare GenAI Research Ideas

Understanding Healthcare GenAI Research Approaches