Title: Stanford Study Reveals ChatGPT Can Outperform Medical Students in Answering Tricky Clinical Care Exam Questions

Subtitle: Researchers Call for New Approach to Teaching as AI Technology Shines in Evaluating Clinical Reasoning Skills

Stanford University researchers have conducted a study that demonstrates the exceptional capabilities of ChatGPT, specifically its latest version, GPT-4, in outperforming first- and second-year medical students when answering challenging clinical care exam questions. This groundbreaking research highlights the potential for a new teaching approach in medical education.

While ChatGPT, developed by OpenAI, has already proven its competency in answering multiple-choice questions on the United States Medical Licensing Examination (USMLE), the researchers at Stanford wanted to delve deeper and explore the AI system’s performance in tackling more difficult, open-ended questions that assess clinical reasoning skills.

The assessment involved questions that provided detailed information about patients’ medical cases, requiring doctors-in-training to develop analytical and diagnostic skills. To their surprise, the researchers discovered that ChatGPT scored an average of over four points higher than human trainees in this section of the exam.

Eric Strong, a hospitalist and associate clinical professor at Stanford School of Medicine and one of the study’s authors, spoke about their astonishment at ChatGPT’s success in tackling medical reasoning questions. He noted that “ChatGPT exceeded human examinees’ scores in answering these types of medical reasoning questions.” This achievement underscores the AI’s potential to contribute significantly to the medical field.

The study utilized the latest version of ChatGPT, GPT-4, which was released in March 2023. This investigation builds upon an earlier study that focused on GPT-3.5, its predecessor, released by OpenAI in November 2022.

While ChatGPT’s prowess in handling multiple-choice questions has been expected since these questions mainly require information recall, the researchers emphasize that open questions with free answers pose a much greater challenge. This finding has prompted Stanford Medical School to make adjustments, such as switching from open-book exams with internet access to closed-book exams where students must rely solely on memory. However, this change comes at the expense of assessing students’ abilities to gather information from various sources, a vital skill in clinical care.

Recognizing the significant impact of ChatGPT, Stanford Medical School is considering incorporating artificial intelligence tools into the curriculum to enhance student learning. The aim is to strike a balance between leveraging the benefits of AI and ensuring that doctors are adequately trained to reason through cases independently.

“While we don’t want doctors to overly rely on AI during their education, we also fear a world in which doctors are ill-prepared to use AI effectively, despite its prevalence in modern practice,” said one of the researchers, highlighting the importance of finding the right balance in utilizing AI technologies in medical education.

As the medical community explores and embraces the potential of AI, this study sheds light on how technology can supplement traditional medical education and enhance clinical reasoning skills. It is hoped that these advancements will ultimately contribute to the provision of higher-quality healthcare in the future.

