ChatGPT scores 50% on ophthalmology certification practice test, study finds

Newswise — According to a study, ChatGPT had a low accuracy rate, correctly answering less than 50% of the questions from a study resource frequently used by doctors preparing for ophthalmology board certification exams.

A study led by St. Michael's Hospital found that ChatGPT, an artificial intelligence tool, correctly answered less than half of the questions from a commonly used study resource for physicians preparing for board certification in ophthalmology. The study, published in JAMA Ophthalmology, reported a 46% accuracy rate for ChatGPT during the initial test in January 2023. However, when the same test was conducted a month later, ChatGPT scored more than 10% higher.

Since ChatGPT was released to the public in November 2022, many people have been excited about its potential in medicine and exam preparation. However, there are also concerns about the possibility of incorrect information and cheating in academic settings. ChatGPT is available for free to anyone with internet access and can be used in a conversational way.

Dr. Rajeev H. Muni, who led the study, emphasized that while ChatGPT may become more important in medical education and clinical practice in the future, it is crucial to use such AI systems responsibly. He added that ChatGPT did not provide enough correct answers to multiple-choice questions to be considered a substantial aid for preparing for board certification at this time.

The researchers used a set of multiple choice questions from a free trial of OphthoQuestions, which is a commonly used resource for preparing for board certification exams in ophthalmology. They made sure that the responses from ChatGPT were not affected by previous conversations by clearing all previous entries and using a new ChatGPT account for each question. They did not include questions that required image or video input since ChatGPT can only accept text input.

Out of 125 text-based multiple-choice questions, ChatGPT answered 58 questions correctly, which is 46% accuracy during the first test conducted in January 2023. In the second test conducted in February 2023, the performance of ChatGPT improved, and it answered 58% of the questions correctly.

Dr. Marko Popovic, a co-author of the study and a resident physician in the Department of Ophthalmology and Vision Sciences at the University of Toronto, stated that ChatGPT has great potential in medical education despite the fact that it provided incorrect answers to board certification questions in ophthalmology about half the time. He also mentioned that they anticipate ChatGPT's body of knowledge will evolve quickly.

According to the study, ChatGPT's selection of multiple-choice responses closely matched the responses of ophthalmology trainees. It selected the most popular answer among trainees 44% of the time, and the least popular answer only 11% of the time. It selected the second least popular answer 18% of the time and the second most popular answer 22% of the time.

It's important to note that Andrew Mihalache's name has not been mentioned in the press release or the study. However, the statement about ChatGPT's performance on general medicine questions and ophthalmology subspecialties is correct. The study found that ChatGPT performed better on general medicine questions, answering 79% of them correctly, while its accuracy was considerably lower on questions for ophthalmology subspecialties, such as oculoplastics (20% correct) and retina (0% correct). The authors suggest that the accuracy of ChatGPT may improve in niche subspecialties in the future.

MEDIA CONTACT

CITATIONS

JAMA Ophthalmology

ChatGPT scores nearly 50 per cent on board certification practice test for ophthalmology, study shows

AI tool scored more than 10 per cent higher one month later

MEDIA CONTACT

CITATIONS

TYPE OF ARTICLE

SECTION

CHANNELS

KEYWORDS

Expert Pitch

Expert Query

Expert Directory

ChatGPT scores nearly 50 per cent on board certification practice test for ophthalmology, study shows

AI tool scored more than 10 per cent higher one month later

MEDIA CONTACT

CITATIONS

TYPE OF ARTICLE

SECTION

CHANNELS

KEYWORDS