A recent study conducted by Mass General Brigham has revealed that ChatGPT, an artificial intelligence chatbot, exhibits promising capabilities in the field of clinical decision-making. Researchers found that ChatGPT was able to achieve an accuracy rate of 72 percent when making clinical decisions, comparable to the proficiency level of an intern or resident who has just graduated from medical school.
The study involved feeding 36 clinical scenarios into ChatGPT and requesting possible diagnoses, known as differential diagnoses. Additional information was then provided to the chatbot before it was asked to provide a final diagnosis and treatment plan. The results showed that ChatGPT demonstrated a 77 percent accuracy rate in final diagnoses, 68 percent accuracy rate in clinical management decisions, and a 60 percent accuracy rate in differential diagnoses.
Despite the success of ChatGPT in these areas, it is important to note that the study highlighted its limitations. The chatbot performed better in providing final diagnoses but struggled with determining possible differential diagnoses. This indicates that ChatGPT may be more effective in confirming a diagnosis rather than suggesting potential alternative conditions.
Dr. Marc Succi, the study’s corresponding author and executive director of MESH Incubator at Mass General Brigham, expressed that while no definitive benchmarks exist for evaluating chatbot performance, ChatGPT’s results suggest a competency level akin to someone who has recently completed medical school.
Looking ahead, Dr. Adam Landman, Chief Information Officer and Senior Vice President of Digital at Mass General Brigham, emphasized the potential of large language models like ChatGPT. He stated, “We see great promise in these models and are assessing their accuracy, reliability, safety, and equity for clinical documentation and drafting responses to patient messages.” Dr. Landman stressed the importance of rigorous studies, like the one conducted, to ensure the integration of these language models into clinical care responsibly.
1. What is ChatGPT?
ChatGPT is an artificial intelligence chatbot developed by OpenAI that uses large language models to engage in conversations and provide responses based on the information it has been trained on.
2. How accurate is ChatGPT in clinical decision-making?
According to the study by Mass General Brigham, ChatGPT demonstrated an overall accuracy rate of 72 percent in clinical decision-making, comparable to the level of an intern or resident who has just graduated from medical school.
3. What were the main findings of the study?
The study found that ChatGPT performed better in providing final diagnoses, with a 77 percent accuracy rate. However, it was less accurate in determining possible differential diagnoses, achieving a 60 percent accuracy rate.
4. How will ChatGPT be used in clinical care?
Mass General Brigham is currently evaluating the integration of large language models like ChatGPT into clinical care. They are specifically assessing these models’ accuracy, reliability, safety, and equity for tasks such as clinical documentation and drafting responses to patient messages.
Mass General Brigham: https://www.massgeneralbrigham.org