Back to the list
Congress: ECR25
Poster Number: C-28099
Type: Poster: EPOS Radiologist (scientific)
Authorblock: D. Pisarcik, M. Kissling, J. Heimer, R. Kubik-Huch, A. Euler; Baden/CH
Disclosures:
Dusan Pisarcik: Other: Sirius Medical
Marc Kissling: Nothing to disclose
Jakob Heimer: Nothing to disclose
Rahel Kubik-Huch: Nothing to disclose
Andre Euler: Speaker: Siemens
Keywords: Artificial Intelligence, Mammography, Ultrasound, Biopsy, Cancer, Neoplasia
Purpose Previous research primarily evaluated readability and accuracy based on expert feedback, while our study uniquely aimed to assess the interpretability and perception of various AI-translated mammography and sonography reports, covering benign to potentially malignant pathologies. Using a patient survey, we evaluated the comprehensibility of diagnoses, follow-up procedures, and conveyed empathy. 
Read more Methods and materials Three fictional mammography reports including sonography (BI-RADS 3, 4, 5) were generated and translated into plain language using ChatGPT-4, ChatGPT-4o, and Google Gemini. The following standardized prompt was used: Simplify this medical report so that it is clear, correct, and easily understandable by the patient without requiring any medical expertise. Each report was translated three times to address AI variability. New accounts were used to minimize bias from prior model training.Two expert radiologists reviewed AI-translated reports for factual correctness, completeness, and...
Read more Results Pairwise comparisons derived from the model across question categories showed that the odds of receiving a higher rating were significantly greater for ChatGPT-4 and ChatGPT-4o than for Google Gemini (p < .001 for all comparisons). [fig 2] [fig 1] The Plackett-Luce Model indicated that ChatGPT-4o had the highest probability of being preferred (Probability = 0.48), followed by ChatGPT-4 (Probability = 0.37), and Google Gemini had the lowest probability of preference (Probability = 0.15).  Most participants ranked ChatGPT4o and ChatGPT4 first, while Google Gemini...
Read more Conclusion Results showed that AI-generated translations improve patient access to complex radiological reports while maintaining factual correctness and clarity. Patient preferences varied, with most ranking ChatGPT-4o and ChatGPT-4 highest, while Google Gemini was rarely preferred. Clarity and empathy were especially valued in BI-RADS 4 and 5 reports, where patient anxiety is heightened. ChatGPT-4 was preferred for its superior clarity and emotional sensitivity, particularly when optimized prompts were used. However, translation quality varied, emphasizing the need for prompt design and oversight.Limitations include...
Read more References Bhayana R. Chatbots and Large Language Models in Radiology: A Practical Primer for Clinical and Research Applications. Radiology. 2024 Jan;310(1):e232756. doi: 10.1148/radiol.232756. PMID: 38226883. Rao A, Kim J, Kamineni M, Pang M, Lie W, Dreyer KJ, Succi MD. Evaluating GPT as an Adjunct for Radiologic Decision Making: GPT-4 Versus GPT-3.5 in a Breast Imaging Pilot. J Am Coll Radiol. 2023 Oct;20(10):990-997. doi: 10.1016/j.jacr.2023.05.003. Epub 2023 Jun 21. PMID: 37356806; PMCID: PMC10733745. Mese I, Taslicay CA, Sivrioglu AK. Improving radiology workflow using ChatGPT...
Read more
GALLERY