Maayan Mandelbaum MD, Daniella Levy-Erez MD, Shelly Soffer MD, Eyal Klang MD, Sarina Levy-Mendelovich MD
Artificial Intelligence (AI), particularly large language models (LLMs) like OpenAI's ChatGPT, has shown potential in various medical fields, including pediatrics. We evaluated the utility and integration of LLMs in pediatric medicine. We conducted a search in PubMed using specific keywords related to LLMs and pediatric care. Studies were included if they assessed LLMs in pediatric settings, were published in English, peer-reviewed, and reported measurable outcomes. Sixteen studies spanning pediatric sub-specialties such as ophthalmology, cardiology, otology, and emergency medicine were analyzed. The findings indicate that LLMs provide valuable diagnostic support and information management. However, their performance varied, with limitations in complex clinical scenarios and decision-making. Despite excelling in tasks requiring data summarization and basic information delivery, the effectiveness of the models in nuanced clinical decision-making was restricted. LLMs, including ChatGPT, show promise in enhancing pediatric medical care but exhibit inconsistent performance in complex clinical situations. This finding underscores the importance of continuous human oversight. Future integration of LLMs into clinical practice should be approached with caution to ensure they supplement, rather than supplant, expert medical judgment.