AI in pediatric surgery: a narrative review
Introduction
Background
Artificial intelligence (AI) encompasses a range of concepts and technologies that enable machines to simulate human intelligence by analyzing, inferring, and generating predictions from data. These systems use algorithms and statistical models to perform tasks such as problem-solving, object recognition, and decision-making. Applications and devices equipped with AI can see and identify objects. They can understand and respond to human language. They can learn from new information and experience. They can make detailed recommendations to users and experts. They can act independently, replacing the need for human intelligence or intervention.
This narrative review aims to describe and analyze the potential applications of AI in pediatric surgery, providing an up-to-date synthesis of the available literature and focusing on its utility across different stages of care.
To fully understand AI’s potential in this field, it is important to recognize the underlying technologies: machine learning (ML) and deep learning (DL).
ML encompasses methods enabling computers to learn from data and make predictions without explicit programming, using algorithms such as regression models, decision trees, support vector machines, and clustering. A prominent subset of ML, DL, uses multilayered neural networks to automatically extract meaningful features from large datasets, excelling in complex tasks such as natural language processing and medical image interpretation. Key AI-related terms and their interrelationships are summarized in Figure 1 (1).
AI’s integration into healthcare has evolved through major milestones, from the first neural network model in 1943 to the emergence of DL and real-world clinical applications in the 21st century (2). A visual timeline of these developments is presented in Figure 2 (3-5). We present this article in accordance with the Narrative Review reporting checklist (available at https://tp.amegroups.com/article/view/10.21037/tp-2025-437/rc).
Methods
We analyzed literature on PubMed and Web of Science database using “artificial intelligence” and “pediatric surgery” as key words for research in January 2025. To detect all potential applications of AI in pediatric surgery, the interval time considered was a 10-year period [2015–2025]. Selection was limited to English articles only.
Studies regarding adult patients were excluded. The PubMed search yielded 382 results. After removing duplicates and applying the inclusion/exclusion criteria, 46 articles were included in the final review. The included articles addressed the applications of AI in pediatric surgery, ranging from diagnosis to postoperative care. The search strategy is summarized in Table 1.
Table 1
| Items | Specification |
|---|---|
| Date of search | From January to May, 2025 |
| Databases | PubMed and Web of Science |
| Search terms used | Language: English. 382 results, after removing duplicates and applying inclusion/exclusion criteria, 46 articles were included in the final review. PubMed (“artificial intelligence”[MeSH Terms] OR “artificial intelligence”[All Fields]) AND (“pediatric surgery”[MeSH Terms] OR “pediatric surgery”[All Fields]). Web of Science: TS=(“artificial intelligence” OR “machine learning” OR “deep learning”) AND TS=(“pediatric surgery” OR “paediatric surgery” OR “child surgery”) |
| Timeframe | From 2015/01/01 to 2025/01/15 |
| Inclusion and exclusion criteria | Inclusion: English-language articles focused on AI applications in pediatric surgery (diagnosis to postoperative care); original research articles. Exclusion: editorials, commentaries, conference abstracts, adult-only studies, studies unrelated to surgery or AI |
| Selection process | Selection was conducted independently by M.R. and B.C., disagreements resolved by consensus |
AI, artificial intelligence.
Discussion
AI in pediatric surgery
AI has emerged as a transformative technology in medicine, promising significant enhancements in healthcare delivery, patient outcomes, and clinical efficiency. Its applications span multiple domains, including diagnostics, administrative workload reduction, clinical decision support, and predictive analytics (2,6).
AI is increasingly used in pediatric surgery to enhance diagnosis, surgical planning and procedures, and post-operative care. Recent advancements in DL have significantly improved image recognition, enabling AI to assist in areas such as in surgery: AI helps automate the recognition of instruments and procedures, providing real-time anatomical navigation during operations. We will now explore the existing literature about various areas of AI application in pediatric surgery.
AI in diagnosis
AI-based diagnostic framework can be used to analyze vital signs, medical history and images to make diagnostic suggestions. These assessments help prioritize patients based on the severity of their condition, improve the efficiency of healthcare providers, and reduce wait times. Additionally, the framework can assist in diagnosing complex or rare conditions by using large-scale data, reducing individual bias, and expanding the range of possible diagnoses (7).
- Appendicitis: AI models are being developed to improve the diagnosis of appendicitis in children, a condition that can be challenging due to its varied presentations and the reliance on subjective assessments. These models combine clinical findings, laboratory values, and imaging to predict and grading appendicitis (8,9). Some AI tools are able to differentiate between uncomplicated and complicated appendicitis (10), as well as distinguish it from other conditions such as Henoch-Schönlein purpura (11). For instance, AI-powered decision trees, such as the AiPAD model described by Shikha and Kasem (12), can significantly enhance the diagnostic accuracy of trainees, bringing it closer to the level of expert clinicians (13).
This hybrid approach, which combines the strengths of ML with human expertise, offers a promising path for the development of AI-driven diagnostic tools and case-based guidelines for various medical conditions.
- Intussusception: a DL framework has been developed using AI to detect the presence of “concentric circles” in ultrasound images for this clinical condition. The model demonstrates high accuracy in identifying “concentric circles” and is also highly effective in classifying pediatric intussusception (14).
- Necrotizing enterocolitis (NEC): AI is being explored for diagnosing NEC using abdominal radiographs, with some models incorporating perinatal factors to predict surgical NEC in very low birth weight infants (15). The effectiveness of DL algorithms in detecting NEC signs and assisting physicians in early detection is still being studied. Weller et al. demonstrated that a deep convolutional neural network (DCNN) can accurately identify pneumatosis intestinalis in neonatal radiographs, enhancing diagnostic accuracy and clinician confidence for quicker, more effective interventions (16). Additionally, a 2023 study by van Varsseveld et al. and a 2024 study by Wu et al. explored DL models such as Resnet18, Densenet121, and SimpleViT to predict the need for surgery in NEC patients (17,18), demonstrating that AI can help surgeons in challenging decisions. Indeed, ResNet18, trained on bedside chest and abdominal imaging, demonstrated strong diagnostic performance with high accuracy, and represents a reliable tool to support surgical decision-making in neonatal NEC.
- Hirschsprung’s disease (HD): recent advances have shown that AI can significantly enhance the histological diagnosis of HD. Braun et al. developed an ML model that analyzed acetylcholinesterase (AChE) staining in rectal biopsies to detect parasympathetic hyperinnervation, a hallmark of HD. Their AI-assisted image analysis achieved an impressive diagnostic accuracy, which increased when only rectal samples were considered, demonstrating its value even when biopsy samples are limited in submucosal tissue (19). Complementing this, Duci et al. trained DL U-net models on hematoxylin and eosin (H&E)-stained post-operative tissue sections to identify ganglionic cells and hypertrophic nerves. Their approach reached over 91% accuracy for both targets, highlighting AI’s potential to automate and standardize diagnosis while aiding less-experienced pathologists (20).
- Biliary atresia: AI is being developed to screen for biliary atresia using ultrasound images, as ultrasound is operator-dependent and can often lead to misdiagnosis. Specifically, pediatric hepatobiliary ultrasounds require highly trained imaging professionals. The AI method was not intended to replace liver biopsy or intraoperative cholangiography, but to reduce human error in the interpretation of ultrasounds. The trained DL models could assist physicians beyond pediatric surgeons, pediatric gastroenterologists, or pediatric radiologists, helping to prevent misinterpretation of pediatric hepatobiliary ultrasound images (21).
- Hydronephrosis: the Hydronephrosis Severity Index (HSI) escribed by Erdman et al. is a grading system that assesses the severity of urinary dilatation that, coupled with AI, can estimate the severity of hydronephrosis based on ultrasound images alone; it is accurate and generalizable and has no problems related to subjective interpretation. ML models can incorporate clinical and imaging predictors to determine outcomes, the need for surgery, or the severity of hydronephrosis. The use of this technology can help reduce invasive testing for children who can resolve the problem without intervention and expedite treatment for those who can benefit (22).
- Vesicoureteral reflux (VUR): the international radiographic grading system for VUR is a widely used tool for assessing the clinical progression and guiding treatment decisions in children with VUR. Grading is determined by analyzing voiding cystourethrogram (VCUG) images and accurate grading is critical, as it can influence patient management. However, studies have shown that inter-rater agreement using this system can be below 60%, largely due to the subjective nature of image interpretation. In response to this limitation, Khondker et al. introduced quantitative VUR (qVUR)—an automated, quantitative method that uses supervised ML to assign VUR grades based solely on VCUG images. Using measures for the ureteropelvic junction, ureterovesical junction, ureter width, and tortuosity, qVUR is able to predict low or high-grade VUR with high accuracy and explainability (23).
- Pulmonary diseases: AI provides valuable decision support for detecting pneumothorax and pleural effusion in radiology images. Chest radiographs are crucial for diagnosing pneumonia in children, although they can be subjective, with varying interpretations among radiologists. A DL system has been developed to assist in diagnosing common pediatric pulmonary diseases, aiding in the differentiation between normal X-rays and those showing lower respiratory tract issues. This system helps clinicians review X-ray interpretations and reduce oversights. In recent years, AI has also proven effective in assessing pulmonary nodules, with DL algorithms enhancing the detection of small nodules and osteosarcoma-related micro-metastases. AI plays a key role in improving early diagnosis, reducing missed diagnoses, and guiding treatment decisions, ultimately improving survival rates for patients with osteosarcoma (24,25).
- Esophageal caustic burns: the debate about the management of corrosive ingestion is still open and represents a problem both for the patients and healthcare systems. Aydin et al. demonstrate that the presence and the severity of the esophageal burn after caustic substance ingestion can be predicted with blood count parameters; in this study they used ML algorithms to forecast caustic burn and determined that the type of the caustic and the platelet distribution width (PDW) values were the most important predictors (26).
- Testicular torsion (TT): TT is a common emergency that deserves immediate exploration. Diagnosing TT can be challenging due to overlap with other scrotal conditions, and imaging may not always be available. Different scoring systems have been published to improve diagnostic consistency; in a 2020 study Klinke et al. evaluate different TT scores (including an AI-based score) in a large cohort of children with acute scrotum, demonstrating that Boettcher Alert Score (BAL) and AI scores can predict TT with high sensitivity and specificity (27).
AI in surgical planning and procedures
AI is improving surgery by creating three-dimensional (3D) models from computed tomography (CT) or magnetic resonance imaging (MRI) scans, helping surgeons plan more precisely, especially in complex surgeries. It also helps assess surgical skills by giving real-time feedback on a surgeon’s performance. During surgery, AI aids navigation and decision-making using real-time images, helping surgeons find structures and detect problems. It can also predict outcomes based on patient data, supporting better decisions, with robotic surgery improving precision in delicate steps. Let’s explore the various applications of AI:
- 3D modeling: AI can generate 3D models from cross-sectional images, providing valuable assistance in surgical planning. By using AI algorithms, it is possible to segment organs and tumors from imaging data, which enables the visualization of complex anatomical structures. These 3D models play a crucial role in preoperative planning, surgical simulation, and patient education, offering a more comprehensive understanding of the patient’s condition. Additionally, AI tools have made 3D modeling more accessible by reducing the need for specialized technical expertise and lowering financial barriers, making these advanced technologies available to a wider range of healthcare professionals (28). For example, in Wilms’ tumor, 3D reconstruction of the neoplastic kidney provides valuable support for surgical planning, risk assessment, patient selection, and communication with families. However, it is not widely used in clinical practice due to the time-consuming and error-prone process of manual image segmentation. To address this, Hild et al. developed an AI-based tool to automate kidney tumor segmentation and reduce the need for expert input, making 3D reconstruction more practical in everyday clinical settings. They used a convolutional neural network (CNN) with a U-Net architecture, designed to automatically fill in missing areas during segmentation. This semi-automated method reached expert-level accuracy while reducing manual work by up to 80% (29).
- Surgical skill assessment: AI can analyze surgical videos to assess the quality of surgical techniques and provide objective feedback to trainees. By using DL, AI systems can identify specific characteristics that distinguish skilled surgical techniques. Additionally, AI can evaluate the movement of surgical instruments, such as forceps, during procedures, offering insights into the precision and efficiency of the surgeon’s actions. Furthermore, AI can be utilized for the automated evaluation of surgical skills, providing valuable feedback based on DL algorithms that assess technique and performance. For example, in recent years, the surgical choice for the correction of esophageal atresia is thoracoscopy which allows young patients to recover more quickly without increasing the risk of perioperative complications, but at the same time, pediatric minimally invasive surgery requires advanced technical skills. Yasui et al. built a system that automatically evaluated surgical skills based on forceps movement using DL by structuring an automated system that recognized the movement of the forceps and determined the quality of the surgical technique. This system has been constructed that automatically evaluated the quality of surgical techniques based on the movement of forceps using DL, so as to identify the important procedures for suture manipulation and thus reduce operating times (28).
- Intraoperative guidance: AI technologies, for instance ML and image recognition, show great potential in assisting with the identification and classification of hypospadias, guiding the surgical approach. For example, an experienced surgeon may opt for a different surgical approach than a less experienced one. Training algorithms with intraoperative images can improve decision-making. Nicolas Fernandez et al. developed an algorithm with 90% accuracy in identifying and classifying hypospadias, aiming to standardize classification methods and provide expert-level guidance. Abbas et al. proposed a three-stage AI framework using the plate objective scoring tool (POST) score for objective assessment of distal hypospadias, achieving 99.1% sensitivity. This model helps standardize evaluation, reducing subjectivity in surgical planning and ensuring more reliable outcomes across clinical settings (30,31).
AI for postoperative care and outcomes
Pain assessment
Managing postoperative pain in children is challenging due to the subjective nature of pain and the need for age-appropriate assessment tools. While self-reporting scales such as Visual Analogue Scale (VAS), Numerical Rating Scale (NRS), and Faces Pain Scale-Revised (FPS-R) work well for children over seven, they are less effective for younger or non-verbal patients. Behavioral and physiological scales offer alternatives but come with limitations such as inter-rater variability and increased complexity. AI, particularly using ML and DL, shows promise in improving pain assessment accuracy and efficiency. However, studies reveal varying performance and limitations, such as small datasets and lack of generalizability. The AI models used, primarily DL and ML, demonstrated accuracy rates ranging from 79% to 85.62% (32). In a 2021 study, Salekin et al. introduced a temporal, multimodal AI system designed to evaluate postoperative pain in neonates. The system independently analyses video inputs (facial expressions and body movements) and audio signals (crying sounds) to generate pain scores, which are then integrated through decision fusion to produce a final pain assessment. Experimental findings indicate that this multimodal approach offers greater reliability in real-world clinical settings (33). There is limited data on how parents use standardized tools to assess and manage this pain at home. AI-enabled tools, for instance the PainChek Infant app, can help parents assess their infant’s pain post-surgery by analysing facial expressions. As infants are often discharged the same day as surgery, pain management typically takes place at home, but it is often reported as inadequate. Additionally, Sada et al. demonstrated that PainChek Infant and ObsVAS have similar utility, helping parents choose and assess the effectiveness of pain interventions, thus supporting their decision-making (34). AI can predict patients’ pain management needs also based on their medical history, surgery type, and recovery progress, optimizing pain control while minimizing medication side effects. These examples underscore the potential of AI to revolutionize postoperative care, making it more responsive, personalized, and effective (35).
Predicting complications
AI models are increasingly being developed to predict the risk of complications, such as intra-abdominal abscesses (IAAs) after appendectomy, as well as in a variety of other clinical contexts including tumor-related conditions and congenital malformations. These predictive tools aim to improve patient outcomes by enabling early identification of high-risk cases across diverse medical scenarios. A 2022 study by Alramadhan et al. demonstrated that artificial neural networks (ANNs) can effectively predict the risk of IAAs after appendectomy using selected clinical variables: they showed superior accuracy, sensitivity, and specificity. Key predictive factors included the surgeon’s intraoperative diagnosis, antibiotic therapy completion, and known clinical risks such as longer surgery time, leukocytosis, and elevated body temperature. Notably, the surgical approach (laparoscopic vs. open) had little influence on IAA prediction. Despite limitations, these findings suggest ANNs can aid decision-making and improve postoperative care (36). In 2021, Bhambhvani pioneered the use of deep neural networks (DNNs) to predict 5-year survival in genitourinary rhabdomyosarcoma (GU-RMS) patients. Using National Cancer Institute’s Surveillance Epidemiology and End Results (SEER) data, DNNs outperformed traditional Cox models in accuracy and calibration. Although Cox models offer better interpretability by identifying key risk factors such as age, tumor site, and disease spread, DNNs delivered superior predictive performance, indicating DL’s promise for prognosis in rare cancers (37). Another example is from a 2024 retrospective study by Richter et al. addressing pregnancy decisions in suspected fetal malformations such as lower urinary tract obstruction (LUTO). This machine-learning model used prenatal ultrasound features to predict critical postnatal outcomes, for instance death, dialysis, or transplantation. Unlike previous invasive tests, this non-invasive approach relies on widely available ultrasound data, enhancing clinical counseling and supporting personalized decision-making for families facing complex pregnancies (38).
Length of stay (LOS)
AI can be used to predict the length of hospital stays for pediatric patients, that varies widely depending on chronic conditions and institutional practices, and is influenced by numerous factors including patient characteristics, surgical complexity, anesthesia, and postoperative course. Understanding these determinants is essential for optimizing postoperative management and resource allocation. In this context, the application of AI and ML offers a promising approach to more accurately predict discharge times. These adaptive models can identify modifiable risk factors associated with extended stays, supporting more efficient care planning and improved clinical outcomes (39).
The use of mathematical forecasting models to estimate the LOS offers multiple advantages. From the patient’s perspective, accurately predicting LOS can help prevent complications and provide a clearer expectation of the likely discharge date, contributing to better-informed care and reduced anxiety. From a managerial standpoint, in pediatric settings the identification and quantification of factors influencing pediatric LOS (LOS-P) are particularly impactful. Understanding these determinants enables more effective management of pediatric wards, facilitating improved distribution of healthcare teams and bed capacity while alleviating operational constraints (40).
A study by Elrod et al. compares traditional prediction methods with AI models for burn patients. Although AI and linear regression outperformed the simple “1 day per 1% burn” rule, the improvements were modest, especially for larger burns where treatment complexity and complications make predictions difficult. AI performed best with smaller burns and larger datasets but was limited by variability in data and missing non-medical factors such as hospital policies. Still, even small gains in accuracy can support better resource management, highlighting the need for richer data and tailored models for different treatment centers (41).
Trauma
AI and ML are being used to predict outcomes in pediatric trauma, such as massive transfusion, need for operative management, and mortality risk. AI can also be used to predict trauma volumes in pediatric centers. In the study conducted by Liu et al., trauma team activation levels in pediatric patients were approached as a classification task, utilizing ML models to enhance decision-making in emergency settings. The study presented a proof-of-concept for an ML-based clinical decision support tool designed to predict the appropriate activation levels for trauma teams in children with traumatic injuries. The results demonstrated that the model used in the study significantly reduced variability in trauma activation decisions. The ML models also outperformed emergency department (ED) clinicians in minimizing under-triage while maintaining similar over-triage rates. This highlights the potential of ML-driven tools to support, rather than replace, clinical judgment by reducing bias and ensuring more consistent decision-making. Ultimately, the goal of any trauma system is to deliver the right level of care to the right patient at the right time. ML applications, such as the one developed in this study, offer the potential to support real-time clinical decision-making, improving patient outcomes and optimizing hospital resource utilization (42). Similarly, DL models analyzed in Shahi et al.’s 2021 in a retrospective study have shown the ability to improve early decision-making in children with blunt solid organ injuries. These models outperformed traditional scoring systems in identifying patients needing urgent intervention and highlighted the value of early laboratory results and thromboelastography in risk assessment. The findings emphasize the potential of AI to capture complex, non-linear patterns in clinical data, offering a more personalized and efficient approach to trauma care (43).
AI for caregivers and families
AI offers valuable support in family-centered pediatric perioperative care by easing the burden on healthcare providers and assisting caregivers in meaningful ways. AI technologies can generate personalized educational content, provide emotional support through chatbots, aid in preoperative preparation, and enhance postoperative communication. A key benefit is the ability to make complex medical information more accessible by translating it into simpler, more inclusive language. While AI cannot replace the empathy and nuance of human interaction, it can complement clinical care by streamlining processes and offering consistent support. AI-powered chatbots have shown promise in delivering emotional reassurance and educational resources to caregivers, potentially reducing anxiety. This aligns with the growing recognition of the need to better prepare children for hospital procedures, especially surgery, as poor preparation is closely linked to heightened anxiety and distress. In this context, the study by Bray et al. evaluated the effectiveness of a digital therapeutic platform, Xploro, which provides health information through gamification, serious games, a chatbot, and an augmented reality avatar. Conducted the before-and-after study found that children using Xploro experienced significantly lower levels of procedural anxiety before and after surgery, greater perceived procedural knowledge, and increased involvement in their care. Parents also reported reduced preoperative anxiety. These findings underscore the vital role that digital tools can play in improving psychological readiness and the overall hospital experience for pediatric patients undergoing surgery (44,45).
Beneficial outcomes
The use of AI in pediatric surgery offers several potential benefits. The integration of AI into pediatric surgery promises to expand possibilities and significantly transform surgical practices within this specialty. By providing surgeons with advanced tools for improved decision-making, more accurate procedural planning, and increased surgical precision, AI has the potential to minimize operative risks, shorten recovery periods, and enhance patient outcomes.
Improved accuracy
AI can improve the accuracy of diagnoses and reduce the risk of human error.
ML algorithms have the capacity to continually refine their diagnostic accuracy by learning from new data, thus playing a key role in enhancing patient care and clinical outcomes. This is especially crucial in pediatric surgery, where the number of patients is limited and multiple compounding factors may influence patient outcomes. By leveraging predictive analytics and decision-support systems, AI can significantly enhance pattern recognition, improve healthcare monitoring, and reliably address the individualized care requirements of pediatric patients (46,47).
Increased efficiency
AI can automate tasks such as image analysis and data extraction, saving time for medical professionals and reducing surgical costs. Integrating AI technology into surgical practice can require a significant initial investment, but long-term benefits may lead to substantial cost savings through reduced complications and improved resource management (29,48). Quality improvement initiatives utilizing AI can analyze various factors influencing quality measures, thereby decreasing surgical complications, hospital readmissions, and repeat interventions, ultimately lowering healthcare expenses. Additionally, optimized use of surgical equipment, operating room scheduling, and staffing allocation can minimize waste and enhance overall efficiency (42).
Personalized medicine
Personalized medicine in pediatric surgery is significantly enhanced by the integration of AI, enabling tailored therapeutic strategies for young patients. AI facilitates the analysis of extensive clinical datasets, imaging, and patient-specific information, thus improving diagnostic accuracy, surgical planning, and postoperative management. AI algorithms can predict patient-specific risks, optimize surgical outcomes, and minimize complications by identifying subtle patterns that are often missed by conventional methods. Additionally, ML-driven models offer decision support by predicting outcomes based on genetic, phenotypic, and clinical variables unique to each child. This precision-driven approach not only fosters safer, more effective surgical interventions but also reduces healthcare costs and enhances patient and family satisfaction, paving the way toward truly individualized pediatric surgical care (28).
Challenges and considerations
The implementation of AI in pediatric surgery presents significant potential but comes with notable challenges and considerations:
- Data quality: the accuracy of AI models relies heavily on the data used for training. Pediatric populations typically have limited sample sizes and encompass patients from newborns to adolescents across varying developmental stages, creating distinct biases and limitations.
- Reporting standards: there is a lack of standardization in reporting AI studies, which can make it difficult to compare studies and implement models clinically. Guidelines such as STREAM-URO and APPRAISE-AI are being developed to improve reporting and critical appraisal of AI studies (49).
- Empathy in medical communication: the application of large language models (LLMs) in medical communication has great potential, though it faces notable limitations. The most significant challenges are establishing a reliable standard for accurate communication, managing potential biases inherent in these models, addressing critical privacy concerns, and acknowledging that audio analysis alone is not enough to capture emotional subtleties. While facial expression analysis may increase emotional understanding, it also raises significant ethical and privacy concerns that require careful consideration (50).
- Ethical and legal considerations: data safety concerns and the need for careful handling of pediatric data should be addressed. Unresolved legal issues regarding accountability for AI-generated errors, such as incorrect diagnoses or recommendations, necessitate clear investigation and defined responsibilities among developers, surgeons, and healthcare institutions. Ultimately, pediatric surgical practice should evolve toward a hybrid model, synergizing human expertise with sophisticated AI-driven support tools to address decisional complexities inherent in current clinical care (51).
- Parental consent: evidence suggests that while many parents are open to AI-driven interventions in their child’s care, this openness is contingent upon perceived accuracy, convenience, cost, and, critically, the extent to which shared decision-making is preserved (52). Furthermore, recent data indicate pervasive concern about misdiagnosis, accountability, and an enduring belief that AI should augment rather than replace the physician (53). These findings highlight the urgent need for future research to explore how informed consent frameworks for pediatric AI applications can be structured to reflect parental expectations, ethical transparency, and demographic diversity.
Future directions
To fully realize the potential of AI in pediatric surgery, extensive research efforts should focus on enhancing the performance and reliability of AI-driven models. Specifically, future research initiatives must prioritize the creation and utilization of pediatric-specific datasets, which are crucial to improving model accuracy, given the unique physiological and anatomical characteristics inherent to the pediatric population. Pediatric datasets must encompass comprehensive age groups, from neonates to adolescents, to ensure broader applicability and reliability of predictive outcomes.
Moreover, significant attention should be directed towards integrating these AI models into clinical practice through rigorous, prospective, real-world studies. Such practical assessments would enable healthcare providers to evaluate the direct impact of AI on patient care outcomes, operational efficiency, and overall treatment effectiveness. Understanding how AI tools interact with existing clinical workflows, as well as their acceptance and usability among healthcare professionals, will be vital to their successful implementation. Furthermore, assessing AI’s influence on clinical decision-making processes and patient-centered outcomes, including improved diagnostic precision, reduced complications, shorter hospital stays, and enhanced patient safety, will be essential in validating the clinical value of AI.
In conclusion, dedicated efforts to advance pediatric-specific AI development and rigorous clinical integration studies are imperative for translating promising AI technologies into effective, reliable, and meaningful improvements in pediatric surgical care.
Conclusions
This review explored the current application of AI in pediatric surgery and depicted AI as a powerful ally in delivering safer, more precise, and more personalized surgical care for pediatric patients.
AI is poised to significantly reshape pediatric surgery by improving diagnostic accuracy, enhancing surgical planning, supporting intraoperative decision-making, and optimizing postoperative care.
Despite its transformative potential, current applications remain limited by challenges such as insufficient pediatric-specific data, lack of standardized validation, and unresolved ethical and legal issues.
To fully integrate AI into clinical practice, future efforts must focus on the development of robust datasets, transparent methodologies, and prospective validation studies.
Acknowledgments
We would like to thank the Health Innovation Factory (HIF) Department Research Center/Center Office, University of Verona for their support in revising this review.
Footnote
Reporting Checklist: The authors have completed the Narrative Review reporting checklist. Available at https://tp.amegroups.com/article/view/10.21037/tp-2025-437/rc
Peer Review File: Available at https://tp.amegroups.com/article/view/10.21037/tp-2025-437/prf
Funding: None.
Conflicts of Interest: All authors have completed the ICMJE uniform disclosure form (available at https://tp.amegroups.com/article/view/10.21037/tp-2025-437/coif). The authors have no conflicts of interest to declare.
Ethical Statement: the authors are accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved.
Open Access Statement: This is an Open Access article distributed in accordance with the Creative Commons Attribution-NonCommercial-NoDerivs 4.0 International License (CC BY-NC-ND 4.0), which permits the non-commercial replication and distribution of the article with the strict proviso that no changes or edits are made and the original work is properly cited (including links to both the formal publication through the relevant DOI and the license). See: https://creativecommons.org/licenses/by-nc-nd/4.0/.
References
- Miyake Y, Retrosi G, Keijzer R. Artificial intelligence and pediatric surgery: where are we? Pediatr Surg Int 2024;41:19. [Crossref] [PubMed]
- Tsai AY, Carter SR, Greene AC. Artificial intelligence in pediatric surgery. Semin Pediatr Surg 2024;33:151390. [Crossref] [PubMed]
- McCulloch WS, Pitts W. A logical calculus of the ideas immanent in nervous activity. 1943. Bull Math Biol 1990;52:99-115; discussion 73-97.
- Turing AM. Computing machinery and intelligence. Mind 1950;59:433-60.
- The Research Conference Where AI Began. Available online: https://home.dartmouth.edu/about/artificial-intelligence-ai-coined-dartmouth
- Aung YYM, Wong DCS, Ting DSW. The promise of artificial intelligence: a review of the opportunities and challenges of artificial intelligence in healthcare. Br Med Bull 2021;139:4-15. [Crossref] [PubMed]
- Liang H, Tsui BY, Ni H, et al. Evaluation and accurate diagnoses of pediatric diseases using artificial intelligence. Nat Med 2019;25:433-8. [Crossref] [PubMed]
- Abu-Ashour W, Emil S, Poenaru D. Using Artificial Intelligence to Label Free-Text Operative and Ultrasound Reports for Grading Pediatric Appendicitis. J Pediatr Surg 2024;59:783-90. [Crossref] [PubMed]
- Hayashi K, Ishimaru T, Lee J, et al. Identification of Appendicitis Using Ultrasound with the Aid of Machine Learning. J Laparoendosc Adv Surg Tech A 2021;31:1412-9. [Crossref] [PubMed]
- Reismann J, Kiss N, Reismann M. The application of artificial intelligence methods to gene expression data for differentiation of uncomplicated and complicated appendicitis in children and adolescents - a proof of concept study. BMC Pediatr 2021;21:268. [Crossref] [PubMed]
- Nie D, Zhan Y, Xu K, et al. Artificial intelligence differentiates abdominal Henoch-Schönlein purpura from acute appendicitis in children. Int J Rheum Dis 2023;26:2534-42. [Crossref] [PubMed]
- Shikha A, Kasem A. The Development and Validation of Artificial Intelligence Pediatric Appendicitis Decision-Tree for Children 0 to 12 Years Old. Eur J Pediatr Surg 2023;33:395-402. [Crossref] [PubMed]
- Shikha A, Kasem A, Han WSP, et al. AI-augmented clinical decision in paediatric appendicitis: can an AI-generated model improve trainees' diagnostic capability? Eur J Pediatr 2024;183:1361-6. [Crossref] [PubMed]
- Li Z, Song C, Huang J, et al. Performance of Deep Learning-Based Algorithm for Detection of Pediatric Intussusception on Abdominal Ultrasound Images. Gastroenterol Res Pract 2022;2022:9285238. [Crossref] [PubMed]
- Kim SH, Oh YJ, Son J, et al. Machine learning-based analysis for prediction of surgical necrotizing enterocolitis in very low birth weight infants using perinatal factors: a nationwide cohort study. Eur J Pediatr 2024;183:2743-51. [Crossref] [PubMed]
- Weller JH, Scheese D, Tragesser C, et al. Artificial Intelligence vs. Doctors: Diagnosing Necrotizing Enterocolitis on Abdominal Radiographs. J Pediatr Surg 2024;59:161592. [Crossref] [PubMed]
- Wu Z, Zhuo R, Liu X, et al. Enhancing surgical decision-making in NEC with ResNet18: a deep learning approach to predict the need for surgery through x-ray image analysis. Front Pediatr 2024;12:1405780. [Crossref] [PubMed]
- van Varsseveld OC, Ten Broeke A, Chorus CG, et al. Surgery or comfort care for neonates with surgical necrotizing enterocolitis: Lessons learned from behavioral artificial intelligence technology. Front Pediatr 2023;11:1122188. [Crossref] [PubMed]
- Braun Y, Friedmacher F, Theilen TM, et al. Diagnosis of Hirschsprung disease by analyzing acetylcholinesterase staining using artificial intelligence. J Pediatr Gastroenterol Nutr 2024;79:729-37. [Crossref] [PubMed]
- Duci M, Magoni A, Santoro L, et al. Enhancing diagnosis of Hirschsprung's disease using deep learning from histological sections of post pull-through specimens: preliminary results. Pediatr Surg Int 2023;40:12. [Crossref] [PubMed]
- Hsu FR, Dai ST, Chou CM, et al. The application of artificial intelligence to support biliary atresia screening by ultrasound images: A study based on deep learning models. PLoS One 2022;17:e0276278. [Crossref] [PubMed]
- Erdman L, Rickard M, Drysdale E, et al. The Hydronephrosis Severity Index guides paediatric antenatal hydronephrosis management based on artificial intelligence applied to ultrasound images alone. Sci Rep 2024;14:22748. [Crossref] [PubMed]
- Khondker A, Kwong JCC, Rickard M, et al. A machine learning-based approach for quantitative grading of vesicoureteral reflux from voiding cystourethrograms: Methods and proof of concept. J Pediatr Urol 2022;18:78.e1-7. [Crossref] [PubMed]
- Akay MA, Tatar OC, Tatar E, et al. XRAInet: AI-based decision support for pneumothorax and pleural effusion management. Pediatr Pulmonol 2024;59:2809-14. [Crossref] [PubMed]
- Ni YL, Zheng XC, Shi XJ, et al. Thoracoscopic resection of pulmonary osteosarcoma metastases guided by artificial intelligence: A case series. J Pediatr Surg Case Rep 2023;99:102729.
- Aydın E, Khanmammadova N, Aslanyürek B, et al. A simple machine learning approach for preoperative diagnosis of esophageal burns after caustic substance ingestion in children. Pediatr Surg Int 2023;40:20. [Crossref] [PubMed]
- Klinke M, Elrod J, Stiel C, et al. The BAL-Score Almost Perfectly Predicts Testicular Torsion in Children: A Two-Center Cohort Study. Front Pediatr 2020;8:601892. [Crossref] [PubMed]
- Ryan ML, Wang S, Pandya SR. Integrating Artificial Intelligence Into the Visualization and Modeling of Three-Dimensional Anatomy in Pediatric Surgical Patients. J Pediatr Surg 2024;59:161629. [Crossref] [PubMed]
- Hild O, Berriet P, Nallet J, et al. Automation of Wilms' tumor segmentation by artificial intelligence. Cancer Imaging 2024;24:83. [Crossref] [PubMed]
- Fernandez N, Lorenzo AJ, Rickard M, et al. Digital Pattern Recognition for the Identification and Classification of Hypospadias Using Artificial Intelligence vs Experienced Pediatric Urologist. Urology 2021;147:264-9. [Crossref] [PubMed]
- Abbas TO, AbdelMoniem M, Khalil IA, et al. Deep learning based automated quantification of urethral plate characteristics using the plate objective scoring tool (POST). J Pediatr Urol 2023;19:373.e1-9. [Crossref] [PubMed]
- Kasundra A, Chanchlani R, Lal B, et al. Role of Artificial Intelligence in the Assessment of Postoperative Pain in the Pediatric Population: A Systematic Review. Cureus 2025;17:e77074. [Crossref] [PubMed]
- Salekin MS, Zamzmi G, Goldgof D, et al. Multimodal spatio-temporal deep learning approach for neonatal postoperative pain assessment. Comput Biol Med 2021;129:104150. [Crossref] [PubMed]
- Sada F, Chivers P, Cecelia S, et al. Parental Assessment of Postsurgical Pain in Infants at Home Using Artificial Intelligence-Enabled and Observer-Based Tools: Construct Validity and Clinical Utility Evaluation Study. JMIR Pediatr Parent 2024;7:e64669. [Crossref] [PubMed]
- Yue JM, Wang Q, Liu B, et al. Postoperative accurate pain assessment of children and artificial intelligence: A medical hypothesis and planned study. World J Clin Cases 2024;12:681-7. [Crossref] [PubMed]
- Alramadhan MM, Al Khatib HS, Murphy JR, et al. Using Artificial Neural Networks to Predict Intra-Abdominal Abscess Risk Post-Appendectomy. Ann Surg Open 2022;3:e168. [Crossref] [PubMed]
- Bhambhvani HP, Zamora A, Velaer K, et al. Deep learning enabled prediction of 5-year survival in pediatric genitourinary rhabdomyosarcoma. Surg Oncol 2021;36:23-7. [Crossref] [PubMed]
- Richter J, Shinar S, Erdman L, et al. Use of prenatal ultrasound findings to predict postnatal outcome in fetuses with lower urinary tract obstruction. Ultrasound Obstet Gynecol 2024;64:768-75. [Crossref] [PubMed]
- Cascella M, Guerra C, Atanasov AG, et al. Predicting Post-surgery Discharge Time in Pediatric Patients Using Machine Learning. Transl Med UniSa 2024;26:69-80. [Crossref] [PubMed]
- Boff Medeiros N, Fogliatto FS, Karla Rocha M, et al. Predicting the length-of-stay of pediatric patients using machine learning algorithms. Int J Prod Res 2023;63:483-96.
- Elrod J, Mohr C, Wolff R, et al. Using Artificial Intelligence to Obtain More Evidence? Prediction of Length of Hospitalization in Pediatric Burn Patients. Front Pediatr 2020;8:613736. [Crossref] [PubMed]
- Liu CW, Chacon M, Crawford L, et al. Machine Learning Improves the Accuracy of Trauma Team Activation Level Assignments in Pediatric Patients. J Pediatr Surg 2024;59:74-9. [Crossref] [PubMed]
- Shahi N, Shahi AK, Phillips R, et al. Decision-making in pediatric blunt solid organ injury: A deep learning approach to predict massive transfusion, need for operative management, and mortality risk. J Pediatr Surg 2021;56:379-84. [Crossref] [PubMed]
- Chaker SC, Hung YC, Saad M, et al. Easing the Burden on Caregivers- Applications of Artificial Intelligence for Physicians and Caregivers of Children with Cleft Lip and Palate. Cleft Palate Craniofac J 2025;62:574-87. [Crossref] [PubMed]
- Bray L, Sharpe A, Gichuru P, et al. The Acceptability and Impact of the Xploro Digital Therapeutic Platform to Inform and Prepare Children for Planned Procedures in a Hospital: Before and After Evaluation Study. J Med Internet Res 2020;22:e17367. [Crossref] [PubMed]
- Stiel C, Elrod J, Klinke M, et al. The Modified Heidelberg and the AI Appendicitis Score Are Superior to Current Scores in Predicting Appendicitis in Children: A Two-Center Cohort Study. Front Pediatr 2020;8:592892. [Crossref] [PubMed]
- Aydin E, Türkmen İU, Namli G, et al. A novel and simple machine learning algorithm for preoperative diagnosis of acute appendicitis in children. Pediatr Surg Int 2020;36:735-42. [Crossref] [PubMed]
- Ryan ML, Knod JL, Pandya SR. Creation of Three-dimensional Anatomic Models in Pediatric Surgical Patients Using Cross-sectional Imaging: A Demonstration of Low-cost Methods and Applications. J Pediatr Surg 2024;59:426-31. [Crossref] [PubMed]
- Khondker A, Kwong JCC, Rickard M, et al. Application of STREAM-URO and APPRAISE-AI reporting standards for artificial intelligence studies in pediatric urology: A case example with pediatric hydronephrosis. J Pediatr Urol 2024;20:455-67. [Crossref] [PubMed]
- Liévin V, Hother CE, Motzfeldt AG, et al. Can large language models reason about medical questions? Patterns (N Y) 2024;5:100943. [Crossref] [PubMed]
- Celi LA, Fine B, Stone DJ. An awakening in medicine: the partnership of humanity and intelligent machines. Lancet Digit Health 2019;1:e255-7. [Crossref] [PubMed]
- Sisk BA, Antes AL, Burrous S, et al. Parental Attitudes toward Artificial Intelligence-Driven Precision Medicine Technologies in Pediatric Healthcare. Children (Basel) 2020;7:145. [Crossref] [PubMed]
- Huang YD, Zeng SL, Lin J, et al. Parents' understanding and attitudes toward the application of AI in pediatric healthcare: a cross-sectional survey study. Front Public Health 2025;13:1654482. [Crossref] [PubMed]



