Transfer learning prediction of surgical necrotizing enterocolitis in preterm infants without pneumoperitoneum on abdominal X-ray

Dayan Sun; Chuanping Xie; Yong Zhao; Junmin Liao; Yanan Zhang; Kaiyun Hua; Yichao Gu; Jingbin Du; Shuangshuang Li; Dingding Wang; Jinshi Huang

doi:10.21037/tp-2025-1-867

Original Article

Transfer learning prediction of surgical necrotizing enterocolitis in preterm infants without pneumoperitoneum on abdominal X-ray

Dayan Sun^#, Chuanping Xie^#, Yong Zhao, Junmin Liao, Yanan Zhang, Kaiyun Hua, Yichao Gu, Jingbin Du, Shuangshuang Li, Dingding Wang, Jinshi Huang

Department of Neonatal Surgery, Beijing Children’s Hospital, Capital Medical University, National Center for Children’s Health, Beijing, China

Contributions: (I) Conception and design: D Sun, C Xie; (II) Administrative support: J Huang; (III) Provision of study materials or patients: J Huang; (IV) Collection and assembly of data: C Xie, D Wang; (V) Data analysis and interpretation: C Xie, D Wang; (VI) Manuscript writing: All authors; (VII) Final approval of manuscript: All authors.

^#These authors contributed equally to this work.

Correspondence to: Jinshi Huang; Dingding Wang. Department of Neonatal Surgery, Beijing Children’s Hospital, Capital Medical University, National Center for Children’s Health, No. 56 Nalishi Road, Xicheng District, Beijing 100045, China. Email: hjsbch@163.com; wddlene@126.com.

Background: Necrotizing enterocolitis (NEC) remains a leading cause of mortality in preterm infants, with 30–39% requiring surgical intervention. However, existing models for predicting surgical NEC lack accuracy and clinical utility, especially for infants without pneumoperitoneum on abdominal X-ray (AXR). In this study, we aimed to develop a prediction model to earlier identify NEC requiring surgical intervention.

Methods: All preterm infants diagnosed with NEC (modified Bell’s stage ≥ II) without pneumoperitoneum on AXR from Beijing Children’s Hospital between January 2016 to December 2022 were retrospectively reviewed. Demographic, perinatal, clinical, laboratory, and imaging findings were analyzed. Six machine learning (ML) algorithms—logistic regression, decision tree, random forest, support vector machine, multilayer perceptron, and extreme gradient boosting—were trained and optimized via ten-fold cross-validation. The best-performing support vector machine model was further enhanced using transfer learning. The optimal algorithm was deployed into a web-based graphical user interface (GUI) for real-time risk stratification.

Results: A total of 144 preterm infants with NEC without pneumoperitoneum on AXR were included in our study, including the surgical NEC group (n=31) and the medical NEC group (n=113). Multivariate analysis identified lower gestational age (P=0.010), pregnancy vaginitis (P=0.014), respiratory support (P=0.005), positive abdominal examinations (P<0.001), elevated C-reactive protein (P=0.003), and turbid peritoneal fluid on abdominal ultrasonography (P<0.001), as independent risk factors for surgical NEC. Then we constructed six ML models to predict surgical NEC by utilizing five variables derived from clinical, laboratory, and imaging findings in NEC-afflicted infants. Of all the models, support vector machine achieved perfect discrimination and superior reproducibility across training and validation sets. The transfer-learning model, built on the support vector machine base, achieved superior performance in the training set [area under the receiver operating characteristic curve (AUC) =0.964, 95% confidence interval (CI): 0.921–0.995] and validation set (AUC =0.937, 95% CI: 0.829–1.000). SHapley Additive exPlanations analysis highlighted positive abdominal examinations, turbid fluid on abdominal ultrasound, and bowel sounds grades as the top predictors. Furthermore, we developed a transfer-learning based GUI for the predictive model to facilitate clinical application.

Conclusions: This study pioneered an interpretable ML framework integrating multimodal data to predict surgical NEC with near-perfect discrimination. Furthermore, the transfer-learning based GUI represented a transformative approach to optimizing surgical timing.

Keywords: Necrotizing enterocolitis (NEC); surgery; prediction; machine learning (ML)

Submitted Dec 02, 2025. Accepted for publication Jan 27, 2026. Published online Feb 27, 2026.

doi: 10.21037/tp-2025-1-867

Highlight box

Key findings

• In this study, we constructed machine learning (ML) models for predicting surgical treatment with near-perfect discrimination in preterm without pneumoperitoneum on abdominal X-ray (AXR).

What is known and what is new?

• Emerging evidence underscores the prognostic significance of multifactorial risk stratification integrating clinical, laboratory, and imaging variables. However, conventional logistic regression models often inadequately capture nonlinear interactions among predictors, limiting their clinical utility for real-time risk stratification.

• This study developed a multimodal ML framework incorporating clinically actionable variables derived from the clinical, laboratory, and imaging findings to identify necrotizing enterocolitis (NEC) neonates requiring surgical intervention. In addition, we deployed the optimized ML algorithm into a clinician-centric graphical user interface (GUI) for real-time risk stratification. By enabling early, data-driven surgical decision-making, this tool holds significant potential to mitigate NEC-associated mortality.

What is the implication, and what should change now?

• The findings suggested the ML predictive model achieve excellent accuracy in predicting surgical NEC. Positive abdominal signs, turbid fluid on abdominal ultrasonography (AUS), bowel sound grades were the most influential predictors of surgical NEC. Further prospective research should be required to verify and improve the predictive performance of the models.

Introduction

Necrotizing enterocolitis (NEC), a life-threatening gastrointestinal pathology predominantly affecting preterm neonates, is characterized by intestinal inflammation, necrosis, and potential progression to bowel perforation and systemic sepsis (1). Despite advancements in neonatal intensive care, NEC remains a leading cause of mortality in preterm infants, with surgical intervention required in 30–39% of cases to address complications such as perforation or fulminant ischemia (2,3). At present, pneumoperitoneum on abdominal X-ray (AXR) remains the established radiographic criterion for definitive surgical indication in NEC; however, 25–50% of NEC-associated perforations lack radiographic evidence of free air, a phenomenon attributed to localized inflammatory exudates and fibrinous adhesion that trap gas and promote fluid-dominated peritoneal leakage (4). Early identification of infants without pneumoperitoneum on AXR at risk for surgical NEC is critical, as delayed intervention exacerbates adverse outcomes, including multiorgan failure and long-term neurodevelopment impairment (5). At present, the timing of surgical intervention relies on clinical signs, laboratory markers, and imaging findings; however, there is a lack of an appropriate model for predicting the optimal timing for surgical NEC (1,2,6).

Emerging evidence underscores the prognostic significance of multifactorial risk stratification integrating clinical, laboratory, and imaging variables. Lower gestational age (GA), congenital heart disease (CHD), and systemic inflammatory markers [for example, C-reactive protein (CRP), thrombocytopenia] are established predictors of NEC severity (7,8). Additionally, advanced imaging techniques such as abdominal ultrasonography (AUS) enable the detection of pathognomonic features (e.g., pneumoperitoneum, turbid peritoneal fluid) associated with transmural necrosis and surgical urgency (9). Despite these advancements, conventional logistic regression (LR) models often inadequately capture nonlinear interactions among predictors, limiting their clinical utility for real-time risk stratification.

Machine learning (ML) algorithms, renowned for their ability to analyze high-dimensional datasets and identify complex predictive patterns, offer a transformative approach to prognostication in neonatal critical care. Prior applications of ML in neonatology have demonstrated efficacy in sepsis prediction and mortality risk modeling, yet its utility in surgical NEC prediction remains underexplored (10). In this study, we aimed to develop a multimodal ML framework incorporating clinically actionable variables derived from the clinical, laboratory, and imaging findings to identify NEC neonates requiring surgical intervention. In addition, we deployed the optimized ML algorithm into a clinician-centric graphical user interface (GUI) for real-time risk stratification. By enabling early, data-driven surgical decision-making, this tool holds significant potential to mitigate NEC-associated mortality. We present this article in accordance with the TRIPOD reporting checklist (available at https://tp.amegroups.com/article/view/10.21037/tp-2025-1-867/rc).

Methods

Study design

We conducted a retrospective review of electronic medical records of all neonates diagnosed with NEC at Beijing Children’s Hospital from January 2016 to December 2022. Inclusion criteria comprised: (I) GA ≤37 weeks; (II) diagnosis of NEC at ≤28 days of age; (III) infants diagnosed with NEC modified Bell’s staging ≥ IIA; (IV) infants without pneumoperitoneum on AXR. The exclusion criteria were as follows: (I) infants with NEC modified Bell’s staging I; (II) severe NEC infants abandoned treatment; (III) infants with pneumoperitoneum on AXR. Finally, a total of 144 infants were enrolled in this study and were divided into two groups according to the main clinical outcomes to analyze the risk factors for surgical intervention and construct a prediction model for surgical NEC (Figure 1). The study was conducted in accordance with the Declaration of Helsinki and its subsequent amendments. The study was approved by the Ethics Committee of Beijing Children’s Hospital (2023-E-005-R). Informed consent was waived in this retrospective study.

Figure 1 The flowchart of patients’ screening. AXR, abdominal X-ray; NEC, necrotizing enterocolitis.

Data collected

Demographic, perinatal, clinical, laboratory, and imaging data were collected within 24 hours after the established NEC diagnosis. Perinatal variables included gender, GA, birth weight, pregnancy complications (including diabetes, hypertension, hypothyroidism, vaginitis), the application of corticosteroids, maternal age, childbirth delivery methods (vaginal delivery, cesarean section), prelabor rupture of membranes (PROM), Apgar score (1 min, 5 min), the presence of CHD, in vitro fertilization, and feeding type. Complex CHD was defined as severe structural heart defects at birth, such as tetralogy of Fallot, ventricular septal defect, and hypoplastic left heart syndrome (11). Clinical parameters encompassed NEC onset timing, hemodynamic instability, respiratory support modality (non-invasive/invasive ventilation), hematochezia, metabolic acidosis, and abdominal examinations (distension, tenderness, rigidity, visible peristalsis, and bowel sound grades). Positive abdominal signs are defined as obvious abdominal tenderness (manifested by curled legs and distress pain when touching the abdomen) or abdominal tension. Bowel sound was classified as normal, weakened, or absent based on its intensity. Laboratory indices involved CRP, leukocyte count, hemoglobin, platelet count, serum albumin, pH, HCO₃, and lactate. Imaging evaluations incorporated AXR and AUS findings: pneumatosis intestinalis (PI), portal venous gas (PVG), pneumoperitoneum, and turbid peritoneal fluid. Imaging findings from AXR and AUS studies were only collected from imaging studies performed prior to the surgical intervention. If the patients did not have surgical NEC, then imaging findings were collected from studies performed during the work-up for NEC. All the imaging findings on AXR or AUS were read by experienced pediatric radiologists.

The diagnosis of NEC was based on the modified Bell’s criteria, which incorporates clinical symptoms and radiological manifestations. The indication of surgical intervention included highly suspected bowel perforation or failure to respond to conservative treatment (7). The surgical approach was determined by the condition of the intestinal tract. If intestinal necrosis was identified, an exploratory laparotomy was performed, followed by one-stage intestinal resection or intestinal diversion after resection of the necrotic bowel segment. If the intestinal tract did not exhibit necrosis, exploratory surgery was followed by intrabdominal lavage and placement of intra-abdominal drains.

Predictive modeling and evaluation

A total of six ML algorithms—LR, random forest (RF), decision tree (DET), support vector classifier (SVC), multilayer perceptron (MLP), and extreme gradient boosting (XGBoost)—were investigated for the prediction of surgical NEC. All models were developed in Python (version 3.10).

To further improve predictive capability, a transfer learning (TL) framework was designed. This model adopted a two-phase methodology: initially, an SVC model was pre-trained on the complete derivation cohort to derive feature importance weights. Subsequently, an artificial neural network (ANN) comprising three fully connected layers (64-32-1 nodes) was constructed. The ReLU activation function and Dropout (rate =0.3) were incorporated for regularization. Knowledge transfer was achieved by initializing the input layer weights with the feature importance scores from the pre-trained SVC model. Training employed the Adam optimizer (learning rate =0.001) and binary cross-entropy loss, with a sample weighting strategy applied to address class imbalance. The optimal model based on validation set performance was retained for final assessment.

To mitigate multicollinearity among features, Spearman correlation analysis was conducted. Pairwise correlation coefficients were computed, and features exhibiting high correlation (r>0.9) were removed. The dataset was randomly partitioned into training and validation subsets at a 7:3 ratio. Feature selection was performed using LASSO regression. Predictive models were built employing ten-fold cross-validation on the training data, with performance evaluated on the validation set. Nine metrics were recorded per iteration, including the area under the receiver operating characteristic curve (AUC), accuracy, precision, recall, and F1 score. Models were compared based on the mean AUC across ten iterations, and the top-performing ML model was selected. Calibration curves were generated to assess agreement between predicted and observed risks.

SHapley Additive exPlanations (SHAP) were utilized to interpret feature contributions to model predictions. For clinical translation, a web-based GUI was developed using the Flask framework (Python 3.10) and deployed as an online tool. This interface enables real-time input of five clinical variables and instant prediction of surgical NEC risk with corresponding management guidance.

Statistical analysis

Descriptive statistics are reported as mean (standard deviation, SD) or median (interquartile range, IQR) for continuous variables, and as percentages for categorical variables. Group comparisons were conducted using the Kruskal-Wallis or Wilcoxon rank-sum tests for continuous data, and the Chi-square or Fisher’s exact tests for categorical data. Variables with P<0.1 in univariate analysis were included in multivariable analysis to identify predictors of surgical NEC. Results are presented as unadjusted and adjusted odds ratios (OR) with 95% confidence intervals (CI). A two-sided P value <0.05 was considered statistically significant.

The entire statistical analysis pipeline—including descriptive statistics, hypothesis testing, feature scaling, ML modeling (RF, SVM, and neural networks), cross-validation, model evaluation (ROC, calibration, decision curve analysis), and interpretability analysis (SHAP)—was implemented in Python (version 3.10) using libraries including pandas, NumPy, scikit-learn, PyTorch, and SHAP. R (version 4.2.1) was used only to corroborate the results of selected nonparametric tests, ensuring consistency across platforms.

Results

Patient characteristics

A total of 144 preterm infants diagnosed with NEC (stage ≥ IIA) without pneumoperitoneum on AXR were enrolled in this study. To enable early identification of neonates requiring surgical intervention, participants were stratified into two cohorts: the surgical NEC group (n=31) and the medical NEC group (n=113). Univariate analysis of baseline characteristics demonstrated that the surgical NEC group had significantly lower GA (P=0.002) or birth weight (P=0.020), and higher incidences of vaginitis during pregnancy (P=0.008). Multivariate LR identified lower GA (P=0.010) and pregnancy vaginitis (P=0.014) as independent risk factors for surgical NEC.

In the clinical presentation, the surgical NEC group displayed higher incidences of respiratory support (P<0.001), shock (P<0.001), positive abdominal signs (P<0.001), and absent bowel sounds (P<0.001). Multivariate analysis further identified respiratory support (P=0.004) and positive abdominal signs (P<0.001) as significant predictors of surgical NEC (Table 1).

Table 1

Univariate and multivariate analysis of demographic, perinatal, and clinical findings associated with surgical NEC

Baseline characteristics	Surgical NEC (n=31)	Medical NEC (n=113)	Unadjusted		Adjusted
Baseline characteristics	Surgical NEC (n=31)	Medical NEC (n=113)	P value	OR (95% CI)	P value	OR (95% CI)
Demographic and clinical variable
Gender male	18 (58.1)	59 (52.2)	0.56	0.78 (0.35–1.76)
Gestational age (weeks)	31.91±2.57	32.72±0.38	0.002	0.77 (0.66–0.91)	0.01	1.25 (1.06–1.48)
Birth weight (kg)	1.70±0.49	1.78±0.09	0.02	1.00 (1.00–1.00)
Pregnancy complications
Diabetes	5 (16.1)	25 (22.1)	0.47	0.68 (0.24–1.94)
Hypertension	5 (16.1)	26 (23.0)	0.31	0.58 (0.21–1.67)
Hypothyroidism	3 (9.7)	12 (10.6)	0.88	0.90 (0.24–3.42)
Vaginitis	8 (25.8)	6 (5.3)	0.001	6.20 (1.96–19.60)	0.01	4.46 (1.35–14.76)
Corticosteroids	4 (12.9)	18 (15.9)	0.68	0.78 (0.24–2.51)
Age mother at birth (years)	30.42±6.46	32.15±0.48	0.13	0.94 (0.88–1.02)
Delivery
In vitro fertilization	5 (16.1)	16 (14.2)	0.78	1.17 (0.39–3.48)
Vaginal delivery	9 (29.0)	34 (30.1)	0.91	0.95 (0.40–2.28)
Multiple births	6 (19.4)	18 (15.9)	0.65	1.27 (0.46–3.53)
Meconium amniotic fluid	2 (6.5)	8 (7.1)	0.90	0.91 (0.18–4.50)
PROM	11 (35.4)	24 (21.4)	0.10	2.04 (0.86–4.83)
1-min Apgar	8.74±1.85	8.76±1.99	0.52	1.05 (0.90–1.24)
5-min Apgar	9.56±0.75	9.41±1.26	0.62	1.10 (0.77–1.57)
Complex congenital heart disease	2 (6.5)	3 (2.7)	0.31	2.53 (0.40–15.85)
Any breastfeeding before NEC	9 (29.0)	43 (38.1)	0.35	0.67 (0.28–1.58)
Any formula feeding before NEC	25 (80.6)	77 (68.1)	0.18	1.95 (0.74–5.16)
Clinical presentation and physical examination
NEC onset days	18.23±15.70	17.27±13.25	0.73	1.01 (0.98–1.03)
Respiratory support	26 (83.9)	41 (36.3)	<0.001	9.09 (3.26–25.64)	0.005	9.17 (1.93–43.54)
Shock	9 (29.0)	5 (4.4)	<0.001	8.84 (2.70–28.91)
Hematochezia	14 (45.2)	61(54.0)	0.38	0.70 (0.32–1.56)
Positive abdominal sign	27 (87.1)	4 (3.5)	<0.001	102.21 (27.89–347.72)	<0.001	76.92 (15.63–406.34)
Visible peristalsis	4 (12.9)	7 (7.2)	0.21	2.21 (0.73–6.65)
Absent bowel sound	14 (45.2)	3 (2.7)	<0.001	30.20 (7.85–116.19)

Data are presented as n (%), mean ± SD, median ± IQR. CI, confidence interval; IQR, interquartile range; NEC, necrotizing enterocolitis; OR, odds ratio; PROM, prelabor rupture of membranes; SD, standard deviation.

Laboratory and imaging findings

Comparative analyses of laboratory and imaging parameters are summarized in Table 2. Univariate analysis revealed that CRP levels (P<0.001) were significantly elevated in the surgical NEC group, whereas leukocyte count (P=0.028), hemoglobin (P=0.001), platelet count (P=0.015), serum albumin (P<0.001), and pH (P=0.013) were markedly reduced. Multivariate analysis confirmed elevated CRP (P=0.001) as an independent risk factor for surgical NEC.

Table 2

Univariate and multivariate analysis of laboratory and imaging findings associated with surgical NEC

Variable	Surgical NEC (n=31)	Medical NEC (n=113)	Unadjusted		Adjusted
Variable	Surgical NEC (n=31)	Medical NEC (n=113)	P value	OR (95% CI)	P value	OR (95% CI)
Laboratory value prior to clinical onset
C-reactive protein (mg/L)	41.0 (15.0, 133.8)	8.0 (8.0, 8.0)	<0.001	1.02 (1.01–1.03)	0.003	1.02 (1.01–1.03)
Leukocytes (×10⁹/L)	6.7 (3.9, 11.5)	11.6 (8.7, 13.8)	0.03	0.91 (0.84–0.99)	0.046	0.90 (0.80–1.00)
Hemoglobin (g/L)	108.5 (95.3, 119.0)	129.0 (106.0, 148.0)	0.001	0.98 (0.96–0.99)
Platelet (×10⁹/L)	175.0 (112.3, 263.5)	301.0 (186.0, 382.0)	0.02	1.00 (0.99. 1.00)
Na (mmol/L), mean (SD)	135.5 (132.4, 138.0)	135.9 (133.8, 138.6)	0.17	0.94 (0.86–1.03)
Albumin (g/L), mean (SD)	28.0 (23.1, 31.5)	31.9 (27.2, 34.7)	<0.001	0.83 (0.75–0.92)
pH, mean (SD)	7.4 (7.3, 7.4)	7.4 (7.4, 7.5)	0.01	0.00 (0.00–0.26)
HCO₃ (mmol/L)	23.9 (21.1, 27.4)	24.4 (22.2, 26.1)	0.55	0.96 (0.84–1.09)
Lactate (mmol/L)	1.8 (1.0, 3.5)	1.8 (1.2, 3.3)	0.71	0.95 (0.72–1.26)
Imaging findings
Pneumatosis on AXR	11 (35.5)	35 (31.0)	0.63	1.23 (0.53–2.83)
Portal venous gas on AXR	4 (12.9)	7 (6.2)	0.21	2.24 (0.61–8.22)
Pneumatosis on AUS	16 (53.3)	88 (79.3)	0.004	0.30 (0.13–0.70)	0.047	0.30 (0.09–0.98)
Portal venous gas on AUS	8 (26.7)	54 (48.6)	0.03	0.38 (0.15. 0.94)
Pneumoperitoneum on AUS	7 (23.3)	3 (2.7)	<0.001	10.96 (2.63–45.58)
Turbid peritoneal fluid on AUS	18 (60.0)	4 (3.6)	<0.001	40.13 (11.65–138.22)	<0.001	35.71 (8.55–145.83)

Data are presented as n (%), median (IQR) unless otherwise specific. AUS, abdominal ultrasonography; AXR, abdominal X-ray; CI, confidence interval; IQR, interquartile range; NEC, necrotizing enterocolitis; OR, odds ratio; SD, standard deviation.

Regarding imaging findings, pneumoperitoneum on abdominal ultrasound (AUS, P<0.001), as well as turbid peritoneal fluid on AUS (P<0.001), were more prevalent in the surgical group. Conversely, pneumatosis (P=0.004) and PVG (P=0.031) on AUS were less frequent. Multivariate analysis identified turbid peritoneal fluid on AUS (P<0.001) as the strongest independent predictor of surgical intervention.

Development of a ML prediction model

To improve surgical NEC prognostication, a ML model was developed by utilizing variables screened from clinical, laboratory, and imaging findings. Spearman correlation analysis confirmed the absence of significant multicollinearity. LASSO regression was employed for feature selection, yielding five optimal predictors based on diagnostic efficacy (Figure S1). Six algorithms—LRM, DET, MLP, SVC, RF, and XGBoost—were trained using 10-fold cross-validation and optimized via Optuna hyperparameter tuning (Figure 2A-2C).

Figure 2 Performance of multiple machine learning model based on selected features. (A) Correlation analysis of 5 features extracted by LASSO regression. (B,C) AUC of all six machine learning models in the training and validation set, including LRM, DET, MLP, SVC, RF, and XGBoost. (D) Calibration curves for all six machine learning models. AUC, area under the receiver operating characteristic curve; AUS, abdominal ultrasonography; CRP, C-reactive protein; DET, decision tree; LASSO, least absolute shrinkage and selection operator; LRM, logistic regression; MLP, multilayer perceptron; RF, random forest; SVC, support vector classifier; XGBoost, extreme gradient boosting.

As shown in Table 3, the five traditional ML models exhibited varying performance, with SVC showing superior performance in the validation set (AUC =0.924, 95% CI: 0.812–1.000) (Table 3). Calibration curves (Figure 2D) indicated a strong alignment between predicted and observed probabilities.

Table 3

Predictive efficacy of six machine learning predictive models in the training and validation set

Algorithms	Average performance of cross-validation on the training set					Performance on the validation set
Algorithms	Precision	Recall	F1-score	Accuracy	AUC (95% CI)	Precision	Recall	F1-score	Accuracy	AUC (95% CI)
LRM	0.850	0.750	0.757	0.910	0.938 (0.871–1.000)	0.875	0.636	0.737	0.886	0.803 (0.635–0.971)
DET	0.690	0.700	0.660	0.870	0.806 (0.687–0.925)	0.625	0.455	0.526	0.796	0.682 (0.488–0.876)
MLP	0.883	0.750	0.777	0.920	0.888 (0.786–0.989)	0.875	0.636	0.737	0.886	0.803 (0.635–0.971)
SVC	0.633	0.650	0.600	0.880	0.825 (0.680–0.970)	0.833	0.909	0.870	0.932	0.924 (0.812–1.000)
RF	0.833	0.750	0.733	0.900	0.906 (0.808–1.000)	0.857	0.546	0.667	0.864	0.758 (0.577–0.938)
XGBoost	0.817	0.700	0.713	0.910	0.916 (0.823–1.000)	0.714	0.455	0.556	0.818	0.697 (0.505–0.889)
TL	0.773	0.850	0.810	0.920	0.964 (0.921–0.995)	0.833	0.909	0.870	0.932	0.937 (0.829–1.00)

AUC, area under the receiver operating characteristic curve; CI, confidence interval; DET, decision tree; LRM, logistic regression; MLP, multilayer perceptron; RF, random forest; SVC, support vector classifier; XGBoost, extreme gradient boosting; TL, transfer learning.

To further improve predictive efficacy, a TL model was built on the foundation of the best-performing SVC model. The TL model outperformed all traditional models in the training set (mean AUC =0.964, 95% CI: 0.921–0.995) and validation set (AUC =0.937, 95% CI: 0.829–1.000). The performance of the TL model is further illustrated in Figure 3A-F, where the ROC curve of the TL model in the validation set shows improved performance (ROC =0.937) over the base SVC model (ROC =0.924), the confusion matrix for the TL model in the validation set highlights classification accuracy, and the predicted probability distribution histogram for the TL model demonstrates clear separation between medical NEC and surgical NEC groups.

Figure 3 Performance of the transfer learning model developed based on the SVC model. (A,B) ROC curve of the TL model in the training and validation sets, showing improved performance over the base SVC model. (C,D) Predicted probability distribution histogram for the TL model, demonstrating clear separation between operated NEC and non-operated NEC cases in the training and validation sets. (E,F) Calibration curve for the TL model in the training and validation sets. AUC, area under the receiver operating characteristic curve; CI, confidence interval; NEC, necrotizing enterocolitis; ROC, receiver operating characteristic; SVC, support vector classifier; TL, transfer learning.

Model interpretation and clinical deployment

SHAP analysis highlighted positive abdominal signs, turbid fluid on AUS, and bowel sounds grades as the most influential predictors (Figures S2,S3, Figure 4A-4D). Feature impacts were visualized using color gradients (red: higher values, blue: lower values). DCA curves for the TL model also demonstrated excellent alignment between predicted and actual probabilities (Figure 4E,4F).

Figure 4 Model interpretability of the transfer learning model. (A,B) Summary bar plot of mean absolute SHAP values, ranking the relative importance of the 5 input features. Abdominal sign, Turbid fluid on AUS, and bowel sounds grade were the top predictors. (C,D) SHAP dependence plot for the features in the training and validation sets. (E,F) DCA for the TL model in the (E) training and (F) validation cohorts. The DCA curves depict the net benefit across a range of threshold probabilities, demonstrating the model’s clinical utility for predicting surgical NEC. AUS, abdominal ultrasonography; CRP, C-reactive protein; DCA, decision curve analysis; NEC, necrotizing enterocolitis; TL, transfer learning; SHAP, SHapley Additive exPlanations.

The TL model was operationalized into a clinician-friendly, publicly accessible web GUI application. Healthcare providers can access the tool online (https://nec-idmf.onrender.com) to input five clinical parameters and obtain instant surgical risk probability with stratified clinical recommendations, facilitating point-of-care decision support.

Discussion

Early identification of NEC requiring surgical intervention is crucial to prevent severe complications such as bowel perforation, sepsis, and multiorgan failure, especially for infants without pneumoperitoneum on AXR (the definite indication for surgery) (12). In this study, we pioneered a novel approach by leveraging five variables derived from the clinical, laboratory, and imaging findings to construct ML learning models for predicting surgical NEC. Notably, the TL algorithm emerged as the most robust predictive model, demonstrating excellent predictive accuracy in both the training and validation set. To enhance clinical applicability, the TL model was integrated into a GUI application, enabling real-time assessment of the risk of surgical NEC. The practical tool holds significant potential to support clinicians in making timely and data-driven decisions, facilitating early intervention, and enhancing the prognosis of preterm infants with severe NEC.

Patient characteristics and clinical presentation

Consistent with previous literature, our study corroborated that lower GA was a critical factor for surgical NEC, a relationship likely mediated by the triad of immature intestinal mucosal integrity, dysregulated motility, and compromised immunocompetence in preterm neonates (13). The underdeveloped intestinal epithelium in this population exhibits heightened vulnerability to ischemic injury and inflammatory cascades, predisposing to rapid progression of mucosal necrosis and transmural perforation (14). Furthermore, the immunological immaturity of preterm infants and dysregulated cytokine responses also aggravated uncontrolled inflammation, thereby accelerating NEC pathogenesis (15). Our study first identified maternal vaginitis during pregnancy as a risk factor for severe NEC. This condition might trigger intrauterine inflammation via ascending infection, leading to fetal exposure to elevated levels of pro-inflammatory cytokines and microbial components. Preterm infants prenatally programmed into this state of heightened susceptibility are predisposed to developing fulminant, dysregulated inflammatory responses postnatally, progressing to transmural intestinal necrosis and perforation (3,16).

Clinical manifestations and physical examination findings are pivotal for the timely determination of surgical intervention in NEC neonates. Khalak et al. (17) developed a predictive model incorporating seven physical examination parameters-abdominal girth, erythema, discoloration, tenderness, tension, absent bowel sounds, and delayed cap refill-which demonstrated high sensitivity and specificity in identifying neonates requiring surgery. These signs collectively reflect the progression of intestinal inflammation, ischemia, and peritoneal irritation, pathognomonic features of advanced NEC (18). In alignment with these findings, multivariate regression analysis in our study identified positive abdominal pain/tenderness and respiratory support as independent risk factors for surgical NEC, which underscored the critical role of abdominal signs in the early identification of severe NEC requiring surgical intervention (18).

Laboratory and imaging findings

Our multivariate analysis demonstrated that higher CRP was a prognostic factor for surgical NEC, while ML models further identified CRP as a critical predictive factor for surgical intervention. These findings align with a prior investigation by Duci et al. (19), who similarly reported the association between metabolic acidosis, hyperinflammatory states, and severe NEC requiring surgery. The pathophysiological cascade involved neutrophil-driven cytokine storms exacerbating intestinal microcirculatory failure and metabolic acidosis potentially signaling transmural necrosis (20,21).

AUS plays a critical role in predicting surgical NEC (22). Allin et al. (23) demonstrated that hypoechoic ascites, absent bowel peristalsis, pneumoperitoneum, bowel wall thinning/thickening, and luminal dilation correlate with surgical intervention or mortality in NEC. Our study also identified turbid peritoneal fluid on AUS as the strongest independent predictor of surgical NEC. This finding highlights AUS’s unique capacity to detect timely pathological change, which was critical for evaluating the necessity of surgical intervention (22).

Notably, while PI and PVG have been variably linked to NEC severity in prior studies, our analysis found no association between these signs and surgical NEC (22,24-26). Some studies suggested PI/PVG portend poor prognosis, whereas others argued their presence or resolution did not consistently correlate with clinical severity (9,19). Our results align with the latter, emphasizing that PI and PVG may reflect specific pathophysiological processes rather than universal indicators of surgical urgency. Clinicians should therefore interpret these signs cautiously, avoiding overreliance on PI/PVG to guide surgical decisions.

ML prediction model and interpretation

ML represents a computer-driven methodology that constructs sophisticated analytical models through adaptive learning frameworks, with the primary objective of enhancing predictive accuracy. With advancements in computational science, ML has been widely adopted across medical disciplines, facilitating the development of diagnostic and prognostic models by leveraging real-world clinical data (13,26). For example, Ashoori et al. (27) found that utilizing the dynamic characteristics of cerebral oxygen saturation (rcSO₂) and employing the XGBoost model can effectively predict short-term brain injury in infants with hypoxic-ischemic encephalopathy (HIE) (AUC =0.73) and HIE grading (AUC =0.81), thereby providing an objective and non-invasive assessment tool for clinical practice. In this study, we pioneered the application of ML to predict surgical NEC by utilizing five variables derived from clinical, laboratory, and imaging findings. Among the seven distinct ML algorithms in predicting the likelihood of surgical NEC, the TL algorithm demonstrated superior predictive performance and robust reproducibility across both training and validation datasets. By leveraging the SHAP for interpreting these models, we found that positive abdominal signs, turbid fluid on AUS, and bowel sound grades were the most influential predictors of surgical NEC. These findings align closely with conventional LR analyses. To enhance clinical utility, the TL model was embedded into a GUI application, enabling real-time risk stratification and supporting clinicians in making timely, evidence-based decisions.

There are some limitations in our study. Firstly, this is a single-center retrospective study with a relatively small sample size, leading to a particular deviation. Second, the positive abdominal signs (abdominal tension or tenderness) and bowel sound grade in the predictive model were not quantitative indicators, which could be influenced by clinical experience or prejudice. Thirdly, the calibration curve of the ML predictive model constructed in this study has some bias, which is mainly due to the limited number of patients, and therefore a larger prospective clinical cohort is required to verify and improve the predictive performance of the models.

Conclusions

In conclusion, the ML predictive model developed using five variables from the clinical, laboratory, and imaging parameters demonstrated excellent accuracy in predicting surgical NEC. Positive abdominal signs, turbid fluid on AUS, bowel sound grades were the most influential predictors of surgical NEC. Furthermore, we applied a GUI-based clinical decision support system, powered by the TL algorithm, providing clinicians with a practical tool for predicting surgical NEC.

Acknowledgments

None.

Footnote

Reporting Checklist: The authors have completed the TRIPOD reporting checklist. Available at https://tp.amegroups.com/article/view/10.21037/tp-2025-1-867/rc

Data Sharing Statement: Available at https://tp.amegroups.com/article/view/10.21037/tp-2025-1-867/dss

Peer Review File: Available at https://tp.amegroups.com/article/view/10.21037/tp-2025-1-867/prf

Funding: This work was supported by the Beijing Municipal Science & Technology Commission (No. Z2102921062) and Beijing High Innovation Plan (No. 20250058).

Conflicts of Interest: All authors have completed the ICMJE uniform disclosure form (available at https://tp.amegroups.com/article/view/10.21037/tp-2025-1-867/coif). The authors have no conflicts of interest to declare.

Ethical Statement: The authors are accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved. The study was conducted in accordance with the Declaration of Helsinki and its subsequent amendments. The study was approved by the Ethics Committee of Beijing Children’s Hospital (2023-E-005-R). Informed consent was waived in this retrospective study.

Open Access Statement: This is an Open Access article distributed in accordance with the Creative Commons Attribution-NonCommercial-NoDerivs 4.0 International License (CC BY-NC-ND 4.0), which permits the non-commercial replication and distribution of the article with the strict proviso that no changes or edits are made and the original work is properly cited (including links to both the formal publication through the relevant DOI and the license). See: https://creativecommons.org/licenses/by-nc-nd/4.0/.

References

Golubkova A, Hunter CJ. Updates and recommendations on the surgical management of NEC. Semin Perinatol 2023;47:151698. [Crossref] [PubMed]
Speer AL, Lally KP, Pedroza C, et al. Surgical Necrotizing Enterocolitis and Spontaneous Intestinal Perforation Lead to Severe Growth Failure in Infants. Ann Surg 2024;280:432-43. [Crossref] [PubMed]
Jones IH, Hall NJ. Contemporary Outcomes for Infants with Necrotizing Enterocolitis-A Systematic Review. J Pediatr 2020;220:86-92.e3. [Crossref] [PubMed]
Alexander KM, Chan SS, Opfer E, et al. Implementation of bowel ultrasound practice for the diagnosis and management of necrotising enterocolitis. Arch Dis Child Fetal Neonatal Ed 2021;106:96-103. [Crossref] [PubMed]
Wang Y, Liu S, Lu M, et al. Neurodevelopmental outcomes of preterm with necrotizing enterocolitis: a systematic review and meta-analysis. Eur J Pediatr 2024;183:3147-58. [Crossref] [PubMed]
Rausch LA, Hanna DN, Patel A, et al. Review of Necrotizing Enterocolitis and Spontaneous Intestinal Perforation Clinical Presentation, Treatment, and Outcomes. Clin Perinatol 2022;49:955-64. [Crossref] [PubMed]
Guo H, Li Y, Wang L. Assessment of inflammatory biomarkers to identify surgical/death necrotizing enterocolitis in preterm infants without pneumoperitoneum. Pediatr Surg Int 2024;40:191. [Crossref] [PubMed]
Lu P, Gong X, Gu X, et al. Mortality and extrauterine growth restriction of necrotizing enterocolitis in very preterm infants with heart disease: a multi-center cohort study. Eur J Pediatr 2024;183:3579-88. [Crossref] [PubMed]
Cuna AC, Reddy N, Robinson AL, et al. Bowel ultrasound for predicting surgical management of necrotizing enterocolitis: a systematic review and meta-analysis. Pediatr Radiol 2018;48:658-66. [Crossref] [PubMed]
Zhang WY, Chen ZH, An XX, et al. Analysis and validation of diagnostic biomarkers and immune cell infiltration characteristics in pediatric sepsis by integrating bioinformatics and machine learning. World J Pediatr 2023;19:1094-103. [Crossref] [PubMed]
Brunse A, Deng L, Pan X, et al. Fecal filtrate transplantation protects against necrotizing enterocolitis. ISME J 2022;16:686-94. [Crossref] [PubMed]
Bethell GS, Knight M, Hall NJ, et al. Surgical necrotizing enterocolitis: Association between surgical indication, timing, and outcomes. J Pediatr Surg 2021;56:1785-90. [Crossref] [PubMed]
Kim SH, Oh YJ, Son J, et al. Machine learning-based analysis for prediction of surgical necrotizing enterocolitis in very low birth weight infants using perinatal factors: a nationwide cohort study. Eur J Pediatr 2024;183:2743-51. [Crossref] [PubMed]
Cao X, Zhang L, Jiang S, et al. Epidemiology of necrotizing enterocolitis in preterm infants in China: A multicenter cohort study from 2015 to 2018. J Pediatr Surg 2022;57:382-6. [Crossref] [PubMed]
Duess JW, Sampah ME, Lopez CM, et al. Necrotizing enterocolitis, gut microbes, and sepsis. Gut Microbes 2023;15:2221470. [Crossref] [PubMed]
Willers M, Ulas T, Völlger L, et al. S100A8 and S100A9 Are Important for Postnatal Development of Gut Microbiota and Immune System in Mice and Infants. Gastroenterology 2020;159:2130-2145.e5. [Crossref] [PubMed]
Khalak R, D’Angio C, Mathew B, et al. Physical examination score predicts need for surgery in neonates with necrotizing enterocolitis. J Perinatol 2018;38:1644-50. [Crossref] [PubMed]
Chen Q, Yao W, Xu F, et al. Application of abdominal ultrasonography in surgical necrotizing enterocolitis: a retrospective study. Front Microbiol 2023;14:1211846. [Crossref] [PubMed]
Duci M, Fascetti-Leon F, Erculiani M, et al. Neonatal independent predictors of severe NEC. Pediatr Surg Int 2018;34:663-9. [Crossref] [PubMed]
Mohd Amin AT, Zaki RA, Friedmacher F, et al. C-reactive protein/albumin ratio is a prognostic indicator for predicting surgical intervention and mortality in neonates with necrotizing enterocolitis. Pediatr Surg Int 2021;37:881-6. [Crossref] [PubMed]
El Manouni El Hassani S, Niemarkt HJ, Derikx JPM, et al. Predictive factors for surgical treatment in preterm neonates with necrotizing enterocolitis: a multicenter case-control study. Eur J Pediatr 2021;180:617-25. [Crossref] [PubMed]
Lazow SP, Tracy SA, Staffa SJ, et al. Abdominal ultrasound findings contribute to a multivariable predictive risk score for surgical necrotizing enterocolitis: A pilot study. Am J Surg 2021;222:1034-9. [Crossref] [PubMed]
Allin B, Long AM, Gupta A, et al. A UK wide cohort study describing management and outcomes for infants with surgical Necrotising Enterocolitis. Sci Rep 2017;7:41149. [Crossref] [PubMed]
Lin PC, Lin CY, Chang HY. Sonographic diagnosis of portal venous gas in necrotizing enterocolitis. Pediatr Neonatol 2022;63:93-4. [Crossref] [PubMed]
Seliga-Siwecka J, Rutkowski J, Margas W, et al. Sensitivity and specificity of different imaging modalities in diagnosing necrotising enterocolitis in a Polish population of preterm infants: a diagnostic test accuracy study protocol. BMJ Open 2020;10:e033519. [Crossref] [PubMed]
Weller JH, Scheese D, Tragesser C, et al. Artificial Intelligence vs. Doctors: Diagnosing Necrotizing Enterocolitis on Abdominal Radiographs. J Pediatr Surg 2024;59:161592. [Crossref] [PubMed]
Ashoori M, O’Toole JM, Garvey AA, et al. Machine learning models of cerebral oxygenation (rcSO2) for brain injury detection in neonates with hypoxic-ischaemic encephalopathy. J Physiol 2024;602:6347-60. [Crossref] [PubMed]

Cite this article as: Sun D, Xie C, Zhao Y, Liao J, Zhang Y, Hua K, Gu Y, Du J, Li S, Wang D, Huang J. Transfer learning prediction of surgical necrotizing enterocolitis in preterm infants without pneumoperitoneum on abdominal X-ray. Transl Pediatr 2026;15(3):68. doi: 10.21037/tp-2025-1-867

Transfer learning prediction of surgical necrotizing enterocolitis in preterm infants without pneumoperitoneum on abdominal X-ray

Highlight box

Introduction

Methods

Study design

Data collected

Predictive modeling and evaluation

Statistical analysis

Results

Patient characteristics

Table 1

Laboratory and imaging findings

Table 2

Development of a ML prediction model

Table 3

Model interpretation and clinical deployment

Discussion

Patient characteristics and clinical presentation

Laboratory and imaging findings

ML prediction model and interpretation

Conclusions

Acknowledgments

Footnote

References

Article Options

Download Citation

Share