Construction and validation of a diagnostic model for Kawasaki disease based on neutrophil-related genes and analysis of immune infiltration
Original Article

Construction and validation of a diagnostic model for Kawasaki disease based on neutrophil-related genes and analysis of immune infiltration

Jing Chen, Huan Zhang

Department of Rehabilitation, Children’s Hospital, Zhejiang University School of Medicine, National Clinical Research Center for Child Health, Hangzhou, China

Contributions: (I) Conception and design: J Chen; (II) Administrative support: J Chen; (III) Provision of study materials or patients: J Chen; (IV) Collection and assembly of data: Both authors; (V) Data analysis and interpretation: Both authors; (VI) Manuscript writing: Both authors; (VII) Final approval of manuscript: Both authors.

Correspondence to: Huan Zhang, BS. Department of Rehabilitation, Children’s Hospital, Zhejiang University School of Medicine, National Clinical Research Center for Child Health, Binjiang Campus, No. 3333 Binsheng Road, Hangzhou 310052, China. Email: zhanghuan12012@163.com.

Background: Kawasaki disease (KD) is an acute self-limiting systemic vasculitis of unknown etiology, lacking specific diagnostic biomarkers. Neutrophils and their released extracellular traps play a central role in disease pathogenesis and coronary artery injury, making targeting this pathological process a key direction for exploring novel diagnostic and therapeutic strategies. Therefore, this study aimed to identify neutrophil-related genes associated with KD and to construct and validate a diagnostic model for improving the early and precise diagnosis of KD.

Methods: Based on Gene Expression Omnibus (GEO) database, this study integrated datasets for training (GSE18606, GSE68004) and validation (GSE63881, GSE73461). Key diagnostic genes were identified through differential analysis and machine learning (least absolute shrinkage and selection operator, extreme gradient boosting, random forest). A nomogram model was constructed and evaluated via receiver operating characteristic, calibration and decision curves. Immune infiltration was analyzed using single-sample gene set enrichment analysis (ssGSEA) and CIBERSORT, followed by molecular subtyping and functional enrichment based on the diagnostic genes.

Results: This study identified four neutrophil-related diagnostic genes (ALPL, F5, HK3, MGAM) and developed a high-performance nomogram model. Immune analysis showed increased neutrophil/macrophage infiltration and decreased adaptive immune cells in KD. Two immune-distinct molecular subtypes were identified, and upstream regulatory networks with potential therapeutic drugs were predicted, offering new insights into KD mechanisms and treatment.

Conclusions: This study developed a neutrophil-based diagnostic model for KD, revealing key immune microenvironment features and molecular subtypes. These findings enhance the understanding of disease heterogeneity and support early molecular diagnosis and targeted treatment strategies.

Keywords: Kawasaki disease (KD); neutrophil-related genes; immune infiltration; diagnostic model; subtype


Submitted Jan 10, 2026. Accepted for publication Mar 04, 2026. Published online Apr 23, 2026.

doi: 10.21037/tp-2026-1-0031


Highlight box

Key findings

• A novel diagnostic model based on four neutrophil-related genes (ALPL, F5, HK3, MGAM) associated with Kawasaki disease (KD) was established.

What is known and what is new?

• In recent years, neutrophils have attracted increasing attention due to its close involvement in the onset and progression of various inflammatory diseases.

• The identified key neutrophil-related genes were associated with the development and progression of KD.

What is the implication, and what should change now?

• The study suggests that neutrophil-related genes may serve as promising diagnostic biomarkers for KD, and the next step should be to validate the model in prospective clinical cohorts and confirm key genes through experimental assays.


Introduction

Kawasaki disease (KD) is an acute, self-limiting systemic vasculitis that primarily affects children under the age of 5 years (1). Since its initial report, KD has become the leading cause of acquired heart disease in children in developed countries (2). Although KD occurs worldwide, its epidemiology exhibits significant ethnic disparities, with a notably higher incidence in East Asian populations compared to European and American populations (3). This suggests a potential role for genetic susceptibility in the pathogenesis of KD (4). Due to the lack of a gold standard for specific laboratory diagnosis, clinical diagnosis primarily relies on persistent fever (>5 days) accompanied by typical mucocutaneous symptoms such as bilateral bulbar conjunctival injection, strawberry tongue, polymorphous rash, extremity changes, and cervical lymphadenopathy (5). The main pathological feature of KD is extensive inflammation of small- and medium-sized arteries throughout the body, with the coronary arteries being the primary target (6,7). If not treated promptly and effectively during the acute phase, approximately 25% of affected children may develop coronary artery aneurysms (8,9). These coronary artery lesions can lead to coronary artery stenosis, thrombosis, and even myocardial infarction and sudden death in childhood (10). Although the inflammatory response in most KD children is self-limiting, the resulting vascular endothelial injury and remodeling may have long-term implications for cardiovascular health in adulthood (8). Currently, the first-line standard treatment for KD is intravenous immunoglobulin (IVIG) combined with high-dose aspirin (11). This therapy can effectively suppress the systemic inflammatory response, reducing the incidence of coronary artery lesions from 25% to less than 5% (12). However, the exact etiology of KD remains unknown, and 10–20% of children exhibit IVIG resistance (13,14). These refractory cases are highly prone to severe coronary artery damage (15). Therefore, identifying specific diagnostic biomarkers and elucidating the immunopathogenesis of KD remain critical challenges in current clinical research.

There is a close pathological connection between the acute systemic vasculitis of KD and neutrophils (16). Neutrophils are not only central participants in the inflammatory response but also key effector cells leading to coronary artery injury (17). During the acute phase of the disease, an abnormal immune response triggered by an unknown etiology induces an intense cytokine storm [e.g., interleukin-1 beta (IL-1β), IL-6], which drives massive activation, proliferation, and infiltration of neutrophils into the vascular walls (18,19). The core pathogenic mechanism lies in the release of neutrophil extracellular traps (NETs) by activated neutrophils (20). NETs are reticula structures composed of a DNA backbone and toxic granular proteins such as myeloperoxidase and elastase (21,22). They can directly damage vascular endothelial cells, compromise endothelial integrity, and act as a scaffold to promote platelet aggregation and thrombus formation, thereby directly contributing to the development of coronary arteritis, aneurysms, and thrombotic complications (23-25). Therefore, targeting neutrophils and their NETs has emerged as an important frontier in exploring specific biomarkers and novel therapeutic strategies for KD.

Building upon the close pathological association between KD and neutrophils, this study focuses on neutrophil-related genes (NRGs). Through systematic analysis and modeling, it aims to explore their potential value in the diagnosis of KD. We further investigate the dynamic characteristics and inherent heterogeneity of the immune microenvironment during disease progression, attempting to construct and validate a multi-omics-based diagnostic model. This effort aims to provide new insights and theoretical foundations for the precise identification and mechanistic understanding of KD. We present this article in accordance with the TRIPOD reporting checklist (available at https://tp.amegroups.com/article/view/10.21037/tp-2026-1-0031/rc).


Methods

Data acquisition

Microarray data for KD were downloaded from the Gene Expression Omnibus (GEO) database (https://www.ncbi.nlm.nih.gov/geo/) (GSE18606 and GSE68004). The datasets were merged and the batch effects were removed using the removeBatchEffect function from the limma package, subsequently serving as the training set. This training set comprised a total of 161 samples, including 115 KD patients and 46 normal controls. Additional KD microarray data (GSE63881 and GSE73461) were downloaded from the GEO database to serve as the testing datasets. GSE63881 contained a total of 341 samples, with 171 KD patients and 170 normal controls. GSE73461 included 132 samples, with 77 KD patients and 55 normal controls. Furthermore, 1,394 NRGs were obtained from a previously published relevant literature (26). This study was conducted in accordance with the Declaration of Helsinki and its subsequent amendments.

Differential gene expression analysis

The “limma” package was used to perform differential analysis between the control and KD disease groups [adjusted P value <0.05 and |log fold change (FC)| >1]. The differentially expressed genes (DEGs) were displayed by a volcano plot. Subsequently, weighted gene co-expression network analysis (WGCNA) was conducted using the “WGCNA” package, and genes in the module with the highest correlation were identified as key genes. Functional enrichment and Gene Ontology (GO) pathway analyses were performed on these key genes using the “clusterProfiler” package. Finally, differentially expressed neutrophil-related genes (DENRGs) were identified by taking the intersection of the DEGs, key module genes, and NRGs.

Machine learning (ML) was used to identify important variables

The ML algorithm was further employed to identify key features from the hub genes. Using R packages “glmnet”, “xgboost”, and “randomForest”, least absolute shrinkage and selection operator (LASSO), extreme gradient boosting (XGBoost), and random forest (RF) ML approaches were applied respectively to identify significant feature genes. The results from these three methods were subsequently integrated to determine candidate diagnostic genes for KD.

Development and validation of a diagnostic model

In the GSE18606 and GSE68004 datasets, we performed receiver operating characteristic (ROC) curve analysis on the key hub genes identified through ML using the “pROC” package. Genes with an area under the curve (AUC) greater than 0.7 were considered to have good diagnostic value. Based on these diagnostic genes, a predictive nomogram model was constructed using the rmda package, and decision curve analysis (DCA) was applied to evaluate the clinical utility and diagnostic effectiveness of the model. Furthermore, the robustness and generalizability of the diagnostic model were validated in independent testing datasets (GSE63881 and GSE73461). Finally, a comprehensive comparative analysis was conducted to evaluate the diagnostic performance of our model against previously reported KD diagnostic models.

Expression analysis and single-gene enrichment analysis

The expression levels of the diagnostic genes were compared between the KD group and the normal group, and the results were visualized using box plots. To further explore the potential biological functions associated with each diagnostic gene, single-gene set enrichment analysis (GSEA) was performed. In addition, GO and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analyses were conducted using the “clusterProfiler” package.

Immune infiltration analysis

To characterize the immune infiltration landscape in KD, multiple computational algorithms were employed. First, single-sample gene set enrichment analysis (ssGSEA) was performed to quantify immune cell infiltration and immune-related function scores between the KD and normal groups. Subsequently, the CIBERSORT algorithm was applied to estimate the relative proportions of immune cell types, and the microenvironment cell population (MCP)-counter method implemented in the “IOBR” package was used to further assess immune cell abundance in the samples. Finally, Pearson correlation analysis was conducted to evaluate the association between infiltrating immune cells and the expression levels of the diagnostic genes.

Construction of potential TF-microRNA (miRNA) target gene regulatory networks and prediction of small molecule drugs

The miRNet online database (https://www.mirnet.ca/) was used to identify potential miRNAs targeting the diagnostic genes, and the results were visualized using Cytoscape software. Upstream transcription factors (TFs) were predicted using NetworkAnalyst (https://www.networkanalyst.ca/). The Drug Signatures Database (DSigDB; http://dsigdb.tanlab.org/DSigDBv1.0/) was employed to evaluate potential protein-drug interactions. Specifically, the target genes were uploaded to the Enrichr suite of the GSEA tool (https://maayanlab.cloud/modEnrichr/) to leverage the DSigDB database and predict potential candidate drugs that may target the genes of interest.

Subtype identification of diagnostic genes

To identify potential molecular subtypes of KD and investigate their immune characteristics, the consensus clustering algorithm from the “ConsensusClusterPlus” package was applied to cluster samples from the KD group based on the obtained diagnostic genes. PCA was then used to visualize the sample distribution before and after adjustment. Box plots were employed to compare the expression levels of diagnostic genes across different subtypes. Immune infiltration disparities among subtypes were assessed using ssGSEA, while the CIBERSORT algorithm was utilized to estimate immune cell proportions. Differential expression analysis between subtypes was performed using the limma package, with thresholds set at an adjusted P value <0.05 and |logFC| >0.585. Finally, GO and KEGG enrichment analyses of subtype-related genes were conducted using the clusterProfiler package to explore underlying biological pathways.

Statistical analysis

All analyses were performed using relevant packages in R (version 4.4.0). The Wilcoxon test was used to determine statistical differences between two groups. Associations between two variables were assessed using the Pearson method. Statistical significance was defined as a P value <0.05, with P values summarized as follows: not significant (ns), P≥0.05; *, 0.01≤P<0.05; **, 0.001≤P<0.01; ***, 0.0001≤P<0.001.


Results

Differential expression analysis between the KD and normal groups

Analysis of the GSE18606 and GSE68004 datasets shows that after batch correction, the sample distribution becomes more uniform in the PCA plot (Figure 1A,1B). Differential expression analysis between KD and normal control samples identified a total of 1,541 DEGs, of which 899 genes were up-regulated and 642 genes were down-regulated (Figure 1C). The “WGCNA” package was used to analyze key modules in the normal and disease groups of samples. Based on a soft threshold of 5 (Figure 1D), 8 key modules were ultimately obtained (Figure 1E). Among them, the turquoise module exhibited the highest correlation with KD (Figure 1F), and 132 genes from the turquoise module [|module membership (MM)| >0.8, |gene significance (GS)| >0.2] were identified as key genes (Figure 1G). GO enrichment analysis of the key genes revealed that these genes were highly associated with immune system activation. They were primarily involved in the differentiation and immunoregulatory processes of lymphocytes and T cells. Their protein products were mostly localized in secretory granules (such as specific granules) and may exert their immune defense or signal transduction functions through carbohydrate binding or specific enzymatic activities (e.g., NAD+ metabolism). This suggests a possible inflammatory state, infection response, or immune developmental process (Figure 1H). The intersection of DEGs, key genes, and NRGs yielded 47 DENRGs (Figure 1I). GO enrichment analysis showed that the significantly up-regulated genes were mainly enriched in biological processes such as bacterial defense response and immune regulation, revealing that the body initiates strong immune activation and defense mechanisms against stimuli such as bacterial infection (Figure 1J).

Figure 1 Identification of DEGs and key co-expression modules in KD. (A) PCA plot showing sample distribution before batch correction. (B) PCA plot showing sample distribution after batch correction. (C) Volcano plot displaying DEGs between KD and normal control samples. (D) Scale-free topology fit index analysis for selecting the soft-thresholding power in WGCNA. (E) Cluster dendrogram illustrating gene module identification by WGCNA, with each color representing a distinct co-expression module. (F) Heatmap showing the correlation between gene modules and clinical traits. (G) WGCNA turquoise module scatter plot. (H) GO enrichment analysis of key genes. (I) Intersection analysis of DEGs, key module genes, and NRGs. (J) GO enrichment analysis of intersection genes. BP, biological process; CC, cellular component; DEGs, differentially expressed genes; FC, fold change; GO, Gene Ontology; KD, Kawasaki disease; MF, molecular function; NRGs, neutrophil-related genes; PCA, principal component analysis; WGCNA, weighted gene co-expression network analysis.

Identification of important variables using ML algorithms

To further identify key diagnostic genes among the 47 intersecting genes, we employed three independent ML algorithms: LASSO regression, XGBoost, and RF analysis. Feature selection was performed based on the minimum Lambda value (lambda.min) in LASSO (Figure 2A,2B), gain >0 in XGBoost (Figure 2C), and importance score (MeanDecreaseGini >1) in RF analysis (Figure 2D,2E), yielding 12, 14, and 12 important variables, respectively. By taking the intersection of these three algorithms, four key feature genes—ALPL, F5, HK3, and MGAM—were identified (Figure 2F). Furthermore, analysis of the expression levels of these four genes revealed that they were all significantly upregulated in the KD group compared to the normal control group (Figure 2G). To further explore the potential biological functions of these diagnostic genes, single-gene enrichment analysis was performed. GSEA results showed that all four target genes exhibited similar enrichment patterns (Figure 2H-2K). High expression of these genes was strongly negatively correlated with the significant downregulation of pathways related to ribosome biogenesis (such as rRNA processing and ribosomal subunit assembly) and translation processes (such as cytoplasmic and mitochondrial translation). This suggests that these four genes may collectively participate in biological processes that suppress cellular protein synthesis and metabolic activities.

Figure 2 Identification and functional analysis of key diagnostic genes using machine learning algorithms. (A) Feature selection using the LASSO regression model. (B) Coefficient distribution plot generated for the sequence of log(λ) in the LASSO model. (C) Scoring of feature variables in XGBoost. (D) RF model showing classification error rate as a function of the number of trees. (E) Variable importance ranking in RF based on MeanDecreaseAccuracy and MeanDecreaseGini scores. (F) Intersection analysis of the three machine learning algorithms. (G) Box plots showing the expression of the four genes in the normal group and the KD group. Single-gene enrichment analyses of ALPL (H), F5 (I), HK3 (J), MGAM (K). The x-axis represents the distribution of Spearman correlation coefficients of ranked genes in GSEA. ***, 0.0001≤P<0.001. KD, Kawasaki disease; LASSO, least absolute shrinkage and selection operator; NES, normalized enrichment score; RF, random forest; XGBoost, extreme gradient boosting.

Development and validation of a diagnostic model

ROC curve analysis was performed on the four important feature genes identified through ML algorithms. In both the GSE18606 and GSE68004 datasets, these four genes—ALPL, F5, HK3, and MGAM—demonstrated excellent diagnostic performance, with AUC values all exceeding 0.7, indicating strong distinguishing capability between KD samples and normal samples, and thus they can serve as diagnostic genes for the disease (Figure 3A). Subsequently, a diagnostic nomogram model was developed based on these four diagnostic genes to provide a diagnostic tool for KD (Figure 3B). DCA showed that this combined model yielded the highest net clinical benefit across a wide range of threshold probabilities, confirming its favorable clinical applicability (Figure 3C). The calibration curve indicated good consistency between the model’s predicted probabilities and the actual observed probabilities (Figure 3D). The diagnostic model was further validated in independent test sets (GSE63881 and GSE73461), achieving satisfactory predictive efficiency, with an AUC of 0.904 for GSE63881 (Figure 3E) and 0.919 for GSE73461 (Figure 3F). Benchmarked against established models, our diagnostic signature exhibited enhanced performance in AUC prediction (Figure 3G, Table S1).

Figure 3 Diagnostic performance evaluation and validation. (A) ALPL, F5, HK3, MGAM and the combined model in the training cohort. (B) Nomogram of the KD diagnostic model. (C) DCA curve of KD diagnostic genes. (D) Calibration curve of the KD diagnostic model. ROC curves of the diagnostic model in the independent validation sets GSE63881 (E) and GSE73461 (F). (G) Comparative analysis of AUC values between different models. AUC, area under the curve; DCA, decision curve analysis; KD, Kawasaki disease; ROC, receiver operating characteristic.

Immune infiltration analysis

Based on ssGSEA, the KD group showed significantly lower scores than the normal control group in various immune cells and immune-related functions, particularly in B cells, CD8+ T cells, mast cells, NK cells, and immune checkpoints. In contrast, the KD group exhibited significantly higher scores in macrophages, neutrophils, Treg cells, type I interferon (IFN) response, and type II IFN response (Figure 4A). The results from the CIBERSORT algorithm were consistent, showing significantly reduced infiltration of CD8+ T cells and resting NK cells in the KD group, while the infiltration levels of monocytes, M0 macrophages, M2 macrophages, and neutrophils were significantly elevated (Figure 4B). MCP-counter analysis further confirmed that neutrophils and endothelial cells were significantly higher in the KD group, whereas T cells, CD8+ T cells, NK cells, monocytic lineage cells, and myeloid dendritic cells were significantly lower compared to the normal group (Figure 4C). Correlation analysis between immune cell infiltration and model genes indicated that the four genes—ALPL, F5, HK3, and MGAM—were significantly positively correlated with neutrophils, Treg cells, and macrophages, while showing negative correlations with tumor-infiltrating lymphocytes (TILs), CD8+ T cells, NK cells, B cells, immune checkpoints, and other factors (Figure 4D-4G).

Figure 4 Immune infiltration and correlation analysis of diagnostic genes in KD. (A) Box plots of ssGSEA immune cell and functional scores in the KD group and normal group. (B) Immune cell composition estimated by the CIBERSORT algorithm. (C) Immune infiltration analysis using the MCP-counter method. Correlation between the expression of diagnostic genes ALPL (D), F5 (E), HK3 (F), and MGAM (G) with immune cell infiltration or immune functional signatures. ns, P≥0.05; *, 0.01≤P<0.05; **, 0.001≤P<0.01; ***, 0.0001≤P<0.001. KD, Kawasaki disease; MCP, microenvironment cell population; ssGSEA, single-sample gene set enrichment analysis.

Construction of potential TF and miRNA target gene regulatory networks and prediction of small molecule drugs

Using NetworkAnalyst, several potential upstream TFs including JUN, FOS, CEBPB, FOXC1, and FOXL1 were predicted (Figure 5A). Additionally, miRNet analysis revealed that multiple miRNAs, such as hsa-miR-205-5p, hsa-miR-93-5p, hsa-miR-15b-5p, and hsa-miR-124-3p, exhibit potential binding affinity with the diagnostic genes (Figure 5B). Furthermore, based on the DSigDB database, Table 1 summarizes the top five predicted candidate drugs (Table S2).

Figure 5 Regulatory network construction of diagnostic genes. (A) TF network of diagnostic genes. (B) miRNA network of diagnostic genes. Red represents diagnostic genes, green represents predicted upstream transcription factors, and yellow represents predicted miRNAs. Deeper red indicates that the diagnostic gene is associated with a greater number of TFs or miRNAs. miRNA, microRNA; TF, transcription factor.

Table 1

Drug sensitivity predictions for diagnostic genes

Term P value Adjusted P value Odds ratio Combined score Genes
Norgestrel CTD 00006422 6.19E−04 0.03 97.50246 720.3128 ALPL; F5
17-Ethynyl estradiol CTD 00005932 0.001 0.03 65.21192 430.702 ALPL; F5
(−)-Epicatechin TTD 00000019 0.002 0.03 555.1111 3,304.666 ALPL
Isoquercitrin TTD 00008703 0.002 0.03 512.3846 3,012.375 ALPL
Caffeic acid TTD 00002640 0.002 0.03 512.3846 3,012.375 ALPL

CTD, Comparative Toxicogenomics Database; TTD, Therapeutic Target Database.

Diagnostic gene subtype classification

Based on the identified diagnostic genes, cluster analysis was performed on disease group samples from the GSE18606 and GSE68004 datasets. Consensus clustering results showed that the clustering effect is most distinct and stable at k =2 with a curve of a gentle slope (Figure 6A-6C). Therefore, patients were divided into two subtypes. PCA analysis clearly distinguished these two clusters (Figure 6D). The expression levels of all four diagnostic genes were significantly higher in Subtype 1 than in Subtype 2 (Figure 6E). ssGSEA results indicated that the infiltration levels of the vast majority of adaptive immune cells were significantly elevated in Subtype 2, including CD8+ T cells, B cells, NK cells, Th1 cells, Th2 cells, and TILs. In contrast, Subtype 1 exhibited a significant increase in the abundance of macrophages and neutrophils. Additionally, Subtype 1 showed a significantly higher APC co-inhibition score (Figure 6F). The CIBERSORT algorithm revealed that the abundance of various cells related to adaptive immune responses was significantly higher in Subtype 2, including CD8+ T cells, memory B cells, plasma cells, naive CD4+ T cells, Tregs, and resting NK cells. This suggests that Subtype 2 possesses stronger anti-tumor immune potential and lymphocyte recruitment capability. In contrast, Subtype 1 exhibited an immunosuppressive pattern dominated by myeloid cells, particularly neutrophils and M0 macrophages (Figure 6G). Differential analysis between the subtypes identified DEGs (Figure 6H). GO enrichment analysis showed that the DEGs were significantly enriched in pathways related to T cell and lymphocyte-mediated immune responses, such as cell activation and differentiation, and overall exhibited a significantly activated state (Figure 6I). KEGG analysis confirmed that the functions of the DEGs were primarily concentrated in cytokine signaling and the activation of Th17 cell differentiation (Figure 6J). To evaluate the clinical relevance of the identified subtypes, we compared the clinical characteristics between Subtype 1 and Subtype 2. Our analysis revealed that there was no significant difference in age or gender distribution between the two subtypes (Table S3).

Figure 6 Molecular subtypes of KD. (A) Consensus clustering heatmap. (B) CDF for cluster stability. (C) Relative change in area under CDF. (D) PCA of two subtype genes. (E) Box plot of diagnostic gene expression between subtypes. (F) ssGSEA immune infiltration differences between subtypes. (G) CIBERSORT box plot. (H) Volcano plot of DEGs between subtypes. GO (I) and KEGG (J) pathway enrichment analysis of DEGs between subtypes. ns, P≥0.05; *, 0.01≤P<0.05; **, 0.001≤P<0.01; ***, 0.0001≤P<0.001. CDF, cumulative distribution function; DEGs, differentially expressed genes; FC, fold change; GO, Gene Ontology; KEGG, Kyoto Encyclopedia of Genes and Genomes; PCA, principal component analysis; ssGSEA, single-sample gene set enrichment analysis.

Discussion

Given the central role of neutrophils in the immunopathology of KD, this study systematically developed and validated a diagnostic model based on NRGs, while exploring the underlying immune infiltration characteristics and potential molecular subtypes. By integrating multi-omics data and ML algorithms, we successfully identified a set of hub genes with high diagnostic efficacy. Building on this foundation, we revealed the existence of a differentiated immune microenvironment in KD patients. These findings not only provide new candidate biomarkers and visualization tools for the molecular diagnosis of KD but also suggest the potential existence of distinct immune response patterns within the disease. This opens new perspectives for understanding the heterogeneity of KD, exploring targeted therapeutic strategies, and advancing personalized clinical management.

The four signature genes (ALPL, F5, HK3, MGAM) were all significantly upregulated in the KD group, and their biological functions are closely associated with the pathophysiological processes of neutrophils. Under bacterial infection or chronic inflammatory stimulation, ALPL expression/enzyme activity rapidly increases and can serve as an early indicator of neutrophil activation (27). Additionally, ALPL is implicated in regulating tissue mineralization and vascular remodeling, which are believed to be linked to coronary artery lesions and endothelial injury in the later stages of KD (28). The identification of F5 (coagulation factor V) suggests a potential pathological connection with severe complications of KD, such as thrombosis and coronary artery aneurysms (29). NETs have been confirmed to act as scaffolds promoting platelet aggregation and thrombosis (30), and the high expression of F5 may further exacerbate the hypercoagulable state in children with KD. This provides new molecular insights into the mechanism of coronary thrombosis caused by vasculitis in KD. HK3 (hexokinase 3), a key enzyme in the glycolytic pathway, is upregulated, indicating that neutrophils in children with KD may undergo metabolic reprogramming. Specifically, glycolysis may be enhanced to meet the substantial energy demands during acute inflammatory bursts (31). In summary, the upregulation of these genes collectively delineates a core pathological axis in KD centered on neutrophil metabolic reprogramming, hyperactivation, and a procoagulant state. This provides an integrated molecular perspective for further elucidating the key driving mechanisms of the disease.

Multiple algorithms consistently indicate that the immune microenvironment in KD patients exhibits a significant state of imbalance. Among these, adaptive immune cells (such as CD8+ T cells, B cells, and NK cells) show a general reduction in infiltration levels, which may suggest the suppression or exhaustion of adaptive immune function during the disease process (32). Concurrently, innate immune cells, represented by neutrophils and monocytes/macrophages, are significantly increased in the KD group, particularly the elevation of M2 macrophages and neutrophils. This suggests that the overactivation of the innate immune system may drive the acute inflammatory response in the vascular walls. During the acute phase of KD, immune complex deposition induces macrophage polarization toward the M2b subtype, promoting the release of pro-inflammatory factors (33). Simultaneously, local oxidative stress [reactive oxygen species (ROS)] or inflammatory signals can reprogram M2 macrophages, causing them to lose their anti-inflammatory functions and even switch to a pro-inflammatory phenotype, continuously releasing inflammatory factors and thereby exacerbating vascular endothelial injury and coronary artery lesions (34). Additionally, excessive NETs formation by neutrophils activates the NLRP3 inflammasome through ROS-dependent pathways, induces pyroptosis in peripheral blood mononuclear cells, and promotes the cascade release of pro-inflammatory cytokines such as IL-1β, further driving inflammatory damage to the vascular endothelium (35).

Based on the aforementioned discussion, we speculate that a potential pathological chain originates from metabolic reprogramming, which subsequently modulates the immune microenvironment and drives the progression of KD. Specifically, the upregulation of HK3, a rate-limiting enzyme in glycolysis, suggests that neutrophils undergo an energetic shift to meet the substantial metabolic demands required for the acute inflammatory burst. This metabolic fuel facilitates the hyperactivation of neutrophils, as evidenced by the rapid increase in ALPL expression, which serves as a definitive marker for early neutrophil activation. Consequently, these hyperactivated neutrophils reshape the immune landscape through the massive release of NETs. Our immune infiltration analysis (ssGSEA and CIBERSORT) underscores this microenvironment remodeling, where the KD group exhibits a significant surge in innate immune cells (neutrophils and M2 macrophages) alongside a marked suppression or exhaustion of adaptive immune populations, such as CD8+ T cells and B cells. Mechanistically, excessive NETs formation triggers the NLRP3 inflammasome via ROS-dependent pathways, inducing cell pyroptosis and the cascade release of pro-inflammatory cytokines like IL-1β, which further intensifies the inflammatory assault on the vascular endothelium. Ultimately, this sustained immunological imbalance leads to irreversible structural damage. The upregulation of F5 suggests a critical pathological link to the hypercoagulable state in KD, where F5 synergizes with NETs—acting as physical scaffolds—to promote platelet aggregation and coronary thrombosis. Concurrently, the downregulation of protein synthesis processes associated with MGAM expression reflects cellular adaptation to severe inflammatory stress, collectively culminating in the development of coronary artery aneurysms and systemic vasculitis.

Through single-gene GSEA analysis, we found that high expression of these four diagnostic genes was negatively correlated with significant downregulation of ribosome biogenesis and translation processes. Viral or bacterial infections are often the triggering factors for KD (36), and inhibiting host cell protein synthesis is generally a defensive mechanism of the organism to prevent pathogen replication (37), or a metabolic adaptation of cells under severe inflammatory stress (38). In inflammatory-related diseases such as KD, the PKR-eIF2α pathway can be activated, leading to global suppression of protein synthesis while promoting the selective translation of inflammatory factors [such as IL-6 and tumor necrosis factor-alpha (TNF-α)], thereby exacerbating vascular endothelial inflammatory responses (39). Furthermore, endoplasmic reticulum stress and its downstream PERK-eIF2α pathway have been shown to inhibit protein synthesis, activate TFs such as nuclear factor kappa-B (NF-κB) and C/EBP homologous protein (CHOP), and promote chronic inflammation and tissue damage in inflammatory vascular diseases (40).

Finally, through the construction of a regulatory network, we predicted that multiple miRNAs, including hsa-miR-93-5p, may be involved in the expression regulation of these diagnostic genes. This points the way for further in-depth research into the epigenetic regulatory mechanisms of KD. Studies have shown that miRNAs in KD are no longer merely associated molecules but have evolved into actionable biomarkers with triple value: immediate diagnosis, treatment decision-making, and early warning of complications (41). Drug prediction analysis also identified potential small-molecule drugs such as norgestrel. Although their clinical application still requires validation, this opens up new avenues for the treatment of KD.

Although the findings of this study have significant potential for clinical translation, certain limitations remain. Firstly, the study is entirely based on a retrospective analysis of public databases and lacks prospective validation with external clinical cohorts. To address these limitations, our future efforts will prioritize targeted prospective investigations within well-annotated clinical cohorts characterized by standardized treatment protocols, rigorous outcome assessments, and longitudinal follow-up. These studies are designed to comprehensively validate the diagnostic accuracy (sensitivity and specificity), clinical utility, and feasibility of the proposed model. Secondly, while we predicted potential drugs and regulatory mechanisms, there is a lack of in vitro or in vivo experiments, such as reverse transcription-quantitative polymerase chain reaction, western blot, or animal models, to validate the expression levels of core genes and their specific biological functions. Future research should focus on collecting clinical samples for multi-center validation and further investigating the specific molecular mechanisms through which these genes mediate neutrophil dysfunction.


Conclusions

This study successfully constructed a diagnostic model based on neutrophil-related genes, highlighting the central role of neutrophils in the immunopathology of KD. By elucidating the characteristics of the immune microenvironment and identifying molecular subtypes, our findings offer a new perspective for understanding the heterogeneity of the disease. This also provides a crucial theoretical foundation for the early molecular diagnosis of KD and the development of precision treatment strategies.


Acknowledgments

None.


Footnote

Reporting Checklist: The authors have completed the TRIPOD reporting checklist. Available at https://tp.amegroups.com/article/view/10.21037/tp-2026-1-0031/rc

Peer Review File: Available at https://tp.amegroups.com/article/view/10.21037/tp-2026-1-0031/prf

Funding: None.

Conflicts of Interest: Both authors have completed the ICMJE uniform disclosure form (available at https://tp.amegroups.com/article/view/10.21037/tp-2026-1-0031/coif). The authors have no conflicts of interest to declare.

Ethical Statement: The authors are accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved. This study was conducted in accordance with the Declaration of Helsinki and its subsequent amendments.

Open Access Statement: This is an Open Access article distributed in accordance with the Creative Commons Attribution-NonCommercial-NoDerivs 4.0 International License (CC BY-NC-ND 4.0), which permits the non-commercial replication and distribution of the article with the strict proviso that no changes or edits are made and the original work is properly cited (including links to both the formal publication through the relevant DOI and the license). See: https://creativecommons.org/licenses/by-nc-nd/4.0/.


References

  1. Qiu Y, Zhang Y, Li Y, et al. Molecular mechanisms of endothelial dysfunction in Kawasaki-disease-associated vasculitis. Front Cardiovasc Med 2022;9:981010. [Crossref] [PubMed]
  2. Selmek K, Harding M. Kawasaki Disease. Pediatr Rev 2024;45:425-7. [Crossref] [PubMed]
  3. Sharma K, Vignesh P, Srivastava P, et al. Epigenetics in Kawasaki Disease. Front Pediatr 2021;9:673294. [Crossref] [PubMed]
  4. Lin MT, Wu MH. The global epidemiology of Kawasaki disease: Review and future perspectives. Glob Cardiol Sci Pract 2017;2017:e201720. [Crossref] [PubMed]
  5. Tsoukas P, Yeung RSM. Kawasaki Disease-Associated Cytokine Storm Syndrome. Adv Exp Med Biol 2024;1448:365-83. [Crossref] [PubMed]
  6. Cangelosi GA, Hung L, Puvanesarajah V, et al. Common loci for Agrobacterium tumefaciens and Rhizobium meliloti exopolysaccharide synthesis and their roles in plant interactions. J Bacteriol 1987;169:2086-91. [Crossref] [PubMed]
  7. Lo MH, Lin YJ, Kuo HC, et al. Assessment of vascular and endothelial function in Kawasaki disease. Biomed J 2023;46:100525. [Crossref] [PubMed]
  8. Jone PN, Tremoulet A, Choueiter N, et al. Update on Diagnosis and Management of Kawasaki Disease: A Scientific Statement From the American Heart Association. Circulation 2024;150:e481-500. [Crossref] [PubMed]
  9. Kuo HC. Diagnosis, Progress, and Treatment Update of Kawasaki Disease. Int J Mol Sci 2023;24:13948. [Crossref] [PubMed]
  10. Saliba T, Nevesny F, Antiochos P, et al. Progressive coronary aneurysms in Kawasaki disease: A case report and long-term follow-up. Radiol Case Rep 2025;20:2459-62. [Crossref] [PubMed]
  11. Kuo HC, Lin MC, Kao CC, et al. Intravenous Immunoglobulin Alone for Coronary Artery Lesion Treatment of Kawasaki Disease: A Randomized Clinical Trial. JAMA Netw Open 2025;8:e253063. [Crossref] [PubMed]
  12. Broderick C, Kobayashi S, Suto M, et al. Intravenous immunoglobulin for the treatment of Kawasaki disease. Cochrane Database Syst Rev 2023;1:CD014884. [Crossref] [PubMed]
  13. Burns JC. The etiologies of Kawasaki disease. J Clin Invest 2024;134:e176938. [Crossref] [PubMed]
  14. Aggarwal R, Pilania RK, Sharma S, et al. Kawasaki disease and the environment: an enigmatic interplay. Front Immunol 2023;14:1259094. [Crossref] [PubMed]
  15. Gorelik M, Chung SA, Ardalan K, et al. 2021 American College of Rheumatology/Vasculitis Foundation Guideline for the Management of Kawasaki Disease. Arthritis Care Res (Hoboken) 2022;74:538-48. [Crossref] [PubMed]
  16. Hara T, Sakai Y. The Etiopathogenesis of Kawasaki Disease: Evolving Understanding of Diverse Triggers. Immun Inflamm Dis 2025;13:e70267. [Crossref] [PubMed]
  17. Yoshida Y, Takeshita S, Kawamura Y, et al. Enhanced formation of neutrophil extracellular traps in Kawasaki disease. Pediatr Res 2020;87:998-1004. [Crossref] [PubMed]
  18. Gao L, Xu Z, Hu J, et al. Impact of COVID-19 infection on Kawasaki disease and immune status in children. Sci Rep 2025;15:6417. [Crossref] [PubMed]
  19. Agrafiotou A, Sapountzi E, Margoni A, et al. Immunophenotype of Kawasaki Disease: Insights into Pathogenesis and Treatment Response. Life (Basel) 2025;15:1012. [Crossref] [PubMed]
  20. Islam MM, Takeyama N. Role of Neutrophil Extracellular Traps in Health and Disease Pathophysiology: Recent Insights and Advances. Int J Mol Sci 2023;24:15805. [Crossref] [PubMed]
  21. Jin J, Zhao Y, Fang Y, et al. Neutrophil extracellular traps promote the activation of the NLRP3 inflammasome and PBMCs pyroptosis via the ROS-dependent signaling pathway in Kawasaki disease. Int Immunopharmacol 2025;145:113783. [Crossref] [PubMed]
  22. Hu J, Qian W, Wang T, et al. Neutrophil Extracellular Traps Formation and Citrullinated Histones 3 Levels in Patients with Kawasaki Disease. Iran J Immunol 2023;20:327-34. [Crossref] [PubMed]
  23. Natorska J, Ząbczyk M, Undas A. Neutrophil extracellular traps (NETs) in cardiovascular diseases: From molecular mechanisms to therapeutic interventions. Kardiol Pol 2023;81:1205-16. [Crossref] [PubMed]
  24. Xu X, Wu Y, Xu S, et al. Clinical significance of neutrophil extracellular traps biomarkers in thrombosis. Thromb J 2022;20:63. [Crossref] [PubMed]
  25. Li W, Wang Z, Su C, et al. The effect of neutrophil extracellular traps in venous thrombosis. Thromb J 2023;21:67. [Crossref] [PubMed]
  26. Dong W, Li J, Zhuang Z. Neutrophil-related Signature Characterizes Immune Landscape and Predicts Prognosis of Invasive Breast Cancer. Biochem Genet 2025;63:4464-90. [Crossref] [PubMed]
  27. Xu P, Tao Z, Zhang C. Integrated multi-omics and artificial intelligence to explore new neutrophils clusters and potential biomarkers in sepsis with experimental validation. Front Immunol 2024;15:1377817. [Crossref] [PubMed]
  28. Xie Y, Shi H, Han B. Bioinformatic analysis of underlying mechanisms of Kawasaki disease via Weighted Gene Correlation Network Analysis (WGCNA) and the Least Absolute Shrinkage and Selection Operator method (LASSO) regression model. BMC Pediatr 2023;23:90. [Crossref] [PubMed]
  29. Sukorini U, Ninggar GAW, Lesmana MHS, et al. Genome-wide association study-driven identification of thrombomodulin and factor V as the best biomarker combination for deep vein thrombosis. Genomics Inform 2025;23:11. [Crossref] [PubMed]
  30. Yao M, Ma J, Wu D, et al. Neutrophil extracellular traps mediate deep vein thrombosis: from mechanism to therapy. Front Immunol 2023;14:1198952. [Crossref] [PubMed]
  31. Cao M, Zhang Z, Liu Q, et al. Identification of hub genes and pathogenesis in Kawasaki disease based on bioinformatics analysis. Indian J Pathol Microbiol 2024;67:297-305. [Crossref] [PubMed]
  32. Zhong F, Lin Y, Zhao L, et al. Reshaping the tumour immune microenvironment in solid tumours via tumour cell and immune cell DNA methylation: from mechanisms to therapeutics. Br J Cancer 2023;129:24-37. [Crossref] [PubMed]
  33. Guo MM, Chang LS, Huang YH, et al. Epigenetic Regulation of Macrophage Marker Expression Profiles in Kawasaki Disease. Front Pediatr 2020;8:129. [Crossref] [PubMed]
  34. Zheng F, Tao Y, Liu J, et al. KCa3.1 Inhibition of Macrophages Suppresses Inflammatory Response Leading to Endothelial Damage in a Cell Model of Kawasaki Disease. J Inflamm Res 2021;14:719-35. [Crossref] [PubMed]
  35. Wang N, Sun L, Qian G, et al. Neutrophil heterogeneity in Kawasaki disease and multisystem inflammatory syndrome in children. Pediatr Res 2026;99:490-501. [Crossref] [PubMed]
  36. Harrison MJ, Githinji L, Butters C, et al. Add in a virus: four cases of severe Kawasaki disease and concurrent adenovirus infection. BMC Pediatr 2025;25:707. [Crossref] [PubMed]
  37. Bhardwaj P, Raigond B, Raigond P, et al. Antiviral activity of ribosome inactivating proteins for management of plant viral infection. Virology 2025;603:110403. [Crossref] [PubMed]
  38. Akiyama Y, Ivanov P. Oxidative Stress, Transfer RNA Metabolism, and Protein Synthesis. Antioxid Redox Signal 2024;40:715-35. [Crossref] [PubMed]
  39. Chukwurah E, Farabaugh KT, Guan BJ, et al. A tale of two proteins: PACT and PKR and their roles in inflammation. FEBS J 2021;288:6365-91. [Crossref] [PubMed]
  40. Ma K, Zhang Y, Zhao J, et al. Endoplasmic reticulum stress: bridging inflammation and obesity-associated adipose tissue. Front Immunol 2024;15:1381227. [Crossref] [PubMed]
  41. Xiong Y, Xu J, Zhang D, et al. MicroRNAs in Kawasaki disease: An update on diagnosis, therapy and monitoring. Front Immunol 2022;13:1016575. [Crossref] [PubMed]
Cite this article as: Chen J, Zhang H. Construction and validation of a diagnostic model for Kawasaki disease based on neutrophil-related genes and analysis of immune infiltration. Transl Pediatr 2026;15(5):185. doi: 10.21037/tp-2026-1-0031

Download Citation