Go to The Journal of Clinical Investigation
  • About
  • Editors
  • Consulting Editors
  • For authors
  • Publication ethics
  • Publication alerts by email
  • Transfers
  • Advertising
  • Job board
  • Contact
  • Physician-Scientist Development
  • Current issue
  • Past issues
  • By specialty
    • COVID-19
    • Cardiology
    • Immunology
    • Metabolism
    • Nephrology
    • Oncology
    • Pulmonology
    • All ...
  • Videos
  • Collections
    • In-Press Preview
    • Resource and Technical Advances
    • Clinical Research and Public Health
    • Research Letters
    • Editorials
    • Perspectives
    • Physician-Scientist Development
    • Reviews
    • Top read articles

  • Current issue
  • Past issues
  • Specialties
  • In-Press Preview
  • Resource and Technical Advances
  • Clinical Research and Public Health
  • Research Letters
  • Editorials
  • Perspectives
  • Physician-Scientist Development
  • Reviews
  • Top read articles
  • About
  • Editors
  • Consulting Editors
  • For authors
  • Publication ethics
  • Publication alerts by email
  • Transfers
  • Advertising
  • Job board
  • Contact
A machine learning–based triage system for systemic EBV-positive T/NK cell lymphoproliferative diseases of childhood
Pujun Guan, Zihang Chen, Hanze Dong, Xia Guo, Juan Huang, Tian Dong, Mi Wang, Xiaoxi Lu, Fei Huang, Wenbin Li, Yuan Tang, Li Zhang, Ling Pan, Ju Gao, Shikun Wang, Rongbo Liu, Wenyan Zhang, Sha Zhao, Weiping Liu
Pujun Guan, Zihang Chen, Hanze Dong, Xia Guo, Juan Huang, Tian Dong, Mi Wang, Xiaoxi Lu, Fei Huang, Wenbin Li, Yuan Tang, Li Zhang, Ling Pan, Ju Gao, Shikun Wang, Rongbo Liu, Wenyan Zhang, Sha Zhao, Weiping Liu
View: Text | PDF
Research Article Hematology Infectious disease

A machine learning–based triage system for systemic EBV-positive T/NK cell lymphoproliferative diseases of childhood

  • Text
  • PDF
Abstract

Systemic Epstein-Barr virus–positive (EBV-positive) T/NK cell lymphoproliferative diseases of childhood (sEBV+T/NK-LPD) are a spectrum of rare diseases that have highly variable biological behavior, from indolent conditions to highly aggressive malignancies. Clinicians currently face substantial challenges in promptly assessing disease severity and predicting patient outcomes, leading to limitations in treatment planning. To address this challenge, we constructed a comprehensive triage system to aid in rapid clinical interventions. The study included 156 patients with newly diagnosed sEBV+T/NK-LPD from 42 institutions. An independent prospective cohort of 35 newly enrolled patients was further included to evaluate the model’s performance. An additional 45 patients from the literature and 18 patients who underwent hematopoietic stem cell transplantation were included to test the score’s generalizability. An integrative machine learning strategy was applied to identify robust and optimal factors and to integrate multiple algorithms to enhance the system’s performance and stability. This system, termed COLLAPSED, identifies critical factors and provides a stable, high-performing ensemble. This model was validated externally and simplified into a risk score to improve interpretability and accessibility. The COLLAPSED system substantially enhances clinicians’ ability to rapidly and precisely identify high-risk patients, thus enabling timely clinical decision-making and expedited initiation of potentially lifesaving treatments.

Authors

Pujun Guan, Zihang Chen, Hanze Dong, Xia Guo, Juan Huang, Tian Dong, Mi Wang, Xiaoxi Lu, Fei Huang, Wenbin Li, Yuan Tang, Li Zhang, Ling Pan, Ju Gao, Shikun Wang, Rongbo Liu, Wenyan Zhang, Sha Zhao, Weiping Liu

×

Figure 1

Overview of study design and COLLAPSED system development.

Options: View larger image (or click on image) Download as PowerPoint
Overview of study design and COLLAPSED system development.
Suspected cas...
Suspected cases from 42 institutions were screened and reviewed according to the diagnostic criteria of the WHO classification. Eligible patients were randomly split into a training dataset (n = 94) and an internal validation dataset (n = 62). A total of 58 clinical, imaging, laboratory, and pathological features were collected from which 7 robust predictors were selected through an integrative feature selection strategy. These features were used to develop an ensemble machine learning model (COLLAPSED) for risk prediction, and a simplified COLLAPSED score was derived for clinical risk stratification. Model accuracy, calibration, and risk-stratification performance were evaluated using the internal validation dataset, and further assessed using an external literature cohort (n = 45), an extension cohort (n = 18, patients who received hematopoietic stem cell transplantation [HSCT]), and a prospective cohort (n = 35).

Copyright © 2026 American Society for Clinical Investigation
ISSN 2379-3708

Sign up for email alerts