Go to The Journal of Clinical Investigation
  • About
  • Editors
  • Consulting Editors
  • For authors
  • Publication ethics
  • Publication alerts by email
  • Transfers
  • Advertising
  • Job board
  • Contact
  • Physician-Scientist Development
  • Current issue
  • Past issues
  • By specialty
    • COVID-19
    • Cardiology
    • Immunology
    • Metabolism
    • Nephrology
    • Oncology
    • Pulmonology
    • All ...
  • Videos
  • Collections
    • In-Press Preview
    • Resource and Technical Advances
    • Clinical Research and Public Health
    • Research Letters
    • Editorials
    • Perspectives
    • Physician-Scientist Development
    • Reviews
    • Top read articles

  • Current issue
  • Past issues
  • Specialties
  • In-Press Preview
  • Resource and Technical Advances
  • Clinical Research and Public Health
  • Research Letters
  • Editorials
  • Perspectives
  • Physician-Scientist Development
  • Reviews
  • Top read articles
  • About
  • Editors
  • Consulting Editors
  • For authors
  • Publication ethics
  • Publication alerts by email
  • Transfers
  • Advertising
  • Job board
  • Contact
Interpretable machine learning uncovers epithelial transcriptional rewiring and a role for Gelsolin in COPD
Justin Sui, Hanxi Xiao, Ugonna Mbaekwe, Nai-Chun Ting, Kaley Murday, Qianjiang Hu, Alyssa D. Gregory, Theodore S. Kapellos, Ali Öender Yildirim, Melanie Königshoff, Yingze Zhang, Frank Sciurba, Jishnu Das, Corrine R. Kliment
Justin Sui, Hanxi Xiao, Ugonna Mbaekwe, Nai-Chun Ting, Kaley Murday, Qianjiang Hu, Alyssa D. Gregory, Theodore S. Kapellos, Ali Öender Yildirim, Melanie Königshoff, Yingze Zhang, Frank Sciurba, Jishnu Das, Corrine R. Kliment
View: Text | PDF
Research Article Cell biology Pulmonology

Interpretable machine learning uncovers epithelial transcriptional rewiring and a role for Gelsolin in COPD

  • Text
  • PDF
Abstract

Transcriptomic analyses have advanced the understanding of complex disease pathophysiology including chronic obstructive pulmonary disease (COPD). However, identifying relevant biologic causative factors has been limited by the integration of high dimensionality data. COPD is characterized by lung destruction and inflammation, with smoke exposure being a major risk factor. To define previously unknown biological mechanisms in COPD, we utilized unsupervised and supervised interpretable machine learning analyses of single-cell RNA-Seq data from the mouse smoke-exposure model to identify significant latent factors (context-specific coexpression modules) impacting pathophysiology. The machine learning transcriptomic signatures coupled to protein networks uncovered a reduction in network complexity and new biological alterations in actin-associated gelsolin (GSN), which was transcriptionally linked to disease state. GSN was altered in airway epithelial cells in the mouse model and in human COPD. GSN was increased in plasma from patients with COPD, and smoke exposure resulted in enhanced GSN release from airway cells from patients with COPD. This method provides insights into rewiring of transcriptional networks that are associated with COPD pathogenesis and provides a translational analytical platform for other diseases.

Authors

Justin Sui, Hanxi Xiao, Ugonna Mbaekwe, Nai-Chun Ting, Kaley Murday, Qianjiang Hu, Alyssa D. Gregory, Theodore S. Kapellos, Ali Öender Yildirim, Melanie Königshoff, Yingze Zhang, Frank Sciurba, Jishnu Das, Corrine R. Kliment

×

Figure 5

A Gsn-centric latent factor gene set can predict the smoke-exposed group within other epithelial cell types.

Options: View larger image (or click on image) Download as PowerPoint
A Gsn-centric latent factor gene set can predict the smoke-exposed group...
scRNA-Seq data from mice exposed to 6 months of air or CS (n = 3 per group; 5,000 cells per mouse) were analyzed using SLIDE for dataset B for AT2 cells. (A) Standalone significant latent (marginal) factors for AT2 cells are in teal and interacting latent factors are in red. Genes comprising each latent factor by for the AT2 cells are reported in the table and network connectivity maps with genes that characterized CS exposure in red/squares and air in blue/circles. Of note, no air-associated latent factors (blue/circles) were present in this analysis. Cross prediction analysis was completed for between dataset A and B to determine if the CS treatment group could be identified. Area under the curve (AUC) is reported. Statistical comparison by 2-tailed Student’s t test with Mann-Whitney test for data in B–D. P values are noted. (B) AT2 dataset B predicting the CS group from AT2 cells in dataset A, (C) AT2 dataset B predicting the CS group from ciliated cells in dataset B, (D) AT2 dataset A predicting the CS group from ciliated cells in dataset A.

Copyright © 2026 American Society for Clinical Investigation
ISSN 2379-3708

Sign up for email alerts