Go to The Journal of Clinical Investigation
  • About
  • Editors
  • Consulting Editors
  • For authors
  • Publication ethics
  • Publication alerts by email
  • Transfers
  • Advertising
  • Job board
  • Contact
  • Physician-Scientist Development
  • Current issue
  • Past issues
  • By specialty
    • COVID-19
    • Cardiology
    • Immunology
    • Metabolism
    • Nephrology
    • Oncology
    • Pulmonology
    • All ...
  • Videos
  • Collections
    • In-Press Preview
    • Resource and Technical Advances
    • Clinical Research and Public Health
    • Research Letters
    • Editorials
    • Perspectives
    • Physician-Scientist Development
    • Reviews
    • Top read articles

  • Current issue
  • Past issues
  • Specialties
  • In-Press Preview
  • Resource and Technical Advances
  • Clinical Research and Public Health
  • Research Letters
  • Editorials
  • Perspectives
  • Physician-Scientist Development
  • Reviews
  • Top read articles
  • About
  • Editors
  • Consulting Editors
  • For authors
  • Publication ethics
  • Publication alerts by email
  • Transfers
  • Advertising
  • Job board
  • Contact
Transcriptomics and machine learning predict diagnosis and severity of growth hormone deficiency
Philip G. Murray, Adam Stevens, Chiara De Leonibus, Ekaterina Koledova, Pierre Chatelain, Peter E. Clayton
Philip G. Murray, Adam Stevens, Chiara De Leonibus, Ekaterina Koledova, Pierre Chatelain, Peter E. Clayton
View: Text | PDF
Clinical Research and Public Health Endocrinology

Transcriptomics and machine learning predict diagnosis and severity of growth hormone deficiency

  • Text
  • PDF
Abstract

BACKGROUND. The effect of gene expression data on diagnosis remains limited. Here, we show how diagnosis and classification of growth hormone deficiency (GHD) can be achieved from a single blood sample using a combination of transcriptomics and random forest analysis. METHODS. Prepubertal treatment-naive children with GHD (n = 98) were enrolled from the PREDICT study, and controls (n = 26) were acquired from online data sets. Whole blood gene expression was correlated with peak growth hormone (GH) using rank regression and a random forest algorithm tested for prediction of the presence of GHD and in classification of GHD as severe (peak GH <4 μg/l) and nonsevere (peak ≥4 μg/l). Performance was assessed using area under the receiver operating characteristic curve (AUC-ROC). RESULTS. Rank regression identified 347 probe sets in which gene expression correlated with peak GH concentrations (r = ± 0.28, P < 0.01). These 347 probe sets yielded an AUC-ROC of 0.95 for prediction of GHD status versus controls and an AUC-ROC of 0.93 for prediction of GHD severity. CONCLUSION. This study demonstrates highly accurate diagnosis and disease classification for GHD using a combination of transcriptomics and random forest analysis. TRIAL REGISTRATION. NCT00256126 and NCT00699855. FUNDING. Merck and the National Institute for Health Research (CL-2012-06-005).

Authors

Philip G. Murray, Adam Stevens, Chiara De Leonibus, Ekaterina Koledova, Pierre Chatelain, Peter E. Clayton

×

Figure 1

Heatmap of gene expression for those probe sets whose expression correlated with peak GH levels.

Options: View larger image (or click on image) Download as PowerPoint
Heatmap of gene expression for those probe sets whose expression correla...
(A) Normal children (n = 26) were combined with GHD patients (n = 98), rank regression analysis was adjusted for sex and age as covariates, and clusters of similar gene expression were identified using the Euclidean metric and marked using a dendrogram and white boxes (347 probe sets, 271 unique genes). The distinction between normal subjects is marked by the break in the heatmap; GHD is defined by a cutoff level of 10 μ/l growth hormone, as measured by provocation testing. The vertical white line demarcates the point of inflexion for gene expression at a peak GH level of 4.75 μ/l, while the horizontal white line demarcates those probe sets positively and negatively associated with peak GH levels (< or >4.75 μ/l). (B) Two-way cluster analysis of gene expression in GHD and control subjects. Four distinct clusters of GHD subgroups can be seen from the dendrogram on the horizontal axis derived via a Euclidian metric. There is, however, a large number of subjects it was not possible to classify (right of white line). This group contained all but 1 of the normal control subjects and 20 GHD subjects.

Copyright © 2026 American Society for Clinical Investigation
ISSN 2379-3708

Sign up for email alerts