First published April 11, 2019 - More info
Extracellular mRNAs (ex-mRNAs) potentially supersede extracellular miRNAs (ex-miRNAs) and other RNA classes as biomarkers. We performed conventional small-RNA-sequencing (sRNA-seq) and sRNA-seq with T4 polynucleotide kinase (PNK) end-treatment of total exRNA isolated from serum and platelet-poor EDTA, ACD, and heparin plasma to study the effect on ex-mRNA capture. Compared to conventional sRNA-seq PNK-treatment increased the detection of informative ex-mRNAs reads up to 50-fold. The exRNA pool was dominated by hematopoietic cells and platelets, with additional contribution from the liver. About 60% of the 15- to 42-nt reads originated from the coding sequences, in a pattern reminiscent of ribosome-profiling. Blood sample type had a considerable influence on the exRNA profile. On average approximately 350 to 1,100 distinct ex-mRNA transcripts were detected depending on plasma type. In serum, additional transcripts from neutrophils and hematopoietic cells increased this number to near 2,300. EDTA and ACD plasma showed a destabilizing effect on ex mRNA and non-coding RNA ribonucleoprotein complexes compared to other plasma types. In a proof-of-concept study, we investigated differences between the exRNA profiles of patients with acute coronary syndrome (ACS) and healthy controls. The improved tissue resolution of ex mRNAs after PNK-treatment enabled us to detect a neutrophil-signature in ACS that escaped detection by ex miRNA analysis.
View Supplemental Data 1. Master table in Excel spreadsheet format containing individual level sample information and read annotation as per initial annotation settings (reads 12-nt and longer, up to two mismatches, multi-mapping allowed) shown as read counts and percentages.
View Supplemental Data 2. Effect of mismatches and multi-mapping on the percentage of reads 12-nt and longer annotated as mRNAs using initial mapping/annotation settings (up to two mismatches to reference, multi-mapping allowed).
View Supplemental Data 3. Excel spreadsheet containing the results of the differential analysis of ex miRNAs comparing the 4 different sample types used in libraries 1 to 4. Shown are raw and Benjamini-Hochberg corrected (FDR) P values for the overall/ANOVA-like analysis and Benjamini-Hochberg corrected (FDR) P values of the pairwise comparisons.
View Supplemental Data 4. Excel spreadsheet containing the results of the differential analysis of ex mRNAs comparing the 4 different sample types used in libraries 1 to 4. Only ex mRNA fragments without mismatch, 15-nt or longer, and at maximum mapping to two transcripts of the transcriptome reference were allowed for this analysis. Shown are raw and Benjamini-Hochberg corrected (FDR) P values for the overall/ANOVA-like analysis and Benjamini-Hochberg corrected (FDR) P values of the pairwise comparisons.
View Supplemental Data 5. Excel spreadsheet containing the results of the gene set analysis of mRNAs comparing the 4 different sample types used in libraries 1 to 4. C2 gene sets (from the MSigDB Collections of the Broad Institute) containing the terms “ribosome”, “translation”, or “inflammation” were used as input. A self-contained test as implemented in the mroast function of the Bioconductor package edgeR was used. The mRNA input count matrix and the design model were identical to the differential analysis (Dataset 3).
View Supplemental Data 6. Excel spreadsheet containing the tissue and cell RNA-seq samples used to calculate the tissue-specific score (TSS). Details about the tissue and cells samples are given in sheet “README”, and the gene expression matrix (transcripts per million, TPM) in sheet “Supplemental Data 6”.
View Supplemental Data 7. Excel spreadsheet containing the 169 mRNA genes (first column) with a tissue-specificity score > 3 detected in circulation in either serum or the plasma samples EDTA, ACD, or heparin. The other columns show the corresponding expression values in TPM (transcripts per million) of these genes in the selected tissues.
View Supplemental Data 8. Percentage of the top 200, 500, 1,000 cellular transcripts by gene expression in the displayed cell and tissues that were captured as ex mRNAs in each sample type if the mRNA transcript reference collected 5 or more unique reads (15-nt and longer, no mismatch, up to 2 mapping locations) in 3 or more of the 6 donors per sample type.
View Supplemental Data 9. ACS and control group demographics. Basic demographics and blood parameters of 6 patients with acute coronary syndrome (ACS) and 10 matched controls used in libraries 5 (untreated RNA) and 6 (PNK-treated RNA). All 16 donors were male. P values were calculated using the Wilcoxon rank sum test.
View Supplemental Data 10. Excel spreadsheet containing the individual level clinical chemistry and laboratory data for the six ACS patients and the 10 matched controls whose RNA was processed in sRNA-seq libraries 5 (untreated) and 6 (PNK-treated).
View Supplemental Data 12. Excel spreadsheet containing the results of the differential analysis of ex mRNAs (library 6 only) comparing the 6 ACS patients with age-matched controls. Only mRNA fragments without mismatch, 15-nt or longer, and at maximum mapping to two transcripts of the transcriptome reference were allowed for this analysis. Shown are raw and Benjamini-Hochberg corrected (FDR) P values for the overall/ANOVA-like analysis (any difference among the samples?) and Benjamini-Hochberg corrected (FDR) P values of the pairwise comparisons.
View Supplemental Figure 1. exRNA sample characteristics for libraries 1-4. (A) Quantification of total exRNA from 450 µl sample from healthy control donors used for libraries 1-4. Samples were quantified using the Qubit RNA HS Assay as described in the supplemental methods. Only samples with residual sample volume left after sRNA-seq cDNA library preparation were measured. Differences were tested with a linear-fixed-effects model and Tukey’s post-hoc test in the R language. It should be noted that the T4 PNK treatment started with an aliquot of the untreated RNA. Box plots show the median and first and third quartiles (lower and upper hinges). Whiskers extend at most 1.5 x interquartile range from the hinges, any data outside this is shown as individual outlier points. (B) Read length distribution of uniquely mapping and multi-mapping mRNAs reads across different blood sample types. Included were only reads without mismatch to the reference transcriptome.
View Supplemental Figure 2. Read coverage of ex-tRNAs. (A) Read coverage of the Gly-tRNA isoacceptor from the untreated heparin sample of donor “Control 1” (sample “Control1_heparin_untreated” in Supplemental Data 1) together with the tRNA secondary structure (box). The anticodon is highlighted in yellow and bases protected by the RNA-binding protein ZNF598 based on PAR-CLIP(11) are colored blue. The pileup was scaled to 11 bins using the following formula for each base: (((count/maximum count) ** 1/4) + 0.5)/0.1. (B) Showing the 10 most frequently sequenced sequences (corresponding to the 5’-end) with corresponding read count.
View Supplemental Figure 3. Differences in the ex miRNA and ex mRNA profile and read coverage by blood sample type. (A) Differences in abundance of cell-free, circulating miRNAs (top row) and mRNAs (bottom row); see Supplemental Data 14 for hypothesis tests. (B) t-SNE plot based on the exRNA profiles of all RNA categories of the four different sample types from six healthy donors (perplexity 4). (C) Read coverage for the small nuclear RNAs (snRNAs) U1 (left), and U2 (right) in the different sample types. Rectangle in the left sub-figure indicates coverage loss of U1 region in EDTA and ACD plasma. Bar (core) on the left (U1) and right (U2) indicates core region resistant to nuclease digestion.
View Supplemental Figure 4. Capture of top expressed transcripts from selected solid tissues in different sample types. The 1,000 most abundant nuclear mRNA transcripts from the selected cell types and tissues from bulk RNA-seq that collected 5 unique or 10 total ex-mRNA reads in at least 3 of the 6 donors per sample type were considered captured. The x out of 1,000 captured transcripts (x axis) were ordered in descending order by the tissue specificity score (TSS, y axis). Transcripts with a TSS greater than 3 were highlighted in red and listed space permitting. Note that bulk sequencing data of heterogenous and well perfused tissues like lung will contain hematopoietic or erythropoietic/hematopoietic-specific transcripts within the top 1,000 genes of the sample. It should also be noted that most reads for e.g. the MYBPC3 (left lower panel) and all for MIOX (kidney panels) were shorter than 17-nt with a high likelihood of misassignments.
View Supplemental Figure 5. Treatment of total extracellular RNA with T4 polynucleotide kinase (T4 PNK) followed by small RNA-sequencing (sRNA-seq) in a pilot cohort of chest pain patients (ACS) and matched controls. This figure is analogous to Figure 1B, C and Supplemental Figure 1 showing basic read and annotation characteristics (A, B) as well as the effect of PNK end-treatment on mRNA capture (C) for libraries 5 and 6, and the library preparation strategy for libraries 5 and 6. Matching untreated and treated RNA samples from each donor were ligated to the same barcoded 3’-adapter. See Supplemental Data 1 for details on all sub-samples and libraries.. (A) Differences in read annotation for endogenous RNA classes for untreated RNA and PNK-treated RNA using initial annotation settings (up to 2 mismatches, unlimited multi-mapping). (A) Read length distribution for reads annotated as mRNAs (nuclear and mitochondrial) with 0, 1, and 2 mismatches to the reference transcriptome; multi-mapping reads and reads mapping to only one transcript (uniquely mapping) are shown in different colors as indicated. Note that for the final mRNA downstream analysis only mRNA reads without mismatch (i.e. 0 mismatch), maximum 2 mapping locations, and 15-nt or longer were considered. (C) Differences in nuclear-encoded mRNA capture between untreated and PNK-treated RNA using strict (final) annotation criteria (no mismatch and up to two mapping locations).
View Supplemental Figure 6. Unsupervised hierarchical clustering of RNA spike-in calibrators. Shown are spike-in calibrator RNAs profiles of samples from library 5 and 6. The calibrator RNA (cocktail 2) was added at the 3’-adapter ligation step as described. Columns and rows were clustered using complete linkage and Euclidean distance (row dendrogram was removed. Clustering was done in the R language using the aheatmap package.
View Supplemental Figure 7. Changes of the top 9 (by FDR) ex mRNAs higher in ACS patients (n = 6) than controls (n = 10; library 6, PNK-treated). Box plots show the median and first and third quartiles (lower and upper hinges). Whiskers extend at most 1.5 x interquartile range from the hinges, any data outside this is shown as individual outlier points.