Clinico-histopathologic and single-nuclei RNA-sequencing insights into cardiac injury and microthrombi in critical COVID-19

Acute cardiac injury is prevalent in critical COVID-19 and associated with increased mortality. Its etiology remains debated, as initially presumed causes — myocarditis and cardiac necrosis — have proved uncommon. To elucidate the pathophysiology of COVID-19–associated cardiac injury, we conducted a prospective study of the first 69 consecutive COVID-19 decedents at CUIMC in New York City. Of 6 acute cardiac histopathologic features, presence of microthrombi was the most commonly detected among our cohort. We tested associations of cardiac microthrombi with biomarkers of inflammation, cardiac injury, and fibrinolysis and with in-hospital antiplatelet therapy, therapeutic anticoagulation, and corticosteroid treatment, while adjusting for multiple clinical factors, including COVID-19 therapies. Higher peak erythrocyte sedimentation rate and C-reactive protein were independently associated with increased odds of microthrombi, supporting an immunothrombotic etiology. Using single-nuclei RNA-sequencing analysis on 3 patients with and 4 patients without cardiac microthrombi, we discovered an enrichment of prothrombotic/antifibrinolytic, extracellular matrix remodeling, and immune-potentiating signaling among cardiac fibroblasts in microthrombi-positive, relative to microthrombi-negative, COVID-19 hearts. Non–COVID-19, nonfailing hearts were used as reference controls. Our study identifies a specific transcriptomic signature in cardiac fibroblasts as a salient feature of microthrombi-positive COVID-19 hearts. Our findings warrant further mechanistic study as cardiac fibroblasts may represent a potential therapeutic target for COVID-19–associated cardiac microthrombi.


Low Post-Mortem Interval (PMI) Initiative
During the initial COVID-19 surge in New York City, the Columbia University Irving Medical Center (CUIMC) Departments of Pathology and Medicine implemented a multidisciplinary initiative to lower post-mortem interval (PMI). Clinicians directly involved in the care of decedents obtained autopsy consent from next-of-kin at the time of death notification or shortly thereafter. Witnessed verbal consent was approved and accepted by New York Presbyterian-CUIMC since all hospital visitations were prohibited during this period.
Documentation of informed autopsy consent was protocolized for the electronic medical record; a copy of the document was sent to the decedents' next-of-kin by electronic or postal mail.
Timely transfer to the autopsy suite was coordinated immediately upon attainment of autopsy consent. Infographics, cloud-based resources, and multiple daily HIPAA-compliant text communications to frontline clinicians were created and used to support the low PMI initiative.

Measurement of ventricular SARS-CoV-2 viral load
Total RNA was extracted from formalin-fixed paraffin-embedded ( For each RT-qPCR reaction, we used 5 μL of the total 75 μL of extracted RNA per ventricular tissue sample. Hence, the calculated viral load of each N2 RT-qPCR reaction was multiplied by 15 to yield the ventricular viral load of the sample. To account for potential differences in ventricular tissue sample sizes, ventricular viral loads were normalized to RP expression using the ∆ cycle threshold (Ct) method. Specifically, a normalization ratio for RP expression was calculated as 2 -∆Ct , whereby ∆Ct was the difference between each sample's RP Ct and the mean RP Ct of all samples. Thus, the normalized viral load for each ventricular tissue sample was calculated by dividing the raw ventricular viral load by the normalization ratio.

RT-qPCR of canonical genes of fibroblast activation
Total RNA was extracted from formalin-fixed paraffin-embedded (FFPE) tissue samples of the RV of all COVID-19 hearts for which PMI was less than 24 hours. Total RNA was extracted using the Quick-RNA FFPE Miniprep Kit (Zymo Research, Irvine, CA) according to the manufacturer's instructions. Only those samples with detectable housekeeping genes GAPDH and RPS13 were included in summary analysis.

Data Processing of single nuclei RNA sequences
Raw base call sequencing files were de-multiplexed; FASTQ files were generated using the 10x Genomics CellRanger 4.0.0 in the Cumulus workflow (2). To remove homopolymers (A30, T30, G30, and C30) and the template switch oligo sequence (CCCATGTACTCTGCGTTGATACCACTGCTT and its complement AAGCAGTGGTATCAACGCAGA GTACATGGG), reads were trimmed using Cutadapt v2.8 (3) with default parameters. Trimmed reads were aligned to the GRCh38 pre-mRNA human reference with SARS-CoV2 annotations (NC_045512). Count matrices were generated using CellRanger Count v12.
We inspected each sample for mapping quality based on the number of mapped reads per nucleus, percentage of mapped mitochondrial reads (%mitochondrial reads), and the shape of the unique molecular identifier (UMI) decay curve. One sample (SID 69-RV) demonstrated high levels of mitochondrial reads and was removed from further analyses. The remaining samples were filtered using CellBender v2.0 (4) with default settings to remove the ambient RNA byproducts of nuclear isolation. From this, 108,016 nuclei remained.
Quality control was performed on the individual sample level. For each droplet, we calculated the ratio of reads mapping to exonic regions to total mapped reads using Scrinvex v13 (https://github.com/getzlab/scrinvex). Droplets with an exon ratio greater than the 75 th percentile + IQR range were removed from downstream analyses due to increased cytoplasmic transcripts (3,182 nuclei). We also excluded droplets which contained more than one nucleus as identified by Scrublet (5) (10,866 nuclei). Samples were also filtered to remove nuclei with reads mapped to less than 200 genes (11,280 nuclei) and nuclei with greater than 5% mitochondrial reads (9,825 nuclei).
The gene list was filtered for highly variable genes (minimum mean 0.0125, maximum mean 3, minimum dispersion 0.5), using a subset of 6,255 genes for graph-based clustering. To account for variable complexity per nucleus, counts were normalized to 10,000 unique molecules per nucleus and logarithmized. Total read count and % mitochondrial reads were regressed out (Scanpy preprocessing regress_out); data was scaled to a maximum value of 10.
Ischemic time for non-COVID-19 reference controls (6) was unavailable. Hence, we could not account for effects of ischemic time; results relating to hypoxic signaling should be interpreted with caution.

Marker gene and cell type identification
Genes were ranked using a Wilcoxon rank sum test (scanpy rank_genes_groups) for each cell-type cluster versus all others; and log2-fold change (FC) and percentage of nuclei expressing each gene were calculated. Area under the receiver operator curve (AUC) scores were calculated for all genes within each cluster using SciKit Learn roc_auc_score. Genes were considered markers of a given cluster with an AUC score >0.7 or a log2-FC >0.6.

Compositional analyses
The proportion of each cell type was compared across samples using scCODA version 0. . Credible interval differences in cell type proportions were determined using spikeand-slab inclusion probabilities. Importantly, the proportion of a given cell type in a sample is governed by our ability to liberate the nuclei equivalently from the tissue and our ability to successfully identify cells compared to empty droplets. The former may be affected by cell death or increased fibrosis which our sectioning protocol is designed to mitigate. The latter is a more challenging problem for single nucleus RNA sequencing, which is influenced by the relative transcriptional complexity of various cell types, making transcript rich cell types such as cardiomyocytes and fibroblasts easy to identify with transcript poor cells such as immune cells more apt to be assigned as an empty droplet. Use of a probabilistic cell calling mechanism in CellBender(4) is used to overcome this challenge.

Differential expression testing
Differentially expressed genes (DEGs) were calculated for each major cell cluster separately using the MAST(9) pipeline. We constructed a Hurdle model (zlm(~condition + ngenes + (1 | sample_ID), sca,method='glmer', ebayes = FALSE, strictConvergence = FALSE)) with the normalized reads based on the cellular detection rate, the set condition (either donor control versus COVID-19, or COVID-19 microthrombi-positive versus COVID-19 microthrombinegative), and biological individual. Only genes with non-zero expression in at least 15% of all nuclei were included. DEGs were identified based on an Benjamini-Hochberg false discovery rate (FDR) adjusted P-value <0.05. Importantly, each droplet is contaminated by ambient RNA, which is imperfectly removed informatically by design. Therefore, DE results should be interpreted with caution, particularly when examining non-major cell types (e.g., intracardiac neurons, lymphatic endothelial, mast cells) and genes highly expressed in those as numerous as cardiomyocytes. Genes which serve as markers of another cluster (AUC > 0.7) were blacklisted to exclude potential differential contamination by ambient RNA as a driver of such effect. Gene lists were also compared to the secreted protein list obtained from the Human Protein Atlas (proteinatlas.org). Full lists of DE genes, including those which were blacklisted, are contained in Supplemental Table ST3.

Reactome pathway enrichment
For each major cell type, we performed gene pathway enrichment using Reactome (10 Table   ST11.

Cell-Cell Communication
Cell-cell communication was tested with CellphoneDB version 2.1.7 (12) on each sample separately using normalized count data for the 9 largest cell types. Afterward, significant interactions were aggregated between microthrombi-positive and microthrombi-negative samples. Briefly, CellphoneDB identifies and compares ligand-receptor interaction pairs between cell types and compares the observed interactions to the expected interactions of a null distribution generated from randomly permuted cell labels. Default parameters were used for the analysis (10% threshold for cells expressing ligands and receptors, p-value = 0.05, 1000 iterations for generation of null distribution, curated interactions list compiled by CellphoneDB from UniProt, Ensembl, PDB, IMEx consortium, and IUPHAR).

Regulator Genes
To identify regulator genes that function within and likely drive gene networks within a biological context, we analyzed DEGs with a log 2-FC >0.5 for each cell type using GeneWalk(13) (direction of differential expression was not incorporated). GeneWalk builds biologically relevant networks from provided gene lists, connecting genes and GO terms, and compares the network to random networks. GeneWalk was used with default parameters and an FDR-corrected P-value of 0.1. The identified regulator genes were compared to a list of druggable genes. (14) The full list of regulator genes identified is available in Supplemental