A single-cell atlas of the myometrium in human parturition

Parturition is a well-orchestrated process characterized by increased uterine contractility, cervical ripening, and activation of the chorioamniotic membranes; yet, the transition from a quiescent to a contractile myometrium heralds the onset of labor. However, the cellular underpinnings of human parturition in the uterine tissues are still poorly understood. Herein, we performed a comprehensive study of the human myometrium during spontaneous term labor using single-cell RNA sequencing (scRNA-Seq). First, we established a single-cell atlas of the human myometrium and unraveled the cell type–specific transcriptomic activity modulated during labor. Major cell types included distinct subsets of smooth muscle cells, monocytes/macrophages, stromal cells, and endothelial cells, all of which communicated and participated in immune (e.g., inflammation) and nonimmune (e.g., contraction) processes associated with labor. Furthermore, integrating scRNA-Seq and microarray data with deconvolution of bulk gene expression highlighted the contribution of smooth muscle cells to labor-associated contractility and inflammatory processes. Last, myometrium-derived single-cell signatures can be quantified in the maternal whole-blood transcriptome throughout pregnancy and are enriched in women in labor, providing a potential means of noninvasively monitoring pregnancy and its complications. Together, our findings provide insights into the contributions of specific myometrial cell types to the biological processes that take place during term parturition.

. Clinical and demographic characteristics of the study population.
Supplemental Table 2. Summary of cell populations identified in the human myometrium at term pregnancy.

Provided as separate files:
Supplemental Table 3. List of marker genes used for pathway analysis based on Gene Ontology (GO) terms to differentiate Smooth Muscle Cell (SMC) subsets.
Supplemental Table 4. List of marker genes used for pathway analysis based on Gene Ontology (GO) terms to differentiate stromal cell subsets.
Supplemental Table 5. List of marker genes used for pathway analysis based on Gene Ontology (GO) terms to differentiate macrophage subsets.
Supplemental Table 6. List and summary of labor-associated differentially expressed genes (DEGs) in human myometrial cell types. Supplemental Table 7. Summary of genes and pathways used in the intercellular communication analysis using CellChat.
Supplemental Table 8. Summary of the top 20 genes used to generate signature analysis plots in the comparative analysis between maternal peripheral blood collected throughout gestation and myometrial tissues collected at term. Supplemental Table 9. Summary of quality control metrics calculated for each library and myometrial sample from the 10x Genomics Cell Ranger pipeline.
Supplemental Table 10. List of marker genes for the identification of cell types in the human myometrium.
Supplemental Table 11. Labor-associated differentially expressed genes in cell types that can be detected in the maternal peripheral blood. 1

Study design
This prospective cross-sectional study included women that underwent cesarean section at term (≥ 37 weeks of gestation) and were divided into the following groups: term not in labor (N = 13) and term in spontaneous labor (N = 11). The demographic and clinical characteristics of the study groups are shown in Supplemental Table 1

Sample Collection
Immediately after the delivery of the placenta, myometrial biopsies were obtained from the lower uterine segment at the time of cesarean section. Biopsies were obtained from the midpoint of the superior edge of the uterine incision and transported to the laboratory in Dulbecco's Modified Eagle Medium (DMEM; Life Technologies Corporation, Grand Island, NY, USA) for single-cell dissociation and formalin fixation/paraffin embedding for further histological characterization. Additionally, a fraction of the myometrial biopsy and samples from the umbilical cord tissues were obtained, snap-frozen in liquid nitrogen, and kept at -80°C for maternal and fetal genotyping, respectively. Placentas were also collected and processed for histological evaluation.

Smooth muscle actin (SMA) immunohistochemistry
Formalin-fixed paraffin-embedded uterine tissues were cut in 5-μm-thick sections. Slides were deparaffinized in xylene and rehydrated with a series of decreasing ethanol concentrations. Masson's trichrome staining 3 After deparaffinization and rehydration as described above, 5-μm-thick myometrial tissue sections were stained using the Masson's 2000 Trichrome Stain Kit (Cat. No. KTMTR2; American MasterTech, Lodi, CA), according to the manufacturer's protocol. Following staining, the sections were dehydrated with increasing concentrations of ethanol and then a coverslip was placed.
Brightfield images were taken using the Vectra Polaris Multispectral Imaging System.

Multiplex immunofluorescence
Five-µm-thick sections of formalin-fixed, paraffin embedded myometrial tissues were cut and mounted on microscope slides. Using OPAL Multiplex 7-color IHC kit (Cat. no. NEL811001KT; Akoya Biosciences), myometrial tissue slides were stained according to manufacturer's instructions. Prior to multiplex immunofluorescence staining, each analyte was individually optimized by single-plex antibody staining combined with different fluorescent TSA reagents (Akoya Biosciences). Briefly, after slides were deparaffinized and rehydrated, antigen retrieval (AR) was performed using AR buffer and boiling using a microwave oven. The slides were then incubated in blocking buffer to eliminate non-specific binding prior to incubation with selected primary antibody at room temperature. Next, the slides were rinsed in TBST prior to incubation with anti-mouse and anti-rabbit secondary antibody-HRP conjugate followed by were incubated with DAPI (4′,6-diamidino-2-phenylindole) as nuclear counterstain and mounted using AquaSlip™ Aqueous Permanent Mounting Medium (American MasterTech). Depending on the host species of the primary antibody, mouse or rabbit isotype (FLEX Universal Negative Control, both from Agilent) was used as negative control. Multiplex staining was performed by consecutively staining slide-mounted tissues using the same antibody concentrations and conditions validated through single-plex staining. Immunofluorescence images were taken using the Vectra Polaris Multispectral Imaging System.

Placental histopathological examination
Placentas were obtained at the time of cesarean section and examined histologically by perinatal pathologists blinded to clinical diagnoses and obstetrical outcomes, according to standardized Perinatology Research Branch protocols (1). Briefly, three to nine sections of the placenta were examined, and at least one full-thickness section was taken from the center of the placenta; others were taken randomly from the placental disc. Acute inflammatory lesions of the placenta were diagnosed according to established criteria (2)(3)(4), as shown in Supplemental Table 1.

Myometrial tissue dissociation
Immediately following myometrial biopsy collection, tissues were mechanically and enzymatically homogenized to prepare single-cell suspensions, modified from previously described protocols (5)(6)(7). Briefly, myometrial tissues were minced into small pieces until a fine consistency is achieved. Then, the tissues were enzymatically digested using the Umbilical Cord

Cytospin preparations from myometrial cell suspensions
Myometrial tissue biopsies were dissociated as described above. Using a Shandon Cytospin

Single-cell GEM generation and library construction
Single-cell RNA sequencing library preparation was performed on viable cells using the

Genotyping analysis
Genotype information was converted to vcf format using "iaap-cli gencall" and "gtc_to_vcf.py" from Illumina. Imputation to 37.5 M variants using the 1000 Genomes haplotype references was performed using the University of Michigan Imputation Server (https://imputationserver.sph.umich.edu/). The maternal/fetal relationship of the genotyped samples was verified using plink2 KING-robust kinship analysis (8). The vcf files were then filtered for high quality imputation and coverage for at least ten scRNA-seq transcripts using bcftools before they were used for demultiplexing.

scRNA-seq data normalization and pre-processing
Sequencing data were processed using Cell Ranger version 4.0.0 (10x Genomics). The fastq files were then aligned using kallisto (9), and bustools (10) was used to summarize the cell/gene transcript counts in a matrix for each sample. In parallel, "cellranger counts" was also used to align the scRNA-seq reads by using the STAR aligner (11) to produce the bam files necessary for demultiplexing the individual of origin based on genotype information using 8 souporcell (12) and demuxlet (13). The following quality metrics were calculated for each library: the average number of unique molecular identifiers (UMIs), the average number of detected genes, cell count, the average number of reads per cell, the fraction of reads in cells, the percentage of reads mapping to the mitochondrial genome, and valid barcodes across the prepared libraries (Supplemental Table 9). All libraries had excellent quality based on 10X Genomics recommendations. After demultiplexing, we removed any droplet/Gel Bead-in-Emulsion (GEM) barcode that was assigned to doublet or ambiguous cells in demuxlet or souporcell, and only those cells that could be assigned to a pregnancy case were kept. Additionally, we further filtered any cell with less than 100 genes or more than 10,000 genes detected or that had > 25% mitochondrial reads. All count data matrices were then normalized and combined using the Seurat package in R

Cell type annotation
SingleR (19) package in R version 1.4.1 was used to identify cell types based on their similarities to reference datasets (7,20,21) with known labels. SingleR annotates single cells from query data by computing the Spearman coefficient between the single-cell gene expression data and samples from the reference dataset. The correlation is measured only based on the variable genes in the reference dataset. The multiple correlation coefficients per cell type are combined 9 according to the cell type labels of the reference dataset to assign a score per cell type in the singlecell data. In addition to these previously labeled datasets, a publicly available dataset of 2,700 Peripheral Blood Mononuclear Cells (PBMCs) generated by 10x Genomics (22) was used to annotate the immune single cells. An anchor-based supervised mapping workflow was used to integrate reference and query single-cell samples and annotate query cells based on the shared biological states using the FindTransferAnchors and MapQuery functions from Seurat version 4.0.1. More specifically, the query data were first normalized and scaled using the SCTransform function. Then, the anchors between the query and reference data were found by a supervised principal component analysis (sPCA) (23). sPCA identifies the set of principal components that maximizes the dependency between the reference and query data, where the structure of the multimodal dataset is best captured based on the Hilbert-Schmidt Independence Criterion measure.
Next, the labels were transferred by a binary classification model built on the reference cell type labels. Finally, the neighbor sets representing the nearest neighbors from the reference to each query cell were identified. The UMAP projection was performed using the neighbor sets and the query data was projected onto the coordinates of the provided reference UMAP. Using different annotations obtained from the supervised reference-mapping workflows (SingleR and Seurat) on four datasets (7,(20)(21)(22), the final cell type labels were assigned based on majority vote. We further confirmed their identity and profile by identifying the top differentially expressed genes (DEGs) and performing functional enrichment analysis (see below). If multiple clusters were assigned to the same consensus cell type, we added a sub-index to that cell type for each different original Seurat cluster: Clusters 0 and 9 were annotated as Stromal-1 and Stromal-2; clusters 12, 17, and 20 were annotated as SMC-1, SMC-2 and SMC-3; clusters 1, 2, 14, and 21 were annotated as Macrophage-1 to Macrophage-4, respectively; and clusters 3 and 15 were annotated as Endothelial-1 and Endothelial-2. All remaining clusters were assigned a unique cell type identifier (Supplemental Table 2).

Differential gene expression for cell type analysis
For this analysis, the differential expression of selected marker genes for each cell type/cluster were identified using the Wilcoxon Rank Sum test and the FindAllMarkers function from Seurat. For this analysis, we first compared each cluster to all cell types (Supplemental Table   10). Additionally, we further analyzed clusters of closely related cell types in three major subgroups: i) smooth muscle cells (Supplemental Table 3 Table   4), and iii) macrophages (Supplemental Table 5). For this analysis, we compared each sub-cluster to the others in each sub-group to focus on the smaller differences between each sub-type. We further used the top 100 cell markers [ranked based on log2(Fold change) and requiring q < 0.1] assigned to each sub-cluster to analyze the enrichment of pathways in sub-clusters by performing an over-representation analysis based on the Gene Ontology (GO) database (Supplemental Figure   1-3).

Differential gene expression in parturition
The identification of labor-associated DEGs between study groups was performed using the DESeq2 R package version 1.32.0 (24). A term for each library was added to the DESeq2 model to correct for technical batch effects (library identifier). For each combination, we only used samples with more than 20 cells; otherwise, it was treated as a non-observed combination. Cell types found in less than 3 samples per study group in each combination were dropped (Supplemental Table 6 shows all of the DEGs). 11

Gene ontology and pathway enrichment analysis of genes affected by parturition
The clusterProfiler in R version 4.0.4 (25) was used to perform Gene Set Enrichment Analysis (GSEA) and Over-Representation Analysis (ORA) based on the Gene Ontology (GO), Kyoto Encyclopedia of Gene and Genomes (KEGG), WikiPathways (26), and Reactome databases. The functions "enrichPathway", "enrichKEGG", and "gseGO" from "clusterProfiler" were used to perform the ORA and GSEA analyses separately for each list of genes obtained as differentially expressed for each cell type. In ORA analyses, the universe of genes for each cell type was the subset that was expressed at a level sufficient to be tested in differential gene expression analysis. When results are combined across cell types, any genes tested (with a calculated p-value) in any of the cell types are used for the universe. Only ORA results that were significant after correction were reported with q < 0.05 being considered statistically significant.
STRING analysis was performed using the STRINGdb version 2.4.1 and ggplot2 version 3.3.5 R packages together with the STRING database (https://string-db.org) (27).

Multivariate Adaptive Shrinkage (MASH) analysis of parturition associated genes.
A meta-analysis of the measured changes between the two study groups across cell types was performed using the mashr R package version 0.2.50 (28). This approach, referred to as MASH analysis, tests and estimates multiple effects (e.g., genes) in many conditions (e.g., cell types) by allowing sparse effects and correlation among non-zero effects in different conditions. This method performs a condition-by-condition analysis to estimate the effect of each gene and its corresponding error in each cell type. Using the estimates from condition-by-condition results, the empirical Bayes procedure learns the sparsity pattern as well as correlation among effects and aggregates the learned patterns to improve the effect estimates and the corresponding measures of 12 significance across the different cell types. The metaplot function from the rmeta R package version 3.0 was used to visualize the result of meta-analysis based on selected labor-associated DEGs.

Cell-cell communication analysis
CellChat (29)   context-specific pathways across different cell groups were identified using a cut-off of 0.7 when visualizing the connection.

Comparison to myometrial bulk transcriptomic studies at term pregnancy
Myometrial microarray bulk data comparing women in labor (N = 19) and not in labor (N = 20) were obtained from a previous study (30), and the log2(Fold change) from this previous study were compared to those obtained in the current study across different cell types. The comparison was visualized with scatter plots using the ggplot2 R package version 3.3.3 and Spearman correlation analysis. Additionally, the ranked lists of DEGs from a previous myometrial study and DEGs detected in myometrial tissues from the current study were used to identify enriched cell types in the given list based on GSEA performed using clusterProfiler in R version 3.18.1.

Deconvolution analysis
Cibersortx (31), a deconvolution analysis method, was used to de-convolute previously collected bulk myometrium transcriptome data (30). Cibersortx employs a linear support vector   (24). The P-values were adjusted using false discovery rate (32). The DEGs were selected based on an adjusted P-value (q) < 0.1. The combined list of DEGs across cell types from the bulk data after deconvolution, and the DEGs from single-cell analysis on the "smooth muscle cell-1" cluster were used to perform ORA using the clusterProfiler in R version 4.0.4 (25).

Comparison to maternal peripheral blood bulk transcriptomic longitudinal data
Longitudinal transcriptomic data from maternal peripheral blood collected throughout gestation (N = 49) were taken from a previous study (33) and used to analyze the gestational agerelated changes in average expression of cell type-specific gene signatures (cell type markers) at the time of the blood draw (only non-labor samples were available for the longitudinal analysis).
The cell type-specific gene signatures were defined using the top 20 DEGs for each cell type (Supplemental Table 8). The expression of cell type-specific gene signatures was averaged based on the gestational age at sampling. The change trend was identified using a linear mixed-effect model fitted based on the quadratic splines. The significant trends were analyzed using linear mixed-effects models for the cell types defined from myometrial scRNA-seq data.

Comparison to maternal peripheral blood bulk transcriptomic studies at term pregnancy
Maternal peripheral blood bulk transcriptomic data from women at term in labor (N = 21) and not in labor (N = 28) were obtained from a previous study (34). The ranked list of DEGs in maternal whole blood and DEGs detected in the myometrial tissues from the current study were used to identify enriched cell types in the given list based on GSEA performed using clusterProfiler 15 in R version 3.18.1. Overlapping DEGs between the single-cell and bulk transcriptomic datasets are shown in Supplemental Table 11.

Statistical analysis of the demographic data
Statistical analyses were performed in SPSS v19.0 (IBM, Armonk, NY, USA) or the R package (as described above). Human demographic data were compared using two-tailed Fisher's exact tests for proportions and Mann-Whitney U-tests for non-normally distributed continuous variables.

Gene Expression
Current data Single-cell analysis Mittal et al. 2010 Bulk analysis

Deconvolution analysis
Single-cell signatures Supplemental Figure 12. Projection of the "SMC-1" signature to infer the underrepresented functions of smooth muscle cell types in labor. (A) Diagram illustrating the deconvolution analysis using signatures from the single-cell dataset of the human myometrium in labor (n = 11, red) and not in labor (n = 13, blue) to obtain differentially expressed genes (DEGs) from inferred cell types in the bulk dataset from myometrial tissues in labor (n = 19, red) and not in labor (n = 20, blue). (B) Visualization of "Myometrial relaxation and contraction pathways" from WikiPathway (https://www.wikipathways.org/index.php/Pathway:WP289) that was significant based on the over-representation analysis of DEGs obtained from deconvolution analysis and single-cell SMC-1 (shown in Fig. 8E). Purple and pink lines depict pathways associated with myometrial relaxation and contraction, respectively. DEGs in bold indicate overlap between WP289 pathway and single cell SMC-1 (shown in Supplementary Figure 13A).