Single-cell landscape analysis unravels molecular programming of the human B cell compartment in chronic GVHD

Alloreactivity can drive autoimmune syndromes. After allogeneic hematopoietic stem cell transplantation (allo-HCT), chronic graft-versus-host disease (cGVHD), a B cell–associated autoimmune-like syndrome, commonly occurs. Because donor-derived B cells continually develop under selective pressure from host alloantigens, aberrant B cell receptor (BCR) activation and IgG production can emerge and contribute to cGVHD pathobiology. To better understand molecular programing of B cells in allo-HCT, we performed scRNA-Seq analysis on high numbers of purified B cells from patients. An unsupervised analysis revealed 10 clusters, distinguishable by signature genes for maturation, activation, and memory. Within the memory B cell compartment, we found striking transcriptional differences in allo-HCT patients compared with healthy or infected individuals, including potentially pathogenic atypical B cells (ABCs) that were expanded in active cGVHD. To identify intrinsic alterations in potentially pathological B cells, we interrogated all clusters for differentially expressed genes (DEGs) in active cGVHD versus patients who never had signs of immune tolerance loss (no cGVHD). Active cGVHD DEGs occurred in both naive and BCR-activated B cell clusters. Remarkably, some DEGs occurred across most clusters, suggesting common molecular programs that may promote B cell plasticity. Our study of human allo-HCT and cGVHD provides understanding of altered B cell memory during chronic alloantigen stimulation.


Comparison of allo-HCT patient single-cell RNA-Seq data to a publicly available dataset on purified B cells from HDs, HIV patients, and malaria-exposed individuals
To compare scRNA data from this study to data published by Holla et al. (3), both datasets were filtered for cell quality using the same thresholds (cells with more than 5% mitochondrial gene counts, counts from fewer than 200 genes, or counts from more than 2,500 genes were excluded).
High-quality cells were then filtered to retain only B cells, with cell type inferred based on signature genes from CellAssign (4) and expression of the union of those signature genes and the top 2000 most variable genes (following variance stabilizing transformation (5)) using the R package scSorter (6). To normalize the datasets, gene counts per cell were transformed to counts per million reads mapped (CPM) using the Seurat R package. To compare gene expression, data from the present study and Holla et al. (3) were integrated following steps recommended by the developers of Seurat. Five cohorts were integrated to account for batch effects, as follows: Holla et al. data (3) were split by sample type (HD vs. HIV vs. Malaria), and allo-HCT data was split by the two sequencing runs. To investigate biologically-relevant genes across studies and sample types, we generated a heatmap with the following attributes: We identified each subset based on a detection threshold of 50 CPM for ITGAX and CD27 expression. To illustrate distribution of gene expression, values were plotted lowest to highest for each gene of interest and within each B cell subset (CD27 + ITGAX -, CD27 -ITGAX + , and CD27 + ITGAX + ) and sample type. Populations with more than 100 B cells were randomly down-sampled to 100 cells. Data analyses and plot generation were carried out in the R statistical environment with packages from the comprehensive R archive network (CRAN; https://cran.r-project.org/) and the Bioconductor project (7). Code used to replicate the analyses and figures followed principles of reproducible analysis, leveraging knitr dynamic reports and git source code management (8).

Dermal skin cell isolation and single-cell RNA-Seq analysis
The skin biopsies were washed with cold DPBS and digested with 1X dispase (ThermoFisher Scientific) overnight at 4 o C. The dermis was mechanically separated from the epidermis and then digested at 37 o C and 5% CO2 with 0.25% trypsin/EDTA for 1.5 hr. Dermal cell suspensions were then passed through 70 µm and 40 µm cell strainers, then pelleted by centrifugation and resuspended in Keratinocyte-SFM (ThermoFisher Scientific) containing 0.4% BSA. Single-cell suspensions were then subjected to 10X Genomics Chromium library generation and Illumina sequencing as described above. Signature gene analysis was performed using the R package Seurat to identify B cells within dermal cell clusters. DEG analysis was performed using the R package DESeq2, as described for blood B cells above. The skin cell scRNA-Seq library data is available through the Gene Expression Omnibus (GEO) database (https://www.ncbi.nlm.nih.gov/geo/), accession ID GSE190338.

Quantitative PCR (qPCR)
Frozen (-80°C) B cell pellets were thawed on ice, and total RNA was isolated using the RNeasy Plus Mini Kit (Qiagen) and quantified with a Qubit ® 3.0 Fluorometer (ThermoFisher). RNA (500 ng) was reverse-transcribed with an iScript cDNA Synthesis Kit (BioRad
PhenoGraph cluster analysis was then performed on the remaining 6 markers after first pre-gating on viable B cells (CD19 + 7AAD -), using this available function in Flowjo in combination with R.
A detailed summary of all antibodies used is as follows: CD19 Pacific blue (clone J3-119,

Western blot analysis
Frozen (-80°C) pellets of purified B cells from allo-HCT patients with Active cGVHD (n=4) or No cGVHD (n=4) were lysed using a commercially available whole cell lysis buffer (Roche Complete Lysis-M) with freshly added protease inhibitors (Roche). Protein lysates were then denatured and loaded into wells of a 4-12% Bis-Tris gel (ThermoFisher) for electrophoresis under reducing conditions. Proteins were transferred from the gels to a nitrocellulose membrane using dry electroblotting (20 V for 9 min). Membranes were blocked in 2% fish gelatin buffer for 75 min and then incubated with rabbit anti-human P27 KIP1 polyclonal antibody (C-19, Santa Cruz Biotechnology, cat#sc-528, lot#KO413) overnight at 4°C. The membrane was thoroughly washed in Tris-buffered saline containing Tween 20 (TBST) before incubation for 1 h with donkey antirabbit polyclonal secondary antibody labeled with a near-infrared fluorescent dye (IRDye ® 680, LI-COR, cat#926-68073, lot#C50821-05). The membrane was then washed thoroughly in TBST, and fluorescently labeled proteins were detected using an LI-COR Odyssey CLX Imaging System.
Band densities were quantified using ImageJ software (https://imagej.nih.gov/ij/).  Table 4). Each symbol represents one patient within the group indicated (Active, Active cGVHD; No, No cGVHD). (B,C) Co-expression analysis of GPR183 and TNFRSF13B using Seurat provides insight into numbers of B cells separately expressing and co-expressing these genes. In (B), normalized expression density UMAP plots are shown for GPR183 only (left panels), TNFRSF13B only (middle panels), and both genes in combination (right panels). The relative expression level of each gene in the density plots is based on the minimum to maximum color threshold range shown at the left. B cells co-expressing the genes at a high threshold level in the combination plots are depicted by magenta color. The boxed areas in the density plots approximate the B cells in Cluster 8 and are enlarged at the bottom to more easily visualize single B cells individually expressing or co-expressing the two genes. Active, Active cGVHD; No, No cGVHD. In (C), all B cells in the scRNA-Seq dataset individually expressing or co-expressing TNFRSF13B and GPR183 were quantitated using this available function in Seurat and separated by patient group. Bars represent total B cells expressing one or both genes as indicated by the X-axis labels, with the ratio of Active cGVHD B cells (filled bars) to No cGVHD B cells (open bars) indicated by the number above each condition, highlighting the increase in Active cGVHD. (D,E) PhenoGraph analyses of representative allo-HCT patients as described in Figure 6, A-E and Supplemental Figure 4 for identical flow cytometry panels with the exception of the inclusion of antibodies to either TACI (left plots) or EBI2 (right plots), as indicated by the red arrows. Cluster prediction was similar for the 2 panels, with 15 clusters predicted in both cases (UMAP plots at top, and not shown), enabling the comparison of similar clusters of B cells between panels. In (D), a similarly positioned and shaped cluster representing ABCs is present for both the TACI panel (blue) and EBI2 panel (green) that is otherwise phenotypically similar based on the other markers used, and is increased in frequency in Active cGVHD B cells (percentages indicated above each UMAP plot, and as with ABCs described in Figure 6, B and D and Supplemental Figure 4). For comparison, (E) shows a different cluster in the PhenoGraph projections (yellow, TACI panel; aqua, EBI2 panel) that has a naïve follicular B cell phenotype, lacks expression of TACI and EBI2, and is similar in frequency between patient groups. Figure 9G (along with the experiment shown in Figure 9F) Table 8. DEGs identified in the bulk-like analysis shown in Figure 9B, assigned to their respective biological pathways after interrogation of the GO database. *Typical direction of impact on cell activation, differentiation or survival based on GO annotation. '?' indicates potential direction based on a thorough search of the available literature. DEGs from Figure 9B without an assigned classification to at least one GO Qualified Term/pathway were omitted from the