Research ArticleImmunologyPulmonology Free access | 10.1172/jci.insight.126556
1Division of Pulmonary, Critical Care and Sleep Medicine, Department of Medicine, National Jewish Health, Denver, Colorado, USA.
2Division of Pulmonary Sciences and Critical Care Medicine, Department of Medicine, University of Colorado – Anschutz Medical Campus, Aurora, Colorado, USA.
3Center for Genes, Environment, and Health and
4Program for Cell Biology, Department of Pediatrics, National Jewish Health, Denver, Colorado, USA.
Address correspondence to: Kara J. Mould, National Jewish Health, 1400 Jackson Street, Denver, Colorado 80206, USA. Phone: 303.398.1787; Email: MouldK@NJHealth.org.
Authorship note: KJM and NDJ contributed equally to this work. MS and WJJ contributed equally to this work.
Find articles by Mould, K. in: JCI | PubMed | Google Scholar |
1Division of Pulmonary, Critical Care and Sleep Medicine, Department of Medicine, National Jewish Health, Denver, Colorado, USA.
2Division of Pulmonary Sciences and Critical Care Medicine, Department of Medicine, University of Colorado – Anschutz Medical Campus, Aurora, Colorado, USA.
3Center for Genes, Environment, and Health and
4Program for Cell Biology, Department of Pediatrics, National Jewish Health, Denver, Colorado, USA.
Address correspondence to: Kara J. Mould, National Jewish Health, 1400 Jackson Street, Denver, Colorado 80206, USA. Phone: 303.398.1787; Email: MouldK@NJHealth.org.
Authorship note: KJM and NDJ contributed equally to this work. MS and WJJ contributed equally to this work.
Find articles by Jackson, N. in: JCI | PubMed | Google Scholar
1Division of Pulmonary, Critical Care and Sleep Medicine, Department of Medicine, National Jewish Health, Denver, Colorado, USA.
2Division of Pulmonary Sciences and Critical Care Medicine, Department of Medicine, University of Colorado – Anschutz Medical Campus, Aurora, Colorado, USA.
3Center for Genes, Environment, and Health and
4Program for Cell Biology, Department of Pediatrics, National Jewish Health, Denver, Colorado, USA.
Address correspondence to: Kara J. Mould, National Jewish Health, 1400 Jackson Street, Denver, Colorado 80206, USA. Phone: 303.398.1787; Email: MouldK@NJHealth.org.
Authorship note: KJM and NDJ contributed equally to this work. MS and WJJ contributed equally to this work.
Find articles by Henson, P. in: JCI | PubMed | Google Scholar
1Division of Pulmonary, Critical Care and Sleep Medicine, Department of Medicine, National Jewish Health, Denver, Colorado, USA.
2Division of Pulmonary Sciences and Critical Care Medicine, Department of Medicine, University of Colorado – Anschutz Medical Campus, Aurora, Colorado, USA.
3Center for Genes, Environment, and Health and
4Program for Cell Biology, Department of Pediatrics, National Jewish Health, Denver, Colorado, USA.
Address correspondence to: Kara J. Mould, National Jewish Health, 1400 Jackson Street, Denver, Colorado 80206, USA. Phone: 303.398.1787; Email: MouldK@NJHealth.org.
Authorship note: KJM and NDJ contributed equally to this work. MS and WJJ contributed equally to this work.
Find articles by Seibold, M. in: JCI | PubMed | Google Scholar
1Division of Pulmonary, Critical Care and Sleep Medicine, Department of Medicine, National Jewish Health, Denver, Colorado, USA.
2Division of Pulmonary Sciences and Critical Care Medicine, Department of Medicine, University of Colorado – Anschutz Medical Campus, Aurora, Colorado, USA.
3Center for Genes, Environment, and Health and
4Program for Cell Biology, Department of Pediatrics, National Jewish Health, Denver, Colorado, USA.
Address correspondence to: Kara J. Mould, National Jewish Health, 1400 Jackson Street, Denver, Colorado 80206, USA. Phone: 303.398.1787; Email: MouldK@NJHealth.org.
Authorship note: KJM and NDJ contributed equally to this work. MS and WJJ contributed equally to this work.
Find articles by Janssen, W. in: JCI | PubMed | Google Scholar |
Authorship note: KJM and NDJ contributed equally to this work. MS and WJJ contributed equally to this work.
Published February 5, 2019 - More info
Macrophages are well recognized for their dual roles in orchestrating inflammatory responses and regulating tissue repair. In almost all acutely inflamed tissues, 2 main subclasses of macrophages coexist. These include embryonically derived resident tissue macrophages and BM-derived recruited macrophages. While it is clear that macrophage subsets categorized in this fashion display distinct transcriptional and functional profiles, whether all cells within these categories and in the same inflammatory microenvironment share similar functions or whether further specialization exists has not been determined. To investigate inflammatory macrophage heterogeneity on a more granular level, we induced acute lung inflammation in mice and performed single cell RNA sequencing of macrophages isolated from the airspaces during health, peak inflammation, and resolution of inflammation. In doing so, we confirm that cell origin is the major determinant of alveolar macrophage (AM) programing, and, to our knowledge, we describe 2 previously uncharacterized, transcriptionally distinct subdivisions of AMs based on proliferative capacity and inflammatory programing.
Macrophages perform multiple functions during acute inflammation, including production of inflammatory cytokines, phagocytosis of cellular debris, and production of reparative molecules (1–7). As such, they are attractive candidates for therapeutic intervention in inflammatory diseases. To date, however, targeting of macrophages has been hindered by lack of precise understanding of their dynamic programs in inflamed tissues. For instance, it is unclear whether the seemingly divergent inflammatory and reparative functions are performed by macrophages within a shared microenvironment or whether discrete populations of macrophages provide division of labor during acute inflammation.
In virtually all tissues, 2 main subsets of macrophages exist during acute inflammation. These include resident macrophages, which populate the tissues during homeostasis, and recruited macrophages that arise from circulating monocytes. In the lungs, resident airspace macrophages (RAMs) arise during embryogenesis and self-renew throughout life without replacement from circulating monocytes. They reside in the airspaces prior to the onset of inflammation, persist during the entirety of the inflammatory response, and remain after it resolves (8–11). In comparison, recruited airspace macrophages (RecAMs) originate from the BM, traffic to the site of inflammation, and ultimately undergo apoptosis (10, 12, 13). Based on bulk RNA sequencing (RNA-seq), we recently demonstrated that, during endotoxin-induced acute inflammation, RAMs proliferate locally and have relatively stable programing (14). Conversely, the programing of RecAMs is dynamic and is characterized by production of proinflammatory cytokines during peak inflammation and expression of reparative factors during lung repair. However, whether these functions are shared by all cells within each group, or whether intragroup subspecialization of function occurs, remains unknown.
Macrophage programing has traditionally been described along a dichotomous spectrum of polarization. Historically, macrophage polarization denoted the opposing responses of cells to different pathogens — M1 programing described the response to bacterial or viral infections, often modeled in vitro with LPS or IFN-γ, while M2 programing described the response to parasitic infections, often modeled by exposure to IL-4 or IL-13 (15, 16). Later, similarities in features of M2 programing and those present during wound repair led to the evolution of “M2” as a descriptor of reparative, antiinflammatory macrophages (5, 17, 18). Innumerable studies have sought to identify specific markers of these polarization states and have ascribed phenotypes to macrophages using selected genes or gene products. However, recent evidence suggests that macrophage programing is multidimensional, dynamic, and exceedingly complex and, therefore, unsuitable for a simplified, bipolar scheme (19, 20). To date, whether specific proinflammatory or antiinflammatory programs correlate with markers of macrophage polarization has not been determined at the single cell level.
In this study, we aimed to identify heterogeneity within the inflammatory alveolar macrophage (AM) pool at single cell resolution. We used single cell RNA-seq to explore the transcriptional signatures of AMs within the murine lung during 3 discrete time points: homeostasis, peak neutrophilic inflammation, and resolution of inflammation. In doing so, we confirm that cell origin drives AM heterogeneity and describe 2 previously unrecognized subgroups of RAM and RecAM cells specializing in proliferation and inflammatory signaling, respectively.
Single cell transcriptional profiling identifies 5 discrete AM populations across homeostasis, acute inflammation, and resolving inflammation. We performed single cell RNA-seq of all macrophages isolated from the airspaces of C57BL/6 mice treated with intratracheal LPS. Three time points, representing homeostasis (day 0), peak of neutrophilic inflammation (day 3), and resolution of inflammation (day 6) were analyzed (12, 21). DAPI–, Ly6g–, F480+, CD64+ cells (containing both resident and recruited AM subsets) were isolated from lavage using FACS (Supplemental Figure 1; supplemental material available online with this article; https://doi.org/10.1172/jci.insight.126556DS1). We analyzed profiles from 902 cells that passed strict quality control thresholds and used a combination of t-distributed stochastic neighbor embedding (tSNE) and shared nearest neighbor clustering carried out with the 5,784 most variant genes across cells from all time points to identify robust cell clusters.
Five transcriptionally distinct clusters were revealed (Figure 1A). All clusters expressed macrophage-specific markers, including CD68, Lgals3, Marco, and MerTK (Figure 1B). Overlaying time course information on the tSNE plot revealed that clusters 1 and 2 contained AMs from each of the 3 time points. In comparison, clusters 3 and 4 were composed almost exclusively of AMs from day 3, whereas cluster 5 contained AMs from day 6 (Figure 1C). During homeostasis, more than 90% of cells belonged to cluster 1, while nearly all remaining cells were in cluster 2 (Figure 1D). In contrast, during peak inflammation (day 3), the majority of cells were members of clusters 3 or 4, with the remainder coming from clusters 1 and 2. During the resolution of inflammation (day 6), clusters 3 and 4 essentially disappeared from the sample, and cells were segregated to cluster 5 (40%) or clusters 1 and 2 (59%).
Single cell transcriptional profiling identifies 5 discrete AM populations across homeostasis, acute inflammation, and resolving inflammation. Mice were treated with intratracheal LPS and macrophages were isolated from lavage at days 0 (homeostasis), 3 (peak neutrophil inflammation), and 6 (resolution of lung inflammation). (A) T-distributed stochastic neighbor embedding (tSNE) plot shows clustering of 902 cells based on gene expression. Point coordinates are based on tSNE dimensionality reduction of the top 6 principal components calculated from the 5,784 most informative genes. Cell color specifies assignment of cells to 1 of 5 clusters (c1–5) inferred using shared nearest neighbor clustering. (B) Normalized expression of macrophage markers overlaid on tSNE plot. (C) Time course information overlaid on tSNE plot. (D) Relative proportion of cells in each cluster versus time.
AM populations revealed by single cell RNA-seq reflect cell origin. Since clusters 1 and 2 were present during homeostasis and remained throughout the inflammatory time points, we hypothesized that these populations represented the RAMs. Likewise, we postulated that clusters 3, 4, and 5 largely consisted of RecAMs, as they were only present during inflammation. To test these characterizations, we examined whether known RAM and RecAM markers were differentially expressed between the 2 hypothesized groups. We found that RAM marker genes, including Mrc1 (CD206), Itgax (CD11c), SiglecF, and Siglec1 (CD169) (9, 14, 22), were significantly upregulated in clusters 1 and 2 compared with clusters 3, 4, and 5. In comparison, RecAM marker genes, including CD14, Ly6c1 (Ly6c), ApoE, Ccr5, Mafb, and Sell (L-selectin) (9, 14, 23), were significantly upregulated in clusters 3, 4, and 5 compared with clusters 1 and 2 (Figure 2, A and B). Furthermore, mean expression across the panel of RAM markers was 1.6-fold higher in clusters 1 and 2, whereas mean expression of the panel of RecAM markers was 1.4-fold higher in clusters 3, 4, and 5 (Figure 2, B and C). This confirms that cell origin is a major determinant of AM heterogeneity.
AM populations revealed by single cell RNA-seq reflect cell origin. (A) Relative expression of Mrc1 and CD14 overlaid on tSNE plot. Cells that express both markers are turquoise. High versus low expression is defined relative to the 85th percentile. (B) Bubble plot comparing expression of resident (blue) and recruited (red) biomarkers across the 5 macrophage clusters. Bubble size is proportional to percentage of cells in a cluster expressing a gene, and color intensity is proportional to average scaled gene expression within a cluster. (C) Summary expression of 4 resident biomarkers (Mrc1, Itgax, Siglecf, and Siglec1) and 6 recruited biomarkers (Cd14, Ly6c1, Apoe, Ccr5, Mafb, and Sell) overlaid on tSNE plot.
Macrophage polarization states are not mutually exclusive and reflect cell origin. The presence of subgroups within the populations expressing RAM markers (clusters 1 and 2) and RecAM markers (clusters 3 and 4) during peak inflammation led us to question whether this heterogeneity could be ascribed to classically activated (M1) or alternatively activated (M2) polarization states (24). We first examined the expression of Nos2 and Arg1, two commonly used markers to distinguish M1 and M2 activation states (25, 26). Expression of both enzymes was observed in all cell clusters but was greatest in cluster 3 (Figure 3, A and B). Although M1 and M2 activation states and their corresponding transcriptional markers are often considered to be mutually exclusive (24, 26–29), we observed extensive coexpression of Arg1 and Nos2 within the same cells (Figure 3C), consistent with nonexclusivity of M1 and M2 programing that has been suggested by our group and others (30–33). We next examined the mean expression of a comprehensive panel of traditional M1 and M2 markers across clusters and found that clusters 1 and 2 exhibited 1.3-fold higher expression of M2 genes compared with clusters 3 and 4, while clusters 3 and 4 expressed M1 genes at a 1.2-fold higher level compared with clusters 1 and 2 (Figure 3D). Thus, cells expressing RecAM markers at peak inflammation had the highest expression of M1 genes, whereas cells expressing RAM markers during both homeostasis and inflammatory time points expressed the highest levels of M2 genes. Notably, cluster 5 exhibited relatively low expression of both M1 and M2 genes. Taken as a whole, these data show that aggregate expression of M1 and M2 genes corresponds closely to cell origin, irrespective of inflammatory time points.
Macrophage polarization states are not mutually exclusive and reflect cell origin. (A) Cells with high expression of Arg1 and/or Nos2 overlaid on tSNE plot (high versus low expression defined relative to the 85th percentile). (B) Bubble plot shows relative expression of M2 (blue) and M1 (red) markers across clusters. Bubble size is proportional to percentage of cells expressing a gene, and color intensity is proportional to average scaled gene expression within a cluster. (C) Coexpression of Arg1 and Nos2 shown as square root–transformed expression of one gene against the other. Sample color denotes cluster from Figure 1A. (D) Summary expression of genes from B overlaid on tSNE plot. (E and F) Mean normalized expression of M2 (E) and M1 (F) markers relative to homeostatic RAMs. Data are shown across days for RAMs and clusters for RecAMs. Relative-to-baseline expression obtained by subtracting median value observed in clusters 1 and 2 at day 0.
Because homeostatic RAMs (day 0) should not be activated, we next examined mean expression of the gene panels across clusters and days using homeostatic RAMs (day 0) as a reference. We found that mean expression of M2 panel genes was similar across all time points for cells expressing RAM markers and was unchanged from homeostasis (Figure 3E). Conversely, mean expression of M2 panel genes was lower in cells expressing RecAM markers. Interestingly, cells expressing either RAM or RecAM markers at day 3 showed increased expression of M1 genes compared with day 0. Thus, cells that express RAM markers at day 3 exhibit dual expression of both M1 and M2 panel genes, although expression of the latter is similar to homeostatic RAMs. Notably, there was no correlation between Arg1 expression and M2 panel genes or Nos2 expression and M1 panel genes. Together, these data demonstrate that expression of M1 and M2 genes is not mutually exclusive and that single genes cannot be used to ascribe discrete polarization states to macrophages.
Resident AMs contain a proliferative subpopulation present during health and inflammation. As a next step, we sought to define the characteristics that distinguish individual clusters. We started with clusters 1 and 2 since they contained cells from all time points and had similar expression of RAM and polarization markers. In comparison with clusters 3, 4, and 5, cluster 1 contained 314 genes with significantly elevated expression and cluster 2 contained 572 such genes. Of these differentially expressed genes (DEGs), 182 were shared between clusters 1 and 2. Pathway analysis of DEGs from cluster 1 identified pathways known to be important for RAM programing, such as PPAR signaling, lipid metabolism, and protein processing in the ER (Figure 4A and Supplemental Table 1) (34, 35). DEGs from cluster 2 were enriched for similar annotations (Figure 4A), along with multiple proliferation and cell cycle pathways (Figure 4B and Supplemental Table 2).
Resident AMs contain a proliferative subpopulation present during health and inflammation. (A and B) Mean normalized expression of genes annotated for enriched pathways indicated that they were also upregulated in cluster 1 (A) or cluster 2 (B) when compared with RecAM clusters. (C–E) Normalized expression of 3 classic markers of proliferation overlaid onto tSNE plot. (F–H) Characterization of proliferating RAMs in cluster 2 after statistically removing expression heterogeneity related to cell cycle from dataset. (F) t-SNE plot–based scaled residual expression obtained from modeling relationship between gene expression and estimated cell cycle score. Cells colored based on original clusters in Figure 1A. (G) Time course information overlaid on new t-SNE plot. (H) Distribution of cells from original proliferation cluster across new clusters inferred using cell cycle–corrected dataset. New clusters were broadly similar to original clusters and were, thus, named and colored to match cluster analogs in Figure 1A.
We next compared clusters 1 and 2. Intriguingly, cluster 1 contained no DEGs compared with cluster 2. By contrast, cluster 2 exhibited 469 upregulated DEGs, the top 3 of which were Ki67, Pclaf, and topoisomerase II α (Top2a) (Figure 4, C–E, and Supplemental Table 3). Widely recognized as markers of cell proliferation, these genes were significantly upregulated in cluster 2 compared with the other clusters in the dataset. Notably, cluster 2 contained 6.5% of cells analyzed at day 0, 6.9% at day 3, and 2.1% at day 6 (Figure 1D) These data are consistent with previously published reports of RAM proliferation based on Ki67 staining, BrdU incorporation, or [3H]thymidine labeling (10, 14, 36)
Since the vast majority of DEGs in cluster 2 were involved in regulation of the cell cycle, we sought to confirm that these genes were principally responsible for the separation of this cluster. Accordingly, the expression signature of proliferation was regressed out of the dataset, after which clustering was reperformed. As shown in Figure 4, F–H, this resulted in the disappearance of the original cluster 2 and the reassignment of most of its members to cluster 1, suggesting that proliferation genes were largely what separated cluster 2 from other cells in cluster 1. Together, these data identify, for the first time to our knowledge, a dedicated subpopulation of proliferative AMs, with an expression profile characteristic of RAMs, present during health and disease.
Although clusters 1 and 2 contained cells isolated from 3 vastly different microenvironments (i.e., days 0, 3, and 6 of inflammation), relatively little partitioning of cells was apparent on the tSNE plot (Figure 1C). However, quantitative assessment of DEGs in cluster 1 revealed upregulation of 53, 414, and 39 genes at days 0, 3, and 6 (compared with the other time points), respectively. The DEGs at day 0 were enriched in pathways for lipid metabolism and transcription (Supplemental Table 4). Cells from day 3 were the most unique, with enrichment of pathways related to proteasome function, TLR signaling, and aerobic metabolism (Supplemental Table 5). At day 6, pathways for posttranslational modifications of proteins were enriched (Supplemental Table 6). Because cluster 2 contained only 43 cells, similar analyses of DEGs over time were limited by statistical power. However, a similar trend in DEGs across time points was observed in cluster 2. Taken as a whole, these data demonstrate that cluster 1 cells maintain their RAM identify while modifying gene function to fit the changing microenviroment posed by inflammation.
Macrophage subsets are identified during inflammation. We next investigated the defining characteristics of clusters 3 and 4, which exhibited profiles typical of RecAMs and were only present during peak inflammation. Comparison of clusters 3 and 4 with other clusters yielded 574 upregulated genes. Top pathways enriched with these DEGs involved inflammation-related terms such as NF-κB signaling pathway, TLR signaling, IFN signaling, and the ER-phagosome pathway (Figure 5A and Supplemental Table 7). Pathway analysis of 543 DEGs upregulated in cluster 3 compared with cluster 4 showed that cluster 3 is distinguished by proteolysis, antigen presentation, and oxidative metabolism (Figure 5B and Supplemental Table 8). In contrast, the 43 genes with increased expression in cluster 4 compared with cluster 3 were enriched in pathways specific to proinflammatory cytokine production and cell survival (Figure 5C and Supplemental Table 9), suggesting that these cells are specialized for inflammatory signaling. Gene level analysis confirmed elevated expression of several important inflammatory cytokines in cluster 4, including Cxcl2, IL1b, and Tnf (Figure 5, D–F). Together, these data identify 2 distinct clusters of cells that express RecAM markers at the peak of inflammation: 1 defined by inflammatory cytokine production and 1 defined by protein processing and proteasome functions. Intriguingly, while RecAMs are known to undergo apoptosis (12, 13), these 2 clusters have differential expression of apoptotic genes, suggesting that they may also have different propensities for programed cell death. To our knowledge, such subsets have not been previously described.
Macrophage subsets are identified during inflammation. (A–C) Bar graphs comparing mean normalized expression. Mean normalized expression of genes annotated for enriched pathways upregulated in clusters 3 and 4 together when compared with all other clusters (A), or cluster 3 (B) or cluster 4 (C) when compared with each other. (D–F) Normalized expression of 3 inflammatory cytokines overlaid on tSNE plot. (G) Bar plots comparing mean normalized expression of genes annotated for enriched pathways upregulated in cluster 5 when compared with all other clusters.
In contrast to the 2 RecAM marker–enriched clusters present at peak inflammation, cluster 5, which contained the RecAM-like cells present at day 6, was distinguished from all other clusters by upregulation of 450 DEGs enriched in pathways involved in production of growth factors, secretory granule formation, lysosome processing, and PIK3-AKT signaling (Figure 5G and Supplemental Table 10). In addition, pathways important for actin binding and cell migration were also identified. Further analysis of gene expression variation within cluster 5 did not reveal further subclustering of cells.
Unique transcriptional programs define macrophage subsets. Having identified genes that distinguished clusters from one another, we next plotted the scaled expression of the top DEGs in each cluster across all cells to visualize the extent to which patterns of expression among these cluster-defining genes were shared or unique in respect to other clusters (Figure 6 and Supplemental Table 11). Examination of the resultant heat map confirmed that genes related to cell cycle were almost exclusively expressed in cluster 2. In comparison, the DEGs that defined cluster 1 were also expressed by cells from cluster 2. Likewise, the DEGs that defined cluster 4 were also expressed by cells cluster 3, whereas the proinflammatory DEGs that defined cluster 3 were less highly expressed by cells in cluster 4 and downregulated in other clusters. Finally, genes encoding the growth factors and reparative molecules during resolution phase RecAM in cluster 5 were slightly increased in RAM (clusters 1 and 2) but were not increased in the RecAM clusters that dominated during peak inflammation (clusters 3 and 4). As a whole, these patterns illustrate that each cluster is defined by a unique transcriptional program that imparts specific functions and divisions of labor throughout the inflammatory process.
Expression of identified gene sets distinguishes clusters. Scaled expression of top 30 differentially expressed genes from each cluster. Individual cells are represented on the horizontal axis and grouped by cluster and day of inflammation.
Gene networks from bulk RNA-seq are coexpressed within single cell RNA-seq–based clusters. Because single cell sequencing is inherently prone to dropout, or incomplete detection of genes expressed at low levels, we sought to integrate our single cell data with our previously published bulk RNA-seq of RAMs and RecAMs (14, 37). In this way, we aimed to infer a more complete transcriptional profile to our newly identified single cell clusters. In addition, by comparing the 2 datasets, we sought to determine the extent that clustering of our single cell data explained the coregulation of genes identified in our bulk dataset and validate that cell origin is a major determinant of AM programing. Finally, we assessed the technical reproducibility of data from the 2 sequencing platforms using the same model system.
We first compared the transcriptional content of the single cell dataset with our previously published bulk RNA-seq data of RAMs and RecAMs sorted from bronchoalveolar lavage (BAL) at equivalent time points following LPS-induced inflammation (14). Overall, 3,494 quality controlled (QCed) genes in the bulk dataset were not present in the QCed single cell dataset, of which 1,521 had conanoical gene names. Although the majority of these missing genes were expressed at low levels in the bulk dataset, 148 of these genes were detected at levels above the median level of expression of the bulk data. While pathway analyses of these genes did yield enrichments, several genes encoding important macrophage effector proteins were identified (Myd88 and Tlr9), suggesting that they were prone to dropout with single cell technology.
We next performed weighted gene coexpression network analysis (WGCNA) of the bulk RNA-seq data. Eleven distinct gene coexpression networks were identified based on samples collected at days 0, 3, and 6 (Supplemental Figure 2), 3 of which were upregulated in RAMs, 5 of which were upregulated in RecAMs, and 3 of which were upregulated in cells belonging to both groups (Supplemental Figure 3). Overall, a strong correlation between bulk and single cell heterogeneity was observed, with cells expressing a given network largely limited to 1 or 2 clusters.
Pathway analysis carried out on gene networks also revealed broad correspondence between bulk and single cell datasets (Figure 7, Table 1, and Supplemental Table 12), while enhancing understanding of the processes and genes that define cell clusters. For instance, 2 networks were characteristic of RecAMs at day 3: one network was enriched in inflammatory signaling pathways, glucose metabolism, and apoptosis pathways (Figure 7D), while another was enriched in proteasome activity, IFN signaling, and mitochondrial respiration (Figure 7E). Interestingly, there was no major difference in the average expression of these 2 networks between single cell clusters 3 and 4, even though pathway analysis based on DEGs from the clusters (Figure 4B) predicts that these networks would define them.
Gene networks from bulk RNA-seq are coexpressed within single cell RNA-seq–based clusters. (A–G) Seven coexpressed gene networks derived from bulk RNA-seq data exhibit peak expression within single cell–derived resident (A–C) or recruited (D–G) macrophage clusters (see also Supplemental Figure 3). For each network, mean scaled eigengene expression is shown across bulk RNA-seq resident and recruited AMs from each day (heatmaps). For comparison with the single cell dataset, relative eigengene expression of each network is overlaid onto the single cell tSNE plot. Select results from pathway analysis of each network are shown in Table 1.
Two networks were most highly expressed among RecAMs at day 6 in the bulk dataset, both of which were enriched for pathways involved in cell growth and repair (Figure 7, F–G). Both networks were highly and near exclusively expressed by cluster 5 of the single cell dataset.
Three networks were characteristic of RAMs. One network had pathways enriched in lipid metabolism and PPAR signaling and was expressed most highly among day 0 cells contained within clusters 1 and 2 (Figure 7A). Likewise, another network, enriched in pathways related to RNA modification, mitochondrial respiration, and proteasome activity, was most highly expressed in RAMs at day 3 (Figure 7B). This network appeared to be driven primarily by environment, rather than cell origin, in that its expression was more similar to RecAMs from the same time point (clusters 3 and 4) compared with RAMs at other time points. The third network, enriched in cell proliferation genes, was strongly characteristic of cluster 2 (Figure 7C).
By comparing coregulated gene networks in both bulk and single cell datasets, the relative contributions of cell origin and environment to the overall heterogeneity and discrete functions that define each AM cluster can begin to be ascertained. Further, given the high overlap of bulk RNA-seq–defined modules, which contained many genes not detected by the single cell sequencing platform, the integration of these technologies provides significantly more genic definition to the single cell dataset and validates this approach to more fully inform the biology of single cell populations.
In this study, we applied single cell RNA-seq technology to reexamine and refine our previous characterization of AM heterogeneity in a self-resolving model of lung inflammation. In doing so, we have identified 5 discrete populations of airspace macrophages that exist in the lungs during acute inflammation, each with unique transcriptional signatures and putative functions. Three populations contain cells that express RecAM markers and are only present during inflammation, whereas 2 populations that contain cells that express RAM markers are present during both inflammation and homeostasis. While this study is not sufficient to prove cell origin, together with our previous finding that embryonic versus postnatal origin dictates the majority of AM programing, it provides compelling supporting evidence. Further, we have identified additional determinants of cell heterogeneity. This includes a proliferative subset of RAMs that is present during all time points and potentially novel subsets of proinflammatory and proteolytic RecAMs present only during peak inflammation. A uniform group of reparative RecAMs present during resolution of inflammation is also identified.
The existence of a proliferative pool of RAMs is consistent with their known self-renewing properties (10, 36, 38–42). RAMs populate the airspaces during embryogenesis and maintain a steady population throughout life with little contribution from BM-derived cells (11). During acute inflammation, they continue to proliferate and remain after inflammation has resolved. To our knowledge, our data provide the first transcriptional characterization of the proliferative subset of RAMs during both health and disease. As has been previously shown, we found overall low expression of transcriptional repressors of proliferation, Mafb and Maf (cMaf), in RAMs (43). However, in contrast to macrophages in other organs, levels of Mafb and Maf were not further decreased in actively proliferating (cluster 2) versus quiescent (cluster 1) RAM subsets (43). Rather, our data demonstrate that Mafb and Maf gene expression levels are largely unchanged throughout inflammation. Notably, our data do not distinguish between a specialized stem-like population of cells or, as suggested by previous studies, a regular proportion of cells stochastically entering and leaving the cell cycle (10, 36) Because the proliferative RAMs we identified show only minor changes in expression of gene transcripts that encode cell surface proteins, studies using cell labeling techniques or proliferation reporters will likely be required to further investigate the specific stimuli that signal AMs to maintain the niche during homeostasis and inflammation.
Strikingly, despite existing in drastically different microenvironments during homeostasis, peak inflammation, and lung repair, RAMs displayed remarkably stable programing. The largest differences among RAMs were observed at peak inflammation, where proteasome and immune response genes were upregulated, echoing the concomitant but more dramatic patterns observed in RecAMs. While it is possible that direct exposure to LPS had significant effects on the transcriptional signature of RAMs at earlier time points that were not included in our dataset, our study shows that, after 3 days, any major transcriptional consequences of LPS treatment have resolved, underscoring the role of resident macrophages in maintenance of tissue homeostasis (7, 8, 44). Besides highly proliferative RAMs, no other obvious subsets of RAMs were detected. Specifically, we did not identify the subset of sessile RAMs suggested by Westphalen et. al, which may be important in immunosuppression (2). The absence of a sessile RAM subpopulation (characterized by high expression of connexins and calcium signaling molecules) may reflect the relative ineffectiveness of lavage for retrieving cells from the alveolus (45). Future comparison of cell populations isolated from lavage versus cells isolated by lung digestion will be important to expand the repertoire of AM functions during inflammation.
The finding of 2 subsets of RecAMs during peak neutrophilic inflammation has major implications for macrophage programing during inflammatory lung disease. RecAMs are thought to derive from circulating monocytes that are recruited to inflamed tissues in response to specific chemoattractant molecules. RecAMs persist during inflammation and repair, then undergo apoptosis, leaving very few BM-derived AMs once homeostasis has been restored (11–13, 46–48). Our previous bulk RNA-seq data identified RecAMs as proinflammatory and suggested that they are responsible for the majority of proinflammatory cytokines produced by macrophages during peak inflammation. However, our single cell data indicate that a distinct subset of RecAMs, cluster 4, is largely responsible for cytokine production. In comparison, cluster 3 is characterized by expression of enzymes involved in protein processing and antigen presentation, raising the possibility that cells in this cluster are programmed for functions distinct from those of cluster 4 — despite existing in a shared microenvironment. Indeed, altered ratios of constitutive versus immunoproteosomal subunit transcripts in cluster 4 compared with cluster 3 are in keeping with the emerging role of proteosomal protease activities as master regulators of the inflammatory response (49–53). Given that several of the DEGs that distinguish cluster 4 from cluster 3 are also expressed in neutrophils at steady-state (S100a9, Lcn2, Il2r) (38), we considered whether this population may represent macrophage/neutrophil complexes (i.e., doublets containing 2 cells bound together). However, we found no differences in total expressed genes between clusters 3 and 4, as would be expected in the case of cell doublets. We also considered the possibility that the cells in cluster 4 were macrophages that phagocytosed neutrophils. However, many other genes that are highly expressed in neutrophils such as Ly6G were absent. Moreover, there was no enrichment in phagocytosis pathways in cluster 4 versus cluster 3. Rather, the high expression of protein processing and proteolysis genes in cluster 3 suggest that this population may be poised for active phagocytosis. We suspect that the genes in question are truly expressed by this proinflammatory macrophage subset and that this reflects their monocytic origin. Interrogation of the ImmGen database (54) lends support for this concept and demonstrates that the genes are highly expressed by Ly6Chi BM monocytes.
In addition to differences in inflammatory programing, RecAM subclusters at day 3 may have differential cell fate. Although it is commonly believed that monocytes arrive in bulk during the initial phase of inflammation, it is currently unknown whether the RecAMs that exist at day 6 derive from those present at day 3 or whether ongoing turnover and replacement of RecAMs occurs. One possibility is that clusters 3 and 4 represent different phases of monocyte influx into the lungs. In this context, cluster 3 may contain mature RecAMs that are destined for apoptosis prior to day 6, while cluster 4 may represent more recently recruited AMs that mature to reparative RecAMs at day 6. In support of this hypothesis, cluster 3 demonstrates increased expression of proapoptotic caspase 8, while cluster 4 is notable for increased expression of antiapoptotic molecules, Bcl2L1, Ier1, mitogen-activated protein kinase 2 (MAPK2), and Mapk14 (p38) (55). Likewise, cluster 4 cells express high levels of Nrf2 and its downstream products, Slc7a11 and Slpi, genes implicated in cytoprotective and antioxidant functions (56).
Alternatively, the basis for heterogeneity within RecAMs may be explained by heterogeneity of circulating monocytes. Recent studies investigating monocyte progenitors have identified 2 distinct sources for circulating monocytes: granulocyte-monocyte progenitors (GMPs) and monocyte-DC progenitors (MDPs). GMPs produce neutrophil-like, inflammatory monocytes, while MDPs produce a subset of monocyte-derived DCs (57). While the markers used to identify these populations in the BM are not clearly confined to either clusters 3 or 4, the division of proinflammatory cytokine production and antigen-presenting functions between clusters aligns with the observed neutrophil-like GMP and DC-like MDP-derived monocyte phenotypes. Identification of markers to distinguish cells within clusters 3 and 4 will allow for further study of the source and fate of these 2 subsets.
Single cell analysis allows for direct correlation of traditional markers of macrophage programing with cell function. In our LPS model, we saw the majority of M1 marker expression present during peak inflammation, with a return to baseline expression of M1 markers during resolution of inflammation (Figure 3F). This trend is most pronounced in RecAMs but is also present in the RAM populations. Notably, we detected the expression of several M2 activation markers in homeostatic RAMs. In fact, mean expression of our panel of M2 markers was significantly higher in naive (day 0) RAMs compared with cells captured at peak inflammation or resolution time points (1-sided t test P = 4.54 × 10–21 and 2.28 × 10–4, respectively). This invalidates these transcripts as markers of cell activation and rather suggests that they represent markers of the normal process of differentiation into mature macrophages. Importantly, M2 marker expression did not correlate with expression of growth factors and molecules thought to be important for tissue repair in cluster 5. Together, these analyses complement an emerging literature that suggests that the traditional M1-versus-M2 dichotomy derived from in vitro studies is inadequate in ascribing a molecular basis for the multidimensional, dynamic programs required for specific functions of AMs in vivo (19, 20). Specifically, while Arg1 and Nos2 have been used for decades to assign phenotypes to macrophages, we show that they are coexpressed within the same clusters of cells — and even in the same individual cell, as has been suggested previously (30–33). In our study, these markers alone, or as part of a panel of markers, are neither sufficiently sensitive nor specific to functionally characterize the cells that express them.
It is important to note that our study did not include all myeloid populations present in the lung lavage. For instance, while widely accepted in the literature as sensitive and specific markers of AMs, the use of F480 and CD64 for positive selection may have excluded AMs with lower expression of these molecules (58). However, our previous work with reporters specific for RAM or BM-derived myeloid cells (12, 47) suggest that F480lo and CD64lo cells represent an insignificant portion of AMs. Likewise, the goal of this study was to focus on bona fide AMs. As such, DCs and monocytes were also excluded during FACS and were not analyzed. Our study included data from single animals at each time point and was not powered to detect macrophage heterogeneity between animals or varying levels of inflammation.
By using the same murine model of LPS-induced lung inflammation as our prior work (14), the current study allowed for direct comparison and examination of reproducibility of data between single cell and bulk RNA-seq platforms. Using WGCNA, we have illustrated a high level of technical reproducibility between the 2 techniques and have highlighted the added power of single cell analysis. While bulk RNA-seq identified RAMs and RecAMs as transcriptionally distinct cell types with different inflammatory and proliferative programs, the addition of single cell RNA-seq allowed for identification of additional subgroups of cells and assignment of putative functions to these groups. Further, sequencing of single cells allows for direct correlation of cell markers with cell function, enabling reexamination of the adequacy of traditional markers of cellular programing. In this study, application of single cell technologies to a murine model of lung inflammation has yielded the discovery of proinflammatory and proteolytic subsets of RecAM present only during peak inflammation and identification of a subset of proliferative RAMs during both health and disease. Our results confirm that AMs exist in heterogeneous populations during lung inflammation and suggest that targeting the division of labor between AM subgroups may provide therapeutic targets for inflammatory lung diseases.
Animal model
Animals. All mice used for RNA-seq were C57BL/6J males, age 10–12 weeks of age, from the Jackson Laboratory. This study was approved by and performed in accordance with the ethical standards of IACUC at National Jewish Health.
Acute lung inflammation model. LPS (E. coli O55:B5 from List Biological Laboratories) (20 μg) in 50 μl of PBS was instilled directly into the tracheas of mice sedated with isofluorane (Baxter) using a modified feeding needle (59). Mice were sacrificed using i.p. injection of pentobarbital sodium solution.
Macrophage isolation. BAL was performed with 10 serial lavages of 1 ml PBS containing 5 mM EDTA. One animal was used for each time point. BAL cells were washed twice in PBS. FcγR was blocked using anti-CD16/CD32 (eBioscience, clone 93) for 20 minutes. For day 3 sample, neutrophils were depleted over a column by binding Ly6G prior to surface staining. Cells were incubated with primary antibody on ice for 30 minutes, washed twice, and then taken to FACS. Dead cells were excluded on the basis of scatter and DAPI exclusion. Monocytes were excluded based on size and light scattering characteristics (14). Antibodies and cell stains used are as follows: anti-Ly6G (BioLegend, clone 1A8), anti-CD3 (Thermo Fischer Scientific, clone 17A2), anti-CD19 (Thermo Fischer Scientific, clone eBio1D3), anti-NK1.1 (Thermo Fischer Scientific, clone PK136), DAPI (BioLegend, catalog 422801), anti-CD64 (BD Biosciences, clone X54-5/7.1), anti-CD11c (Thermo Fischer Scientific, clone N418), anti-CD11b (Thermo Fischer Scientific, clone M1/70), and anti-F4/80 (Thermo Fischer Scientific, clone BM8). FACS was performed using a BD FACSAria Cell Sorting System with BD FACSDiva Software (BD Biosciences) on unfixed specimens.
Single cell RNA isolation and sequencing. Sorted cells were dispensed into a 5,184-nanowell version 1 microchip using the ICELL8 platform (WaferGen Biosystems) from which single cells were selected for library preparation and sequencing. Library preparation was carried out using ICell8 single-cell poly-(A)+ transcriptome amplification hands-on workflow (with Triton X-100 for cell lysis), protocol D07-000040-003 rev4. Libraries were generated for a total of 1,134 selected cells, which were then sequenced as 100 bp single-end reads on an Illumina HiSeq 2500 in 3 batches (2 flow cell lanes per run).
Data and software availability
All datasets have been deposited in the National Center for Biotechnology Information/Gene Expression Omnibus (GEO) under accession number GSE120000.
Statistics
Preprocessing of single cell RNA-seq data. Raw cDNA reads in FASTQ files were trimmed using Cutadapt (60), where 5′ and 3′ quality trimming was carried out (q < 20) and both poly A tails and reads shorter than 25 bp were removed. Trimmed reads were aligned to the GRCm38/mm10 genome with GSNAP, setting “max-mismatches=0.05” and accounting for both known Ensembl splice sites and SNPs. We quantified gene expression using HTSeq with “stranded=yes”, “mode=intersection-nonempty”, and “t=gene.” We summed the number of unique molecular identifiers (UMIs) for each gene across runs to obtain a UMI count matrix that we used for all downstream analysis.
We quantified the number of genes with expression greater than zero in each cell and then removed 222 cells expressing fewer than 2,000 genes. Although samples comprising multiple cells were removed during the cell selection stage, to safeguard against undetected doublets, we removed 7 outlier cells expressing >11,800 genes or >90,000 UMIs. We removed 3 additional cells for which <40% of original reads mapped to a unique gene-annotated region. All remaining cells exhibited unique mapping rates >50% and expressed genes of which <30% were aligned to the mitochondrial genome. Prior to downstream analysis, we also removed select mitochondrial and ribosomal genes (those beginning with Mtat, Mt-, Mrpl, Mrps, Rpl, and Rps) and very lowly expressed genes (those genes not expressed in at least 0.5% of cells).
The final QCed dataset contained 902 cells with a mean of 20,199 UMIs and 4,832 genes per cell. To account for differences in coverage across cells, we normalized UMI counts by dividing each by the sum of its cell’s UMIs, multiplying by 10,000, and then taking the log. For input into dimensionality reduction and clustering analyses and for plotting relative expression in heatmaps and bubble plots, we also fitted normalized expression of each gene to the sum of the UMI counts per cell and then mean centered (subtracted from each gene count the average expression of that gene) and scaled (divided each centered gene count by that gene’s SD) the residuals. We used the Seurat R package (61) to carry out data normalization and scaling, as well as downstream dimensionality reduction, clustering, tSNE plot overlaying, and differential expression.
Dimensionality reduction and clustering. We focused on the most informative genes for use in dimensionality reduction. To obtain these, we modeled the relationship between mean expression and the log of the ratio of expression variance to its mean (dispersion) and then selected the 5,784 genes for which fitted dispersion was >0.2 and mean expression was >0.15. We subjected these variable genes to principal component (PC) analysis and then, based on visualizing a relative leveling off of the SD across PCs, we selected the top 6 PCs for subsequent clustering and visualization. To reduce dimensionality further, we carried out the Barnes-Hut implementation of t-distributed neighborhood embedding (tSNE) based on the top 6 PCs and plotted this embedding in 2-dimensional space (perplexity, 50; number of iterations, 1,000).
We clustered single cells in an unsupervised manner based on their expression by first constructing a shared nearest neighbor graph based on k-nearest neighbors calculated from the first 6 PCs of the scaled data (k = 30). The number of clusters was then determined using a modularity function optimizer based on the Louvain algorithm (resolution, 0.465).
To remove the cell cycle signature from the dataset, we used Seurat to calculate a cell cycle score for each cell based on mean expression across a list of S and G2/M markers compiled by Tirosh et al. (62). We then performed dimensionality reduction and clustering using scaled residual expression obtained from modeling the relationship between gene expression and cell cycle score.
Plotting expression across cells. For overlaying expression onto the tSNE plot for single genes or for the average across a single panel of genes, we plotted normalized expression along a continuous color scale, with the extreme color values being set to the 5th and 95th quantile expression values. For module expression overlays in Figure 7, we plotted eigengene expression (eigenvectors from the first PC of a given network’s gene expression) rather than mean expression. For plotting the expression of 2 genes on a single plot, “high” versus “low” expression for 1 or both genes was relative to the 85th quantile of normalized expression. The bubble plot and heatmap show scaled normalized expression along a continuous color scale. We produced the heatmap in Figure 6 using Heatmap3 (63).
Differential expression analysis. Differential expression for each gene between various groups specified in the text was tested using a nonparametric Wilcoxon rank sum test. We limited each comparison to genes exhibiting both an estimated log fold change >0.25 and detectable expression in >10% of cells in 1 of the 2 clusters being compared. We corrected for multiple hypothesis testing by calculating FDR-adjusted P values. Genes were considered to be differentially expressed when FDR < 0.05. To create the heatmap in Figure 6, we carried out differential expression analysis between each of the 5 focal clusters and other individual clusters of comparative interest, based on our previous characterization of clusters using biomarkers of cell origin and state. Specifically, we compared cluster 1 with cluster 3, cluster 4, and cluster 5; cluster 2 with cluster 1; cluster 3 with cluster 1, cluster 4, and cluster 5; cluster 4 with cluster 1 and cluster 3; and cluster 5 with cluster 1, cluster 3, and cluster 4.
Functional enrichment analysis. We tested for gene overrepresentation of all target lists (e.g., from DEGs and WGCNA networks) within a panel of annotated gene databases (Gene Ontology [GO; http://amp.pharm.mssm.edu/Enrichr/geneSetLibrary?mode=text&libraryName=GO_Biological_Process_2017] Biological Process [BP], GO Molecular Function [MF; http://amp.pharm.mssm.edu/Enrichr/geneSetLibrary?mode=text&libraryName=GO_Molecular_Function_2017], GO Cellular Component [CC; http://amp.pharm.mssm.edu/Enrichr/geneSetLibrary?mode=text&libraryName=GO_Cellular_Component_2017], Kyoto Encyclopedia of Genes and Genomes [KEGG; http://amp.pharm.mssm.edu/Enrichr/geneSetLibrary?mode=text&libraryName=KEGG_2016], Reactome [http://amp.pharm.mssm.edu/Enrichr/geneSetLibrary?mode=text&libraryName=Reactome_2016]) using hypergeometric tests implemented with Enrichr (64), as automated using the python script, EnrichrAPI (https://github.com/russellstewart/enrichrAPI/tree/08e4733a0e07ac6c57c46a918547439867f8ba44; commit ID, 08e4733a0e07ac6c57c46a918547439867f8ba44). We report only terms and pathways that were enriched with FDR < 0.05. Box plots comparing expression of enriched pathways across clusters show mean expression for genes in the target list that are also annotated for the pathway being plotted.
WGCNA. We carried out WGCNA using samples from a previous bulk RNA-seq experiment on flow-sorted resident and recruited AMs that was similar in design to the current study (14). We first downloaded the raw count table from GEO and then filtered samples such that only RAMs from days 0, 3, and 6 and RecAMs from days 3 and 6 were retained (3 replicates per treatment = 15 samples total). Genes without counts >1 in at least 1 sample were removed, leaving 19,857 genes to analyze. We normalized the counts using DESeq2’s implementation of variance stabilizing transformation (VST), which accounts for heterogeneity in sequencing coverage across cells, while also minimizing mean-dependent variance (65).
We then analyzed the VST-normalized counts using the WGCNA R package (66) in order to infer distinct networks (or modules) of coexpressed genes that may share common biological functions or be regulated by the same upstream mechanisms (67). Briefly, the WGCNA method measures pairwise Pearson correlations among all expressed genes in the dataset, constructs a gene network based on these correlations, calculates topological overlap among genes in the network, and then, by carrying out hierarchical clustering of a distance metric based on the topological overlap matrix, identifies discrete clusters of genes. We set the soft-thresholding power to 3, which best maximized the number of connections in the gene network, while still meeting the scale-free network assumptions of the method, and we retained the direction of correlations among indirect connections in the network (i.e., analysis was signed). In addition, minimum network size was set to 30, maxCoreScatter was set to 0.84, minGap was set to 0.15, and cutHeight was set to 0.78 (the 90% quantile of the tree height). We then merged any networks whose eigengenes were highly correlated (>0.85).
When determining whether gene expression was significantly different in 1 group compared with another, we used the Wilcoxon rank sum test. Genes were considered differentially expressed when P values (controlled for multiple comparisons using the Benjamini–Hochberg FDR procedure) were <0.05. Similarly, for determining significance in pathway analysis, pathways were only deemed significant if the hypergeometric test FDR was <0.05 and there were at least 3 overlapping genes. For comparisons of mean expression between groups, we used 1-sided t tests, with P < 0.05 considered significant.
Study approval
Animal studies were approved by the IACUC at National Jewish Health.
KJM designed research, performed experiments, analyzed data, and wrote the manuscript. NDJ performed bioinformatics, analyzed data, and wrote the manuscript. PMH contributed to the manuscript. MS and WJJ designed research, analyzed data, and wrote the manuscript.
This work was supported by the NIH grants R35 HL140039, R01 HL130938, R01 HL114381, and K08 HL141648.
Address correspondence to: Kara J. Mould, National Jewish Health, 1400 Jackson Street, Denver, Colorado 80206, USA. Phone: 303.398.1787; Email: MouldK@NJHealth.org.
Conflict of interest: The authors have declared that no conflict of interest exists.
License: Copyright 2019, American Society for Clinical Investigation.
Reference information: JCI Insight. 2019;4(5):e126556. https://doi.org/10.1172/jci.insight.126556.