Gene expression atlas of energy balance brain regions

Energy balance is controlled by interconnected brain regions in the hypothalamus, brainstem, cortex, and limbic system. Gene expression signatures of these regions can help elucidate the pathophysiology underlying obesity. RNA sequencing was conducted on P56 C57BL/6NTac male mice and E14.5 C57BL/6NTac embryo punch biopsies in 16 obesity-relevant brain regions. The expression of 190 known obesity-associated genes (monogenic, rare, and low-frequency coding variants; GWAS; syndromic) was analyzed in each anatomical region. Genes associated with these genetic categories of obesity had localized expression patterns across brain regions. Known monogenic obesity causal genes were highly enriched in the arcuate nucleus of the hypothalamus and developing hypothalamus. The obesity-associated genes clustered into distinct “modules” of similar expression profile, and these were distinct from expression modules formed by similar analysis with genes known to be associated with other disease phenotypes (type 1 and type 2 diabetes, autism, breast cancer) in the same energy balance–relevant brain regions.


Introduction
Energy balance is controlled by the intricate interplay of gene expression in the hypothalamus, brain stem, cortex and limbic system ( Figure 1A). The hypothalamus and brain stem are part of the homeostatic circuity involved in sensing and controlling the energy status of the organism by integrating multiple peripheral metabolic inputshomeostatic signals -such as circulating metabolites, gut-derived hormones and adiposity-related signals (1,2). Cortical and limbic brain regions form the executive and reward systems of the forebrain corticolimbic appetitive network. The executive system is involved in the conscious and voluntary decision to eat (3,4). Regions of the reward system establish the motivational (incentive salience) and pleasurable (hedonic) values of energy balance-associated stimuli and behaviors (5). Homeostatic, reward and executive regions are interconnected by extensive neuronal circuits (5). Disturbances in any of these regions or their interconnecting neurocircuitry can lead to an imbalance of food intake and energy expenditure resulting in obesity. To understand the pathogenetic mechanisms of obesity, detailed knowledge about qualitative and quantitative gene expression patterns of these brain regions is essential.
Monogenic obesities confirm the essential roles of specific genes in body weight homeostasis in mice and humans (6)(7)(8). The vast majority of human obesity is not monogenic. Many genes of small effect account for only ~10% the apparent ~40% risk variance for obesity within a specific environment (9)(10)(11). Efforts to find the missing inheritance -in less prevalent genetic variants of novel genes -have led to the extensive use of whole-exome sequencing (WES) in pedigrees or association analyses of extremely obese individuals (12). The vetting of novel variants for functional relevance can be conducted in cell-based and animal transgenic systems (13,14). To assist in prioritizing genes/alleles for such resource-intensive strategies, brain regional expression patterns can be used.
Although there are several comprehensive public resources (e.g., GTExPortal, Brain Architecture Project) reporting large-scale gene expression data from many tissues, none of them allows the direct comparison of the molecular signature based on bulk RNA sequencing (bulk RNA-seq) profiling of specific brain regions involved in the regulation of energy balance. Therefore, we performed bulk RNA-seq of obesity-relevant brain regions comprising the nominal homeostatic, reward, and executive regions in both adult and embryonic mouse brains (Figure 1, A and B; Supplemental Table 1). Additional sequencing was obtained from brain regions considered not to be involved in the regulation of energy balance, as well as mouse embryonic stem cells (Supplemental  Table 1). Our study also aims to characterize the utility of this database for vetting obesity-associated genes of interest.
Thus, we mapped known obesity-associated genes -including monogenic obesity genes, rare coding variants, low-frequency coding variants, syndromic obesity genes, and variants identified in GWAS for body mass index (BMI) -onto the expression profiles of the brain regions with functional roles in energy balance. Moreover, we identified specific expression patterns for obesity-associated genes in obesity-relevant brain regions compared to genes associated with other disease phenotypes -type 1 diabetes (T1D), type 2 diabetes (T2D), autism, and breast cancer. Furthermore, to facilitate usage of these data we provide access via a publicly available web portal, the Brain Energy 6 Balance Atlas, accessible via doegelab.com, permitting the region-specific analyses of any gene of interest.

Samples cluster according to brain area specification and developmental stage.
RNA sequencing was performed on 57 samples from brain regions (regions of energy balance and additional regions) from P56 C57BL/6NTac male mice (Figure 1, A Tables 2 and 3. We performed hierarchical clustering of brain regions and mESC samples using expression profiles of all protein-coding genes. As expected, samples clustered according to brain area specification and developmental stage, with a clear separation of mESC samples (Figure 2A).
Genetic categories of obesity have specific frequency distributions and enrichment scores across brain regions mediating energy balance.
All obesity-associated genes were detected in the brain at variable levels in at least one region per gene (Supplemental Figure 4 and Supplemental Table 6). To assess whether these obesity-associated genes are enriched in any of the brain regions involved in energy balance, the frequency distribution was calculated for each obesity-associated gene, by calculating the percent a sample contributes to the total TPM for each gene. For each genetic category the data was compiled, and the average was plotted (Figure 3, A, B, C, D, E). This was repeated for all categories together (Supplemental Figure 5A).
Monogenic obesity genes were significantly enriched in the ARH compared to any other brain region; monogenic obesity genes also showed enrichment in the THy/PHy. Statistical comparisons between brain regions are given in Supplemental Table 7. The other genetic categories showed more equal distribution across brain regions. There were trends for relative enrichments in some regions, such as among genes associated with rare coding variants in embryonic brain, low-frequency coding variants in hypothalamic regions, GWAS genes in ARH and embryonic midbrain, syndromic genes in hypothalamus and cerebellum (Figure 3 and Supplemental Table 7). For all obesityassociated genes as a group (n = 190), we observed a trend for enrichment in some regions, such as ARH and embryonic midbrain (Supplemental Figure 5A and Supplemental Table 8).
To better understand the genetics underlying these frequency distributions, we identified the three genes with highest percentage expression among the obesity-associated genes in the two most enriched regions, within each genetic category (Figure 4) as well as within all the obesity-associated genes (Supplemental Figure 5B). The genes in the top two enriched regions for monogenic obesity were Pomc, Pcsk1, Lepr in ARH, and Sim1, Pomc, Tub in THy/PHy. All of these genes are expected to be present in these regions and the majority of them are components of the leptin-melanocortin pathway. These results are consistent with prior studies that have classified monogenic obesity mutations into genes with roles in the hypothalamic melanocortin system of feeding regulation (e.g, Lepr, Pomc), and into genes that are essential for the development of the hypothalamus (e.g., Sim1) (16). The embryonic midbrain with Plxna3, Gpr, Plxna1 and embryonic hindbrain with Nrp2, Sema3a, Plxna3 are the two regions (and top genes) identified as most enriched for rare coding variants. Among genes associated with low-frequency coding variants we identified Entpd6, Ache, Rapgef3 as genes with highest TPM values in VMH, and Rapgef3, Zfr2, Ache in PVH. Among syndromic genes we identified Ttc8, Bbs9, Mks1 in ARH, and Inpp5e, Alms1, Mkks in CENT2 (Figure 4). The genes in the top two enriched regions for GWAS loci were Tal1, Tfap2b, Tcf7l2 in embryonic hindbrain and Pomc, Asb4, Calcr in the ARH (Figure 4). These regions and genes were also identified, when combining all genes of all genetic categories (Supplemental Figure 5B).
In conclusion, we observed that among all the analyzed genetic categories associated with obesity, monogenic obesity genes were most enriched in arcuate nucleus and embryonic hypothalamus. We also observed that for all obesity-associated genes, there was a trend for enrichment in ARH and embryonic brain regions.
Next, we performed Gene Set Enrichment Analysis (GSEA) (Supplemental Figure 6, A to F). All replicates were input, alongside the obesity genetic category gene lists In conclusion, as shown above for the frequency distribution, GSEA determined that obesity-associated genes are enriched in the ARH and embryonic hypothalamus. In addition, GSEA also identified the enrichment of obesity-associated genes in the VMH, DMH and regions of the cerebellum.

Obesity-associated genes cluster into modules.
To assess whether obesity genes from different genetic categories show similar or different expression profiles across the brain regions of energy balance, we subjected all obesity-associated genes to Weighted Correlation Network Analysis (WGCNA), obtaining 11 modules (Supplemental Figure 7). The heatmap in Figure 5A visualizes the modules, and detailed information (e.g. genes annotation) is given in Supplemental Table 9. In module 1 we observed enrichment of genes in embryonic regions. Genes in module 2 had higher expression in embryonic regions and stem cells. Module 3 featured genes with high expression in regions of homeostatic system and cerebellum. Module 4 was characterized by genes with high expression in regions of the executive system and hypothalamic regions. In module 5 we detected enrichment of genes in regions of the cerebellum. In module 6, expression was highest in regions of the executive system, reward system, hypothalamus and cerebellum. Genes defining module 7 had high expression in regions of the embryonic brain, cerebellum, and in stem cells. In modules 8 and 9 we observed increased expression if genes in regions of embryonic brain and cerebellum, differentiating 8 from 9, module 8 had increased expression in executive regions. Module 10 showed the presence of genes specifically enriched in embryonic and hypothalamic regions. In module 11 genes were more widespread among the regions, with no specific localization ( Figure 5A).
In order to address whether there is an association between expression modules and the genetic categories of obesity evidence, for each module, the number of genes associated to each genetic category was normalized to the size of the module, and the normalized gene number was expressed as percentage ( Figure 5B). All modules are characterized by the presence of genes belonging to more than one genetic category, with modules 1 and 4 being defined by genes associated with all five the genetic categories. Except for GWAS genes which were distributed across all modules, genes from the remaining genetic categories were present only in a subset of the modules. Genes associated with monogenic obesity were primarily expressed in module 10, but also in modules 1, 4 and 8 ( Figure 5B), the modules with predominance of genes in embryonic brain and hypothalamus ( Figure 5A) Next, we determined whether a genetic category is over-represented in a given module.
We developed an over-representation score which indicates the deviation from an equal distribution across modules, normalizing for module size and gene list size. The deviation from the equal distribution was plotted, where a positive deflection denotes enrichment In order to further strengthen our findings, we utilized an independent approach to assess the expression profile of obesity-associated genes across brain regions of energy balance. All obesity-associated genes were subjected to hierarchical clustering, resulting in 11 distinct modules, each defined by more than one genetic category (Supplemental  Table 5). This analysis also found the abundance of genes associated with monogenic obesity in embryonic and hypothalamic regions; the enrichment of rare coding variants in embryonic regions; and prevalence of low-frequency coding variant genes in regions of the executive system, reward system and hypothalamus (Supplemental Figure 8). Clustering into additional smaller modules does not lead to the assignment of just one genetic category per module (Supplemental

Figure 9).
We then looked at changes in the composition of the modules comparing the 11 modules obtained using WGCNA clustering to the 11 modules obtained performing hierarchical clustering, observing that the majority of obesity-associated genes have the tendency to cluster in similar ways in the two distinct analyses (Supplemental Figure 10), strengthening the validity of both approaches and the biological relevance of our dataset.
In order to confirm that a genetic category is not assignable to a distinct module, we performed hierarchical clustering on all the genetic categories individually, and found that genes associated with each of the categories are distributed across all anatomic regions, without distinct preference for homeostatic, executive or reward system (Supplemental Figure 4, A to E). Thus, the genetic category per se does not determine regional gene expression profile.
Obesity-associated genes have specific enrichment score across energy balancerelevant brain regions.
To determine whether the expression patterns of the obesity-associated genes are specific to obesity, we examined expression profiles of genes known to be associated with four other diseases: 1) autoimmune disease (T1D, n = 61) (31); 2) metabolic disease (T2D, n = 140) (32-36); 3) complex brain disorder (autism, n = 190) (37); 4) cancer (breast cancer, n = 98) (38-40) (Supplemental Table 10, 11, 12 and 13). Expression values for genes associated with obesity, T1D, T2D, autism and breast cancer are given in Supplemental Table 14. Transcripts from all genes were detected in the brain at variable levels in at least one region per gene. These diseases have some genes in common with obesity (15 genes out of 650 total genes) (Supplemental Table 15), which may be suggestive of the presence of shared pathways between diseases and/or obesity as a risk factor for the other diseases and/or vice versa.
To assess if there was region-specific enrichment for one or the other disease, we These results show that obesity-associated genes have specific enrichment across brain regions mediating energy balance.

Clustering of obesity-associated genes together with genes associated with other diseases.
To assess if genes associated with the four aforementioned diseases cluster with obesity, we compared obesity-associated genes to T1D-, T2D-, autism, or/and breast cancerassociated genes using WGCNA, resulting in 15 distinct modules (Supplemental Figure   12), each defined by admixed genes associated with all diseases, and each comprised of genes with specific regional localization. The heatmap in Figure 7A visualizes the modules, and detailed information (e.g. genes annotation) is given in Supplemental Table 16. The percentage of contribution of each disease to each module, calculated by normalizing the number of genes associated to each disease normalized to the size of the module and expressed as percentage, revealed that all diseases are present, at different levels, in every module. The only exceptions were for T1D-associate genes, absent in module 1, 12 and 14; and T2D-associated genes, absent in module 11 ( Figure   7B).
To evaluate whether the modules were enriched for disease-specific genes, even if composed of genes from all diseases, we generated an over-representation score, for  (Figure 7A), was enriched by obesity-associated genes (Figure 7C), expected to be highly present in these regions. Module 3, characterized by high expression of genes in regions of the executive system and cerebellum (Figure 7A), is enriched with autism-associated genes (Figure 7C), supporting the significant enrichment previously showed for genes associated with this disease in these regions (Figure 6 and Supplemental Figure 11D). Module 4 was enriched with obesity-associated genes ( Figure 7C). This module had high expression of genes in regions of the homeostatic system and cerebellum (Figure 7A), in agreement with our previous findings (Figure 6 and Supplemental Figure 11A). Autism-associated genes were enriched in module 5 ( Figure 7C), characterized by genes highly expressed in embryonic region and mESC ( Figure 7A). Module 6 was enriched with T1D-associated genes ( Figure 7C). This module had increased expression of genes in cerebellum regions ( Figure 7A). We observed a non-significant trend of enrichment of T1D-associated genes in the cerebellum in the GSEA analysis (Figure 6 and Supplemental Figure 11A). Obesityassociated genes were enriched in module 7 ( Figure 7C). This module showed highest expression of genes in regions of the executive system and homeostatic system ( Figure   7A), confirming the previously obtained results for the latter system (Figure 6 and Supplemental Figure 11D). Module 8, comprised of genes highly expressed in the executive system (Figure 7A), was enriched by autism-associated genes (Figure 7C), confirming our previous findings (Figure 6 and Supplemental Figure 11D). Module 9 was enriched by obesity-associated genes (Figure 7C), and characterized by gens with high expression in embryonic regions, cerebellum and mESC ( Figure 7A). Module 10, with high expression of genes in hypothalamic regions (Figure 7A), was enriched by T2Dassociated genes (Figure 7C). Module 11 was enriched by obesity-associated gens ( Figure 7C). Module 12 showed enrichment of autism-associated genes (Figure 7C), with genes primarily localized in embryonic regions ( Figure 7A). T2D-associated genes were enriched in module 13 (Figure 7C), characterized by high expression of genes in regions of the reward system (Figure 7A), and module 14, and defined by genes with high expression in the embryonic hindbrain ( Figure 7A). Taken together, this analysis reveals that clustering of obesity-associated genes together with genes associated with other diseases results in modules with higher contribution from one or another disease.
Such findings indicate that despite some similarities in the genetic signature between diseases, they clearly show the presence of different gene expression patterns.
We confirmed these findings with an independent method, k-means clustering. First, we compared obesity to each of the other diseases individually. We combined obesityassociated genes with genes associated with one of the other four diseases and performed k-means clustering, resulting in four distinct clusters, each cluster defined by the genes associated to the two diseases (Supplemental Figure 13, A, B, C, D Figure 13, A, B, C, D, F, G, I; Supplemental Tables 17, 18, 19, 20).
In summary, although there is some overlap in the genetic signature between obesity, T1D, T2D, autism and breast cancer, there is a large genetic component characterized by a clear separation, suggesting an obesogenic signature of obesity-associated genes in brain regions that function in the regulation of energy balance.

Discussion
In this study, gene expression profiles of 16 energy balance-relevant brain regions were obtained from P56 mice and E14.5 embryos using bulk RNA-seq. The brain regions chosen were compiled from a comprehensive literature search of regions known to be involved in the control of energy balance, as reviewed by Caron and Richards (5). To punch the ROIs with highest possible accuracy we chose P56 as the age of adult mice to allow the use of landmarks from the mouse brain map of the Allen Brain Atlas with the punch size chosen by the size of the ROI. Canonical neuropeptides involved in the regulation of body weight (e.g., Pomc, Agrp, Oxt) and those involved in other functions that are known to map to distinct regions (e.g., Cartpt, Hcrt, Trh, Crh) show the expected region-specific expression pattern. This neuropeptidergic expression pattern supports the accuracy of the regional identifications and sample ascertainment.
As expected, known obesity-associated genes (n = 190) were enriched in at least one of the brain regions of nominal homeostatic, reward and executive circuitry. Monogenic obesity genes were expressed in specific regions of the homeostatic circuits as well as the developing hypothalamus. The arcuate nucleus was found to be the region most enriched by the 9 known human monogenic obesity genes, with highest expression of tubby-like protein (TUB) (88)(89)(90)(91)(92). Furthermore, we observed that genes carrying rare coding variants, low-frequency coding variants, genes identified in GWAS, and also genes associated with syndromic obesity, were enriched in at least one of the energy balance brain regions: rare coding variants in the embryonic hindbrain; low-frequency coding variants and GWAS genes in homeostatic regions; syndromic genes in hypothalamic regions. Interestingly, our data showed specific enrichment for syndromic and GWAS genes in regions of the cerebellum, a region whose link with obesity has not been well defined yet. For these genes we have to consider the two following options: 1) expression in these samples is a measure for additional functions of these genes, independent of their role in obesity; 2) cerebellum is truly contributing to the regulation of body weight. At this point, experimental testing of the above hypothesis is needed to understand such expression pattern.
WGCNA clustering of known obesity-associated genes revealed that these genes cluster in distinct modules, and each module is defined by more than one genetic categorysimilar results were obtained performing an independent hierarchical clustering analysis.
In addition, we identified modules with enrichment of obesity-associated genes in mESC or cerebellum. Modules 2 and 7 are characterized by enrichment of genes in the mESC lines and embryonic brain. Genes in these modules implicated in obesity are Gdf15 (103), Alms1 (104) and Rab23 (20). Their dysfunction during neurodevelopment could contribute to susceptibility to obesity. Modules 3,5,6,8,9 are characterized by enrichment of genes in regions of the cerebellum. Some of the genes present in these modules have been functionally associated with obesity, including Irs1 (105), Sdccag8 (106,107), Negr1 (108)(109)(110)(111), Ksr2 (112,113), Tlr4 (114,115) and Sh2b1 (116,117). Mice with either Lepr neuron-specific or adult-onset, hypothalamus-specific ablation of Sh2b1 develop obesity, insulin resistance, and liver steatosis (118,119). It would be interesting to test whether knocking out Sh2b1 specifically in the cerebellum the mice would result in the same phenotype. Cep290 (120,121) and Inpp5e (19), associated with syndromic ciliopathies that include obesity, and Bardet-Biedl syndrome (BBS) causal genes -Arl6, Bbs1, Bbs2, Bbs5, Bbs7 -(122), were also present in these modules. Whether genes with enriched expression in the cerebellum are truly causal of obesity is unclear, because the majority of them also show some enrichment in other brain regions. Some patients with BBS display characteristic structural brain abnormalities, including within the cerebellum (123,124). The literature implicating the cerebellum in weight regulation, does not identify specific neuronal circuitry or molecular mechanisms for such an effect (125)(126)(127)(128)(129)(130)(131)(132). A role in anticipatory aspects of ingestive behaviors -similar to the cerebellum's classical role in motor activities (133) -is an interesting possibility, the study of which may be assisted by analytic strategies and the tools developed in this project reported here.
In order to address whether there is an obesogenic signature of obesity-associated genes in the brain regions of energy balance, we included in our analysis genes associated with four additional diseases: T1D, an autoimmune disease; T2D, a metabolic disease; autism, a complex neurobehavioral disorder; and breast cancer. These diseases share some gene overlap with obesity (15 genes out of 650 total genes) suggesting the presence of shared pathways between diseases and/or obesity as a risk factor for the other diseases and/or vice versa. We observed an enrichment of obesity-associated genes in regions of the homeostatic system. Genes associated to autism, a neurodevelopmental disorder, were enriched, as expected, in regions of the embryonic brain and the executive system.
Breast cancer-associated genes showed enrichment in the stem cells samples. T1D and T2D did not show region-specific enrichment. WGCNA clustering of obesity-associated genes together with T1D-, T2D-, autism and breast cancer-associated genes across the regions of energy balance revealed that these genes cluster in distinct modules, and that each module is defined by genes associated to each disease -with few exceptions -and characterized by the prevalence of one disease over the others. Similar findings were observed in k-means clustering of genes associated with these five diseases together.
Regardless of the comparison examined, some clusters have more equal contribution from the diseases, while more than half of clusters show predominant contribution from one disease over the other. These results suggest that there is a clear separation between the diseases and thus, an obesogenic signature of obesity-associated genes in the brain regions of energy balance.
The online database and analytic strategies presented here can be utilized to vet novel obesity-candidate genes by "positioning" them within specific gene clusters and neural circuits. This information can be used to determine next steps with regard to functional analyses. More specifically, to fulfil the promise of precision medicine in obesity, we envision the following work-flow: clinical genetics using WES/GWAS to identify novel obesity candidate genes, determine brain region/s of enrichment for a given novel obesity candidate gene for prioritization using the Brain Energy Balance Atlas portal (doegelab.com), determine the specific cell type/s expressing the candidate gene using single-cell RNA sequencing (scRNA-seq) in the determined region/s, generation of induced pluripotent stem cells (iPSC) from patient's peripheral blood mononuclear cells, correction of obesity candidate mutation using CRISPR to generate isogenic control iPSC, differentiation of iPSC (mutation carrying and isogenic control) into the cell type identified earlier by scRNA-seq, and phenotypic analysis of these cells to get a molecular phenotype of the mutation. With the ultimate goal being to utilize these patient-specific, functional, in vitro cell systems for drug screening and evaluation.
The vetting of genes with this database can also be used to relate central nervous system (CNS) region-specific gene expression and circuits to other metabolic phenotypes such as T2D. Despite evidence implicating the role of the brain in glucose homeostasis, the regions of the brain involved have not been all identified and the mechanisms behind are not fully understood (134)(135)(136). This latter use could help to deconvolute complex interactions of the CNS with seemingly remote phenotypes such as autoimmune disease and cancer.

Mice.
Bulk RNA-seq experiments were performed in C57BL/6NTac male P56 mice and E14.5 embryos harvested from timed pregnant mice. All mice were obtained from Taconic Biosciences.

Housing and diets.
Mice were housed at 22-24 °C temperature with a regular 12 h light:12 h dark cycle (lights were turned off at 7 p.m.) with no more than five adult animals per cage and ad libitum access to Purina 5058 chow diet and water.

Tissue dissection.
P56 mice or pregnant mice with E14.5 embryos were sacrificed via cervical dislocation followed by decapitation.
For P56, brains were immediately removed and embedded in O.C.T. compound (Fisher Scientific, #23730571) and placed in dry ice cooled isopentane to flash freeze. Each brain was sectioned into 500 μm-thick coronal sections (Figure 1B, left) using a Microm HM 525 cryotome (Fisher Scientific) at -6 °C. The ROIs were micro-dissected by single or bilateral punching of the brain sections using a dissecting microscope. The diameter of the punch, either 0.5, 0.75 or 1 mm, was chosen according to the size of the region.
E14.5 embryos were embedded in O.C.T. compound and immediately placed in dry icecooled isopentane to flash freeze. Blocks were sectioned into 300 μm-thick sagittal sections (Figure 1B, right) as described above. The ROIs were micro-dissected by single punches (0.5 -0.75 mm diameter) of the sections.

Sample description
In total, 17 adult mice (P56), 12 mouse embryos (E14.5) and 2 stem cell lines were used for bulk RNA-seq. Sample collection and preparation was performed in 9 batches within 1 month (Tables 2 and 3). Depending on the size of the brain region, the number of sections punched and pooled together varied ( Table 2).
For each sample from P56 brains, dissected tissue pieces from three or four mice were immediately pooled in lysis buffer for subsequent RNA extraction ( Table 2). The only exception (due to technical issues) was that one nucleus accumbens (ACB) sample from only one mouse was processed ( Table 2). The ROI for brain areas of energy balance  Table 1; Table   2; Supplemental Figure 1, A and B; and Supplemental Figure 2).
Mouse embryonic stem cells (mESC) samples were included; they are not directly involved in the regulation of energy balance. Two mESC lines termed clone 1 (B6-1) and clone 2 (B6-2) were derived from a C57BL/6 mouse strain ( Table 1; Table 3 and Supplemental Fig 1C).

RNA extraction.
For all punches collected from P56 and E14.5 mice, total RNA was extracted using Principal Component Analysis (PCA) was used for outlier detection.
Hierarchical clustering was used to determine the how the regions cluster together ( Figure 2A) and to show gene order for most heatmaps. Genes were routinely visualized with heatmaps, which were produced by averaging the TPM data across the replicates for each brain region and for mESC (replicates details are given in Tables 2 and 3 The TPM values for each gene were scaled for each gene to improve visualization. To calculate the frequency distribution, the TPM of each gene was calculated as a percentage compared to the total TPM values for all samples (brain regions and mESC, averaged across each replicate). The frequency score and standard error was calculated by compiling the percentage for all the genes in a gene list.
Gene Set Enrichment Analysis (GSEA, v 4.1.0) (139) was used to compliment the frequency distribution. All replicates were input, alongside either the obesity gene lists (monogenic, rare coding variants, low-frequency coding variants, GWAS and syndromic) or the broader gene list (Obesity, T1D, T2D, autism and breast cancer). GSEA outputs an enrichment score, which indicated if the genes within a particular gene list were ranked highly in the sample compared to all other samples (Supplemental Figures 6 and 11), within all the genes from the gene lists input, and the enrichment score was plotted.
Weighted Gene Correlation Network Analysis (WGCNA, v1.70-3) (140) was used to see how obesity genes cluster, and how they cluster with other disease genes. This allows for determination of genes with similar expression patterns within the dataset, and also allows for consideration of how genes are expressed across the brain regions (e.g. homeostatic regions). Replicates of the same brain region were averaged, and then a soft thresholding power was chosen to give the highest separation between modules.
WGCNA outputs discrete modules and a dendrogram (Supplemental Figures 7 and 12).
Heatmaps were used for visualization of the modules. The order of genes on the heatmap was determined by the module number, followed by the order the gene appears in hierarchical clustering. To determine if a gene list is enriched within a module, we developed an over-representation score which indicates the deviation from an equal distribution across modules, normalizing for module size and gene list size. Overrepresentation score was calculated using the following equation: (w/((x/y)*z)-1, where w = n genes per module per gene list, x = n genes in gene list, y= n all gene lists and z= n genes in module. This over-representation score was plotted to show a positive deflection where a gene list is over-represented.
K-means clustering was used to partition n observations into k clusters in which each observation belongs to the cluster with the nearest mean (141). The number of clusters was decided based on the slope change on an elbow plot. Enrichment was determined using the same method used for WGCNA, where a k-means cluster is substituted for a WGCNA module.
Sequencing data is available on an interactive web portal, the Brain Energy Balance Atlas, which is accessible via doegelab.com. This portal allows users to search for expression of any annotated gene, and its differential expression profiles across the 22 brain regions and 2 mESC lines studied. Counts files were then analyzed with DESeq2 (142) (Version 1.28.1) to compare the expression pattern in each brain region to all other brain regions or to mESCs.

Statistics
Frequency distribution was expressed as mean ± SE. For each gene list, pairwise comparisons of all brain regions were evaluated using one-way ANOVA and reporting the Benjamini, Krieger and Yekutieli two step false discovery rate corrected p-value. P values less than 0.05 were considered significant.
For GSEA enrichment score plots, regions with a positive enrichment score and an FDR adjusted p-value < 0.05 were denoted with a * (139).
All analyses were performed in R (v 1.3.1093) or in GraphPad Prism (v 9).

Study approval
Animal care and experimental procedures were performed according to Columbia   Regions of the brain executive system (red) include the anterior cingulate area (ACA) and the frontal cortex (FRP). Regions of the brain reward system (yellow) include the nucleus accumbens (ACB), ventral tegmental area (VTA) and LHA. The regions in the developing