Multiomics of Bohring-Opitz syndrome truncating ASXL1 mutations identify canonical and noncanonical Wnt signaling dysregulation

ASXL1 (additional sex combs–like 1) plays key roles in epigenetic regulation of early developmental gene expression. De novo protein-truncating mutations in ASXL1 cause Bohring-Opitz syndrome (BOS; OMIM #605039), a rare neurodevelopmental condition characterized by severe intellectual disabilities, distinctive facial features, hypertrichosis, increased risk of Wilms tumor, and variable congenital anomalies, including heart defects and severe skeletal defects giving rise to a typical BOS posture. These BOS-causing ASXL1 variants are also high-prevalence somatic driver mutations in acute myeloid leukemia. We used primary cells from individuals with BOS (n = 18) and controls (n = 49) to dissect gene regulatory changes caused by ASXL1 mutations using comprehensive multiomics assays for chromatin accessibility (ATAC-seq), DNA methylation, histone methylation binding, and transcriptome in peripheral blood and skin fibroblasts. Our data show that regardless of cell type, ASXL1 mutations drive strong cross-tissue effects that disrupt multiple layers of the epigenome. The data showed a broad activation of canonical Wnt signaling at the transcriptional and protein levels and upregulation of VANGL2, which encodes a planar cell polarity pathway protein that acts through noncanonical Wnt signaling to direct tissue patterning and cell migration. This multiomics approach identifies the core impact of ASXL1 mutations and therapeutic targets for BOS and myeloid leukemias.

17A and 18A). We confirmed effective rRNA removal (Supplemental Figures 17B and 18B). Differential expression analysis was performed, adjusting for sample sex. Genes with a p-adjusted (Benjamini-Hochberg) less than 0.05 were classified as significantly differentially expressed (Wald's test), and fold changes (5) were shrunk using approximate posterior estimation for GLM coefficients. We further filtered our differentially expressed genes (DEGs) for abs(log2FC) ≥ 0.58, corresponding to an absolute log fold change ≥ 1.5. Gene ontology (6, 7) over-enrichment tests were completed using clusterProfiler v3. 12.0 (8) by submitting differentially expressed genes against all genes from the Gencode hg38 annotation, version 31. Gene ontologies were classified as significantly enriched when p-adjusted (Benjamini-Hochberg) was less than 0.05 (hypergeometric test). HOMER v4.9 findMotifs.pl (9) was used to identify enrichment of motifs within differentially expressed genes' promoters. Differentially expressed genes were compared against HOMER's default human, RefSeq-based promoter set, which yielded de novo and known motif enrichments for motif lengths of 8, 10, and 12. Sequencing tracks were visualized using Integrated Genomic Viewer (IGV 2.9.4) and GVIZ.

ATACseq cont.
After purification and PCR amplification for library generation, the libraries were double-sided bead purified using AMPure XP beads (Beckman, A63881) to remove contaminating primer dimers. All libraries were quantified on the Qubit 4.0 and library quality was assessed using Agilent High Sensitivity DNA Tapestation (Agilent, 5067-5584) before multiplexing and sequencing on a Nextseq 550 for a minimum of 40 million paired-end reads per sample with 75bp length. Figure 20) Fastq files were processed through our best practise bioinformatic in-house pipeline (Supplemental Figure 20). Quality of reads were assessed using FastQC. Raw reads were then aligned to GENCODE Human genome version hg38 (GR38) version 31 using BWA-MEM (10). BAM files were then sorted, indexed, filtered against chrX, chrY, and MT reads using SAMtools.

ATACseq Bioinformatic Analysis. (Supplemental
Picard tools were then used to generate insert size histograms and remove duplicates from BAM files. Narrow peaks from each sample were called using MACS2 callpeak (11); any peak that overlapped by at least one base was then merged using BEDtools merge (12) merge. Reads overlapping merged peaks were counted using featureCounts (4). Technical replicates clustered closely on PC analysis and were collapsed during analysis. Peaks were annotated using annotatePeaks.pl from HOMER to the nearest gene or regulatory element, including exon, intron, promoter, 5' untranslated region (UTR), 3' UTR, and other genomic features (9). DESeq2 was used to identify differentially open peaks between disease and control samples, adjusting for sample gender. Peaks with p-adjusted value (Benjamini-Hochberg) less than 0.05 were classified as significantly differentially open (Wald's test), and fold changes were shrunk using approximate posterior estimation for GLM coefficients. Significant peaks were identified as promoter peaks if their distance from their respective closest gene was less than 1kb upstream or 2kb downstream relative to the gene transcription start site. Gene ontology over-enrichment tests were completed using clusterProfiler by submitting the closest genes to significantly differentially open peaks against all genes from the Gencode hg38 annotation, version 31. Gene ontologies were classified as significantly enriched when p-adjusted (Benjamini-Hochberg) was less than 0.05 (hypergeometric test). Enriched de novo and known motifs were identified in significant peaks using findMotifsGenome.pl from HOMER (9). Significant peaks were identified as promoter peaks if their distance from their respective closest gene was less than 1kb upstream or 2kb downstream relative to the gene transcription start site. Sequencing tracks were visualized using Integrated Genomic Viewer (IGV 2.9.4) and GVIZ.

DNA methylation.
We used DNA methylation generated from BOS and control blood and fibroblasts and processed as described in (51). We calculated the absolute difference between the means of the β value for BOS patients versus controls for each CpG to obtain the delta beta (Δβ) value using linear regression modeling. We filtered for highly differentially methylated sites (|delta beta (Δβ)| > 5%).
Significant CpG sites were identified if they were below FDR < 0.05. To identify biological mechanisms that were dysregulated, we queried these CpG sites using GREAT (13). We filtered for gene ontologies with padj < 0.05 for significance.

RT-qPCR.
10ng of RNA was used per well in 20ul total well volume, and each sample and primer combination was conducted in triplicate. Machine settings were for Standard Curve and Fast run mode. RNA isolated from patient blood and patient-derived fibroblast cell lines were subject to RT-qPCR to validate RNA-seq findings. RNA isolated from transfection experiments were subject to RT-qPCR to identify effects of the ASXL1 truncated plasmids or ASXL1 siRNAs on cell transcriptome.   (A) An array of antibodies targeting ASXL1 were tested on whole cell lysate (20ug) from CACO2 cells and fibroblast cells. Only antibody #ab228009 identified higher ASXL1 expression in CACO2 cells, which was expected based on ProteinAtlas data (Supplemental Table 4). (B) To see whether ASXL1 antibody #ab228009 could identify an exogenous and/or endogenous ASXL1 correctly, we transfected truncated ASXL1 plasmids into HEK293T cells and resolved 10ug nuclear extract on western gel. ASXL1 antibody #ab228009 identified the truncated ASXL1 plasmids: ASXL1 1-479 + FLAGx3, and ASXL1 1-1304. ASXL1 (below, left) and FLAG (below, right) staining for the ASXL1 1-479 + FLAGx3 cell lysate showed complete overlap.   Table 7).  Table 7).

Supplemental Figure 4
Supplemental Figure 5 A B Pt3 Pt4 Pt15 Ct1 Ct50 Ct51 Ct5 Ct8 Ct10 C Supplemental Figure 6 A B Supplemental Figure 6: Sequencing Coverage and ASXL1 Read Count A) RNA-seq ASXL1 reads were plotted against total sequencing coverage. Blood samples (red) showed a strong correlation (R 2 = 0.76) between increased ASXL1 reads as total sequencing reads increased. Fibroblast samples (blue) had higher ASXL1 read count at the same sequencing coverage as blood samples, and the same positive correlation between ASXL1 read count and sequencing coverage. (B) For each RNA-seq BOS sample, the number of reads at the BOS-causing mutation site was identified (Supplemental Table 10) and plotted against total ASXL1 reads for that sample. No strong correlation was identified in either blood (red) or fibroblast (blue) samples. Significant DEGs from BOS RNAseq fibroblast data was integrated with significant differential chromatin accessibility peaks at the transcriptional start site from BOS ATACseq fibroblast data. DEGs that overlapped are shows with the log 2 FC for RNAseq (y axis) and log 2 FC for ATACseq (x axis).

BOS) Patient Information and Assays
Bohring-Opitz Syndrome (BOS) patients included in our assays are identified with their patient IDs. We include mutation annotation, age, and sex, and outline the multi-omics assays that each patient's samples were used in.

Supplemental Table 2: Control Sample Information and Assays
Control samples used in our assays are shown with their age and sex. We outline the multiomics assays that each sample was used in.
Supplemental Table 3: BOS Patient Summary Table  We show summary data (age, sex, mutation) for BOS patients in our cohort that are included in each -omics assay.

Supplemental Table 4: Expression of Key Genes in Tissues and Cell Lines
Transcripts per million (TPM) gene expression are summarized for key genes of interest. We obtained data for gene expression in relevant tissues from GTEx and ProteinAtlas, and data for cell lines from ProteinAtlas.

Supplemental Table 5: RT-qPCR Primers
PrimeTime™ RT-qPCR Primers were purchased from IDT to assay ASXL1 expression. Predesigned primer pairs and fluorescently labeled 5′ nuclease probe that were used in our assays are listed. These are two primer sets for ASXL1, each targeting a different exonic range. and one set targeting B-actin (ACTB) which was used as a control.
Supplemental Table 6: List of Antibodies Antibodies used in this paper are shown here with their purchasing information. Supplemental Table 9: RNA-seq Blood QC and Mapping Statistics BOS patient and control blood were used in RNA-seq. Here, we show FastQC and mapping statistics for each sample.

Supplemental Table 10: ASXL1 Pathogenic and Wild-type Allele Counts in BOS patients.
RNA-seq read counts for the ASXL1 gene in BOS patients was determined. Each read was designated pathogenic if it included the truncating mutation associated with that patient, and wild-type if it did not.

Supplemental Table 11: Significant Differentially Expressed Genes in BOS Fibroblasts
Identified through RNA-seq DESeq2 was used to analyze RNA-seq gene counts from our cohort of BOS and control fibroblast samples. Samples were adjusted for the co-variate of sex based on principal component analysis. Genes were considered significant if Bonferroni adjusted p < 0.05 and are listed here with their adjusted p value, log 2 Fold Change, and gene annotations.

Supplemental Table 12: Significant Differentially Expressed Genes in BOS Blood
Identified through RNA-seq DESeq2 was used to analyze RNA-seq gene counts from our cohort of BOS and control blood samples. Samples were adjusted for the co-variate of sex based on principal component analysis. Genes were considered significant if Bonferroni adjusted p < 0.05 and are listed here with their adjusted p value, log 2 Fold Change, and gene annotations.

Supplemental Table 13: Gene Ontology Analysis for BOS Blood RNA-seq
Gene ontology (GO) over-enrichment tests were performed using all significant differentially expressed genes in BOS blood samples from our RNA-seq assay. P-values shown are the probability of seeing at least x number of genes out of the total input genes annotated to a particular GO term, given the proportion of genes in the whole genome that are annotated to that GO term. We filtered for gene ontologies with padj < 0.05 for significance.

Supplemental Table 14: Gene Ontology Analysis for BOS Fibroblast RNA-seq
Gene ontology (GO) over-enrichment tests were performed using all significant differentially expressed genes in BOS fibroblast samples from our RNA-seq assay. P-values shown are the probability of seeing at least x number of genes out of the total input genes annotated to a particular GO term, given the proportion of genes in the whole genome that are annotated to that GO term. We filtered for gene ontologies with padj < 0.05 for significance.

Supplemental Table 17: Unique Gene List for Differentially Expressed Genes in BOS Fibroblast ATAC-seq
Unique genes identified in BOS fibroblast ATAC-seq analysis are listed here with the corresponding number of unique differentially regulated chromatin peaks per gene at each filtering cutoff.

Supplemental Table 18: Gene Ontology Analysis for BOS Fibroblast ATAC-seq
Gene ontology (GO) over-enrichment tests were performed using all significant differentially expressed genes in BOS fibroblast samples from our ATAC-seq assay. P-values shown are the probability of seeing at least x number of genes out of the total input genes annotated to a particular GO term, given the proportion of genes in the whole genome that are annotated to that GO term. We filtered for gene ontologies with padj < 0.05 for significance.

Supplemental Table 19: Motif Enrichment Analysis of ATACseq DEGs in BOS Fibroblast
HOMER (Hypergeometric Optimization of Motif EnRichment) was used to analyze motif enrichment of our BOS ATAC-seq fibroblast data using both known motif and de novo methods.

Supplemental Table 20: Overlapping DEGs Identified through Chromatin Accessibility and Transcriptional Dysregulation in BOS Fibroblasts
Bonferroni adjusted p < 0.05 significant differentially expressed genes (DEGs) from our fibroblast RNA-seq dataset (Supplemental Table 11) and our fibroblast ATAC-seq (Supplemental Table 16) datasets were integrated to identify common dysregulated genes. We identified 25 DEGs common between the two tissue types, with 21 of these genes dysregulated in the same direction.

Supplemental Table 21: Significant Differentially Methylated CpG Sites in BOS Blood
Identified through DNA methylation Significant CpG sites between BOS patient blood and control blood were identified if they were below FDR < 0.05. The absolute difference between the means of the β value for BOS patients versus controls for each CpG was calculated to obtain the delta beta (Δβ) value. We filtered for highly differentially methylated sites (|delta beta (Δβ)| > 10%) using linear regression modeling. Genes were considered significant if below FDR < 0.05 and are listed here with their adjusted p value, Δβ, and gene annotations.

Supplemental Table 22: Unique Gene List for Differentially Expressed Genes in BOS Blood DNA methylation
Unique genes identified in BOS blood DNA methylation analysis are listed here with the corresponding number of unique differentially methylated CpG peak per gene at each filtering cutoff.
Supplemental Table 23: Overlapping DEGs between RNA-seq and DNA methylation analysis of BOS blood Bonferroni adjusted p < 0.05 significant differentially expressed genes (DEGs) from our blood RNA-seq (Supplemental Table 12) and our blood DNA methylation (Supplemental Table 21) datasets were integrated to identify common dysregulated genes. We identified 341 DEGs common between the two assay types.

Supplemental Table 24: Differentially Methylated CpG Sites in BOS Fibroblast Identified through DNA methylation
CpG sites between BOS patient fibroblast and control fibroblast were identified if they were below a nominal p-value of 0.005. The absolute difference between the means of the β value for BOS patients versus controls for each CpG was calculated to obtain the delta beta (Δβ) value. We filtered for highly differentially methylated sites (|delta beta (Δβ)| > 10%) using linear regression modeling. Genes are listed here with their nominal p value, Δβ, and gene annotations.

Supplemental Table 25: Gene Ontology Analysis for BOS Blood RNA-seq and DNA Methylation Integration
Integration of DEGs from our BOS RNA-seq blood (Supplemental Table 12) and BOS blood DNA methylation (Supplemental Table 21) were analyzed using clusterProfiler v3.12.0 to identify gene set enrichments. clusterProfiler v3.12.0 uses all genes from the Gencode hg38 annotation, version 31, as background. We filtered for gene ontologies with padj < 0.05 for significance.