Allele-specific silencing of the gain-of-function mutation in Huntington’s disease using CRISPR/Cas9

Dominant gain-of-function mechanisms in Huntington’s disease (HD) suggest that selective silencing of mutant HTT produces robust therapeutic benefits. Here, capitalizing on exonic protospacer adjacent motif–altering (PAM-altering) SNP (PAS), we developed an allele-specific CRISPR/Cas9 strategy to permanently inactivate mutant HTT through nonsense-mediated decay (NMD). Comprehensive sequence/haplotype analysis identified SNP-generated NGG PAM sites on exons of common HTT haplotypes in HD subjects, revealing a clinically relevant PAS-based mutant-specific CRISPR/Cas9 strategy. Alternative allele of rs363099 (29th exon) eliminates the NGG PAM site on the most frequent normal HTT haplotype in HD, permitting mutant-specific CRISPR/Cas9 therapeutics in a predicted ~20% of HD subjects with European ancestry. Our rs363099-based CRISPR/Cas9 showed perfect allele specificity and good targeting efficiencies in patient-derived cells. Dramatically reduced mutant HTT mRNA and complete loss of mutant protein suggest that our allele-specific CRISPR/Cas9 strategy inactivates mutant HTT through NMD. In addition, GUIDE-Seq analysis and subsequent validation experiments support high levels of on-target gene specificity. Our data demonstrate a significant target population, complete mutant specificity, decent targeting efficiency in patient-derived cells, and minimal off-target effects on protein-coding genes, proving the concept of PAS-based allele-specific NMD-CRISPR/Cas9 and supporting its therapeutic potential in HD.


Introduction
Huntington's disease (HD) is caused by an expanded CAG trinucleotide repeat in the first exon of the huntingtin gene (HTT) (1). CAG repeats greater than 35 generate characteristic motor symptoms whose onset age is inversely correlated with the length of the repeat (2). Overall, the size of the expanded CAG repeat explains approximately 60% of the variance in age at onset in a fully dominant fashion (3), with unexplained variance being associated with various genetic loci (4,5). These indicate that the rate of HD is primarily determined by the size of the CAG repeat and further modified by other genes (5). Although the cause of HD has been known for more than 25 years (1), effective treatments have not been developed yet, potentially due to the complicated underlying disease biology in HD.
Considering dominant inheritance in HD (6), the known disease-causing mutation (1), and an essential role for HTT in development (7)(8)(9), selective inactivation of mutant HTT through CRISPR-Cas9 genome editing may yield robust therapeutic benefits. Since an expanded HTT CAG repeat causes HD, one may advocate therapeutic strategies that directly target the expanded repeat. However, targeting the disease-causing mutation using CRISPR-Cas9 is challenging and not desirable in HD because of 1) the lack of NGG PAM sequence (which is required for the most commonly used Cas9 endonuclease) in the CAG repeat region, 2) potential inactivation of normal HTT, and 3) concomitant targeting of many other genes containing a CAG repeat. Considering that DNA modifications by CRISPR-Cas9 produce permanent changes, the most critical requirement of therapeutic CRISPR-Cas9 strategies for HD is allele specificity. To inactivate only mutant HTT without targeting the disease-causing mutation, we developed an allele-specific CRISPR-Cas9 strategy that selectively targets the HTT haplotype harboring an expanded CAG repeat using PAM sites generated by SNPs (10). Using the PAM-Altering SNP (PAS)-based dual gRNA CRISPR-Cas9 approach, we were able to selectively excise a genomic region including the transcription start site and an expanded CAG repeat from the mutant HTT locus, allowing cells to produce only normal HTT mRNA and protein (10). Our PAS-based haplotype-targeting CRISPR-Cas9 strategy using two gRNAs to prevent the transcription of the mutationbearing transcript represents a highly flexible approach that can be applied to a gene of interest regardless of the location, size, and type of the disease-causing mutation (10). However, the use of two gRNAs required for genomic excision to prevent the transcription of mutant HTT can increase the possibility of off-targeting. Here, we developed a complementary CRISPR-Cas9 strategy that uses a single gRNA to selectively inactivate the mutant HTT through nonsense-mediated decay (NMD) (11,12), and subsequently determined its applicability, allele-specificity, molecular consequences, and off-target effects to evaluate its utility in clinical applications.

PAS-based mutant-specific NMD-CRISPR-Cas9 strategies for HD.
Among other requirements, mutant HTT specificity is strongly preferred for DNA-targeting therapeutic strategies for HD as they produce permanent changes. Our complementary allele-selective CRISPR-Cas9 strategy for HD aims at inducing NMD of mutant HTT mRNA (namely NMD-CRISPR-Cas9) by targeting a mutant HTT-specific exonic PAM site that is produced by the alternative allele ( Figure 1A) or the reference allele of a given PAS ( Figure 1B). All cases of HD are due to an expanded CAG repeat. However, the disease-causing mutation is found on diverse haplotypes (13)(14)(15), and many HD patients carry different combinations of mutant and normal HTT haplotypes (i.e., diplotype) (16). Therefore, key steps in developing mutant-specific NMD-CRISPR-Cas9 strategies for HD are: 1) finding exonic PAS on HTT, 2) mapping exonic PAS-generated PAM sites on HTT haplotypes, and 3) identifying mutant HTT-specific PAM sites in a given diplotype. Aiming at identifying exonic PAS, we analyzed 1000 Genomes Project (KGP) data, and revealed that 91 coding SNPs on HTT alter the PAM sequence for SpCas9 (S. pyogenes Cas9; 5'-NGG-3') (Supplemental Table 1; Supplemental Figure 1). We then mapped 91 PAS-generated NGG PAM sites on the 8 most frequent HTT haplotypes, which account for more than 80% of HD subjects with European ancestry (Supplemental Figure 2). Among 91 PAS, 69 and 19 SNPs generate NGG PAM sites on all and none of the 8 common HTT haplotypes, respectively. Therefore, those SNP variants can't be used for allele-specific CRISPR-Cas9 for HD subjects carrying common diplotypes (Supplemental Figure 2). However, 3 exonic PAS generate PAM sites on some of the common HTT haplotypes (Figure 2A).
For example, reference alleles of rs1065745 and rs363099 generate NGG PAM sites on hap.04 and hap.08, respectively. In contrast, the alternative allele of rs362331 generates NGG PAM sites on hap.04 and hap.08 haplotypes (Supplemental Figure 2).
Having identified candidate variants that might permit mutant HTT-specific NMD-CRISPR-Cas9 in HD, we then evaluated the levels of mutant specificity of those SNPs by calculating the percentages of HD subjects who carry the PAM site only on the mutant HTT. Firstly, we calculated the proportion of each diplotype in HD subjects based on our large scale genome-wide association study data that were analyzed to identify genetic modifiers (4). In agreement with previous studies with small sample sizes (16,17), hap.01 and hap.08 are the most frequent mutant and normal haplotypes, respectively. As a result, a diplotype comprising mutant HTT on hap.01 and normal HTT on hap.08 represents the most frequent diplotype in HD, accounting for more than 8% of HD subjects with European ancestry ( Figure 2B). Focusing on rs1065745, rs363099, and rs362331, we then identified diplotypes that carry NGG PAM sites selectively on the mutant HTT. HD subjects with normal HTT on hap.04 and mutant HTT on non-hap.04 carry an rs1065745-generated PAM sequence on the mutant chromosome, accounting for approximately 0.2% of European HD subjects with common diplotypes ( Figure   2C, blue). HD subjects carrying mutant HTT on hap.04 or hap.08 and normal HTT on other haplotypes have an rs362331-generated NGG PAM site on the disease chromosome, accounting for approximately 1.3% of HD ( Figure 2C, green). In contrast to low levels of mutant specificity for rs1065745 and rs362331, rs363099 showed a significantly higher mutant specificity. The alternative allele of rs363099 eliminates the NGG PAM site on the most common normal HTT haplotype (i.e., hap.08), and therefore, HD subjects with mutant HTT on non-hap.08 and normal HTT on hap.08 carry a NGG PAM site only on the mutant HTT ( Figure 2C, red), accounting for approximately 20% of European HD subjects with common diplotypes ( Figure 2B). When considering 16 common HTT haplotypes, which account for more than 90% of the mutant chromosomes (16) approaches to prevent the transcription of the mutant HTT (10,19). In addition, pre-clinical efficacy of permanent suppression of mutant HTT expression in a knock-in mouse model (20), and the highest mutant specificity of rs363099 in HD subjects (this study) make a CRISPR-Cas9 strategy based on this variant relevant and significant. Therefore, we set out to determine the editing efficiency, allele specificity, and molecular consequences of a single gRNA CRISPR-Cas9 strategy utilizing the PAM site generated by rs363099. We reasoned that the levels of allele specificity would be independent of cell-types because DNA sequence provides the basis for allele specificity in our CRISPR-Cas9 strategy. Therefore, we used readily available iPSC (induced pluripotent stem cell) lines derived from HD patients carrying mutant hap.01 and normal hap.08 as a representative diplotype for subsequent molecular analyses focusing on evaluating the levels of allele specificity. As summarized in Figure 3A, our allele-specific CRISPR-Cas9 strategy was designed to selectively target the mutant HTT haplotype using the NGG PAM site on the minus strand of the disease chromosome to induce NMD of mutant HTT mRNA. The lack of PAM site at the same location on the normal chromosome is predicted to prevent the CRISPR-Cas9 genome editing in normal allele, leaving normal HTT expression intact ( Figure 3A). Indeed, our rs363099-based CRISPR-Cas9 strategy showed a perfect allele specificity in iPSCs lines derived from heterozygous HD subjects with adult onset CAG repeats, generating small indels selectively on the mutant HTT without modifying its normal counterpart (  Table 2B).
Consistent with a previous report (21), our CRISPR-Cas9 strategy produced out-of-frame indels predominantly as zero and a small number of in-frame indels were observed in the mutant alleles of iPSC-A and iPSC-B, respectively (Supplemental Table 2; Table 1). We do not think the difference in the levels of in-frame modification between iPSC-A and iPSC-B was due to the difference in CAG sizes. Rather, this might be due to higher editing efficiencies in iPSC-B, generating more diverse genome modifications at the target site.
Complete allele-specificity, good targeting efficiencies, and significant reduction in total HTT protein levels were also observed in HD patient-derived neural precursor cells (Supplemental Figure 3) and other cell-types (data not shown). Cell type-independent allele specificity is quite expected because DNA sequence provides the basis for allele discrimination in our strategy. In contrast to our primary allele-specific targeting strategy based on a PAS, we observed inactivation of normal HTT when rs363099 was targeted as part of gRNA hybridization (Supplemental Figure 4), supporting a better allele specificity for a PAS-based CRISPR-Cas9 approach at this location. Together, high levels of mutant specificity and decent editing efficiency of our rs363099-based NMD-CRISPR-Cas9 strategy observed in our transfection experiments (without selection) suggest that single SNP-based haplotype-targeting CRISPR-Cas9 approaches can selectively inactivate the disease allele without directly targeting the mutation itself.
Complete ablation of mutant HTT protein expression by allele-specific NMD-CRISPR-Cas9. Next, we developed targeted clonal lines to unequivocally determine the immediate downstream consequences of our rs363099-based mutant-specific NMD-CRISPR-Cas9. We chose an iPSC with a juvenile onset CAG repeat (72 CAGs) in order to distinguish mutant HTT protein from its normal counterpart in immunoblot assays. Three independent clonal cell lines were established ( Figure 3B); subsequent sequencing analysis confirmed premature early stop codons at 29th, 30th, and 29th exons in iPSC-C1, iPSC-C2, and iPSC-C3 lines, respectively, predicting selective NMD of mutant HTT mRNA. Consistent with our predictions, targeted HD single cell clones showed both expanded and normal CAG repeats in DNA ( Figure 3C). However, mutant HTT RT-PCR assays ( Figure 3D) and MiSeq analysis of cDNA (Supplemental Figure 5) represent newly synthesized mutant HTT mRNA that has not been subjected to NMD yet.

Molecular consequences of mutant HTT-specific NMD-CRISPR-Cas9.
Having confirmed high levels of allele specificity, we characterized the molecular outcomes of our mutant-specific CRISPR-Cas9 strategy using two iPSC lines carrying adult onset CAG repeats (iPSC-A and iPSC-B carrying 42 and 46 CAGs, respectively). We performed RNAseq analysis of targeted single cell clones in order to minimize noise. Multiple independent clonal lines for experimental (Supplemental Table 3; n=12) and control group (Supplemental Table 4; n=12) were developed using our rs363099-based CRISPR-Cas9 strategy. Out-of-frame indels in targeted clones were confirmed by Sanger sequencing and MiSeq analysis of genomic DNA and further validated by MiSeq analysis of cDNA (Supplemental Table 3). Similar to targeted clonal lines with a juvenile onset CAG repeat ( Figure 3D; Supplemental Figure 5), we detected low levels of mutant HTT mRNA (Supplemental Table 3) in the absence of mutant protein (data not shown) in our targeted clones for RNAseq analysis. These data suggest that targeted clonal lines continuously produce both mutant and normal HTT mRNA, but most of the mutant HTT mRNA is quickly degraded before producing mutant HTT protein. We then performed RNAseq analysis to determine the molecular consequences of our mutant HTT-specific CRISPR-Cas9 strategy based on rs363099. Firstly, allele-specific expression (ASE) analysis of HTT focusing on sequence reads containing heterozygous coding SNPs (10 SNPs for hap.01/hap.08 diplotype including rs363099; Figure 4A) did not reveal any significant differences between empty vector controls and targeted clones regarding alleles on normal HTT ( Figure 4B; Supplemental Figure 9A). However, all 10 heterozygous SNP sites showed significantly (multiple test corrected p-value < 0.05) decreased mutant HTT levels in targeted clones ( Figure 4C; Supplemental Figure 9B). Secondly, we performed genome-wide differential gene expression (DGE) analysis to identify genes significantly altered in targeted HD clones compared to EV-treated HD cells. Interestingly, the shape of the volcano plot was atypical ( Figure 4D), rather resembling that of random sample comparisons (Supplemental Figure 10). These findings indicate that expression levels of genes in targeted clones were mostly unchanged except for HTT ( Figure 4D). Since HTT was the only significantly altered gene in targeted clonal lines ( Figure 4D; a black arrow), these data also imply that the probability of recurring frameshift mutations that can alter gene expression of other protein-coding genes may be low in our CRISPR-Cas9 strategy utilizing the PAM site generated by rs363099.
Analysis of genome-wide off-target effects. Lastly, we evaluated the off-target effects of our rs363099based mutant HTT-specific CRISPR-Cas9 strategy. We performed the genome-wide unbiased identification of double strand breaks enabled by sequencing (GUIDE-seq) assay to identify potential off-target sites. To increase the sensitivity of off-target site detection, we used HEK293T cells, which generally exhibit high transfection efficiency and robust CRISPR-Cas9 editing. Transfection of gRNA for our NMD-CRISPR-Cas9 strategy and Cas9 without Puromycin selection resulted in approximately 60-70% on-target editing efficiency in HEK293T cells, and subsequent GUIDE-seq analysis revealed 6 potential off-target sites (Supplemental Figure   11A). Three sites located in the intergenic regions (off-target #1, #2, and #3) showed modest levels of modification compared to the levels of on-target editing in HEK293T cells (Supplemental Figure 11A). The other 3 potential off-targets located in an intron (off-target #4) or intergenic regions (off-target #5 and #6) showed low levels of genome editing in HEK293T cells (Supplemental Figure 11A). To validate those 6 potential off-target sites in HD cells, we analyzed patient-derived iPSC lines treated with empty vector or our test gRNA. MiSeq analysis of representative HD iPSC samples showed relatively modest genome editing at off-target sites #1, #2, and #3 (1-6%) and no modification at sites #4, #5, or #6 (Supplemental Figure 11B).
Although off-target editing was detected at 3 sites in the patient-derived cells, off-target genome editing at those sites may not result in functional changes because of their locations relative to genes. Our RNAseq data also showed that the expression levels of genes flanked by or harboring potential off-target sites were not altered in our targeted clones (Supplemental Table 5), suggesting that 1) targeted clonal lines were not edited at off-target sites repeatedly and/or 2) genome editing at the 6 potential off-target sites does not alter the expression of protein-coding genes.
To further evaluate off-target effects, we used an in silico off-target prediction algorithm (Cas-OFFinder). A total of 83 sites were predicted for our primary test gRNA based on rs363099 (Supplemental Table 6); none of the predicted sites had a perfect match. The site with the highest prediction score contains 1 mismatch (chr12: 92165291-92165313; located at an intergenic region), but we did not detect any modification at this location in iPSC-C cells (Supplemental Table 7). Moreover, only one predicted off-target site is located in an exon (PLXNA1; containing 2 mismatches and 1 bulge) (Supplemental Table 6). The predicted off-target sites were mapped to or are flanked by 76 unique genes. Among them, 53 genes were expressed in our control HD iPSC clones, permitting evaluation of expression levels of predicted off-targets in our RNAseq data.
Notably, genes harboring or located near the predicted off-target sites were neither significantly altered individually ( Figure 4D; red circles) nor enriched as a group in our RNAseq data ( Figure 4E). Together these data indicate that our PAS-based CRISPR-Cas9 strategy utilizing rs363099 is unlikely to alter expression levels of protein-coding genes harboring or flanking off-target sites identified by GUIDE-seq or predicted by a prediction algorithm.

Discussion
Causative mutations of Mendelian disorders are highly sought-after because subsequent revelation of underlying disease mechanisms has been thought to lead to cures. However, detailed mechanistic studies rarely have produced effective therapeutics. For example, in several dominant disorders (e.g., some forms of amyotrophic lateral sclerosis, Alzheimer's disease, and HD), the disease-producing genetic defects have been known for more than 20 years (22)(23)(24)(25)(26), but no effective intervention has yet been developed. This suggests that defining drug targets in genetic diseases through mechanism-focused studies is challenging. Nevertheless, with the evolution of various gene targeting approaches, the disease-causing gene itself is recognized as the best therapeutic target even without a full understanding of its biological functions.
Various gene targeting technologies have been developed. Gene knockdown or knockout approaches can be broadly grouped based on the target. RNA interference (RNAi) and antisense oligonucleotide (ASO) interact with RNA to produce reversible knockdown of the target, offering a versatile means of gene targeting.
However, these approaches may show high levels of off-target effects (27)(28)(29)(30)(31) and require repeated treatments (32)(33)(34). In contrast, zinc finger and CRISPR-Cas approaches aim at producing changes in the target DNA to produce irreversible knockout effects. Although they provide overall high levels of on-target gene specificity, delivery to target tissue is a major challenge to overcome to apply these powerful tools in humans (27).
Furthermore, a phase1-2a trial showed dose-dependent reduction of mutant HTT protein by non-allele-specific ASO in humans (41), supporting the feasibility of gene knockdown approaches. Although promising, non-allelespecific RNA-lowering approaches have limitations including difficulty in adequately maintaining the levels of mutant HTT mRNA. Importantly, despite therapeutic efficacies of non-allele-specific HTT lowering approaches in preclinical studies, a phase 3 trial to test a non-allele-specific ASO in HD showed the lack of clinical benefits (42,43). Considering these, alternative allele-specific DNA targeting approaches may produce robust therapeutic benefits (32,(44)(45)(46)(47)(48) because they may overcome limitations of non-allele-specific RNA targeting approaches (27).
Targeted disruption of Htt causes embryo lethality (7)(8)(9) and other deficits in mice (49)(50)(51). Also, insufficient HTT levels due to compound heterozygous mutations are associated with developmental problems in humans (52,53). However, one copy of Htt (i.e., heterozygous knockout) is sufficient to support the survival of mice (7,9), and individuals with one functional copy of HTT do not present HD symptoms or developmental problems (52)(53)(54). Given HD is caused by a dominant gain-of-function mutation (24), these observations suggest that selective inactivation of mutant HTT gene may produce significant clinical benefits without side effects. Considering genome engineering produces permanent changes, high levels of allele specificity are strongly preferred for any therapeutic DNA-targeting strategy for HD. We thus focused on developing allelespecific CRISPR-Cas9 strategies capitalizing on genetic variations that generate or eliminate a PAM site. We conceived two PAS-based allele-specific CRISPR-Cas9 strategies to selectively inactivate the mutant HTT in a given HD subject. Previously, using two gRNAs, we simultaneously targeted two mutant-specific PAM sites that encompass the transcription start site and an expanded CAG repeat of the mutant HTT to prevent the transcription of the mutant allele by genomic deletion (namely, Transcription Prevention-CRISPR-Cas9) (10).
Similar approaches have been tested in patient-derived fibroblasts and mouse models of HD (19,20).  (55). These observations support the value of our allele-specific DNA targeting strategy as the means for early treatment because it may be applied to pre-symptomatic mutation carriers without producing significant adverse effects.
Since the difference in DNA sequence between mutant and normal HTT in a given HD subject serve as the basis for our personalized mutant-specific NMD-CRISPR-Cas9 strategy, the proportion of HD subjects who are eligible for a CRISPR-Cas9 strategy based on a particular PAS is determined by the frequency of the PAMgenerating allele of the target variant on the mutant HTT. This study tested the concept of mutant-specific NMD-CRISPR-Cas9 based on a PAS rs363099, which can be applied to approximately 20% of HD subjects with European ancestry. In contrast, a CRISPR-Cas9 strategy directly targeting the CAG repeat may be appealing because it can be applied to 100% of HD subjects. However, such a strategy is technically challenging and may generate adverse outcomes due to 1) the lack of a robust PAM site for the commonly used SpCas9, 2) decreased allele selectivity, and 3) lower on-target gene specificity. Although not applicable to all individuals with the target disease, personalized medicine is an important and relevant direction of health care because it may provide increased safety and efficacy on an individual level. In light of this view, our PASbased CRISPR-Cas9 represents one of the personalized HD therapeutic strategies that permit high levels efficacy and safety owing to high levels of mutant HTT specificity. We reason that identification of DNA variants that are compatible with diverse engineered Cas9 variants with different PAM specificities (56-62) will significantly increase the applicability of PAS-based mutant-specific NMD-CRISPR-Cas strategies for HD.
The identification of a PAS (i.e., rs363099) that has never been targeted in HD by CRISPR-Cas9 represents a novel discovery. In addition, our data demonstrating good targeting efficiency, high levels of allele-specificity, and minimal impacts on other protein-coding genes in patient-derived iPSC lines support the clinical relevance of the mutant-specific CRISPR-Cas9 strategy using rs363099. We also evaluated the feasibility of our allele-specific CRISPR-Cas9 strategy in neurons using differentiated neuronal cells from iPSC- Due to the lack of reproducible and relevant phenotypes in HD patient-derived iPSCs (70,71) and iPSC-derived neurons, functional assessments of our rs363099-based mutant-specific NMD-CRISPR strategy in these cell types were impractical. For example, differences in neuronal induction (72)(73)(74), levels of nestin (75,76), action potential (71,72,77), HTT protein aggregates (71,72,78), and CAG repeat instability (71,72,79,80) in HD neurons were inconsistent. Also, it was not feasible to determine targeting efficiencies and impacts on behavioral phenotypes of our strategy in vivo because mouse models that specifically permit the evaluation of our rs363099-based NMD-CRISPR-Cas9 do not exist. The Hu97/18 mouse model is heterozygous at rs363099 (81), and therefore appears to be suitable for testing our mutant-specific NMD- Although a non-allele-specific HTT-lowering ASO did not generate significant benefits in the first phase 3 trial, strengths and advantages of such approaches remain important. Identification of improved targets and optimized clinical trial designs may prove the clinical efficacies of non-allele-specific lowering approaches in HD. Alternatively, numerous PAS on the human genome permit widespread application of allele-specific CRISPR-Cas9 (84). Our PAS-based haplotype-targeting approach will be especially powerful when dealing with single gene disorders caused by dominant gain-of-function mutations because it can inactivate the disease-causing mutation regardless of its location, type, or size (10). Areas of CRISPR-Cas9 application are broadening beyond labs quickly. In our study focused on developing allele-specific DNA targeting strategies using patient-derived iPSC as a model system, the levels of gene editing were somewhat modest. In support, challenges in genome editing in iPSC have been observed in the field (85)(86)(87). Despite modest editing in iPSC models, allele-specific CRISPR-Cas9 strategies may generate significant clinical benefits when combined with efficient delivery methods as supported by robust genome editing in post-mitotic neurons (66-68) and significant benefits in mouse models of HD (20,88). Of note, our data showing on-target gene specificity, mutant allele selectivity, and robust outcomes on the mutant allele addressed critical requirements for safe and effective CRISPR-Cas9 therapeutics. Still, our study was not able to determine the impacts of an allele-specific NMD-CRISPR-Cas9 strategy in mice due to the lack of appropriate pre-clinical models for HD with correct genomic context, representing major limitations and weakness of our current data set. In order to advance our alternative therapeutic strategy, it will be critically important to demonstrate efficacy and safety in the appropriate pre-clinical models of HD. Nevertheless, our PAS-based haplotype-targeting CRISPR-Cas9 strategy targeting the root cause of the disease selectively and permanently may overcome key limitations of other gene lowering approaches and therefore has the potential of being tested in CRISPR-Cas9 intervention trials for HD and others.

Methods
Identification of exonic PAM-Altering SNPs on HTT to design allele-specific CRISPR-Cas9 strategies for HD.
Detailed methods for identifying PAM-Altering SNPs (PAS) on HTT were described previously (10). In this study, we focused on revealing PAS on coding exons of HTT (RefSeq, NM_002111) whose reference or alternative alleles generate or eliminate the PAM sequence for wild-type SpCas9 (i.e., NGG). For this, we analyzed the 1000 Genomes Project (KGP) data (phase 3), identifying 157 PAS residing on the exons of HTT.
Subsequently, we excluded PAS on untranslated regions, revealing 91 PAS on the coding sequence (CDS) of HTT that may allow mutant HTT-specific nonsense-mediated decay through CRISPR-Cas9 (Supplemental Table 1 Mapping PAS-generated NGG PAM sites on the common HTT haplotypes. Aiming at developing mutant-specific CRISPR-Cas9 strategies that do not directly target the expanded CAG repeat but rather the mutant HTT haplotype carrying an expanded CAG repeat, we identified PAS-generated NGG PAM sites on common HTT haplotypes in HD using phased KGP genotype data. We determined the HTT haplotype of each chromosome in the phased KGP data set using our haplotype definitions (14), focusing on the 8 common haplotypes that together account for more than 80% of the mutant chromosomes in HD subjects with European ancestry (14,17,89 Figure 2). We then performed a pairwise comparison to reveal polymorphic or haplotype-specific PAS-generated PAM sites. respectively. iPSC-A and iPSC-B were mainly used to determine allele specificities and molecular outcomes of our mutant-specific CRISPR-Cas9 strategy. To determine the impacts of allele-specific CRISPR-Cas9 on HTT In order to determine the allele-specificity of our CRISPR-Cas9 strategy using rs363099, iPSC lines derived from 2 independent HD subjects carrying hap.01 / hap.08 diplotype were transfected with plasmids for SpCas9 and our primary test gRNA, followed by MiSeq analysis of bulk DNA without selection. We averaged 3 independent CRISPR-Cas9 transfection-MiSeq analysis data for each iPSC line to summarize overall targeting efficiency and allele specificity. Alleles of MiSeq reads were based on the allele at rs363099. Percentage values are relative to all mutant or all normal allele counts. # represent the percentage sequence reads with inframe indels.   Each boxplot shows the maximum, upper quarter, median, lower quarter, and minimum based on n=12 independent clones/group. Circles represent outliers defined by a standard interquartile outlier detection method. Student t-test (2-tailed) was performed separately to determine the statistical significance of each site.
None of them were significant by a Bonferroni multiple test corrected p-value. NS, not significant (multiple test corrected p-value > 0.05).