Resource and Technical AdvanceAIDS/HIVTherapeutics Free access | 10.1172/jci.insight.130153
1Ragon Institute of MGH, MIT and Harvard, Cambridge, Massachusetts, USA.
2Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts, USA.
3AIDS and Cancer Virus Program, Frederick National Laboratory for Cancer Research, Frederick, Maryland, USA.
4Vaccine Research Center, National Institute of Allergy and Infectious Diseases, NIH, Bethesda, Maryland, USA.
5Department of Microbiology, Immunology and Tropical Medicine, School of Medicine and Health Sciences, The George Washington University, Washington, District of Columbia, USA.
6Laboratory of Immunoregulation, National Institute of Allergy and Infectious Diseases, NIH, Bethesda, Maryland, USA.
7Center for Virology and Vaccine Research, Beth Israel Deaconess Medical Center, Boston, Massachusetts, USA.
Address correspondence to: Boris Julg, Ragon Institute of MGH, MIT and Harvard, 400 Technology Square, Cambridge, Massachusetts 02139, USA. Phone: 857.268.7088; Email: bjulg@mgh.harvard.edu.
Find articles by Yu, W. in: JCI | PubMed | Google Scholar |
1Ragon Institute of MGH, MIT and Harvard, Cambridge, Massachusetts, USA.
2Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts, USA.
3AIDS and Cancer Virus Program, Frederick National Laboratory for Cancer Research, Frederick, Maryland, USA.
4Vaccine Research Center, National Institute of Allergy and Infectious Diseases, NIH, Bethesda, Maryland, USA.
5Department of Microbiology, Immunology and Tropical Medicine, School of Medicine and Health Sciences, The George Washington University, Washington, District of Columbia, USA.
6Laboratory of Immunoregulation, National Institute of Allergy and Infectious Diseases, NIH, Bethesda, Maryland, USA.
7Center for Virology and Vaccine Research, Beth Israel Deaconess Medical Center, Boston, Massachusetts, USA.
Address correspondence to: Boris Julg, Ragon Institute of MGH, MIT and Harvard, 400 Technology Square, Cambridge, Massachusetts 02139, USA. Phone: 857.268.7088; Email: bjulg@mgh.harvard.edu.
Find articles by Su, D. in: JCI | PubMed | Google Scholar
1Ragon Institute of MGH, MIT and Harvard, Cambridge, Massachusetts, USA.
2Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts, USA.
3AIDS and Cancer Virus Program, Frederick National Laboratory for Cancer Research, Frederick, Maryland, USA.
4Vaccine Research Center, National Institute of Allergy and Infectious Diseases, NIH, Bethesda, Maryland, USA.
5Department of Microbiology, Immunology and Tropical Medicine, School of Medicine and Health Sciences, The George Washington University, Washington, District of Columbia, USA.
6Laboratory of Immunoregulation, National Institute of Allergy and Infectious Diseases, NIH, Bethesda, Maryland, USA.
7Center for Virology and Vaccine Research, Beth Israel Deaconess Medical Center, Boston, Massachusetts, USA.
Address correspondence to: Boris Julg, Ragon Institute of MGH, MIT and Harvard, 400 Technology Square, Cambridge, Massachusetts 02139, USA. Phone: 857.268.7088; Email: bjulg@mgh.harvard.edu.
Find articles by Torabi, J. in: JCI | PubMed | Google Scholar
1Ragon Institute of MGH, MIT and Harvard, Cambridge, Massachusetts, USA.
2Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts, USA.
3AIDS and Cancer Virus Program, Frederick National Laboratory for Cancer Research, Frederick, Maryland, USA.
4Vaccine Research Center, National Institute of Allergy and Infectious Diseases, NIH, Bethesda, Maryland, USA.
5Department of Microbiology, Immunology and Tropical Medicine, School of Medicine and Health Sciences, The George Washington University, Washington, District of Columbia, USA.
6Laboratory of Immunoregulation, National Institute of Allergy and Infectious Diseases, NIH, Bethesda, Maryland, USA.
7Center for Virology and Vaccine Research, Beth Israel Deaconess Medical Center, Boston, Massachusetts, USA.
Address correspondence to: Boris Julg, Ragon Institute of MGH, MIT and Harvard, 400 Technology Square, Cambridge, Massachusetts 02139, USA. Phone: 857.268.7088; Email: bjulg@mgh.harvard.edu.
Find articles by Fennessey, C. in: JCI | PubMed | Google Scholar
1Ragon Institute of MGH, MIT and Harvard, Cambridge, Massachusetts, USA.
2Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts, USA.
3AIDS and Cancer Virus Program, Frederick National Laboratory for Cancer Research, Frederick, Maryland, USA.
4Vaccine Research Center, National Institute of Allergy and Infectious Diseases, NIH, Bethesda, Maryland, USA.
5Department of Microbiology, Immunology and Tropical Medicine, School of Medicine and Health Sciences, The George Washington University, Washington, District of Columbia, USA.
6Laboratory of Immunoregulation, National Institute of Allergy and Infectious Diseases, NIH, Bethesda, Maryland, USA.
7Center for Virology and Vaccine Research, Beth Israel Deaconess Medical Center, Boston, Massachusetts, USA.
Address correspondence to: Boris Julg, Ragon Institute of MGH, MIT and Harvard, 400 Technology Square, Cambridge, Massachusetts 02139, USA. Phone: 857.268.7088; Email: bjulg@mgh.harvard.edu.
Find articles by Shiakolas, A. in: JCI | PubMed | Google Scholar |
1Ragon Institute of MGH, MIT and Harvard, Cambridge, Massachusetts, USA.
2Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts, USA.
3AIDS and Cancer Virus Program, Frederick National Laboratory for Cancer Research, Frederick, Maryland, USA.
4Vaccine Research Center, National Institute of Allergy and Infectious Diseases, NIH, Bethesda, Maryland, USA.
5Department of Microbiology, Immunology and Tropical Medicine, School of Medicine and Health Sciences, The George Washington University, Washington, District of Columbia, USA.
6Laboratory of Immunoregulation, National Institute of Allergy and Infectious Diseases, NIH, Bethesda, Maryland, USA.
7Center for Virology and Vaccine Research, Beth Israel Deaconess Medical Center, Boston, Massachusetts, USA.
Address correspondence to: Boris Julg, Ragon Institute of MGH, MIT and Harvard, 400 Technology Square, Cambridge, Massachusetts 02139, USA. Phone: 857.268.7088; Email: bjulg@mgh.harvard.edu.
Find articles by Lynch, R. in: JCI | PubMed | Google Scholar
1Ragon Institute of MGH, MIT and Harvard, Cambridge, Massachusetts, USA.
2Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts, USA.
3AIDS and Cancer Virus Program, Frederick National Laboratory for Cancer Research, Frederick, Maryland, USA.
4Vaccine Research Center, National Institute of Allergy and Infectious Diseases, NIH, Bethesda, Maryland, USA.
5Department of Microbiology, Immunology and Tropical Medicine, School of Medicine and Health Sciences, The George Washington University, Washington, District of Columbia, USA.
6Laboratory of Immunoregulation, National Institute of Allergy and Infectious Diseases, NIH, Bethesda, Maryland, USA.
7Center for Virology and Vaccine Research, Beth Israel Deaconess Medical Center, Boston, Massachusetts, USA.
Address correspondence to: Boris Julg, Ragon Institute of MGH, MIT and Harvard, 400 Technology Square, Cambridge, Massachusetts 02139, USA. Phone: 857.268.7088; Email: bjulg@mgh.harvard.edu.
Find articles by Chun, T. in: JCI | PubMed | Google Scholar |
1Ragon Institute of MGH, MIT and Harvard, Cambridge, Massachusetts, USA.
2Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts, USA.
3AIDS and Cancer Virus Program, Frederick National Laboratory for Cancer Research, Frederick, Maryland, USA.
4Vaccine Research Center, National Institute of Allergy and Infectious Diseases, NIH, Bethesda, Maryland, USA.
5Department of Microbiology, Immunology and Tropical Medicine, School of Medicine and Health Sciences, The George Washington University, Washington, District of Columbia, USA.
6Laboratory of Immunoregulation, National Institute of Allergy and Infectious Diseases, NIH, Bethesda, Maryland, USA.
7Center for Virology and Vaccine Research, Beth Israel Deaconess Medical Center, Boston, Massachusetts, USA.
Address correspondence to: Boris Julg, Ragon Institute of MGH, MIT and Harvard, 400 Technology Square, Cambridge, Massachusetts 02139, USA. Phone: 857.268.7088; Email: bjulg@mgh.harvard.edu.
Find articles by Doria-Rose, N. in: JCI | PubMed | Google Scholar |
1Ragon Institute of MGH, MIT and Harvard, Cambridge, Massachusetts, USA.
2Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts, USA.
3AIDS and Cancer Virus Program, Frederick National Laboratory for Cancer Research, Frederick, Maryland, USA.
4Vaccine Research Center, National Institute of Allergy and Infectious Diseases, NIH, Bethesda, Maryland, USA.
5Department of Microbiology, Immunology and Tropical Medicine, School of Medicine and Health Sciences, The George Washington University, Washington, District of Columbia, USA.
6Laboratory of Immunoregulation, National Institute of Allergy and Infectious Diseases, NIH, Bethesda, Maryland, USA.
7Center for Virology and Vaccine Research, Beth Israel Deaconess Medical Center, Boston, Massachusetts, USA.
Address correspondence to: Boris Julg, Ragon Institute of MGH, MIT and Harvard, 400 Technology Square, Cambridge, Massachusetts 02139, USA. Phone: 857.268.7088; Email: bjulg@mgh.harvard.edu.
Find articles by Alter, G. in: JCI | PubMed | Google Scholar
1Ragon Institute of MGH, MIT and Harvard, Cambridge, Massachusetts, USA.
2Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts, USA.
3AIDS and Cancer Virus Program, Frederick National Laboratory for Cancer Research, Frederick, Maryland, USA.
4Vaccine Research Center, National Institute of Allergy and Infectious Diseases, NIH, Bethesda, Maryland, USA.
5Department of Microbiology, Immunology and Tropical Medicine, School of Medicine and Health Sciences, The George Washington University, Washington, District of Columbia, USA.
6Laboratory of Immunoregulation, National Institute of Allergy and Infectious Diseases, NIH, Bethesda, Maryland, USA.
7Center for Virology and Vaccine Research, Beth Israel Deaconess Medical Center, Boston, Massachusetts, USA.
Address correspondence to: Boris Julg, Ragon Institute of MGH, MIT and Harvard, 400 Technology Square, Cambridge, Massachusetts 02139, USA. Phone: 857.268.7088; Email: bjulg@mgh.harvard.edu.
Find articles by Seaman, M. in: JCI | PubMed | Google Scholar |
1Ragon Institute of MGH, MIT and Harvard, Cambridge, Massachusetts, USA.
2Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts, USA.
3AIDS and Cancer Virus Program, Frederick National Laboratory for Cancer Research, Frederick, Maryland, USA.
4Vaccine Research Center, National Institute of Allergy and Infectious Diseases, NIH, Bethesda, Maryland, USA.
5Department of Microbiology, Immunology and Tropical Medicine, School of Medicine and Health Sciences, The George Washington University, Washington, District of Columbia, USA.
6Laboratory of Immunoregulation, National Institute of Allergy and Infectious Diseases, NIH, Bethesda, Maryland, USA.
7Center for Virology and Vaccine Research, Beth Israel Deaconess Medical Center, Boston, Massachusetts, USA.
Address correspondence to: Boris Julg, Ragon Institute of MGH, MIT and Harvard, 400 Technology Square, Cambridge, Massachusetts 02139, USA. Phone: 857.268.7088; Email: bjulg@mgh.harvard.edu.
Find articles by Keele, B. in: JCI | PubMed | Google Scholar |
1Ragon Institute of MGH, MIT and Harvard, Cambridge, Massachusetts, USA.
2Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts, USA.
3AIDS and Cancer Virus Program, Frederick National Laboratory for Cancer Research, Frederick, Maryland, USA.
4Vaccine Research Center, National Institute of Allergy and Infectious Diseases, NIH, Bethesda, Maryland, USA.
5Department of Microbiology, Immunology and Tropical Medicine, School of Medicine and Health Sciences, The George Washington University, Washington, District of Columbia, USA.
6Laboratory of Immunoregulation, National Institute of Allergy and Infectious Diseases, NIH, Bethesda, Maryland, USA.
7Center for Virology and Vaccine Research, Beth Israel Deaconess Medical Center, Boston, Massachusetts, USA.
Address correspondence to: Boris Julg, Ragon Institute of MGH, MIT and Harvard, 400 Technology Square, Cambridge, Massachusetts 02139, USA. Phone: 857.268.7088; Email: bjulg@mgh.harvard.edu.
Find articles by Lauffenburger, D. in: JCI | PubMed | Google Scholar |
1Ragon Institute of MGH, MIT and Harvard, Cambridge, Massachusetts, USA.
2Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts, USA.
3AIDS and Cancer Virus Program, Frederick National Laboratory for Cancer Research, Frederick, Maryland, USA.
4Vaccine Research Center, National Institute of Allergy and Infectious Diseases, NIH, Bethesda, Maryland, USA.
5Department of Microbiology, Immunology and Tropical Medicine, School of Medicine and Health Sciences, The George Washington University, Washington, District of Columbia, USA.
6Laboratory of Immunoregulation, National Institute of Allergy and Infectious Diseases, NIH, Bethesda, Maryland, USA.
7Center for Virology and Vaccine Research, Beth Israel Deaconess Medical Center, Boston, Massachusetts, USA.
Address correspondence to: Boris Julg, Ragon Institute of MGH, MIT and Harvard, 400 Technology Square, Cambridge, Massachusetts 02139, USA. Phone: 857.268.7088; Email: bjulg@mgh.harvard.edu.
Find articles by Julg, B. in: JCI | PubMed | Google Scholar |
Published September 5, 2019 - More info
Broadly neutralizing antibodies (bNAbs) against HIV-1 are under evaluation for both prevention and therapy. HIV-1 sequence diversity observed in most HIV-infected individuals and archived variations in critical bNAb epitopes present a major challenge for the clinical application of bNAbs, as preexistent resistant viral strains can emerge, resulting in bNAb failure to control HIV. In order to identify viral resistance in patients prior to antibody therapy and to guide the selection of effective bNAb combination regimens, we developed what we believe to be a novel Bayesian machine-learning model that uses HIV-1 envelope protein sequences and foremost approximated glycan occupancy information as variables to quantitatively predict the half-maximal inhibitory concentrations (IC50) of 126 neutralizing antibodies against a variety of cross clade viruses. We then applied this model to peripheral blood mononuclear cell–derived proviral Env sequences from 25 HIV-1–infected individuals mapping the landscape of neutralization resistance within each individual’s reservoir and determined the predicted ideal bNAb combination to achieve 100% neutralization at IC50 values <1 μg/ml. Furthermore, predicted cellular viral reservoir neutralization signatures of individuals before an analytical antiretroviral treatment interruption were consistent with the measured neutralization susceptibilities of the respective plasma rebound viruses, validating our model as a potentially novel tool to facilitate the advancement of bNAbs into the clinic.
HIV-1, the cause of the global AIDS epidemic, continues to be a major public health concern. While the development of antiretroviral therapy (ART) has revolutionized HIV care, it is now clear that life-long ART neither eliminates HIV, nor completely reconstitutes the immune system; in addition, it can have deleterious long-term side effects (1, 2). Thus, efforts to eradicate HIV infection, or at least induce states of viral control and drug-free remission, are therefore needed. The recent identification of highly potent and broadly neutralizing antibodies (bNAbs) against HIV-1 has led to a renaissance of antibodies as therapeutic and prophylactic tools. Initial preclinical and clinical studies with passive administration of bNAbs have shown great promise that these monoclonals could play a revolutionary role in HIV care (3–9). While these antibodies are able to temporarily reduce plasma viral loads in viremic HIV-1–infected individuals (3, 5, 10) or delay viral rebound once ART is stopped, recent studies showed that neutralization-resistant viral strains were often selected to emerge, ultimately resulting in virological failure of bNAb treatment (11–13). Single bNAb administration therefore caused strong selective pressure, without eliminating the HIV-1 latent reservoirs due to preexisting viral epitope variants with resistance to selected bNAbs. Identifying viral resistance in patients prior to antibody treatment is therefore critical to rationally design bNAb combination regimens while taking advantage of the increasing number of available bNAbs, thereby avoiding the risk of treatment failure or development of novel resistance mutations.
So far, quantifying archived viral susceptibility to bNAbs prior to antibody treatment depends on inducing viral outgrowth from stimulated resting CD4+ T cells and measuring neutralization sensitivity of these viruses in standard TZM-bl assays (12). This approach, however, is laborious, time consuming, and expensive and fails to capture the whole spectrum of minor viral variants that could emerge, as not every replication-competent provirus can and will be induced for outgrowth under in vitro conditions (14). Importantly, the median frequency of cells carrying intact proviruses is more than 60-fold higher than what is detected by the standard QVOA (15). Alternative approaches that more comprehensively characterize the landscape of neutralization-susceptible proviruses or even identify novel or unexpected mutations that could interfere with antibody recognition are therefore needed.
In the HIV treatment field, genotyping of plasma virus and cellular proviral DNA (GenoSure Archive) for the detection of antiretroviral drug resistance is routinely performed in clinical practice prior to ART initiation in order to choose optimal antiretroviral drug regimens. This viral sequence-based approach uses an extensive database of matched HIV phenotype and genotype samples to predict viral resistance to antiretroviral drugs (Monogram Biosciences, http://www.monogrambio.com). Mimicking this approach, we developed what we believe to be a novel computational model for the prediction of viral neutralization sensitivity based on HIV-1 envelope (Env) protein sequences. When combined with single-genome sequencing of proviral DNA from peripheral blood mononuclear cells (PBMCs), it allows the rapid and accurate mapping of the HIV-1 reservoir for preexisting resistance to current anti-HIV antibodies, with the capacity to be readily adapted to future bNAbs, therefore facilitating the selection of an effective bNAb regimen.
The HIV-1 Env trimer is coated by a glycan shield consisting of approximately 90 N-linked oligosaccharides, which represent half the mass of Env and play a crucial role in bNAb-epitope interactions by altering the antibody accessibility of the underlying protein (16, 17). High-resolution imaging approaches have revealed that bNAb and glycan interactions are common across all bNAb classes, and previous analyses to map antibody binding and antibody neutralization activity depended not only on sequence, but also on glycan occupancy (18, 19). Computational approaches for prediction of neutralization sensitivity based on Env sequences have been reported previously (20–22). Information of glycans and glycan occupancy, however, have not been included in these models, therefore excluding a critical component. Our previous work characterized site-specific N-linked glycan occupancy profiles by conducting a mass spectrometric N-linked glycoproteomic analysis from a set of 149 selected Env proteins, including monomer gp120s and trimers (19). A total of 2,891 potential N-linked glycosylation sites (PNGSs) were measured among those proteins, and distinct sequence usage signatures surrounding PNGSs were associated with different degrees of glycan occupancy (19), allowing us to estimate glycan occupancy based on protein sequence composition.
In this report, we describe the computational framework that uses both HIV Env protein sequence and glycan occupancy as variables to quantitatively predict the half-maximal inhibitory concentrations (IC50) for 126 neutralizing antibodies against a variety of cross clade viruses. We then applied this model to proviral Env sequences from 25 HIV-1–infected individuals, mapping the landscape of neutralization resistance within each individual’s reservoir against bNAbs that are currently under clinical development. Using this information, we determined the predicted ideal bNAb combination to achieve 100% neutralization at IC50 values <1 μg/ml for each individual. Finally, we applied the prediction model to map the viral reservoir of individuals before an analytical ART interruption (11) and correlated the reservoir signatures with the rebound plasma viruses neutralization susceptibilities.
Combining Env sequence and glycan occupancy to build the neutralization prediction models. As the presence or absence of particular N-linked glycans on the HIV-1 Env protein have a significant effect on antibody-epitope interactions, construction of the computational framework for the predictive model was strategized as two steps: (a) estimating the glycan occupancy of PNGSs based on the neighboring sequences and its sequence-derived information and (b) predicting antibody neutralization activities combining Env sequence information and the estimated glycan occupancy. We initially collected full-length Env protein sequences from the Los Alamos HIV database and from GenBank, paired with all available mAb neutralization sensitivity values retrieved from the Los Alamos “Compile, Analyze and Tally NAb Panels” (CATNAP) database (23), a web platform that comprehensively assembles published neutralization information. To ensure sufficient statistical power, only mAbs with available neutralization values against more than 70 viruses/Env sequences were considered for the model construction. In total, data for 126 mAbs were included in the models with 71–717 available paired Env sequence-neutralization values per mAb. These antibodies represented all 5 currently identified bNAb epitope classes, covering specificities for the V2 apex, V3 loop, CD4-binding site (CD4bs), gp120-gp41 interface, and the membrane proximal external region (MPER) (Supplemental Table 1; supplemental material available online with this article; https://doi.org/10.1172/jci.insight.130153DS1).
To deconvolute the corresponding glycan occupancy at each PNGS from the collected Env panels, we implemented a supervised machine-learning approach to build a predictive glycan regression model that defined the relationship between the glycan occupancy of a given PNGS and its neighboring sequence compositions. From the site-specific glycan occupancy profiles obtained from a set of 149 selected Env proteins, including monomer gp120s and trimers (19), a high-quality set of 766 PNGSs was selected, where glycan occupancy was widely distributed from 0% to 100 %. The sequence-derived features for individual PNGSs were generated from a sequence window between position –30 and 30, where the Asn of the PNGS was at position 0. The protein residue composition and the residue’s hydrophobicity, flexibility, polarity, and β-turn values at each position were then extracted. The predictive glycan model was then constructed by a 2-step process: (a) feature reduction by Elastic-Net regularization and (b) model regression by a support vector regression (SVR) model. The prediction performance of glycan occupancy based on the sequence-derived features was assessed by repeated 10-fold cross-validation (10-fold CV). (The original data set is randomly partitioned into 10 sub–data sets. The 9 sub–data sets were used for model training and the remaining 1 was used for model validation. The CV process is repeated a number of times.) The predictive glycan model showed robust glycan occupancy prediction, with R2 = 0.68 (P < 0.001) and area under the curve = 0.96 (Supplemental Figure 1), suggesting that glycan occupancy can be predicted by the sequence signatures.
With this rich protein sequence and predicted glycan occupancy data set, the computational framework Bayesian Markov Chain Monte Carlo–SVR (MCMC-SVR) model was used to build the neutralization prediction model (19) for each individual mAb. The models first defined the mAb-binding signatures by down selecting the sets of key protein residues and PNGSs associating with the mAb neutralization sensitivities, and then IC50 values were estimated based on these determinant features. Overall model performance showed consistent robustness across all 126 tested mAbs from all 5 epitope-target classes with an exceptional average correlation (R2 = 0.9). To evaluate whether glycan information does indeed contribute to prediction accuracy, we compared a model version that solely used Env sequences with the model that used both integrated glycans and sequences. Both models were based on the exact same number of features to assure reasonable comparison. In general, the integrated model greatly improved the prediction resolution for most of mAbs, as determined by goodness of fit by R2 value (Figure 1). Particularly, the inclusion of glycan features into the model increased the neutralization prediction accuracy for certain mAbs by up to 30%, i.e., for PGDM1400, PGT121, PGT128, PGT135, CH01, VRC38.01, and IOMA. As expected, neutralization prediction for bNAbs against the V2 apex and the V3 loop benefited the most from inclusion of the glycan features, followed by CD4bs and gp120-gp41 interface mAbs. In contrast, the glycan features showed less effect on the neutralization prediction of mAbs targeting MPER (Figure 1). These observations cohesively matched the molecular understanding of mAb-epitope interaction. For instance, nearly all mAbs specific for the V2 apex and V3 loop directly target the glycans within their epitopes, whereas antibody-vulnerable glycan sites are unlikely to be recognized by the MPER mAbs, such as 2F5 and 10E8.
Prediction model performance for selected mAb neutralization sensitivities (log2 IC50) representing relevant envelope broadly neutralizing antibody target sites (V2 apex, V3 loop, CD4-binding site, gp120/gp41 interface, and MPER). Scatter plots illustrate the correlation between measured log2(IC50) using neutralization data from the Los Alamos “Compile, Analyze and Tally NAb Panels” (CATNAP) database and log2(IC50) values predicted by our Bayesian machine-learning model. For each antibody, log2(IC50) prediction based on HIV-1 envelope (Env) sequence alone or using sequence and approximated glycan occupancy information is shown. Overall, 71–717 available paired env sequence-neutralization values per mAb were available. The Spearman’s rho and its 2-sided P value are labeled.
Mapping the HIV reservoir for neutralization susceptibility. There is significant diversity within the HIV-1 reservoir (24), and virologic failure of bNAb treatment due to the rapid selection of neutralization-resistant viral variants has been reported in animal models and early-phase clinical trials (3, 5–7, 10, 11). We performed bulk viral outgrowth cultures with PBMCs from 25 HIV-infected individuals, of which 23 had undetectable HIV RNA levels on ART (median 8 years on ART, range 1–18 years) and 2 had low level viremia in the absence of ART at the time of sampling (Table 1). All were US residents at the time of PBMC collection and were therefore likely to be infected with clade B viruses. The outgrown culture isolates were screened for sensitivity to PGT121, PGDM1400, 3BNC117, and VRC01 in a TZM-bl cell neutralization assay. Similar to published data (12), substantial heterogeneity in neutralization sensitives of the outgrown viruses was observed, with some showing reduced sensitivity or resistance against multiple bNAbs (394747, 588552, 842272), while others were sensitive across all tested antibodies (190487, 237983, 387879, 486632, etc.) (Figure 2). We were therefore interested to map the landscape of neutralization sensitivity of archived viruses in the HIV-1 reservoir by applying the neutralization prediction model to proviral Env sequences derived from PBMCs of these individuals. In total, 727 single-genome full-length Env sequences were obtained from DNA isolated from primary PBMCs with a mean of 29 sequences per individual (range 8–83 sequences).
IC50 values (μg/mL) for outgrown viruses, as determined by TZM-bl neutralization assay. PBMCs from 25 HIV-infected individuals were stimulated and infectious culture supernatants were tested for neutralization sensitivity against the broadly neutralizing antibodies (bNAb) VRC01, 3BNC117, PGT121, and PGDM1400 in a TZM-bl neutralization assay using increasing amounts of antibody. IC50 values ≤0.1 μg/mL are highlighted in red, values between 0.1 and 1 μg/mL are highlighted in orange, values between 1 and 10 μg/mL are highlighted in yellow, values between 10 and 25 μg/mL are highlighted in green, and values >25 μg/mL are highlighted gray.
To calculate neutralization sensitivities based on the sequences, we focused on 10 bNAbs that are currently in clinical development, specifically the CD4bs antibodies VRC01, 3BNC117, N6, and VRC07-523; the V3 loop glycan antibodies PGT121 and 10-1074; the V2 apex glycan antibodies PGDM1400, PG9, and CAP256-VRC26.25; and the MPER antibody 10E8. As expected, we observed substantial heterogeneity of the predicted neutralization sensitivities, as determined by IC50 values (in μg/ml), across bNAbs but also within sequences to each single bNAb (Figure 3). Results were reflective of the potency and breadth of the bNAb, e.g., we predicted that the CD4bs antibodies 3BNC117, N6, and VRC07-523 would neutralize 92%, 95%, and 95% of all Env sequences across all patients at an IC50 <1 μg/ml, respectively, while VRC01 neutralized 86% of all Env sequences at IC50 <2 μg/ml. The V3 mAbs 10-1074 and PGT121 were predicted to neutralize 63% and 70% of sequences at IC50 values <1 μg/ml and 40% and 55% of sequences at IC50 values <0.1 μg/ml consistent with their increased potency. The V2 apex antibodies required IC50 values of ≥ 1.4 and ≥ 11.6 μg/ml (PGDM1400 and CAP256-VRC26.25) to neutralize at least 50% of the viruses, most likely reflective of their limited activity against clade B. The MPER antibody 10E8 achieved 91% Env coverage at IC50 <1 μg/ml and covered 43% of sequences at IC50 values of <0.1 ug/ml.
Predicted log2(IC50) values for proviral HIV-1 envelope sequences from all 25 HIV-infected study participants. In total, 727 single-genome full-length envelope (Env) sequences were obtained from DNA isolated from primary PBMCs with a mean of 29 sequences per individual (range 8–83). Shown are box-and-whisker plots demonstrating the breadth of neutralization sensitivities against multiple class broadly neutralizing antibodies (bNAb) and for each participant’s proviral sequences. Each dot represents the predicted IC50 values for an Env sequence.
Among antibodies of the same specificity, predicted neutralization susceptibility of sequences was consistent, e.g., for the V3 glycan–dependent antibodies, predicted IC50 values for PGT121 and 10-1074 correlated strongly (spearman r = 0.81, P < 0.0001) (Supplemental Figure 2A). For the other specificities, cross class correlation was less robust (Supplemental Figure 2B). Specifically for the CD4bs antibodies, overall correlation of predicted IC50 values was more heterogenous, e.g., VRC01, 3BNC117, and VRC07-523 correlated weakly (spearman r ranging from 0.16 to 0.43), and there was a clear disconnect between N6 and VRC07-523 susceptibilities, for which predicted IC50 values did not correlate (R = 0.03, P = 0.34) (Supplemental Figure 2, C and D), potentially related to the specific mode of recognition of N6, permitting it to avoid steric clashes with glycans in the V5 region of Env (25). These data are consistent with the differing epitope-binding pattern of the CD4bs antibodies (26), corresponding in neutralization susceptibility signatures that are bNAb specific.
Reservoir diversity of neutralization sensitivity. Within each individual’s reservoir, we observed a significant proviral sequence diversity (mean diversity of all 25 patients was 2.0%; range 0.1%–3.9%) (Supplemental Figure 3). As a result, predicted IC50 values for any 1 tested bNAb were quite diverse, ranging from sensitive to resistant (Figure 3 and Figure 4). For the CD4bs antibodies N6 and VRC07-523, respectively, sequences with predicted IC50 values <1 μg/ml were found in all patients, but at the same time, 28% and 20% of patients also harbored sequences with predicted IC50 values >1 μg/ml. For 3BNC117 and VRC01, respectively, 4% and 16% of patients only had sequences with predicted IC50 values >1 μg/ml, while in 24% and 72% of patients, sequences with predicted IC50 values <1 μg/ml and >1 μg/ml were found. For the V3 glycan antibodies PGT121 and 10-1074, respectively, 20% and 24% of patients had only sequences with predicated IC50 values >1 μg/ml, while 60% and 28% of patients had sequences with predicted IC50 values <1 μg/ml and >1 μg/ml. For the V2 apex antibodies, 20% of patients for PGDM1400 and 56% of patients for CAP256-VRC2625 had only sequences with predicted IC50 values >1 μg/ml while 68% and 36% of patients were found to have sequences with IC50 <1 μg/ml and >1 μg/ml. For the MPER antibody 10E8, 4% of patients carried only sequences with IC50 values >1 μg/ml while sequences with both IC50 <1 μg/ml and >1 μg/ml were found in 20% of patients. These data suggest that only a minority of the viral reservoir is fully sensitive to a given bNAb and that coexistence of sensitive and resistant viruses is frequently found; this is consistent with prior data demonstrating significant sequence variability in key contact site of bNAbs in plasma viruses with resistant and sensitive viruses existing in parallel (6).
Maximum likelihood phylogenetic trees of proviral HIV-1 envelope sequences from 4 randomly selected study participants. For participant 239667, proviral sequence from 2 time points (blue and green boxes) are shown. For some sequences (identified by numbers), predicted IC50 values (in μg/mL) for the bNAbs 3BNC117, VRC01, PGT121, and PGDM1400 are shown in the chart to demonstrate the heterogeneity in neutralization susceptibility across sequences and broadly neutralizing antibodies (bNAb). IC50 values ≤0.1 μg/mL are highlighted in red, values between 0.1 and 1 μg/mL are highlighted in orange, values between 1 and 10 μg/mL are highlighted in yellow, values between 10 and 50 μg/mL are highlighted in green, and values >50 μg/mL are highlighted white. Genetic distance scale bars are shown for each tree.
We were next interested to determine if the neutralization susceptibility pattern would be consistent overtime. In 3 individuals, we determined predicted IC50 values for an additional time point (i.e., T1 or T2) (3 months to 1 year after the first time point) and for 2 individuals we analyzed a total of 3 time points that were up to 8 years apart (Figure 5). For patient 239667 the proviral diversity in Env was 3.5% (T1) and 3.3% (T2); for patient 387879 it was 1.8% (T1) and 2.0% (T2); for patient 913820 it was 1.3% (T1) and 2.1% (T2); for patient 821034 it was 2.9%(T1), 2.7% (T2), and 1.8% (T3); and for patient 975240 it was 1.1% (T1), 1.6% (T2), and 0.5% (T3). Overall, we observed a heterogenous spread of predicted IC50 values; however, the range of IC50 values between time points did not differ significantly in the majority of cases. Only for some bNAb susceptibilities, i.e., for 3BNC117 in patient 239667 (unpaired t test with Welch correction, P < 0.05) or for VRC01 and PGT121 in patient 975240 (1-way ANOVA with Bonferroni correction, P value ranging from < 0.0001 to < 0.05, respectively), there were significant differences of the IC50 ranges between time points (Figure 5). The predicted neutralization sensitivities toward other bNAbs, however, did not seem affected. These data suggest that Env sequences and therefore respective sensitivities of reservoir viruses to neutralizing antibodies are relatively stable if plasma virus is continuously suppressed with ART as previously reported (27).
Predicted IC50 values (μg/mL) based on proviral HIV-1 envelope sequences from multiple time points per individual. For 3 of the 25 study participants, proviral sequences from a second time point (3 months to 1 year after the first time point) were available and for 2 study participants 3 time points up to 8 years apart were available. Shown are predicted IC50 values for the broadly neutralizing antibodies (bNAb) 3BNC117, VRC01, PGT121, and PGDM1400. Each dot represents the predicted IC50 values for an envelope (Env) sequence. P values were calculated by 2-sided Wilcoxon rank-sum test and multiple testing corrected by the Benjamini and Hochberg method. *P < 0.05; **P < 0.01; ***P < 0.001.
Optimizing and validating bNAb combinations for therapy. It has become increasingly clear that antibody combinations will be needed if bNAbs are considered for therapy or prevention to overcome preexistent resistance but also to prevent the development of novel escape mutations due to insufficient pressure on the virus. Our reservoir susceptibility data clearly demonstrate the vast heterogeneity in Env neutralization sensitivities, and we therefore determined which bNAb combination would have the highest predicted antiviral activity against all viral variants found in a given patient’s reservoir. Focusing on our selected 10 bNAbs, we tested all double and triple combinations of bNAbs targeting different bNAb classes to maximally cover distinct epitopes. To estimate combined IC50s, the individual bNAb IC50 values were first predicted. Because of the selected bNAbs that recognize distinct epitopes, the combination effect of bNAbs can be assumed to be additive (28, 29), and an additive-effect model was used to calculate the combined IC50 based on the following formula: α = 1/(1/x + 1/y), where α denoted combined IC50 and x and y the IC50s of single bNAbs. We again used the arbitrary cutoff (IC50 = 1 μg/ml) to calculate total coverage of combined bNAbs. Overall, in each of the 25 study participants, a combination of 2 bNAbs was sufficient to achieve 100% neutralization coverage of all Envs that had been obtained from the given individual. Furthermore, in most cases multiple bNAb combinations achieved this goal, although with differing neutralization potencies; but for certain combinations, no or very limited viral coverage was predicted (Table 2). When focusing on the top 2 bNAb combination for each individual, based on the lowest IC50 concentration, IC50 values ranged from 0.0006 to 0.205 μg/ml, with a mean of 0.034 μg/ml (Table 2). The combination of CD4bs and V3 glycan antibody was the most frequently selected combination (36% of all combinations), followed by CD4bs and 10E8 (24%). While the addition of a third bNAb did result in a calculated increase of approximately 20% in potency, the nominal effect on IC50 values was rather moderate (mean IC50 reduction from 0.039 to 0.030 μg/ml).
Examples of broadly neutralizing antibody combinations and their respective predicted neutralization coverage for each patient’s proviral HIV-1 envelope sequences
Building on these in vitro data, we next analyzed a cohort of 9 HIV-infected individuals that had participated in the NIH 15-I-0140 trial (NCT02471326). In this study, participants were given multiple infusions of VRC01 before and during an analytical antiretroviral treatment interruption (11). During the study, all participants experienced viral rebound despite VRC01 plasma levels above 50 μg/ml. Analysis of neutralization sensitivity of autologous viruses before and after VRC01 infusion confirmed the emergence of HIV isolates that were less sensitive or even highly resistant to VRC01, suggesting rapid selection of archived resistant viral variants. We generated 199 single-genome Env sequences from isolated CD4+ T cells obtained prior to the VRC01 infusions. Again, Env sequences were distinct and heterogenous. The average proviral diversity of Env for these individuals was 2.2% (range 0.8%–3.9%). In all except one participant (N08), sequences with predicted IC50 values >1 μg/ml were detected with a mean IC50 of 1.47 μg/ml (range from 0.06 to 11 μg/ml) across all available sequences, confirming preexistence of sequences with reduced neutralization sensitivity to VRC01 (Figure 6). Interestingly, predicted sensitivities to the other tested CD4bs antibodies were substantially higher, with mean IC50s of 0.12, 0.21, and 0.53 μg/ml for 3BNC117, VRC07-523, and N6, respectively. We then focused on the rebound viruses of 6 individuals for which sequence data was available and measured neutralization sensitivity of pseudoviruses generated from selected viral isolates to the VRC01 in a TZM-bl assay (Table 3). Overall, the predicted IC50 values based on the rebound virus Env sequences against 4 bNAbs, including VRC01, 3BNC117, PGT121, and 10-1074, correlated well with the measured IC50 values of the correspondent pseudoviruses, as determined in a TZM-bl assay (Figure 7), confirming the validity of our prediction model. When comparing the resistance pattern of the rebound viruses with the predicted sensitivity signatures of the reservoir viruses, we found that in 4 of the 6 study participants (N02, N03, N04, and N08), the rebound viruses had neutralization characteristics that were consistent and within the predicted sensitivity ranges determined by our proviral sequence prediction approach (Table 3). In the remaining 2 individuals (N06 and N09), rebound viruses contained highly resistant strains that had not been detected within the available proviral sequences. The latter might be due to the limited number of single virus sequences that we obtained for this study and deep-sequencing strategies to identify low-frequency variants will most likely be necessary in the future to generate a complete profile of the cellular reservoir’s bNAb susceptibility. Alternatively, rebound viruses in these 2 individuals might be the result of recombination of latent viruses, as previously described (30).
Predicted log2(IC50) values for proviral HIV-1 envelope sequences from 9 participants of the NIH 15-I-0140 trial. A total of 199 single-genome envelope (Env) sequences were obtained from isolated CD4+ T cells from 9 participants of the NIH 15-I-0140 trial prior to VRC01 administration and prior to analytical antiretroviral treatment interruption (ATI). Shown are box-and-whisker plots demonstrating the breadth of predicted neutralization sensitivities against multiple class broadly neutralizing antibodies (bNAb) and for each participant’s proviral sequences. Each dot represents the predicted IC50 values for an Env sequence.
Comparison between IC50 prediction and measured IC50 values using isolated plasma HIV-1 envelope sequences from 6 participants of the NIH 15-I-0140 trial following viral rebound during ATI. Twenty-four pseudoviruses were generated based on plasma virus envelope (Env) sequences. IC50 values were predicted based on the Env sequences, and neutralization of the respective pseudoviruses against the broadly neutralizing antibodies (bNAb) 3BNC117, VRC01, 10-1074, and PGT121 was measured using a TZM/bl neutralization assay. The regression lines with 95% confidence interval boundaries are indicated in black and blue dotted lines. The Spearman’s rho and its 2-sided P value are labeled.
Measured VRC01 neutralization susceptibility of HIV-1 pseudoviruses and the range of predicted IC50 values
In this report, we describe an analytical method, mirroring the genotyping approach used for antiretroviral drug resistance testing, to map viral susceptibility to binding and neutralization by bNAbs and guide the selection of efficacious bNAb combinations in a personalized way. Specifically, this approach combines HIV-1 single-genome Env sequencing from plasma virus or proviral DNA from the cellular reservoir with a potentially new computational model that also incorporates glycan occupancy information for the prediction of bNAb binding and viral neutralization sensitivity to rapidly and accurately diagnose preexisting resistance to anti-HIV–neutralizing antibodies, therefore facilitating the selection of effective bNAb regimens.
Computational modeling of antibody neutralization activity has been mainly used for vaccine design to develop immunogens for elicitation of bNAbs, and these models have primarily focused on Env protein genetic information. Buiu et al. presented a trained neural network approach to directly predict IC50 values of given antibodies against HIV-1, based on the Env sequence information (31), and Hake et al. built binary classifiers with nonlinear support vector machines and string kernels to distinguish between HIV-1 resistance and susceptibility to a bNAb based on different Env amino acid (AA) sequences (20). More recently, Magaret et al. used 2 machine-learning approaches, based on a set of predefined AA sequence features to predict several TZM-bl neutralization assay outcomes for the CD4bs antibody VRC01, including virus’s resistance versus sensitivity status, logIC50/80 values, and estimated neutralization slope of the dose-response curves (22). Bricault et al. used random forest for IC50 regression predictions that included information on AA, PNGS, clade, and variable loop characteristics, demonstrating that the model accuracy using all the information was superior compared with model accuracy including single parameters, i.e., contact-region-only signatures (21). All these approaches primarily rely on sequence information, and although the latter model included PNGS as a variable, information on glycan occupancy at N-linked glycan sites that can positively or negatively influence antibody binding was not included. Env glycans, however, influence antibody access to the underlying protein (16, 32–34), and addition or removal of a particular glycan at one site may have a substantial effect on epitope availability, antibody binding, and consequently neutralization. Indeed, adding glycan information to our Bayesian MCMC approach by incorporating sequence-based glycan occupancy approximations substantially enhanced the prediction accuracy of the model (R2 = 0.9), allowing rigorous prediction of antibody binding and neutralization potency across multiple antibody epitope specificities, demonstrating the robustness of the model. Moreover, using the predicted neutralization data, we were able to identify bNAb combinations that would theoretically cover the viral sequence diversity within an individual’s reservoir using similar models, as previously reported (28, 29). Specifically, Wagh et al. had recently reported a mathematical modeling approach, the Bliss-Hill model, to accurately predict neutralization titers of bNAb combinations against HIV-1 clade C viruses (28). As, this model requires both IC50 and IC80 values, our bNAb combination analysis adapted the additive model as described by Kong et al., which had performed well in predicting potency of bNAb combinations using experimental data from single bNAbs (29). It should be emphasized that the optimal bNAb combinations predicted with our model are highly person specific, based purely on the available Env sequences obtained from the individual’s reservoir, and that these combinations would not guarantee effectiveness against other HIV-1 isolates and clades, i.e., in the setting of a superinfection.
There are several potential limitations to this approach. Only a small fraction (~2%) of infected CD4+ T cells carry replication-competent proviruses (35) that are inducible and able to produce plasma viremia and infect other target cells. Many Env sequences that we included in our analysis might therefore belong to defective proviruses, and their relevance for cure strategies is uncertain. Recent data, however, suggest that defective proviruses may produce HIV-1 antigens and perturb adaptive immunity in vivo, potentially contributing to the persistent immune activation observed in patients on ART (36, 37). With the advances in full-length proviral sequencing technology (38, 39), we might be able to focus our analysis in the near future on proviruses that have been determined intact and therefore eliminate noise in the neutralization prediction by excluding irrelevant Env sequences that belong to defective proviruses. Furthermore, mapping the HIV reservoir at a single time point likely will not demonstrate the entire viral diversity that is present in an individual, and potential ongoing viral replication could lead to a change of the neutralization susceptibility pattern over time. It is, however, to some degree reassuring that longitudinal studies examining reactivatable latent viruses in patients under ART (27) have not revealed any evidence of viral evolution in the blood compartment, and a large proportion (>50%) of the inducible, replication-competent HIV reservoir derives from CD4+ T cells carrying identical proviral sequences (40). Moreover, Lorenzi et al. has examined a large number of independent isolates of replication-competent virus from treated patients and found that >50% share sequence identity in the env gene with other isolates from the same patients (41). With adequate deep-sequencing approaches of the HIV reservoir producing significantly more single-genome sequencing than what we present here, minor viral variants could be identified and included in the neutralization sensitivity mapping. It needs to be determined if all, even very rare, sequences need to be considered, as they could theoretically generate systemic viral rebound following ART interruption, or if a cut-off frequency of viral variants might be feasible.
Although the model was developed using cross-clade Env sequences from the Los Alamos HIV database and from GenBank, paired with mAb neutralization sensitivity values from the CATNAP database, the available data in these databases are biased toward clade B. Training the model in the future with more non–clade B sequence/neutralization data could potentially strengthen its prediction accuracy against such clades. Nonetheless, the current model detected known differences in clade-specific bNAb susceptibilities, as seen by the relatively high predicted IC50 values for the V2 bNAbs (PGDM1400, CAP256-VRC26.25, etc.) that are intrinsically less active against clade B (Figure 3). Finally, our approach requires advanced sequencing and data analysis capabilities that might not be readily available everywhere. Next-generation sequencing technology, however, is becoming more widely available, and the analytical algorithms can be formalized to facilitate access to a broader user base. Alternatively, a reference laboratory system could simplify access to reservoir bNAb susceptibility testing in the future.
In summary, sequence-based approaches to map the cellular reservoir for viral susceptibilities to antibody binding and neutralization are promising and might play a critical role in HIV prevention, but they are foremost promising for therapy strategies. Given the importance of glycans in the viral Env-antibody interaction, information on glycan occupancy should be a crucial component of prediction algorithms, and when included, as we demonstrate here, prediction accuracy and reliability of such a model suggest its application for the selection of bNAb regimens, therefore supporting the advancement of bNAbs into the clinic.
Study participants. Twenty-five HIV-1–infected adults ≥18 years old (23 on suppressive ART, with plasma HIV RNA <50 copies/mL, and 2 off ART, with plasma HIV RNA <1000 copies/mL) were included in this study. For the NIH study (NIH 15-I-0140), 9 HIV-1–infected individuals, ages ≥18 years old, with a CD4+ T cell count of ≥450 cells/mm3 and an undetectable plasma viral load (<50 copies/mL) for at least or 3 years on continuous ART had been previously enrolled (11).
Data set collection and preprocessing. To develop a neutralization prediction model that identified the relationship between HIV Env sequence and mAb neutralization sensitivity, all available full-length Env protein sequences, spanning all different clades and neutralization tiers, were first collected from Los Alamos HIV database and GenBank. The neutralization sensitivity fingerprints of currently available anti-HIV monoclonal neutralizing antibodies were obtained from CATNAP (23), a web platform comprehensively collecting the published neutralization Env panel information. For those mAb neutralization values against the same Env panel repeatedly validated by multiple reports, the geometric median was taken among the values.
To accurately align the collected sequences, Clustal Omega (42), a multiple sequence alignment program that uses a hidden Markov model to help guide alignments, was implemented to align the sequences against the curated HIV Env alignments downloaded from Los Alamos HIV database (43) (http://www.hiv.lanl.gov/content/sequence/NEWALIGN/align.html). The positions of the protein residues for each aligned sequence were mapped back to the HXB2 sequence. The PNGSs were also identified in each sequence based on the consensus sequon sequence (N-X-S/T, X≠P). Additionally, to reduce the burden of computing and to relieve inconsistent alignments in highly variable V1 and V2 regions of Env affecting model prediction, the aligned residue positions in which more than 10% total collected sequences assigned as gap were removed for the later analysis.
Glycan occupancy prediction for N-linked glycosylation sites. Our previous report has shown that the presence of a PNGS is not sufficient to link the presence of the associated glycan (19) and the sequence usage signature around the PNGS was associated to different degrees of the site-specific glycan occupancy (19). Therefore, a predictive model modified from Li et al. (44) was developed to predict glycan occupancy based on the neighboring protein sequence of the PNGS. To build the model, the site-specific glycan occupancy profiles defined by mass spectrometry from a set of 149 selected Env proteins, including monomer gp120s and trimers, were collected from our previous study. Only the sites where the total spectral counts were larger than 300 were used in the model training to avoid instrumental noise due to low spectral signals, and therefore, a total of 766 PNGSs were collected, where the glycan occupancy ranged between 0 and 1.
Initially the neighboring protein sequences were extracted by a local sequence window between position –N and N, where the Asn of PNGS was at position 0 and N = 30. The extracted protein residues were encoded as a 1,220-dimensional vector (61 positions × 20 AA types), where every single residue was converted from a categorized AA into a binary vector. In addition to protein residue information, hydrophobicity, flexibility, polarity, and β-turn values of individual residues at every position were extracted from AAindex database (45). Elastic Net regularization was used for feature reduction, and the α in Elastic Net optimization, controlling the weight of L1 norm and L2 norm, was determined by 10-fold cross validation. The selected features used in machine-learning algorithm can be accessed in Supplemental Data Set 1. To benchmark the prediction performance, several supervised machine-learning algorithms were tested, including SVR, k-nearest neighbors, random forest, and neural network. SVR showed the most robust prediction, outperforming other algorithms. The optimal hyperparameters for SVR were determined via grid search in the parameter space based on 10-fold cross validation, where kernel = radial basis function, gamma = 2.28 × 10–6, Cost = 1 × 105, epsilon_SVR = 9, epsilon = 0.001. The model prediction performance in each step was evaluated by R2 and mean squared error from repeated 10-fold CV framework.
The final SVR predictive model with the optimal parameters was trained by the complete glycan occupancy data set (766 PNGSs) with the selected features. The mode was then used to predict the glycan occupancy of individual PNGSs from the collected HIV Env sequences based on the corresponding sequence features. The predicted occupancy served as part of input variables feeding into the mAb neutralization predictive models later.
mAb neutralization sensitivity prediction. The Bayesian MCMC support vector machine regression model developed in the previous study (19) was used for feature reduction and mAb neutralization sensitivity prediction. Bayesian MCMC-SVR integrates 2 computational methods. First, a Bayesian MCMC with the biased random walk Metropolis-Hastings algorithm (46, 47) is used to approximate the high-dimensional posterior distribution within the data to identify an optimal feature set that best fits the model predictions, and second, a SVR algorithm that implements support vector machines (SVMs) (48) is employed as a supervised learning model to predict neutralization sensitivity based on the selected feature profiles.
The feature matrix for model learning contained 2 parts. (a) The sequence features included every position of the aligned Env protein sequence, which was converted from a categorized AA residue into a binary vector. (b) The glycan features included the PNGSs from the sequence, which were given the predicted glycan occupancy values from 0 (not occupied) to 1 (fully occupied) estimated by the glycan predictive model in the precious section. The mAb neutralization values (IC50) for model regression were log2 transformed to improve the normality of the data distribution. To determine the optimal feature set of the key protein residues and PNGSs, the statistical significance was calculated for individual features based on the permutation test that estimated null distribution. The detail notion of model construction, statistical calculation, and model benchmark can be found in the Materials and methods section in Yu et al. (19). The predictive model was built for each individual mAb neutralization profile available in CATNAP, with more than 70 Env panels. A total of 126 mAb models were constructed in this report (Supplemental Table 1), and the optimal hyperparameters for every mAb model was determined via grid search in the parameter space based on 10-fold cross validation. The optimal feature sets and hyperparameters of individual models can be accessed in Supplemental Data Set 2.
The customized Python scripts were coded to process Env sequence and neutralization data set. The codes for Bayesian MCMC-SVR can be accessed in Yu et al. (19), and they require Matlab LIBSVM package. The algorithm was run on Linux high-performance computing clusters.
Virus culture. Autologous virus was retrieved from HIV-1–infected individuals as previously described (49). Briefly, CD4+ T cells were purified from freshly isolated PBMCs by negative selection using the Rosette Sep CD4+ T cell enrichment cocktail (Stemcell Technologies), depleting CD8+ T cells, NK cells, B cells, macrophages, monocytes, and dendritic cells. CD4+ T cells were then stimulated in IL-2 (50 units/ml) containing T cell medium in the presence of a bispecific anti-CD3/anti-CD8 mAb, which selectively activates CD4+ T-lymphocytes, while simultaneously depleting all remaining CD8+ T cells (50). CD4+ T cell blasts, generated from HIV negative donors, were added every 7 days to maintain the cultures and provide additional targets for viral outgrowth. Every 2–3 days, p24 was measured in culture supernatants using a p24 based ELISA (PerkinElmer Life Sciences Inc., following the manufacturer’s protocol).
mAbs. VRC01 was provided by the Vaccine Research Center (VRC) at NIH. 3BNC117, and 10-1074 was acquired from the NIH AIDS Reagent Program. PGT121 and PGDM1400 were provided by the Virology and Vaccine Research Center at Beth Israel Deaconess Medical Center. The same set of mAbs was used in Figure 2 and Figure 7.
Neutralization assays. Neutralization of autologous, PBMC-derived infectious isolates by VRC01, 3BNC117, PGDM1400, and PGT121 was assessed as previously described (51). TCID50s were determined for all HIV-1–containing supernatant by serial dilutions and infection of TZM-bl cells. For neutralization assays, mAb samples were tested using a primary concentration of 25 μg/ml with 5-fold dilution series. The viral isolates and mAbs were incubated in a 96-well plate at 37°C for 1 hour before addition of TZM-bl cells. After 2 days of incubation, cells were lysed and the viral infectivity was quantified by measuring luciferase activity using a Victor3 luminometer (Perkin-Elmer). Neutralization curves were fit by nonlinear regression using a 5-parameter hill slope equation. The IC50 values were reported as the antibody concentrations required to suppress infection by 50%.
Single-genome amplification and analysis. The Env gene was sequenced using single-genome amplification as described previously (52). Briefly, viral RNA was isolated from blood plasma using the QIAamp Viral RNA Mini kit (Qiagen) and subjected to cDNA synthesis using the primer Env3outR1 (TTGCTACTTGTGATTGCTCCATGT). Newly synthesized cDNA was diluted to single-copy concentration and amplified by nested PCR using the primer sets Env3outR1 and Env5outF1 (TAGAGCCCTGGAAGCATCCAGGAAG) and Env3inR2 (GTCTCGAGATACTGCTCCCACCC) and Env5inF2 (TTAGGCATCTCCTATGGCAGGAAGAAG). Resultant amplicons were sequenced using Sanger-based sequencing. In total, 727 sequences were obtained from the original 25 patients, and an additional 199 Env sequences were obtained from VRC study participants. The mean pairwise diversity measurements were obtained using DIVEIN (53). Phylogenetic trees were generated using the Tamura-Nei neighbor joining method. All 926 sequences were deposited in GenBank.
Statistics. Spearman’s rank analysis (2 tailed) was used to test for correlations between variables. A 2-sided Wilcoxon rank-sum test was used for pairwise analyses comparing antibody neutralization responses across the groups. Comparisons between more than 2 groups were performed with 1-way ANOVA. A P value of less than 0.05 was considered significant. The Benjamini and Hochberg method was used to calculate FDR-adjusted P values for multiple-comparison post hoc correction.
Study approval. All study participants provided written informed consent prior to study participation in accordance with the Declaration of Helsinki, and studies were reviewed by the institutional review boards of each participating institution (Partners Human Research Committee, Somerville Massachusetts, USA, and National Institute of Allergy and Infectious Diseases Institutional Review Board).
CMF and BFK performed single-genome amplification and HIV-1 sequencing. DS and JT performed viral outgrowth assays. MSS performed viral neutralization assays. RL, AS, and NDR generated pseudoviruses and neutralization data. TWC contributed PBMC samples. WHY, GA, and DAL developed the models. WHY, GA, DAL, and BJ conceived and designed the study and analyzed and interpreted the data. DAL and BJ supervised the study.
BJ is supported by NIH grants AI106408 and AI138790, and by the Ragon Institute of MGH, MIT, and Harvard. DAL is supported by NIH grants AI142790, CA210180, CA215798, CA217377, and DK108056. RL is supported by NIH grant AI140788. GA is supported by NIH grants AI060354, AI080289, AI128751, AI142790, and AI129797. BJ and GA are associate members of the I4C-HIV Martin Delaney Collaboratory (UM1AI126603). The authors gratefully acknowledge John Mascola from the NIH Vaccine Research Center for providing mAbs. This project has been funded in part with federal funds from the National Cancer Institute, NIH, under contract HHSN261200800001E. The content of this publication does not necessarily reflect the views or policies of the Department of Health and Human Services, nor does mention of trade names, commercial products, or organizations imply endorsement by the US government.
Address correspondence to: Boris Julg, Ragon Institute of MGH, MIT and Harvard, 400 Technology Square, Cambridge, Massachusetts 02139, USA. Phone: 857.268.7088; Email: bjulg@mgh.harvard.edu.
Conflict of interest: BJ has received research support from Gilead Sciences. GA has received research support from GlaxoSmithKline, Pfizer, Merck & Co., Janssen Pharmaceuticals, Sanofi, Gilead Sciences, and Bristol-Myers Squibb. DAL has received research support from Janssen Pharmaceuticals, Boehringer Ingelheim, and Merrimack Pharmaceuticals and is an equity shareholder of Torque Therapeutics. BJ, GA, DAL, and AS all hold patents or have filed patent applications that are outlined in the supplement.
Copyright: © 2019, American Society for Clinical Investigation.
Reference information: JCI Insight. 2019;4(17):e130153.https://doi.org/10.1172/jci.insight.130153.