Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
HIV-1 IMMUNOGENS AND BROADLY NEUTRALIZING HIV-1 ANTIBODIES
Document Type and Number:
WIPO Patent Application WO/2015/048610
Kind Code:
A1
Abstract:
The present invention relates, in general, to HIV-1 and, in particular, to broadly neutralizing anti-HIV-1 antibodies (and fragments thereof) and to methods of using same to inhibit HIV-1 infection in a subject (e.g., a human). The invention further relates to compositions comprising such antibodies (or fragments). The invention also relates to methods of using the antibodies (and fragments thereof) as templates for vaccine design. The invention further relates to HIV-1 immunogens and to methods of using such immunogens to induce the production of broadly neutralizing HIV-1 antibodies in a subject (e.g., a human).

Inventors:
HAYNES BARTON F (US)
HAHN BEATRICE H (US)
SHAW GEORGE M (US)
KORBER BETTE T (US)
HRABER PETER (US)
LIAO HUA-XIN (US)
Application Number:
PCT/US2014/057955
Publication Date:
April 02, 2015
Filing Date:
September 29, 2014
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
UNIV DUKE (US)
LOS ALAMOS NAT SECURITY LLC (US)
UNIV PENNSYLVANIA (US)
International Classes:
C07K14/16; A61K35/76; C12N15/49; C12N15/63
Foreign References:
US20130251726A12013-09-26
US20120282264A12012-11-08
US20070292390A12007-12-20
US20080279879A12008-11-13
Other References:
SCHEID ET AL.: "Broad diversity of neutralizing antibodies isolated from memory B cells in HIV-infected individuals", NATURE, vol. 458, no. 7238, 2009, pages 636 - 640
Attorney, Agent or Firm:
SADOFF, B., J. (901 North Glebe Road 11th Floo, Arlington VA, US)
Download PDF:
Claims:
WHAT IS CLAIMED IS:

1. A composition comprising an HIV-1 envelope protein as set forth in Figure 26, or

immunogenic subunit thereof, and a carrier.

2. A composition comprising a nucleic acid encoding the envelope protein of claim 1.

3. The composition according to claim 1 wherein said composition comprises the gpl20 subunit or gpl40 subunit of an HIV-1 envelope protein from a sequence set forth in Figure 26.

4. The composition according to claim 1 wherein said composition further comprises an

adjuvant.

5. A nucleotide sequence encoding an HIV-1 envelope protein, as set forth in Figure 26 or immunogenic subunit thereof, wherein said nucleotide sequence is present in a vector.

6. The nucleotide sequence of claim 5 wherein the vector is a viral vector or mycobacterial vector.

7. An composition comprising the nucleotide sequence according to claim 5 and an adjuvant.

8. A method of inducing an immune response comprising administering to a mammal in need thereof the composition of claim 1 or 5 in an amount sufficient to effect the induction.

9. The method according to claim 8 wherein said mammal is a human.

10. The method of claim 8, wherein the composition comprises CH694TF HIV-1 envelope or a nucleotide sequence encoding the same administered as a prime.

1 1. The method of claim 8, wherein the composition comprises a selection of CH694 envelopes of Figure 26 or nucleic acids sequences encoding the same administered as a boost.

12. An isolated antibody having the binding specificity DH175, or antigen binding subunit

thereof.

13. The antibody according to claim 12 wherein said antibody, or said subunit thereof, comprises the CDRs as set forth in Figure 27.

14. The antibody according to claim 12 wherein said antibody, or said subunit thereof, comprises a heavy or light chain amino acid sequence set forth in Figure 27.

15. The antibody according to claim 12 wherein said antibody is DH175.

16. A method of preventing or treating HIV-1 comprising administering to a subject in need thereof said antibody, or said subunit thereof, according to claim 12 in an amount sufficient to treat or prevent HIV-1 infection.

17. The method according to claim 16 wherein said subject is a human.

Description:
HIV-1 IMMUNOGENS AND BROADLY NEUTRALIZING HIV-1 ANTIBODIES

[1] This application claims the benefit of U.S. Application Ser No. 61/883,561 filed

September 27, 2013 the entire content of which application is herein incorporated by reference.

[2] This invention was made with government support under Grants All 067854 and

All 00645 awarded by the National Institutes of Health. The government has certain rights in the invention.

TECHNICAL FIELD

[3] The present invention relates, in general, to HIV-1 and, in particular, to broadly neutralizing anti-HIV-1 antibodies (and fragments thereof) and to methods of using same to inhibit HIV-1 infection in a subject (e.g., a human). The invention further relates to

compositions comprising such antibodies (or fragments). The invention also relates to methods of using the antibodies (and fragments thereof) as templates for vaccine design. The invention further relates to HIV-1 immunogens and to methods of using such immunogens to induce the production of broadly neutralizing HIV-1 antibodies in a subject (e.g., a human).

BACKGROUND

[4] Induction of HIV-1 envelope (Env) broadly neutralizing antibodies (BnAbs) is a key goal of HIV-1 vaccine development. BnAbs can target conserved regions that include conformational glycans, the gp41 membrane proximal region, the VI /V2 region, glycans-associated C3 V3 on gpl20, and the CD4 binding site (CD4bs) (Walker et al, Science 326:285-289 (2009), Walker et al, Nature 477:466-470 (201 1 ), Burton et al, Science 337: 183-186 (2012), Kwong and Mascola, Immunity 37:412-425 (2012), Wu et al, Science 329:856-861 (2010), Wu et al, Science

333: 1593-1602 (201 1), Zhou et al, Science 329:811-817 (2010), Sattentau and McMichael, F1000 Biol. Rep. 2:60 (2010), Stamatotos, Curr. Opin. Immunol. 24:316-323 (2012)). Most mature BnAbs have one or more unusual features (long heavy chain third complementarity determining regions [HCDR3s], polyreactivity for non-HIV-1 antigens, and high levels of somatic mutation) suggesting substantial barriers to their elicitation (Kwong and Mascola, Immunity 37:412-425 (2012), Haynes et al, Science 308: 1906-1908 (2005), Haynes et al, Nat. Biotechnol. 30:423-433 (2012), Mouquet and Nussenzweig, Cell Mol. Life Sci. 69: 1435-1445 (2012), Scheid et al, Nature 458:636-640 (2009)). In particular, CD4bs BnAbs have extremely high levels of somatic mutation suggesting complex or prolonged maturation pathways (Kwong and Mascola, Immunity 37:412-425 (2012), Wu et al, Science 329:856-861 (2010), Wu et al, Science 333: 1593-1602 (201 1), Zhou et al, Science 329:81 1-817 (2010)). Moreover, it has been difficult to find Envs that bind with high affinity to BnAb germline or unmutated common ancestors (UCAs), a trait that would be desirable for candidate immunogens for induction of BnAbs (Zhou et al, Science 329:811-817 (2010), Chen et al, AIDS Res. Human Retrovirol. 23: 1 1 (2008), Dimitrol, MAbs 2:347-356 (2010), Ma et al, PLoS Pathog. 7:el 002200 (2001), Pancera et al, J. Virol. 84:8098-81 10 (2010), Xiao et al, Biochem. Biophys. Res. Commun. 390:404-409 (2009)). Whereas it has been found that Envs bind to UCAs of BnAbs targeting gp41 membrane proximal region (Ma et al, PLoS Pathog. 7:el002200 (2001), Alam et al, J. Virol. 85: 1 1725-1 1731 (201 1)), and to UCAs of some V1/V2 BnAb (Bonsignori et al, J. Virol. 85:9998-10009 (2011)), to date, heterologous Envs have not been identified that bind the UCAs of CD4bs BnAb lineages (Zhou et al, Science 329:811-817 (2010), Xiao et al, Biochem.

Biophys. Res. Commun. 390:404-409 (2009), Mouquet et al, Nature 467:591-595 (2010), Scheid et al, Science 333: 1633-1637 (201 1), Hoot et al, PLoS Pathog. 9:el003106 (2013)), although Envs that bind CD4bs BnAb UCAs should exist (Hoot et al, PLoS Pathog. 9:el003106 (2013)).

[5] Eighty percent of heterosexual HIV-1 infections are established by one

transmitted/founder (T/F) virus (Keele et al, Proc. Natl. Acad. Sci. USA 105:7552-7557 (2008)). The initial neutralizing antibody response to this virus arises approximately 3 months after transmission and is strain-specific (Richman et al, Proc. Natl. Acad. Sci. USA 100:4144-4149 (2003), Corti et al, PLoS One 5:e8805 (2010)). This antibody response to the T/F virus drives viral escape, such that virus mutants become resistant to neutralization by autologous plasma (Richman et al, Proc. Natl. Acad. Sci. USA 100:4144-4149 (2003), Corti et al, PLoS One 5:e8805 (2010)). This antibody-virus race leads to poor or restricted specificities of neutralizing antibodies in -80% of patients; however in -20% of patients, evolved variants of the T/F virus induce antibodies with considerable neutralization breadth, e.g. BnAbs (Walker et al, Nature 477:466-470 (201 1), Bonsignori et al, J. Virol. 85:9998-10009 (201 1), Corti et al, PLos One 5:e8805 (2010), Gray et al, J. Virol. 85:4828-4840 (201 1), Klein et al, J. Exp. Med. 209: 1469- 1479 (2012), Lynch et al, J. Virol. 86:7588-7595 (2012), Moore et al, Curr. Opin. HIV AIDS 4:358-363 (2009), Moore et al, J. Virol. 85:3128-3141 (201 1), Tomaras et al, J. Virol. 85: 1 1502- 1 1519 (201 1)).

[6] There are a number of potential molecular routes by which antibodies to HIV-1 may evolve and, indeed, types of antibodies with different neutralizing specificities may follow different routes (Wu et al, Science 333: 1593-1602 (201 1), Haynes et al, Nat. Biotechnol. 30:423- 433 (2012), Dimitrol, MAbs 2:347-356 (2010), Liao et al, J. Exp. Med. 208:2237-2249 (201 1)). Because the initial autologous neutralizing antibody response is specific for the T/F virus (Moore et al, Curr. Opin. HIV AIDS 4:358-363 (2009)), some T/F Envs might be predisposed to binding the germline or unmutated common ancestor (UCA) of the observed BnAb in those rare patients that make BnAbs. Thus, although neutralizing breadth generally is not observed until chronic infection, a precise understanding of the interplay between virus evolution and maturing BnAb lineages in early infection may provide insight into events that ultimately lead to BnAb development. BnAbs studied to date have only been isolated from individuals who were sampled during chronic infection (Walker et al, Science 326:285-289 (2009), Burton et al, Science 337: 183-186 (2012), Kwong and Mascola, Immunity 37:412-425 (2012), Wu et al, Science 329:856-861 (2010), Wu et al, Science 333: 1593-1602 (201 1), Zhou et al, Science 329:81 1 -817 (2010), Bonsignori et al, J. Virol. 85:9998-10009 (201 1), Corti et al, PLoS One 5:e8805 (2010), Klein et al, J. Exp. Med. 209: 1469-1479 (2012)). Thus, the evolutionary trajectories of virus and antibody from the time of virus transmission through the development of broad neutralization remain unknown.

[7] Vaccine strategies have been proposed that begin by targeting unmutated common ancestors (UC As), the putative na ' ive B cell receptors of BnAbs, with relevant Env immunogens to trigger antibody lineages with potential ultimately to develop breadth (Wu et al, Science 333: 1593-1602 (201 1), Haynes et al, Nat. Biotechnol. 30:423-433 (2012), Scheid et al, Nature 458:636-640 (2009), Chen et al, AIDS Res. Human Retrovirol. 23: 1 1 (2008), Dimitrol, MAbs 2:347-356 (2010), Ma et al, PLoS Pathog. 7 :e 1002200 (2001), Xiao et al, Biochem. Biophys. Res. Commun. 390:404-409 (2009), Alam et al, J. Virol. 85:1 1725-1 1731 (201 1), Mouquet et al, Nature 467:591-595 (2010)). This would be followed by vaccination with Envs specifically selected to stimulate somatic mutation pathways that give rise to BnAbs. Both aspects of this strategy have proved challenging due to lack of knowledge of specific Envs capable of interacting with UCAs and early intermediate (I) antibodies of BnAbs. [8] The present invention results, at least in part, from studies that resulted in the isolation of antibody DH175 from an African patient, CH0694, who was followed from early acute HIV-1 infection phase to over 4 years post-transmission. During this period CH0694 developed plasma HIV-1 neutralization breadth. Isolation of this antibody makes it possible to study virus/antibody co-evolution.

SUMMARY OF THE INVENTION

[9] The invention is directed towards selection of envelopes and recombinant bnAbs from individual CH0694.

[10] In general, the present invention relates to HIV-1. More specifically, the invention relates to broadly neutralizing anti-HIV-1 antibodies (and fragments thereof) and to methods of using same to inhibit HIV-1 infection in a subject (e.g., a human). The invention further relates to compositions comprising such antibodies (or fragments). The invention also relates to methods of using the antibodies (and fragments thereof) as templates for vaccine design. The invention further relates to HIV-1 immunogens and to methods of using such immunogens to induce the production of broadly neutralizing HIV-1 antibodies in a subject (e.g., a human).

[1 1] In certain aspects the invention provides HIV-1 envelope protein as set forth in Figure 26, or immunogenic subunit thereof. In certain aspects the envelope proteins are recombinant.

[12] In certain aspects the invention provides immunogenic compositions comprising an HIV- 1 envelope protein as set forth in Figure 26, or immunogenic subunit thereof, and a carrier. In certain embodiments, the compositions comprise a nucleic acid encoding the HIV-1 envelope protein of Figure 26, or any of the variants described therein. In non-limiting embodiments, the compositions composition comprise the gpl20 subunit or gpl40 subunit of an HIV-1 envelope protein from a sequence set forth in Figure 26. In certain embodiments the gpl20 protein is gpl20deltaN design as described herein. In certain embodiments, the compositions further comprise an adjuvant.

[13] In certain aspects the invention provides a construct comprising a nucleotide sequence encoding an HIV-1 envelope protein, as set forth in Figure 26 or immunogenic subunit thereof, wherein said nucleotide sequence is present in a vector. In certain embodiments the vector is any suitable vector, for example but not limited to a viral vector or mycobacterial vector. In certain embodiments, the vector is an adenoviral vector or a pox virus vector. The invention also provide immunogenic compositions comprising the construct described herein and a carrier.

[14] In certain aspects the invention provides methods of inducing an immune response comprising administering to a mammal in need thereof the composition of claim 1 or 5 in an amount sufficient to effect the induction. In certain embodiments of the methods the

composition is administered by injection, intrarectally or vaginally. In certain embodiments, the subject is a human.

[15] In certain embodiments of the methods and compositions of the invention, the composition comprises CH694TF HIV-1 envelope, which can be administered as a prime.

[16] In certain embodiments of the methods and compositions of the invention the

composition comprises a selection of CH694 envelopes, which can be administered as a boost. The envelopes described herein can be grouped in various selections such that the antigenic diversity of the CH0694 envelopes is represented in the immunization methods— swarm immunization. Non-limiting embodiments of various swarms selected from a group of one hundred and one CH0694 envelopes are described herein. The invention contemplates other combinations of CH0694 envelopes as described herein such that the antigenic diversity of the HIV-1 envelope responsible for the induction of the DH175 lineage is represented.

[17] In certain aspects the invention provides an isolated antibody having the binding specificity DH175, or antigen binding subunit thereof. The antibody is recombinantly produced. In certain embodiments, the antibody comprises the any one of VH and VL CDRs as set forth in Figure 27. In certain embodiments, the antibody comprises all of the VH and VL CDRs as set forth in Figure 27. In certain embodiments, the antibody comprises a heavy or light chain amino acid sequence set forth in Figure 27. In certain embodiments, the antibody is DH175.

[18] In certain aspects the invention provides methods of preventing or treating HIV-1 comprising administering to a subject in need thereof the antibody of the invention, or a subunit thereof in an amount sufficient to effect said prevention or treatment.

[19] Objects and advantages of the present invention will be clear from the description that follows. BRIEF DESCRIPTION OF THE DRAWINGS

[20] The ' patent or application file contains at least one drawing executed in color. Copies of this patent or patent application publication with color drawing(s) will be provided by the Office upon request and payment of the necessary fee.

[21 ] Figure 1. Profile of HIV- 1 infection in CHAVI 703010694 (CH0694) over four years of clinical follow-up, as viral load (red), CD4+ T cell count (blue), heterologous neutralization potency (black lines), and breadth (pink). Vertical lines depict samples from which half-genome sequencing was available. Neutralization results are from 16 viruses (6535.3, QH0692.42, SC422661.8, PVO.4, ACIO.0.29, RHPA4259.7, Dul56.12, Dul 72.17, Du422.1 , ZM197M.PB7, ZM214M.PL15, CAP45.2.00.G3, Q23.17, Q842.dl 2, Q259.d2.17, Q769.d22).

[22] Figure 2. Within-host evolution from a single transmitted/founder virus. The 3-prime half-genome nucleotide sequences as both tree (right) and pixel plot (left). Sequence variation is rendered as a phylogeny paired with a pixel plot. In the pixel plot, only sites that differ from the T/F are shown in blue for mutations or grey for indels. Sequences (rows) follow the ordering of leaves in the tree. Numbering is provided for HXB2 nucleotide sites (bottom) and the CH0694 alignment (top). Grey boxes highlight Env landmarks (VI -V5). The tree was obtained by PhyML 3 with GTR+r4+I. Leaf colors depict when each sequence was sampled. A genomic map is aligned below the pixel plot.

[23] Figure 3. Dynamics of variant frequencies in 53 early polymorphic Env sites. Sites are ordered by when T/F form (dashed lines) becomes less common than a variant form. In the legend below each plot, variants are ordered by their total frequency over time. All plots are scaled as in the top-left panel. The alignment position is followed by a slash, and the TF amino acids, HXB2 position, most selected amino acid. Figure 8 continues with 53 additional sites, that were added as they might be of interest and show sites that have an amino acid that at at least one time point has a mutation that becomes more frequent than the TF. Sites in the V3-glycan epitope, the V2-glycan epitope, and CD4 contacts, are noted.

[24] Figure 4. Sequence logos depict selected variant frequencies with letter heights.

Outlined letters depict the transmitted/founder sequence and filled letters depict mutations. For clarity, only weeks 5-101 are depicted; all 26 samples are shown in Figure 7. Sites are ordered by the same criterion as in Figure 2 and Table 1. Colors indicate amino-acid charge (blue for negative, red for positive). Indels (gaps) appear as blank white space. [25] Figure 5. Variant frequencies hierarchically clustered by the "complete" hierarchical method, with the addition of indel variants. For clarity, only variants with over 50% frequency are depicted, and linked V2 insertion sites 167/-140S, 168/- HOT, 169/-140S, 170/- 1401 are excluded. Neutralization data were normalized and scaled from 0 (black, no neutralization) to 1 (white, highest neutralization). Pink rectangles indicate missing data.

[26] Figure 6. Diversity of 100 selected isolates (red names) represents within-host diversity. Paired Env tree (right) and pixel plots of Env (left) and selected sites (center). Thin horizontal lines indicate the selected isolates. In the pixel plots, red sites show non-synonymous

substitutions in the Env translation, blue sites nucleotide changes, and grey sites indels. In the center panel, sites indicated by red lettering indicate the 12 sites added to the initial 48, thought to contact V2 or V3 epitopes, or the CD4 binding site. Grey are indels, and whole codons are depicted in the columns - the thicker orange stripe indicates that the amino acid has changed, and the thin red/purple indicates the base(s) that change within the codon. For clarity, only 720 Env sequences are depicted, which do not contain stops or incomplete codons, and which do not match the T/F Env. This tree is a pruned version of the tree obtained from PhyML with all Env sequences and HIVw+r4+I.

[27] Figure 7. Sequence logos depict variant frequencies with letter heights. Here sites are ordered by when the T/F form becomes less common than a new variant. Colors indicate amino- acid charge- (blue for negative, red for positive). Indels (gaps) appear as grey boxes.

[28] Figure 8. Dynamics of variant frequencies in 53 polymorphic Env sites that emerge later. This figure continues from the 53 earlier polymorphic sites in Figure 3. Sites are ordered by when T/F form (dashed lines) becomes less common than a variant form. In the legend below each plot, variants are ordered by their total frequency over time. All plots are scaled as in the top-left panel.

[29] Figure 9. Longitudinal CH0694 Neutralization Potency and Breadth

[30] Figure 10 shows plasma neutralization fingerprinting of CH694 individual. Plasma antibody mapping studies identified: N332-sensitive plasma antibodies ; 160-sensitive plasma antibodies; CD4bs RSC3-binding plasma antibodies. Gpl 20 Env-deplete-able plasma neutralizing activity (consensus C Env).

[31] Figure 1 1 shows that 22,800 IgG+ memory B cells were isolated from peripheral blood (week 192 post-transmission) and cultured at limiting dilution. Cell culture supernatants were screened for consensus C neutralization; differential binding of 1086C and 63521 V1/V2 proteins vs their N156Q/N160Q mutants, and RSC3 protein binding

[32] Figure 12 shows a summary of some of the characteristics of DH175: Primary Culture supernatant neutralized 82% of Consensus C pseudovirus infectivity in Tzm-bl assay; Frequency of ConC neutralizing ab: 1/22800 = 0.004%.

[33] Figure 13A and 13B show DH175 ability to neutralize CH0694 T/F virus and capture infectious virions.

[34] Figure 14A and 14B show DH175 Binding to CH0694 T/F gpl20 and gpl40 Env

[35] Figure 15 shows DH175 Heterologous Neutralization Profile. DH175 is a broadly

neutralizing antibody that neutralizes certain tier-2 viruses from multiple clades.

[36] Figure 16 shows DH175 Neutralization Fingerprinting.

[37] Figure 17 shows Effect of N332A Mutation on DH175 Neutralizing Activity- The effect of N332A on DH175 neutralizing activity was strain-specific.

[38] Figure 18 shows that DH175 uses a VH4-34 Gene Segment. VH4-34 is recognized by the anti-idiotype antibody 9G4, which defines a subpopulation of autoreactive antibodies overrepresented in SLE subjects.

[39] Figure 19 shows that DH 175 Binds to Anti-idiotype 9G4 Ab.

[40] Figure 20A and B shows DH175 Autoreactivity.

[41] Figure 21 shows DH175 cross reactivity with human proteins.

[42] Figure 22 shows that DH175 binds to Man9 glycans.

[43] Figure 23 shows the effect of mutations of other N residues and glycosylation sites on DH175 neutralization: effect of N234A and T236A mutations.

[44] Figure 24 shows the effect of mutations of other N residues and glycosylation sites on DH 175 neutralization: effect of N234A, N276A and T278A.

[45] Figure 25 shows the effect of mutations of other N residues and glycosylation sites on DH175 neutralization: effect of N295 and N160.

[46] Figure 26 Amino acids sequences of a selection of Envelopes from CH0694 as gpl60s.

(SEQ ID NOs: to in order of appearance). The amino acids between "can" and

"vpv" (both designated with the small letters in the listing) are deleted in the designs of gpl20deltaN proteins. [47] Figure 27A and 27B show nucleic acid (Fig. 27A, CDRs are underlined) and amino acid

(Fig. 27B, CDRs are highlighted) VH and VL sequences of DH175. SEQ ID NOs: , __, , and (in order of appearance).

[48] Figure 28. Selected envelopes from CH0694. Week 005.20.3 is referred to as T/F.

DETAILED DESCRIPTION OF THE INVENTION

[49] The present invention results, at least in part, from studies that resulted in the isolation of antibody DH175 from an African patient, CH0694, who was followed from the early acute HIV-1 infection phase to over 4 years post transmission. During this period, CH0694 developed plasma HIV-1 neutralization breadth. The invention relates the Env gene sequences shown in the sequence listing and the encoded amino acid sequences (Figure 26), the use thereof as immunogens. The envelopes to be used as immunogens in accordance with the invention can be, for example, expressed as full gpl40, gpl45 with transmembrane portions, gpl20s, gpl20 resurfaced core proteins, gpl20 outer domain constructs, or other minimal gpl20 constructs with portions of the DH175 contacts. The invention also relates to the DH175 antibody, fragments thereof, and to methods of using same as detailed below.

[50] The envelopes to be used as immunogens in accordance with the invention can be expressed as full gpl40, gpl45 with transmembrane portions, gpl20s, g l20 resurfaced core proteins, gpl20 outer domain constructs, or other minimal gpl20 constructs with portions of the DH175 contacts.

[51] In accordance with the invention, immunization regimens can include sequential immunizations of Env constructs selected from those encoded by the sequences described herein, or can involve prime and boosts of combinations of Envs, or the administration of "swarms" of such sequences. Immunogenic fragments/subunits can also be used as can encoding nucleic acid sequences. Alternatively, the transmitted founder virus Env constructs can be used as primes, followed by a boost with the transmitted founder Env and sequential additions of Envs from progressively later times after transmission in patient CH0694. Further, repetitive immunization can be effected with "swarms" of CH0694 Envs (for example, including various combinations of the nucleic acid sequences in Fig.10 and encoded proteins) ranging from, for example, 2 to 40 Envs. (See Example 3.) [52] In one embodiment, the present invention relates to a method of activating an appropriate naive B cell response in a subject (e.g., a human) by administering the CH0694 T/F Env or Env subunits that can include the gpl45 with a transmembrane portion, gp41 and gpl20, an uncleaved gpl40, a cleaved gpl40, a gpl20, a gpl20 subunit such as a resurfaced core (Wu X, Science 329:856-61 (2010)), an outerdomain, or a minimum epitope expressing only the contact points of DH175 with Env (the minimal epitope to avoid dominant Env non-neutralizing epitopes), followed by boosting with representatives of subsequently evolved CH0694 Env variants (e.g., see Fig. 26) either given in combination to mimic the high diversity observed in vivo during affinity maturation, or in series, using vaccine immunogens specifically selected to trigger the appropriate maturation pathway by high affinity binding to UCA and antibody intermediates (Haynes et al, Nat. Biotechnol. 30:423-433 (2012)). DNA, RNA, protein or vectored immunogens can be used alone or in combination. In one embodiment of the invention, transmitted founder virus envelope is administered to the subject (e.g., human) as the priming envelope and then one or more of the sequential envelopes disclosed herein is administered as a boost in an amount and under conditions such that BnAbs are produced in the subject (e.g., human). By way of non-limiting examples, 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 1 , 12, 13, 14, 15, 16, 17, 18 or more envelopes, e.g. one hundred envelopes (or subunits thereof) can be used with one prime and multiple boosts.

[53] The invention provides various methods to choose a subset of viral variants, including but not limited to envelopes, to investigate the role of antigenic diversity in serial samples. In other aspects, the invention provides compositions comprising viral variants, for example but not limited to envelopes, selected based on various criteria as described herein to be used as immunogens.

[54] In other aspects, the invention provides immunization strategies using the selections of immunogens to induce cross-reactive neutralizing antibodies. In certain aspects, the

immunization strategies as described herein are referred to as "swarm" immunizations to reflect that multiple envelopes are used to induce immune responses. The multiple envelopes in a swarm could be combined in various immunization protocols of priming and boosting.

[55] Sequences/Clones

[56] Described herein are nucleic and amino acids sequences of HIV-1 envelopes. In certain embodiments, the described HIV-1 envelope sequences are gpl60s. In certain embodiments, the described HIV-1 envelope sequences are gpl20s. Other sequences, for example but not limited to gpl40s, both cleaved and uncleaved, gpl 50s, gp41s, which are readily derived from the nucleic acid and amino acid gpl60 sequences as shown in the sequence listing. In certain embodiments the nucleic acid sequences are codon optimized for optimal expression in a host cell, for example a mammalian cell, a rBCG cell or any other suitable expression system.

[57] In certain embodiments, the envelope design in accordance with the present invention involves deletion of residues (e.g., 5-1 1, 5, 6, 7, 8, 9, 10, or 1 1 amino acids) at the N-terminus. For delta N-terminal design, amino acid residues ranging from 4 residues or even fewer to 14 residues or even more are deleted. These residues are between the maturation (signal peptide, usually ending with CX, X can be any amino acid) and "VPVXXXX...". In the case of CH0694wl01.20.20gpl20 delta design, as an example, fourteen amino acids were deleted (bold underlined):

MRARGIQRNYOHWWTWGILGFWMLMICNAGLEOOORWVTVYYGVPVWKEAKTTL

FCASDAKAYEKE (rest of envelope sequence is indicated as "...").

[58] In certain embodiments, the invention relates generally to an immunogen, gpl60, gpl20 or g l40, without an N-terminal Herpes Simplex gD tag substituted for amino acids of the N- terminus of gpl20, with an HIV leader sequence (or other leader sequence), and without the original about 4 to about 25, for example 1 1, 14 amino acids of the N-terminus of the envelope (e.g. gpl20). See WO2013/006688, e.g. at pages 10-12, the contents of which publication is hereby incorporated by reference in its entirety.

[59] The general strategy of deletion of N-terminal amino acids of envelopes results in proteins, for example gpl20s, expressed in mammalian cells that are primarily monomeric, as opposed to dimeric, and, therefore, solves the production and scalability problem of commercial gpl20 Env vaccine production. In other embodiments, the amino acid deletions at the N- terminus result in increased immunogenicity of the envelopes.

[60] In certain embodiments, the invention provides envelope sequences, amino acid sequences and the corresponding nucleic acids, and in which the V3 loop is substituted with the following V3 loop sequence TRPNNNTRKSIRIGPGQTFY ATGDIIGNIRQAH. This substitution of the V3 loop reduced product cleavage and improves protein yield during recombinant protein production in CHO cells. [61] In certain aspects, the invention provides composition and methods which use a selection of sequential CH0694 Envs, as g l20s, gp 140s cleaved and uncleaved and gpl60s, as proteins, DNAs, RNAs, or any combination thereof, administered as primes and boosts to elicit immune response. Sequential CH0694 Envs as proteins would be co-administered with nucleic acid vectors containing Envs to amplify antibody induction.

[62] In certain embodiments the invention provides immunogens and compositions which include immunogens as trimers. In certain embodiments, the immunogens include a

trimerization domain which is not derived from the HIV-1 envelope.

[63] In certain aspects the invention provides compositions and methods of Env genetic immunization either alone or with Env proteins to recreate the swarms of evolved viruses that have led to bnAb induction. Nucleotide-based vaccines offer a flexible vector format to immunize against virtually any protein antigen. Currently, two types of genetic vaccination are available for testing— DNAs and mRNAs.

[64] In certain aspects the invention contemplates using immunogenic compositions wherein immunogens are delivered as DNA. See Graham BS, Enama ME, Nason MC, Gordon IJ, Peel SA, et al. (2013) DNA Vaccine Delivered by a Needle-Free Injection Device Improves Potency of Priming for Antibody and CD8+ T-Cell Responses after rAd5 Boost in a Randomized Clinical Trial. PLoS ONE 8(4): e59340, page 9. Various technologies for delivery of nucleic acids, as DNA and/or RNA, so as to elicit immune response, both T-cell and humoral responses, are known in the art and are under developments. In certain embodiments, DNA can be delivered as naked DNA. In certain embodiments, DNA is formulated for delivery by a gene gun. In certain embodiments, DNA is administered by electroporation, or by a needle-free injection

technologies, for example but not limited to Biojector® device. In certain embodiments, the DNA is inserted in vectors. The DNA is delivered using a suitable vector for expression in mammalian cells. In certain embodiments the nucleic acids encoding the envelopes are optimized for expression. In certain embodiments DNA is optimized, e.g. codon optimized, for expression. In certain embodiments the nucleic acids are optimized for expression in vectors and/or in mammalian cells. In non-limiting embodiments these are bacterially derived vectors, adenovirus based vectors, rAdenovirus (Barouch DH, et al. Nature Med. 16: 319-23, 2010), recombinant mycobacteria (i.e., rBCG or M smegmatis) (Yu, JS et al. Clinical Vaccine Immunol. 14: 886-093,2007; ibid 13: 1204-1 1 ,2006), and recombinant vaccinia type of vectors (Santra S. Nature Med. 16: 324-8, 2010), for example but not limited to ALVAC, replicating (Kibler KV et al., PLoS One 6: e25674, 201 1 nov 9.) and non-replicating (Perreau M et al. J. virology 85:

9854-62, 201 1) NYVAC, modified vaccinia Ankara (MVA)), adeno-associated virus,

Venezuelan equine encephalitis (VEE) replicons, Herpes Simplex Virus vectors, and other suitable vectors.

[65] In certain aspects the invention contemplates using immunogenic compositions wherein immunogens are delivered as DNA or RNA in suitable formulations. Various technologies which contemplate using DNA or RNA, or may use complexes of nucleic acid molecules and other entities to be used in immunization. In certain embodiments, DNA or RNA is administered as nanoparticles consisting of low dose antigen-encoding DNA formulated with a block copolymer (amphiphilic block copolymer 704). See Cany et al., Journal of Hepatology 201 1 vol. 54 j 1 15-121 ; Arnaoty et al., Chapter 17 in Yves Bigot (ed.), Mobile Genetic Elements:

Protocols and Genomic Applications, Methods in Molecular Biology, vol. 859, pp293-305 (2012); Arnaoty et al. (2013) Mol Genet Genomics. 2013 Aug;288(7-8):347-63. Nanocarrier technologies called Nanotaxi® for immunogenic macromolecules (DNA, RNA, Protein) delivery are under development. See www.incellart.com/en/research-and- de velopment/ techno lo gies . html .

[66] In certain aspects the invention contemplates using immunogenic compositions wherein immunogens are delivered as recombinant proteins. Various methods for production and purification of recombinant proteins suitable for use in immunization are known in the art.

[67] The immunogenic envelopes can also be administered as a protein boost in combination with a variety of nucleic acid envelope primes (e.g., HIV -1 Envs delivered as DNA expressed in viral or bacterial vectors).

[68] Dosing of proteins and nucleic acids can be readily determined by a skilled artisan. A single dose of nucleic acid can range from a few nanograms (ng) to a few micrograms ^g) or milligram of a single immunogenic nucleic acid. Recombinant protein dose can range from a few μg micrograms to a few hundred micrograms, or milligrams of a single immunogenic polypeptide.

[69] Administration: The compositions can be formulated with appropriate carriers using known techniques to yield compositions suitable for various routes of administration. In certain embodiments the compositions are delivered via intramascular (IM), via subcutaneous, via intravenous, via nasal, via mucosal routes.

[70] The compositions can be formulated with appropriate carriers and adjuvants using techniques to yield compositions suitable for immunization. The compositions can include an adjuvant, such as, for example but not limited to, alum, poly IC, MF-59 or other squalene-based adjuvant, ASOIB, or other liposomal based adjuvant suitable for protein or nucleic acid immunization. In certain embodiments, TLR agonists are used as adjuvants. In some embodiments, the TLR agonist is a TLR4 agonist, such as but not limited to GLA/SE. In other embodiment, adjuvants which break immune tolerance are included in the immunogenic compositions. In some embodiments the adjuvant is TLR7 or a TLR7/8 agonist, or a TLR-9 agonist, or a combination thereof. See PCT/US2013/029164.

[71 ] The present invention includes the specific envelope proteins disclosed herein (e.g., those encoded by the sequences in Fig. 26) and nucleic acids comprising nucleotide sequences encoding same. Examples include the nucleic acid sequences shown in Fig. 26 and referred to in Fig. 28 and the encoded amino acid sequences. The envelope proteins (and subunits) can be expressed, for example, in 293T cells, 293F cells or CHO cells (Liao et al, Virology 353:268-82 (2006)). As indicated above, the envelope proteins can be expressed, for example, as gpl20 or gpl40 proteins and portions of the envelope proteins can be used as immunogens such as the resurfaced core protein design (RSC) (Wu et al, Science 329:856-861 (2010)); another possible design is an outer domain design (Lynch et al, J. Virol. 86:7588-95 (2012)). The invention includes immunogenic fragments/subunits of the envelope sequences disclosed herein, including fragments at least 6, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 120, 140, 160, 180, 200, 220, 240, 260, 280 300, 320 or more amino acids in length, as well as nucleic acids comprising nucleotide sequences encoding such fragments and vectors containing same.

[72] In other embodiments, the invention provides variants of the sequences encoded by the sequences in Fig. 26, including variants that comprise a mutation which repairs a trypsin cleavage site, thereby preventing protein clipping during Env protein production in a cell line, e.g., a CHO cell line.

[73] The envelopes (immunogens) can be formulated with appropriate carriers using standard techniques to yield compositions suitable for administration. The compositions can include an adjuvant, such as, for example, alum, poly IC, MF-59 or other squalene-based adjuvant, ASOIB or other liposomal based adjuvant suitable for protein immunization.

[74] As indicated above, nucleic acid sequences (e.g., DNA sequences) encoding the immunogens can also be administered to a subject (e.g., a human) under conditions such that the immunogen is expressed in vivo and BnAbs are produced. The DNA can be present as an insert in a vector, such as a rAdenoviral (Barouch, et al. Nature Med. 16: 319-23 (2010), recombinant mycobacterial (i.e., BCG or M smegmatis) (Yu et al. Clinical Vaccine Immunol. 14: 886-093 (2007); ibid 13: 1204-1 1 (2006), or recombinant vaccinia type of vector (Santra S. Nature Med. 16: 324-8 (2010)).

[75] Immunogens of the invention, and nucleic acids (e.g., DNAs) encoding same, are suitable for use in generating an immune response (e.g., BnAbs) in a patient (e.g., a human patient) to HIV-1. The mode of administration of the immunogen, or encoding sequence, can vary with the particular immunogen, the patient and the effect sought, similarly, the dose administered.

Typically, the administration route is intramuscular or subcutaneous injection (intravenous and intraperitoneal can also be used). Additionally, the formulations can be administered via the intranasal route, or intrarectally or vaginally as a suppository-like vehicle. Optimum dosing regimens can be readily determined by one skilled in the art. The immunogens (and nucleic acids encoding same) are suitable for use prophylactically, however, their administration to infected individuals may reduce viral load.

[76] The present invention further relates to the DH175 antibody, and fragments thereof (e.g., antigen-binding fragments), and to methods of using same to inhibit HIV-1 infection in a subject (e.g., a human). The invention also relates to nucleic acids comprising nucleotide sequences encoding DH175 and fragments thereof.

[77] Recently, a method of making HIV vaccine immunogens based on their ability to bind to early members of a BnAb clonal lineage was proposed (US Prov. Appln. 61/542,469, filed October 3, 2011). This method is termed B cell lineage immunogen design (Haynes et al.

Nature Biotech. 30: 423-433 (2012)). This method is based on the use of clonal lineage antibody members as templates for design of HIV envelope proteins that bind well to lineage members. This method is based on the use of clonal lineage antibody members as templates for design of HIV envelope proteins that bind well to lineage members. This method is based on the principle that those antigens that bind best to naive BnAb B cell receptors (the unmutated ancestors of mature BnAbs) will be the best immunogens for driving such a clonal lineage. Thus, mature antibodies are isolated, their intermediate ancestor and unmutated ancestor precursors inferred, and the clonal lineage tree reconstructed by Baysian probability statistics and maximum likelihood analysis, and then the tree antibodies are made by recombinant techniques (Haynes et al, Nature Biotech. 30:423-433 (2012)). Then, by screening Envs, or by solving antibody and Env structures and then rational design of Envs that optimally bind to clonal tree members, immunogens are designed and produced for vaccination studies (Haynes et al, Nature Biotech. 30:423-433 (2012)).

[78] Exemplary antibodies of the invention for therapeutic use include those comprising variable heavy (VH) and light (VL) chain amino acid sequences selected from those shown in Example 1. In accordance with the methods of the present invention, either the intact antibody or a fragment thereof can be used. Either single chain Fv, bispecific antibody for T cell engagement, and chimeric antigen receptors can be used (Chow et al, Adv. Exp. Biol. Med. 746: 121 -41 (2012)). That is, for example, intact antibody, a Fab fragment, a diabody, or a bispecific whole antibody can be used to inhibit HIV-1 infection in a subject (e.g., a human). A bispecific F(ab) 2 can also be used with one arm a targeting molecule like CD3 to deliver it to T cells and the other arm the arm of the native antibody (Chow et al, Adv. Exp. Biol. Med.

746: 121-41 (2012)). Toxins that can be bound to the antibodies or antibody fragments described herein include unbound antibody, radioisotopes, biological toxins, boronated dendrimers, and immunoliposomes (Chow et al, Adv. Exp. Biol. Med. 746: 121 -41, 2012)). Toxins can be conjugated to the antibody or antibody fragment using methods well known in the art (Chow et al, Adv. Exp. Biol. Med. 746: 121-41 (2012)). The invention also includes variants of the antibodies (and fragments) disclosed herein, including variants that retain the ability to bind to recombinant Env protein, and methods of using same to, for example, reduce HIV-1 infection risk. Combinations of the antibodies, or fragments thereof, disclosed herein can also be used in the methods of the invention.

[79] The antibodies, and fragments thereof, described above can be formulated as a composition (e.g., a pharmaceutical composition). Suitable compositions can comprise the BnAb (or antibody fragment) dissolved or dispersed in a pharmaceutically acceptable carrier (e.g., an aqueous medium). The compositions can be sterile and can be in an injectable form (e.g., a form suitable for intravenous injection). The antibodies (and fragments thereof) can also be formulated as a composition appropriate for topical administration to the skin or mucosa. Such compositions can take the form of liquids, ointments, creams, gels and pastes. The antibodies (and fragments thereof) can also be formulated as a composition appropriate for intranasal administration. The antibodies (and fragments thereof) can be formulated so as to be

administered as a post-coital douche or with a condom. Standard formulation techniques can be used in preparing suitable compositions.

[80] The BnAbs (and fragments thereof) described herein have utility, for example, in settings including the following:

i) in the setting of anticipated known exposure to HIV-1 infection, the antibodies described herein (or fragments thereof) and be administered prophylactically (e.g., IV, topically or intranasally) as a microbiocide,

ii) in the setting of known or suspected exposure, such as occurs in the setting of rape victims, or commercial sex workers, or in any homosexual or heterosexual transmission without condom protection, the antibodies described herein (or fragments thereof) can be administered as post-exposure prophylaxis, e.g., IV or topically, and

iii) in the setting of Acute HIV infection (AHI), the antibodies described herein (or fragments thereof) can be administered as a treatment for AHI to control the initial viral load or for the elimination of virus-infected CD4 T cells.

[81] Suitable dose ranges can depend on the antibody (or fragment) and on the nature of the formulation and route of administration. Optimum doses can be determined by one skilled in the art without undue experimentation. Doses of antibodies in the range of 1-50 mg/kg can be used.

[82] In accordance with the invention, the BnAbs (or antibody fragments) described herein can be administered prior to contact of the subject or the subject's immune system/cells with HIV-1 or within about 48 hours of such contact. Administration within this time frame can maximize inhibition of infection of vulnerable cells of the subject with HIV-1.

[83] Antibodies of the invention and fragments thereof can be produced recombinantly using nucleic acids comprising nucleotide sequences encoding VH and VL sequences selected from those shown in Example 1.

[84] Certain aspects of the invention can be described in greater detail in the non-limiting Examples that follows. (See also Provisional Appln. 61/613,222, filed March 20, 2012, Provisional Application No. 61/700,234, filed September 12, 2012, Provisional Application No. 61/700,252, filed September 12, 2012, Provisional Application No. 61/708,466, filed October 1, 2012, Provisional Application No. 61/764,421 , filed February 13, 2013, Provisional Application No. 61/542,469, filed October 3, 201 1, Provisional Application No. 61/708,413, filed October 1 , 2012, Provisional Application No. 61/708,503, filed October 1 , 2012, Provisional Application No. 61/806,717, filed March 29, 2013, U.S. Application No. 13/314,712, filed December 8, 201 1, PCT/US2012/000442, filed October 3, 2012, PCT/US2013/00210, filed September 12, 2013 and PCT/US2013/059515, filed September 12, 2013, the entire contents of each of which are incorporated herein by reference.)

[85] This application includes a sequence listing filed electronically concurrently herewith.

EXAMPLE 1

[86] This example (Figures 9-21 ) shows isolation of broadly neutralizing antibody DH 175 from a Malawian chronically HIV-1 infected subject with plasma neutralization breadth.

[87] CH0694 is an African female followed in the CHAVIOOl protocol from the early acute HIV-1 infection phase to over 4 years post-transmission. During this period, CH0694 developed plasma HIV-1 neutralization breadth. This subject was selected for virus/antibody co-evolution studies.

[88] The goal was to isolate broadly neutralizing antibodies (i.e., consensus C neutralization) from memory B cells using the memory B cell culture system for antibody/virus co-evolution studies. Cell culture supernatants were screened for consensus C neutralization, differential binding of 1086C and 63521 V1/V2 tags vs N156Q/N160Q mutants and RSC3 protein binding.

[89] DH175 is a broadly neutralizing antibody that neutralizes selected tier-2 viruses from multiple clades. DH175 does not appear to recapitulate the observed CH0694 serum

neutralization breadth in the Tzm-bl assay.

[90] DH175 does not bind to: i) RSC3, ii) linear peptides from Consensus A, B, C, D and M, CRFl and CRF2; and AE.A244, AE.TH023, B.MN, C.1086, C.TV1 and C.ZM651 HIV-1 strains, iii) Con6, 92TH023, A.9004SS, BaL gpl20 Envs and Al .con.envOl ; B.con.env03, conS, VRC A, VRC B and VRC C gpl40 Envs, or iv) B.MN gp41. The DH175 epitope is likely gpl20 conformational.

[91] DH175 does not bind to: Linear peptides from Consensus A, B, C, D and M, CRFl and CRF2; and AE.A244, AE.TH023, B.MN, C.1086, C.TV1 and C.ZM651 HIV-1 strains; Con6, 92TH023, A.9004SS, BaL gpl20 Envs and Al .con.envOl ; B.con.env03, conS, VRC A, VRC B and VRC C gpl40 Envs; B.MN gp41 ; RSC3. DH175 bound weakly to consensus C gpl20 and binding was abrogated by N332A. DH175 epitope is likely g l20 conformational and the current hypothesis is that it is a PGT-like antibody

[92] DH175 neutralizes consensus C (primary culture conC neutralization = 82%). DH175 binds to consensus C gpl20 (end-point titer = 25 μg/ml). DH175 binding to consensus C gpl20 was abrogated by the N332A mutation in the gpl20 Env.

[93] DH175 does not require N332 to neutralize. Provisional neutralization fingerprinting indicates that DH175 clusters more closely to PGT antibodies than to other known BnAb specificities but it is relatively distal from other PGT antibodies. Figures 22-25 show that DH175 binds to Man9 (Figure 22), and that antibody binding depends on the presence of glycans at positions N332 and N234.

[94] DH175 is a 9G4+ antibody. 9G4+ antibodies were originally identified to be SLE- induced autoreactive antibodies in SLE subjects (faulty GC censorship of autoreactive B cells) They are intrinsically autoreactive against N-acetylactosamine moietes, also cross-react with glycolipids including LPS and gangliosides. 9G4+ plasma antibodies have been identified in chronically HIV-1 infected subjects and found to correlate with HIV-1 plasma neutralization breadth.

[95] The sequences of the heavy and light chain (V(D)J rearrangement) of antibody DH175 from the CHAVI subject CH0694 are shown in Figure 27 (CDRs underlined and bolded).

[96] In summary, using the memory B cell culture system, antibody DH175 was isolated from a chronically HIV-1 infected broad neutralizer. With the isolation of autologous virus quasi- species across multiple time points, and the definition of the viral evolution and sites of immune pressure, DH175 will allow the study of virus/antibody co-evolution in a second broad neutralizer.

[97] DH175 is a polyreactive neutralizing antibody that recognizes a conformational HIV-1 epitope. DH175 cross-reacts with lectins. Neutralization fingerprinting provisionally positions DH175 more closely to PGT-like BnAbs than to other known BnAb specificities.

[98] DH175 is a 9G4+ antibody, a common autoreactive idiotype associated with SLE. DH 175 may have been generated from the same pool of B cells that also generate autoreactive antibodies during SLE. [99] Unlike traditional antibody affinity maturation for non-HIV-1 antigens where affinity maturation is associated with loss of polyreactivity, observations suggest that HIV-1

neutralization breadth is frequently associated with the acquisition of autoreactivity.

[100] The DH175 epitope will be further mapped by making mutations using the 694 T/F virus. Clonally related antibodies will be isolated as well as new bnAbs from CH694 from multiple time points (memory B cell cultures, antigen-specific sorts and Illumina pyro sequencing). The contribution of DH175 and DH175 clonally-related antibodies to autologous viral escape will be evaluated.

EXAMPLE 2

[101] From over 1000 half-genome SGA (single genome amplification) sequences in CH0694, identification of at most 100 is sought for pseudovirus construction, for subsequent use in neutralization assays, and 12 for infectious molecular clones. These will be prioritized by genetic analyses that suggest the sets would be representative of evolving forms, and have the potential to yield insight about virus coevolution with adaptive immunity for the intermediate- ancestor vaccine-design strategy.

[102] Choosing 100 representative sequences from 1103 sequences over 26 time-points might be done at random, with roughly 4 sequences per time-point. To the extent that sequence diversity approximates shape-space diversity, and often it reflects immune selection, it is anticipated that it would be advantageous to more systematically define an informative set for baseline analysis. A simple alternative to random selection would select both the most representative and most evolved sequences per time-point, with preference for any monotypic sequences, and include representatives from multiple clades where they persist (post- week 101 , henceforth designated "wlOl", where the lineages diversification begins to resolve into distinct clades that are maintained over time). By testing both the average and leading sequences per time-point, this would enable characterizing the progression of antibody neutralization to understand development of neutralization potency in autologous sera.

[103] A still more elaborate alternative favors sequences that contain mutations away from the T/F form that eventually reach high prevalence ("fixation"). Such sites reveal diverse dynamics: varied rates of spread, reversion to the T/F form, and stable persistence of multiple variants at a site. Insertions and deletions further complicate this approach. Up to wl 01 , a partially stable progression of mutations can be seen, as described below. Thus, another alternative is to isolate the first instances of such mutations, even when rare, to evaluate whether partial neutralization resistance is conferred in the earliest forms, as a new addition to accumulating changes relative to the T/F.

[104] Not all mutations occur in isolation, and are sampled with the cumulative progressive mutations that occur together in vivo. After wlOl, understanding the progression of escape is complicated by the emergence of distinct clades marking diverse viral lineages.

[105] Finally, a third point of interest is to attempt to include variants in epitope regions for antibody specificities that are evident in the immune sera (e.g., V2-glycan (PG9-like) and V3 glycan (PGT-like) antibodies.

[106] 35 sequences with deletions over 30 nt long, which are presumably not viable, were excluded. Identified were escaping Env sites with over 95% non-T/F form in at least one time- point (Table 1.) A five-site linked insertion into V2 is treated as one variant. The 95% threshold was used just to sift out the single sites that ranked the highest in terms of potential for positive selection in the whole of Env; certainly other sites may be relevant to antibody escape that are missed by this criterion.

[107] Frequencies of T/F, dominant, and other escape forms per site over time were computed. Amino acid frequencies over time were plotted to summarize the diverse dynamics across sites (Figs. 3 & 8). There are many ways for a T/F amino acid to be lost: complete change to another amino acid (eg Fig. 3, 722/ D616N, rowl col2), serial substitutions (eg Fig. 3, 330/ D280H V3, rowl col3), and complex replacements with multiple variants (eg Fig. 3, 229/ E185K, row3 col3) . Variant frequencies per time-point were clustered with hierarchical clustering, to identify positions that vary and the relative progression of these variants. A heatmap depicts the clusters (Fig. 5). A larger graph tracking all variant amino acids at each position is included in Figs. 7 and 8.

[108] Pixel plots paired with trees were used to review the distribution of polymorphisms per sample. Trees were inferred with PhyML version 3 with the HIVw substitution model.

[109] Distinct Epochs pre wklOl (Greek letters correspond to clusters in Figure 5.) The number of clusters varies with a threshold distance. Lacking some objectively defined cutoff, the focus was on cluster patterns that were amenable to interpretation.

[1 10] Variant nomenclature is as follows: Aln position/ (slash) TF_AA, HXB2 site, Esc_AA. [I l l] Here, bold indicates simplest 0->l constant frequency "turnover" replacements,

underline indicates presence of multiple non-T F forms to at least 50%,

italics indicate a site that reverts to the T/F form, and

*asterisks indicate sites with dominant amino acids or indels change in time. a: Seven sites change within the first year

1. 722 D616N

2. 330/D280H* (this is replaced by D280N at wlOl).

3. 53/A50T (appears very early, fades in prevalence wl22-wl94, then returns).

4. 550/K460E (yields at w 122)

5. 950/G835E

6. 236/D189N*. which adds a tandem PNG site, i.e. NNSS.

7. 519/N442S* β: Seven sites change between w041 and w064:

8. 525/D448N, which adds a PNG site

9. 663/R557K

10. 483/D411N

11. 284/K234N

12. 343/G293E

13. 755 K648E

14. 624/M518V γ: Nine sites change at w064-wl01 , and tend to persist:

15. 619/A513V

16. 724/S618T

17. 469/G399E

18. 325/K275E

19. 773/R665K

20. 330/D280N*

21. 854/I746V 22. 886/I778L

23. 185/Y151K δ: Six sites change by wl Ol but do not persist, either reverting to T/F or replaced by another mutation. Here, sampling will be shifted to the sequences nearest consensus and most divergent within a time-point.

[1 12] Considerably more diversity evolves after wl22 (ε-η), yielding greater complexity in the phylogeny that persists through the end of the sampling period. Rather than single escapes emerging sequentially, diverse new variants appear together in the same sample. Contributions to this complexity include length polymorphisms from indels, greater intervals between sampling, and recombination. Some 16 mutations emerge slowly (a few having first been present but rare very early), until they prevail after wl65. Another 5 mutations never become abundant but contribute to diversity in these chronic samples.

[1 13] Changing the criterion to include variants that reach less than 50% frequency preserves the overall clustering pattern, and adds more rare variants to late-emerging cluster η.

[1 14] Table 2 lists 100 candidate isolates for pseudoviruses without stop codons or incomplete codons, which represent the within-host virus diversity. The overall 3 -prime half-genome nucleotide-sequence phylogeny (Fig. 2) resembles the tree obtained from only the Env translation (Fig. 6). One third of these variants represent variants sampled within the first two years of infection. If autologous neutralization and virus escape follow the ladder-like progression of emerging variants, this would provide confirmation that they are produced serially by escape from immune selection.

[1 15] The remaining variants represent a series of bifurcations on the phylogeny. Fig. 6 shows that the variants depicted in Fig. 3 based on the 95% loss of are accumulating through this set, and the changes among these sites are well represented by the Envs suggested for experimentnal testing. It is unclear what drives the bifurcating lineages, and why there seem to be two or more dominant clades on this part of the tree for any given sample timepoint. One possibility is that a division of the virus population into escape variants subpopulations recognized by antibody lineages with different specificities. Recombination may also contribute to structuring the tree this way. [1 16] Several of the PG9-like V2 glycan and PGT V3 glycan epitope and signature positions are among the set highlighted in Fig. 3. Another 12 sites were included that showed some polymorphisms but not 95% displacement of the TF virus, in these and CD4 binding-site contact regions. These sites are depicted in Figures 6, 7, and 8. They do not emerge until very late in the study sampling period, presumably to their association with expanding heterologous neutralization breadth in mature antibodies.

[1 17] Mutations that occur in the LTR of particular interest (in view of interest in CTL activity. Expression levels and IFN sensitivity). (See Example 4)

Table 1. CH0694 variants at sites with less than 5% T/F frequency in at least one sample.

Rank Variant Aln pos HXB22 T/F aa Esc aa non-T/F pos

1 722/D616N 722 616 D N 94.3

2 330/D280N 330 280 D N 93.7

3 525/D448N 525 448 D N 86.5

4 663/R557K 663 557 R K 86.4

5 550/K460E 550 460 K E 85.6

6 950/G835E 950 835 G E 83.8

7 483/D411 N 483 411 D N 82.0

8 284/K234N 284 234 K N 81.9

9 53/A50T 53 50 A T 81.0

10 624/M518V 624 518 M V 79.7

11 755/K648E 755 648 K E 79.2

12 236/D189N 236 189 D N 78.9

13 343/G293E 343 293 G E 78.9

14 229/E185- 229 185 E - 77.2

15 568/T465- 568 465 T - 76.3

16 619/A513V 619 513 A V 72.9

17 751/D644E 751 644 D E 72.3

18 387/E336A 387 336 E A 71.8

19 724/S618T 724 618 S T 71.5

20 773/R665K 773 665 R K 70.4

21 469/G399E 469 399 G E 69.8

22 325/K275E 325 275 K E 69.4

23 185ΛΊ51 Κ 185 151 Y K 68.2

24 519/N442S 519 442 N S 67.5

25 854/I746V 854 746 I V 65.8

26 449/Y396D 449 396 Y D 65.7

27 485 T413N 485 413 T N 64.5

28 886/I778L 886 778 I L 64.1

29 552/E462G 552 462 E G 62.6

30 175/S145F 175 145 s F 62.2

31 34/K33Q 34 33 K Q 61.6

32 142/-134T 142 134 - T 56.5

33 138/K131T 138 131 K T 55.5

34 397/R346K 397 346 R K 55.3

35 31/V30G 31 30 V G 53.8

36 3 V3A 3 3 V A 50.4

37 157/N135- 157 135 N - 50.3

38 156/-134S 156 134 - S 50.0

39 413/I360R 413 360 I R 48.9

40 143/-134A 143 134 - A 47.6

41 747/G640D 747 640 G D 47.3

42 2 166/-140N 166 140 - N 46.0

47 165/-140N 165 140 - N 45.9 48 135/E130N 135 130 E N 45.5

49 621/L515M 621 515 L M 43.5

50 331/T281A 331 281 T A 31 .9

51 369/A319T 369 319 A T 31 .0

52 604/E500K 604 500 E K 28.3 Cumulative percentage of non-T/F variants over sampling period, used for ranking.

Not shown: linked V1 insertion sites 167/-140S, 168/-140T, 169/-140S, 170/-140I.

Table 2. Isolate candidates for pseudovirus construction & autologous Nab assay. 1 n=18 from before roughly 6-months post-infection, capturing the earliest emergent forms with substitutions that rapidly replaced the T/F virus: w005: 20.3, 20.14

w006: 10.3, 10.6

w007: 15.1, 15.48

w009: 15.4, 15.32

w013: 30.2, 30.16

w017: 4.30, 6.2, 6.19, 6.20

w020: 6.10,6.55

w028: 10.2, 10.48 n=15 from 6 to 24 months post-infection:

w041: 40.72, 40.82

w053: 70.32, 70.38, 70.43

w064: 10.1, 10.4, 10.25

wlOl: 10.5, 20.6, 20.20, 20.30, 20.45, 20.48, 30.7 n=37 from years 2 through 3:

wl22: 4.11, 4.20, 4.21, 4.3, 5.61, 5.79, 10.84, 10.90, 81.26

wl24: 6.9, 6.15, 6.16, 6.20, 6.29, 6.37

wl26: 20.35, 20.41, 20.43, 20.51

wl30:3.3,3.4, 10.21, 10.36, 10.63

wl34: 10.7, 20.58, 20.60, 20.65, 20.75

wl38: 10.18, 10.47, 10.50, 10.57

wl46: 3.2,3.7, 15.41, 15.53 n=30 from after year 3:

wl65: 10.14, 10.21, 10.30, 10.32, 10.37, 10.40, 10.43, 10.47

wl94: 5.11,5.12,5.31,5.39,5.41

w215: 30.11, 30.21, 30.35, 30.36, 30.40, 30.60, 30.65

w221: 30.23,40.51,40.63

w223: 5.13, 5.24, 5.25, 5.61, 5.62, 5.69, 10.82

Env sequences containing premature stops or incomplete codons were not considered. EXAMPLE 3

[1 18] Provided herein are non-limiting examples of combinations of antigens derived from CH694 envelope sequences for a swarm immunization. The selection includes priming with a virus which binds to the UCA, for example a T/F virus or another early virus envelope.

[1 19] Non-limiting embodiments of envelopes selected for swarm vaccination are shown as the selections described below. A skilled artisan would appreciate that a vaccination protocol can include a sequential immunization starting with the "prime" envelope(s) and followed by sequential boosts, which include individual envelopes or combination of envelopes. In another vaccination protocol, the sequential immunization starts with the "prime" envelope(s) and is followed with boosts of cumulative prime and/or boost envelopes. In certain embodiments, there is some variance in the immunization regimen; in some embodiments, the selection of HIV-1 envelopes may be grouped in various combinations of primes and boosts, either as nucleic acids, proteins, or combinations thereof.

[120] In certain embodiments the immunization includes a prime administered as DNA, and MVA boosts. See Goepfert, et al. 2014; "Specificity and 6-Month Durability of Immune Responses Induced by DNA and Recombinant Modified Vaccinia Ankara Vaccines Expressing HIV-1 Virus-Like Particles" J Infect Dis. 2014 Feb 9. [Epub ahead of print].

[121] Immunization protocols contemplated by the invention include envelopes sequences as described herein including but not limited to nucleic acids and/or amino acid sequences of gpl 60s, gpl 50s, cleaved and uncleaved gpl40s, gpl 20s, gp41 s, N-terminal deletion variants as described herein, cleavage resistant variants as described herein, or codon optimized sequences thereof. A skilled artisan can readily modify the gpl60 and gpl20 sequences described herein to obtain these envelope variants. The swarm immunization protocols can be administered in any subject, for example monkeys, mice, guinea pigs, or human subjects.

[122] In non-limiting embodiments, the immunization includes a nucleic acid is administered as DNA, for example in a modified vaccinia vector (MVA). In non-limiting embodiments, the nucleic acids encode gpl60 envelopes. In other embodiments, the nucleic acids encode gpl20 envelopes. In other embodiments, the boost comprises a recombinant gpl20 envelope. The vaccination protocols include envelopes formulated in a suitable carrier and/or adjuvant, for example but not limited to alum or GLA/SE. In certain embodiments the immnuzations include a prime, as a nucleic acid or a recombinant protein, followed by a boost, as a nucleic acid or a recombinant protein. A skilled artisan can readily determine the number of boosts and intervals between boosts. In certain embodiments, the immunization methods can include agents which modulate host immune tolerance responses.

[123] As indicated above, the invention includes the use of "swarm" vaccines as DNA constructs. Envs selected for reagent design as described in Example 2 also represent a rational and representative down-selection for "swarm design" for vaccination, to mimic consequences of evolution in CH0694 for generation of antibodies with greater breadth. It is contemplated that earlier forms of the antibodies in the lineage select for resistant forms of the virus, and that exposure to the resistant forms enables the antibody lineage to evolve to tolerate the resistant mutation. For example, 10-20, or more DNA encoded Envs can be included in a single vaccination.

[124] In the down-selection, the amino acids where the transmitted founder (TF) amino acid becomes less than 5% of the population at at least 1 time point were selected; generally these mutations go to fixation, or at least the loss of the TF form becomes fixed, although sometimes it is replaced by several different amino acids with differing frequencies over time. These sites are prime candidates for contributing to immune escape, as they are likely to be under very strong positive selective pressure. In addition to these sites, also attended to were mutations thought to impact V2-glycan antibodies (PG-9 like) and V3-glycan antibodies (PGT-like) through signature analysis, as well as CD4 contact sites, tracking mutations in these positions in Env, even if the mutations that accrue never fully replaced the TF viruses.

[125] All of these mutations accrue sequentially throughout the time period the serum antibodies are gaining potency and breadth, and a hundred Envs isolates were selected that capture their acquisition (Fig. 6).

[126] Prior to week 101 in subject CH0694, the sequential emergence of the mutations away from the TF can be seen, and these mutations can be tracked through the phylogeny, but between weeks 64 and week 101 the virus begins to really diverge (Fig. 2). It is only after this initial divergence that neutralizing breadth begins to become apparent in the subject CH0694's sera (very like CH0505). There is a natural building of diversity prior to the beginning of breadth, and an expansion of the diversity as breadth increases. [127] Swarms of these representative variants can be used in vaccines delivered sequentially, to mimic natural infection in CH0694, with a feasible number of variants included in a swarm. For example: start with the T/F alone, then following up with 5 serial vaccinations, each including 20 variants from the 100 representative forms (e.g., start with the 20 earliest or closest to T/F and the ancestral root of the phylogeny, and then sequentially include later times points and sequences that have evolved further form the T/F, as the vaccination series progresses). An advantage of this strategy is that each variant remains relatively close to a variant the vaccinated individual has seen before, so it is expected that the epitope modifications gradually introduced would not be so drastic as to completely abolish antibody binding but could provide gradual accumulation of differences in different parts of the epitope, allowing antibodies to evolve to develop high affinities to the gradually altered epitopes forms.

[128] An example of the foregoing strategy is set forth below (see Example 2 and Figures 26 for sequence and designation details):

[129] Table 2 in Example 1 provides another non-limiting embodiment of selection of CH0694 envelopes for swarm immunization. In a non-limiting embodiments, the prime is CH0694 T/F.

[130] Below is a non-limiting example of a selection of groups of envelopes for sequential immunization protocol using a swarm of CH694 envelopes.

[131] Group 1 : TF -> week 009, very limited diversification that precedes autologous neutralization

[132] Group 2: Autologous neutralization -> beginning of breadth steady slow diversification w013 - wl 01

[133] Group 3: Modest heterologous neutralization, consistent new level of diversity wl22-146 [134] Group 4: Increase in viral diversity, increase in breadth wl65-w223.

[135] In certain embodiments, CMV-Env DNA synthesized.

[136] Group 1 : Prime: T/F (w005.20.3) or another week 005 sequence that has the amino acid sequene of the TF virus.

n=18 from before roughly 6-months post-infection, capturing the earliest emergent forms with substitutions that rapidly replaced the T/F virus:

w005: 20.13, 20.14

w006: 10.3, 10.6

w007: 15.1 , 15.48

w009: 15.4, 15.32

[137] Group 2:

w013: 30.2, 30.16

w017: 4.30, 6.2, 6.19, 6.20

w020: 6.10, 6.55

w028: 10.2, 10.48

w041 : 40.72, 40.82

w053: 70.32, 70.38, 70.43

w064: 10.1 , 10.4, 10.25

wl Ol : 10.5, 20.6, 20.20, 20.30, 20.45, 20.48, 30.7

[138] Group 3:

n=37 from years 2 through 3: wl22: 4.1 1 , 4.20, 4.21, 4.3, 5.61, 5.79, 10.84, 10.90, 81.26

wl24: 6.9, 6.15, 6.16, 6.20, 6.29, 6.37

wl26: 20.35, 20.41 , 20.43, 20.51

wl 30: 3.3, 3.4, 10.21, 10.36, 10.63

wl 34: 10.7, 20.58, 20.60, 20.65, 20.75

wl38: 10.18, 10.47, 10.50, 10.57

wl46: 3.2, 3.7, 15.41 , 15.53

[139] Group 4:

n=30 from after year 3:

wl65: 10.14, 10.21, 10.30, 10.32, 10.37, 10.40, 10.43, 10.47

wl94: 5.1 1 , 5.12, 5.31, 5.39, 5.41

w215: 30.1 1 , 30.21 , 30.35, 30.36, 30.40, 30.60, 30.65

w221 : 30.23, 40.51 , 40.63

w223: 5.13, 5.24, 5.25, 5.61, 5.62, 5.69, 10.82

[140] In another non-limiting example, CH694 T/F envelope as a prime, and a selection of 100 envelopes from CH694 (plasma with PGT like and VI V2 bnAbs) could be divided in four groups for sequential immunization as follows:

[141] Group 1. 100 CH694 Envs as DNAs and as proteins divided into 4 sequential immunogens. Envs in DNAS are gpl20s and in proteins as gpl 20s. This study is to recreate as best and practically as possible the evolution of viruses over time of BnAb development in this person for PGT and V1V2 Bnab inductions.

[142] Part 1- 25 envs as DNAS electroporated in the thigh, and 25 Envs as proteins IM contralateral thigh as a prime

[143] Part 2- 25 next envs as DNAs electroporated in the thigh and 25 next Envs as proteins IM in contralateral thigh as first boost

[144] Part 3- 25 next envs as DNAs electroporated in the thigh and 25 next Envs as proteins IM in contralateral thigh as first boost

[145] Part 4 25 next envs as DNAs electroporated in the thigh and 25 next eN Envs vs as proteins IM in contralateral thigh as first boost

[146] Group 2. 100 CH649 Envs as DNAs and as proteins (test the effect of having gp41 in the immunogen and effect of quality and quanity of abs by having gp41 in vaccine. [147] Immunization 1- immunize prime with Transmitted founder Env gpl60 as DNA and protein as gpl40C

[148] Immunizaiton 2 (next boost) IM Immunize with 3 variants from Part 1 above in addition to TF as DNA g l60 and protein as gpl40C

[149] Immunization 3 (next boost) IM Immunize with 3 variants from Part 2 above in addition to TF as DNA gpl60 and protein as gpl40C

[150] Immunization 4 (next boost) IM Immunize with 3 variants from Part 3 above in addition to TF as DNA g l60 and protein as gpl40C

[151] Immunization 5 (next boost) IM Immunize with 3 variants from Part 4 above in addition to TF as DNA gpl60 and protein as gpl40C

[152] Adjuvant could be IDRIs GLA (TLR4), quil A and liposome adjuvant (similar to

ASOIB).

Example 4

[153] LTR mutations: There are 4 positions that have a 95% loss of the T/F based at least one time point in subject 0694. These 4 are surrounded by additional mutations that come and go, as we often see with CTL escape, but 4 these represent the most dramatic changes over time within such local clusters of mutations.

[154] They are readily visible in the two accompanying CTL figures.

Alignment Position: HXB2 Position: 0694 T/F LTR position

Global frequency, database

1 15 T->A A 1 15 1 15

A: 63.92% ( 163) T: 35.29% (90) G: 0.78% (2)

380 C->T T 357 360

T: 98.89% (268) other: 1.1 1 % (3)

484 G->A->T T 440 441

T: 99.65% (288) G: 0.35% ( 1 )

548 G->A A 504 505

A: 71.15% (254) G: 24.93% (89) other: 3.92% ( 14)

[155] In each case they evolve towards the global consensus, the transmitted founder form at position 380 and 384 are both very rare in the data, the 1 15 and 548 less common but not that rare. Only 1 15 1 overlaps with the Nef reading from, shown in blue below. It would be interesting to compare a TF without these mutations, and one with only these mutations as an IMC to see if they impact expression... The first natural strain to arise that carries all 4 might also be of interest, it is wlOl.10.6. (Its reading frames are intact, see the end of the file.)

0694TFe : ' TGGAAGGGTTAATTTACTCCAAGAAAAGGCAAGAGATCCTTGATTTGTGGGTCTATCACA 60 wlOl.10.6 60

0694TFe . CACAAGGCTTCTTCCCTGATTGGCAAAACTACACACCGGGACCAGGGGTCAGATTTCCAC 120 wlOl.10.6 ' A 120

0694 TFe TGACCTTTGGATGGTGCTTCAAGCTAGTGCCAGTCAGCCCAGAGGAAGTAGAAGAGGCCA 180 wlOl.10.6 180

0694 TFe ATAAAGGAGAAAACAACTGTCTACTGCACCCTGGGAGCCTGCATGGAATGGAGGATGAAC 240 wl01.10 ' .6 240

0694 TFe ACAGAGAAGTATTAAGATGGAAGTTTGACAGTCAACTAGCACGCAGACACCTGGCCCGCG 300 wlOl.10.6 300

069 TFe AACAACATCCGGAGTATTACAAAGACTGCTGACACAGAAGGGACTTTCCGCTGGGACTTC 360 wlOl .10.6 T 360

0694 TFe CCACTAGGGGCGTTCCAGGGGAGTGGTCTGGGCGGGACTTGGGAGTGGCCAGCCCTCAGA 420 wlOl.10.6 420

0694 TFe TGCTGCATATAAGCAGCTGCGTTTCGCCTGTACTGGGTCTCTCTAGGTAGACCAGATCTG 80 wlOl.10.6 T 480

0694TFe AGCCTGGGAGCTCTCTGGCTATCTGGGGAACCCACTGCT 519

wlOl.10.6 A 519

If you wanted to study if these mutations might impact CTL escape in the LTR, as George has been studying, you could use the following for peptide design:

115

AAAACTACACACCGGGACCAGGGGTCAGATTTCCACTGACCTTTGGATGGTGCTTCAAGC TA AAAACTACACACCGGGACCAGGGGTCAGATATCCACTGACCTTTGGATGGTGCTTCAAGC TA

115 RF1:

_K_T T H R D Q G S D F H *

_K T T H R D Q G S D I H *

115 RF2: silent substitution - so NOT CTL selection in this frame

K L H T G T R G Q I S T D L W M V L Q A K L H T G T R_G Q I S T D L W M V L Q_A__

115 RF3: This is in Nef and in the LTR both:

Q N Y T P G P G V R F P L T F G W C F K L_

Q N Y T P G P G V R Y P L T F G W C F K L_

357

TGACACAGAAGGGACTTTCCGCTGGGACTTCCCACTAGGGGCGTTCCAGGGGAGTGGTCT G TGACACAGAAGGGACTTTCCGCTGGGACTTTCCACTAGGGGCGTTCCAGGGGAGTGGTCT G

357 RF1: silent substitution - so NOT CTL selection in this frame

D T E G T F R W D F P L G A F Q G S G L_

D T E G T F R W D F P L G A F Q G S G L_

357 RF2:

L T Q K G L S A G T S H * L T Q K G L S_A_G T F H *

357 RF3:

_* H R R D F_P L G L P T R G V P G E _S G

_* H R R D F P L G L S T R G V P G E W S G

440:

AGCCCTCAGATGCTGCATATAAGCAGCTGCGTTTCGCCTGTACTGGGTCTCTCTAGGTAG AC AGCCCTCAGATGCTGCATATAAGCAGCTGCTTTTCGCCTGTACTGGGTCTCTCTAGGTAG AC

440 RF1:

*_A_A_A_F R L Y W V S L G R

* A A A F R L Y W V S L G R

440 RF2:

Q P S D_A_A Y K Q L R F A C T G S L *

Q P S D_A_A Y K Q L L F_A__C T G S L *

440 RF3:

__S P Q_M L H I S_S C_V_S P V L G L S R *

_S P Q M L H I S S C F S P V L G L S R *

504:

GATCTGAGCCTGGGAGCTCTCTGGCTATCTGGGGAACCCACTGCT GATCTGAGCCTGGGAGCTCTCTGGCTATCTAGGGAACCCACTGCT

504 RF1:

_D L S L G A L W L S G E P T A__

_D L S L G_A__L W L S R E P T_A_

504 RF2: silent substitution - so NOT CTL selection in this frame

* A W E L S G Y L G N P L

* A W E L S G Y L G N P L_

504 RF3: early stop codon introduced:

R S E P G S S L A I W G T H C_

R S E P^ G S S L A I *

115 Rev Comp: Overlapping with Nef ORF

CTAGCTTGAAGCACCATCCAAAGGTCAGTGGAAATCTGACCCCTGGTCCCGGTGTGTAGT TTT CTAGCTTGAAGCACCATCCAAAGGTCAGTGGATATCTGACCCCTGGTCCCGGTGTGTAGT TTT

115 Rev CompRFl:

* S T I Q R S V E I *

* S T I Q R S V D I *

115 Rev CompRF2:

* L E_A P S K G Q W K S D P W S R C V V L * L E A P S K G Q W I S D P W S R C V V L

115 Rev Comp RF3:

T S L K H H P K V S G Y L T P G P G V *

T S L K H H P K V S G__N_L T P G P G V *

357 Rev Comp

CCAGACCACTCCCCTGGAACGCCCCTAGTGGGAAGTCCCAGCGGAAAGTCCCTTCTGTGT CAG CCAGACCACTCCCCTGGAACGCCCCTAGTGGAAAGTCCCAGCGGAAAGTCCCTTCTGTGT CAG

357 Rev Comp RF1:

* W E V P A E S P F C V S

* W K V P A E S P F C V S

357 Rev Comp RF2: silent substitution - so NOT CTL selection in this frame

P R P L P W N A P S G K S Q R K V P S V S

P R P L P W N A P S G K S Q R K V P S V S

357 Rev Comp RF3:

_P D H S P G T P L V_G___S P S G K S L L__C Q_

P D H S P G T P L V E S P S G K S L L C Q 440 Rev Comp

GGTCTACCTAGAGAGACCCAGTACAGGCGAAACGCAGCTGCTTATATGCAGCATCTGAGG GCT

GGTCTACCTAGAGAGACCCAGTACAGGCGAAAAGCAGCTGCTTATATGCAGCATCTG AGGGCT

440 Rev Comp RF l :

* R D P V Q_A_K_R___S C L Y A A S E G

* R D P V Q A K S S C L Y_A__A S E G

440 Rev Comp RF2:

_G L P_R E T Q Y R R_N__A A_A_Y M Q H L R A_

_G L P R _E T Q Y R R K A A A Y M Q H L R__A_

440 Rev Comp RF3:

V Y L_ E R P S T G_ E T Q L L__I C S I *

V Y L__E R P S T G E _Q L L I C S__I *

504 Rev Comp

AGCAGTGGGTTCCCCAGATAGCCAGAGAGCTCCCAGGCTCAGATCTG AGCAGTGGGTTCCCTAGATAGCCAGAGAGCTCCCAGGCTCAGATCTG

504 Rev Comp RFl : too short

_S G F P_R_*

_S G F P R *

504 Rev Comp RF2 :

A V G S P D S Q R__A__P R L R S G

A V G S L D S Q R__A_P R L R S G_

504 Rev Comp RF3

Q W V P Q I A R E L P G S D L V Q W V P *

[156] If you wanted to test if the 4 mutations that are most strongly selected in the LTR are CTL susceptible, you would need to test the 18 peptides on the left. This fully scans the regions bounding these mutations in each of the 6 reading frames, including the reverse complement. All peptides bounding these sites that: i) don't run into a stop codon, and ii) where the change that goes to fixation is not silent in that frame are included. Some are short because the run up against a stop codon.

1 15 RFl : KTTHRDQGSDFH KTTHRDQGSDIH

1 15 RF3*: NYTPGPGVRFPLTFGWCFK NYTPGPGVRYPLTFGWCFK

357 RF2: LTQKGLSAGTSH LTQKGLSAGTFH

357 RF3: HRRDFPLGLPTRGVPGEWS HRRDFPLGLSTRGVPGEWS

440 RF2: QPSDAAYKQLRFACTGSL QPSDAAYKQLLFACTGSL

440 RF3: PQMLHISSCVSPVLGLSR PQMLHISSCFSPVLGLSR

504 RFl : DLSLGALWLSGEPTA DLSLGAL LSREPTA

504 RF3: RSEPGSSLAIWGTHC RSEPGSSLAI**

1 15-RC-RFl STIQRSVEI STIQRSVDI

1 15-RC-RF2 LEAPSKGQW SDPWSRCVV LEAPSKGQWISDPWSRCVV

1 15-RC-RF3 LKHHPKVSGYLTPGPGV LKHHPKVSGNLTPGPGV

357-RC-RFl WEVPAESPFCVS WKVPAESPFCVS

357-RC-RF3 : DHSPGTPLVGSPSGKSLLC DHSPGTPLVESPSGKSLLC

440-RC-RF l RDPVQAKRSCLYAASEG RDPVQAKSSCLYAASEG

440-RC-RF2 LPRETQYRR AAAYMQHLR LPRETQYRRKAAAYMQHLR

440-RC-RF3 YLERPSTGETQLLICSI YLERPSTGEKQLLICSI 504-RC-RF2 : AVGSPDSQRAPRLRSG AVGSLDSQRAPRLRSG

504-RF-3 : QWVPQIARELPGSDLV

" his is in Nef and in the LTR both

** introduces a stop codon.

[157] The HLA for this individual is:

A*30:01

A*74:01/02/03/09

B*42:02

B * 15:10

Cw*17:01/02/03/04

Cw * 03:04/06/09/19/23/24/26/32/37/44/46/48/54/57/63/64/72/ 73/74/77/78

[158] IEDB predict a number of possible HLA binding peptides within these fragments; it is not sure what to think about epitope prediction and processing in such cases, so it would be best just to check them for T-cell activity.

[159] CH0694.3.wl01.10.6-env has many differences relative to the TF virus, as do all wl02 Env, but it has all 4 changes of interest in the LTR and no others within the LTR, so could be an interesting virus for comparing phenotypes.

>CH0694.3. wlOl .10.6-env

MRVRGIPRN-YQHWWTWGILGFWMLMICNAG-EQ--RWVTVYYGVPVWKE

AKTTLFCASDAKAYEKEAHNVWATHACVPTDPSPQELV-LENVTENFNMW

K DMVEQMHEDI ISLWDQSL-KPCVKLTPLCVTLECTKANFT

NSTSS NNSTYS NNTM-K-EEMKNCSFNATTEI

RDKQ-KKMYAL- FYKLDIVPL KEDKNNSNNSNKYILINCNTST

IAQACPKISFDPI PIHYCAPAGYAILKCNNKTFNGTGPCSNVSTVQCTHG IKPVVSTQLLLNGSLAEEEI I IRSENLTDNTKTIIVHLNESVEIKCTRPG NNTRQ—SVRIGPGQTFYATGDIIGDIRQAHCNISEGKWNATLLKVRE-K

LAEHF-PNKTI -RFNSSAGGDLEITTHTFICGGEFFYCNTSGLFNRTYYA

N ANETAK--YVN STNGN ITLQCRIKQF

INMWQRVGRAMYAPPIAGSITCRSNITGLL LTRDGG E NGT ETFRPGGGDMRDNWRSELYKYKVVEIKPL-GI

APTEAK-RRVVEREK-RAVGLGAVLLGFLGAAGSTMGAASITLTVQARQL LSGIVQQQSNLLKAIEAQHHLLQLTVWGIKQLQARVLAIERYLKDQQLLG LWGCSGKLICTTTVPWNSSWSNKTLEEIWGNMTWMQW-DREVSNYTDIIY ELLEESQ-NQQEKNEQDLLALDKWNSLWNWFDITNWLWYIKI FIMIVGGL IGLRIVFAILSLVNRVRQGYSPLSLQTLIPNQRGPDRPGGIEEEGGE DR GRSVRLVSGFLALA DDLRSLCLFSYHRLRDLLLILARAVELLGHSILRS

LQRGWEVLKYLGNLVQY GLELKKSAISLLDTIAIAVAEGTDRIIE

F-CRAIYNIPTRI -RRGFEAALQ*

>CH0694.3. wlOl .10.6-env. not-TF

P G-.Q

T

T

- S Y.. K

. NNSN

N

E.. .. N E.K

G R

N .A.E N.N

....S N.. E

.G

.... V.... V...

.. K..

N.T... D...

E... E.. K....

. V. F

>CH0694.3. wlOl.10.6-nef

MGNKWSK G PTIRERMRQTDPAAEGVGAASQDLDRRGALTTSNTAQ

TNAACAWLEAQEEEGEVGFPVRPQVPLRPMTFKAAFDLSFFLKEKGGLEG LIYSKKRQEILDLWVYHTQGFFPDWQNYTPGPGVRYPLTFG CFKLVPVS PEEVEEANKGENNCLLHPGSLHGMEDEHREVLRWKFDSQLARRHLAREQH PEYYKDC*

>CH0694.3. wlOl .10.6-nef .not-TF

I ... D....

E... ... R ... F.. . F

.. Y

>CH0694.3.wl01.10.6-pol

PI KGPAKLLWKGEGAVVIQDNSDIKVVPRR-KAKI IRDYGKQMAGADCV AGRQDEDQ

>CH069 .3. wlOl .10.6-pol .not-TF

>CH0694.3. wlOl .10.6-rev

MAGRSGDSDAALLQAVRTIKILYQSNPYPKPEGTRQARRNRRRRWRKRQR QIRALSERILSTCLGRPTEPVPFQLPPIERLAINSSESSGTSGTQHSQEP AEGVGSP*VSGKPCAVLGSGTKKE

>CH069 .3.wl01.10.6-re . not-TF

T

.. RA S

>CH0694.3. wlOl .10.6-tat MDPVDPNLEPWNHPGSQPNTACNKCYCKRCCYHCSVCFLTKGLGISYGRK KRRQRRSSPPSSEDHQNPLSKQPLSQTRGDPTGPEESKKKVEKKTEADPC

A*

>CH0694.3.W101.10.6-tat . not-TF

K s

c

>CH069 .3. wlOl.10.6-vif

MENKWQGLIVWQVDRMRIRTWNSLVKHHMYVSKKTTG FYRHHYESRHPR VSSEVHI PIGDARLVI ITYWGLQTGERDWHLGNGVSIEWRLRRYSTQVDP GLADQLIHMYYFDCFADSAIRRAILGEIVSPRCEYPAGHNQVGSLQYLAL TALIKPK-KRKPPLPSVRK-LVEDR NKPQKTRGRRGNHTMNGH* >CH0694.3.wl01.10.6-vif . not-TF I N E

R

>CH0694.3.wl01.10.6-vpr

MEQAPEDQGPQREPYNEWALELLEELKQEAVRHFPRP LHGLGQYIYETY GDTWTGVEAIIRVLQQLLFIH-FRIGCRHSRIGILRQRRARNGSSRS * >CH069 .3. wl 01.10.6-vpr. not-TF R

>CH0694.3.wl01.10.6-vpu

HEMIDFLARVDYRLGVGALIVALILAI I WTIAYLEYRKVVRQRKIDWLI

ERIRERAEDSGNESEGDTEE-LSTLVDMGHLRLLDVNDL*

>CH069 .3. wlOl .10.6-vpu. not-TF

.... D V.R

41