Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
POLYPEPTIDE CONSTRUCTS WITH NOVEL BINDING AFFINITY AND USES THEREOF
Document Type and Number:
WIPO Patent Application WO/2022/187367
Kind Code:
A1
Abstract:
The present disclosure relates generally to polypeptide constructs, and particularly relate to T-cell receptor (TCR) constructs having binding affinity for a specific cognate antigen. The disclosure also provides compositions and methods useful for producing such constructs as well as methods for the diagnosis, prevention, and/or treatment of conditions associated with cells expressing the cognate antigen recognized by the polypeptide constructs.

Inventors:
CHIOU SHIN-HENG (US)
TSENG DIANE (US)
MACKALL CRYSTAL L (US)
DAVIS MARK M (US)
Application Number:
PCT/US2022/018529
Publication Date:
September 09, 2022
Filing Date:
March 02, 2022
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
UNIV LELAND STANFORD JUNIOR (US)
International Classes:
C07K14/725; A61K38/03; A61P35/00; C07K7/08; C07K16/28
Domestic Patent References:
WO2019075385A12019-04-18
Foreign References:
US20150104441A12015-04-16
Attorney, Agent or Firm:
GOTTFRIED, Lynn F et al. (US)
Download PDF:
Claims:
CLAIMS

WHAT IS CLAIMED IS:

1. An construct comprising at least one complementary determining region (CDR) having at least 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99%, or 100% sequence identity to a sequence selected from the group consisting of SEQ ID NOs: 1-106.

2. The construct of claim 1, wherein the at least one CDR has at least 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99%, or 100% sequence identity to a sequence selected from the group consisting of SEQ ID NOs: 1-56.

3. The construct of claim 1, wherein the at least one CDR has at least 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99%, or 100% sequence identity to a sequence selected from the group consisting of SEQ ID NOs: 57-106.

4. The construct of any one of claims 1 to 3, wherein the construct is a single-chain construct or a double-chain construct.

5. The construct of any one of claims 1 to 4, wherein the construct is selected from the group consisting of: (a) a T cell receptor (TCR); (b) an antibody; and (c) a functional derivative or fragment of (a) or (b).

6. The construct of any one of claims 1 to 5, wherein the construct is a TCR construct comprising a TCR alpha chain and a TCR beta chain operably linked to each other.

7. The construct of any one of claims 1 to 6, wherein the construct is a TCR construct comprising in its beta chain a CDR3 having at least 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99%, or 100% sequence identity to a sequence selected from SEQ ID NOs: 1-106.

8. The construct of claim 7, wherein the construct further comprising in its alpha chain a CDR3a sequence.

9. The construct of any one of claims 1 to 8, wherein construct is an antibody construct selected from the group consisting of an antigen-binding fragment (Fab), a single-chain variable fragment (scFv), a nanobody, a single domain antibody (sdAb), a VH domain, a VL domain, a VHH domain, a diabody, or a functional fragment of any thereof.

10. A recombinant nucleic acid comprising a nucleic acid sequence encoding a construct according to any one of claims 1 to 9.

11. The nucleic acid of claim 10, wherein the nucleic acid sequence is operably linked to a heterologous nucleic acid sequence.

12. The nucleic acid of any one of claims 10 to 11, wherein the nucleic acid molecule is further configured as an expression cassette or a vector.

13. The nucleic acid of claim 12, wherein the vector is a plasmid vector or a viral vector.

14. The nucleic acid of claim 13, wherein the viral vector is derived from a lentivirus, an adeno virus, an adeno-associated virus, a baculovirus, or a retrovirus.

15. An engineered cell comprising: a) a construct according to any one of claims 1 to 9; and/or b) a recombinant nucleic acid according to any one of claims 10 to 14.

16. The engineered cell of claim 15, wherein the engineered cell is a eukaryotic cell.

17. The engineered cell of claim 16, wherein the eukaryotic cell is a mammalian cell.

18. The engineered cell of claim 17, wherein the mammalian cell is a human cell.

19. The engineered cell of any one of claims 15 to 18, wherein the cell is an immune cell.

20. The engineered cell of claim 19, wherein the immune cell is a B cell, a monocyte, a natural killer (NK) cell, a natural killer T (NKT) cell, a basophil, an eosinophil, a neutrophil, a dendritic cell, a macrophage, a regulatory T cell, a helper T cell (TH), a cytotoxic T cell (TCTL), a memory T cell, a gamma delta (gd) T cell, another T cell, a hematopoietic stem cell, or a hematopoietic stem cell progenitor.

21. The engineered cell of claim 20, wherein the immune cell is a lymphocyte.

22. The engineered cell of claim 21, wherein the lymphocyte is a T lymphocyte or a T lymphocyte progenitor.

23. The engineered cell of claim 22, wherein the T lymphocyte is a CD4+ T cell or a CD8+ T cell.

24. The engineered cell of any one of claims 22 to 23, wherein the T lymphocyte is a CD8+ T cytotoxic lymphocyte cell selected from the group consisting of naive CD8+ T cells, central memory CD8+ T cells, effector memory CD8+ T cells, effector CD8+ T cells, CD8+ stem memory T cells, and bulk CD8+ T cells.

25. The engineered cell of any one of claims 22 to 23, wherein the T lymphocyte is a CD4+ T helper lymphocyte cell selected from the group consisting of naive CD4+ T cells, central memory CD4+ T cells, effector memory CD4+ T cells, effector CD4+ T cells, CD4+ stem memory T cells, and bulk CD4+ T cells.

26. A method for making an engineered cell, comprising: a) providing a host cell capable of protein expression; and b) transducing the provided host cell with a recombinant nucleic acid according to any one of claims 10 to 14 to produce an engineered cell.

27. An engineered cell produced by a method according to claim 26.

28. A cell culture comprising at least one engineered cell of any one of claims 15-25 and 27, and a culture medium.

29. A pharmaceutical composition comprising a pharmaceutically acceptable carrier and: a) a construct according to any one of claims 1 to 9; b) a recombinant nucleic acid according to any one of claims 10 to 14; and/or c) an engineered cell according to any one of claims 15-25 and 27.

30. The pharmaceutical composition of claim 29, wherein the composition comprises a recombinant nucleic acid according to any one of claims 10 to 14, and a pharmaceutically acceptable carrier.

31. The pharmaceutical composition of claim 30, wherein the recombinant nucleic acid is encapsulated in a viral capsid or a lipid nanoparticle.

32. The pharmaceutical composition of claim 29, wherein the composition comprises an engineered cell according to any one of claims 15-25 and 27.

33. A method for the prevention and/or treatment of a condition in a subject in need thereof, the method comprising administering to the subject a composition comprising: a) a construct according to any one of claims 1 to 9; b) a recombinant nucleic acid according to any one of claims 10 to 14; c) an engineered cell according to any one of claims 15-25 and 27; and/or d) a pharmaceutically composition according to any one of claims 29 to 32.

34. The method of claim 33, wherein the condition is associated with an immune checkpoint blockade.

35. The method of any one of claims 33 to 34, wherein the method is for a checkpoint blockade immunotherapy.

36. The method of claim 35, wherein the checkpoint blockade immunotherapy is an anti- PD1 checkpoint therapy or an anti -PD 1 -LI checkpoint therapy.

37. The method of any one of claims 33 to 36, wherein the condition is associated with a lung cancer selected from the group consisting of adenocarcinoma, squamous cell carcinoma, small cell carcinoma, non-small cell carcinoma, adenosquamous carcinoma, small cell lung cancer, large cell carcinoma, neuroendocrine cancers of the lung, non-small cell lung cancer (NSCLC), undifferentiated non-small cell carcinoma, non-small cell carcinoma not otherwise specified, pulmonary squamous cell carcinoma, broncho-alveolar carcinoma, sarcomatoid carcinoma, pleomorphic carcinoma, carcinosarcoma, pulmonary blastoma, metastatic carcinoma of unknown primary, primary pulmonary lymphoepithelioma-like carcinoma, and benign neoplasms of the lung.

38. The method of claim 37, wherein the lung cancer is a NSCLC selected from the group consisting of squamous cell carcinoma, adenocarcinoma, large cell carcinoma, carcinoid tumor, pleomorphic, salivary gland cancer, adenosquamous, sarcomatoid, and unclassified carcinomas.

39. The method of claim 38, wherein the NSCLC comprises stage I NSCLC or stage II NSCLC.

40. The method of any one of claim 37 to 39, wherein the cancer is a non-metastatic cancer, a metastatic cancer, a multiply drug resistant cancer, or a recurrent cancer.

41. The method of claim 40, wherein the administered composition inhibits tumor growth or metastasis of the cancer in the subject.

42. The method of claim 41, wherein the condition is a malignancy associated with a viral infection.

43. The method of claim 42, wherein the condition is a malignancy associated with an infection by Epstein-Barr virus (EBV).

44. The method of claim 43, wherein the malignancy is associated with an EBV infection and is selected from the group consisting of Hodgkin lymphoma, Burkitt lymphoma, diffuse large B cell lymphoma, nasopharyngeal carcinoma, gastric carcinoma, post-transplant lymphoproliferative disease, B lymphoproliferative disease, T/NK lymphoproliferative disease, T/NK lymphomas/leukemias, leiomyosarcomas, and lymphoepithelioma-like carcinomas.

45. The method of any one of claims 33 to 44, wherein the subject is a mammal.

46. The method of claim 45, wherein the mammal is a human.

47. The method of any one of claims 33 to 46, wherein the composition is administered to the subject individually as a first therapy or in combination with at least one additional therapies.

48. The method of claim 47, wherein the at least one additional therapies is selected from the group consisting of chemotherapy, radiotherapy, immunotherapy, hormonal therapy, toxin therapy, targeted therapy, and surgery.

49. The method of any one of claims 47 to 48, wherein the at least one additional therapies is selected from the group consisting of an anti-CTLA4 antibody, an anti -PD- 1 antibody, an anti-PD-Ll antibody, an anti-CD20 antibody, an anti-CD40 antibody, an anti-DR5 antibody, an anti-CD Id antibody, an anti-TIM3 antibody, an anti-SLAMF7 antibody, an anti -KIR receptor antibody, an anti -0X40 antibody, an anti-HER2 antibody, an anti-ErbB-2 antibody, an anti-EGFR antibody, cetuximab, rituximab, trastuzumab, pembrolizumab, radiotherapy, single dose radiation, fractionated radiation, focal radiation, whole organ radiation, IL-12, IFNa, GM-CSF, a chimeric antigen receptor, adoptively transferred T cells, an anti-cancer vaccine, and an oncolytic virus.

50. The method of any one of claims 47 to 49, wherein the first therapy and the at least one additional therapies are administered concomitantly.

51. The method of any one of claims 47 to 50, wherein the first therapy is administered at the same time as the at least one additional therapies.

52. The method of any one of claims 47 to 50, wherein the first therapy and the at least one additional therapies are administered sequentially.

53. The method of claim 52, wherein the first therapy is administered before the at least one additional therapies.

54. The method of claim 52, wherein the first therapy is administered after the at least one additional therapies.

55. The method of claim 52, wherein the first therapy is administered before and/or after the at least one additional therapies.

56. The method of any one of claims 47 to 55, wherein the first therapy and the at least one additional therapies are administered in rotation.

57. The method of any one of claims 47 to 48, wherein the first therapy and the at least one additional therapies are administered together in a single formulation.

58. A kit for the diagnosis, prevention, and/or treatment a condition in a subject in need thereof, the system comprising: a) a construct according to any one of claims 1 to 9; b) a recombinant nucleic acid according to any one of claims 10 to 14; c) an engineered cell according to any one of claims 15-25 and 27; and/or d) a pharmaceutically composition according to any one of claims 29 to 32.

59. A method for obtaining a construct according to claim 1, the method comprising: a) identifying a plurality of T cell receptors (TCRs) associated with a health condition; b) determining a sequence of a CDR3 present in each of the identified TCRs; and c) making a construct comprising a CDR3 sequence determined in (b).

60. The method of claim 59, further comprising identifying one or more cognate antigens commonly recognized by the CDR3 sequences.

Description:
POLYPEPTIDE CONSTRUCTS WITH NOVEL BINDING AFFINITY AND USES

THEREOF

CROSS-REFERENCE TO RELATED APPLICATIONS

[001] This application claims priority to U.S. Provisional Patent Application No. 63/156,026, filed March 3, 2021, the disclosure of which is incorporated by reference herein in its entirety, including any drawings.

STATEMENT REGARDING FEDERALLY SPONSORED R&D

[002] This invention was made with government support under grant no. U54 CA232568-01 awarded by The National Cancer Institute. The government has certain rights in the invention.

FIELD

[003] The present disclosure relates generally to the field of immunology, and particularly relate to polypeptide constructs having binding affinity for a specific antigen. The disclosure also provides compositions and methods useful for producing such constructs as well as methods for the diagnosis, prevention, and/or treatment of health conditions associated with cells expressing the cognate antigen recognized by the polypeptide constructs.

BACKGROUND

[004] In recent years, the wide use of immune checkpoint blockade and T cell-based immunotherapies to treat patients with solid tumors requires a deeper understanding of the T cell specificities in cancer. For example, T-cell receptors (TCR) have emerged in recent years as a promising approach for immunotherapy and made headlines in clinical trials conducted by a number of pharmaceutical and biotechnology companies. TCRs have been shown to have therapeutic and diagnostic potential and can be modified similarly to antibody molecules. In particular, the affinity of TCRs for a specific antigen makes them valuable for various therapeutic strategies, including adoptive immunotherapy.

[005] However, despite the widespread use of immunotherapies for treating cancer, general understanding of T cell specificities in cancer is limited. For example, antigen specificity is the key determinant of T cell function, but challenges posed by TCR diversity and human leukocyte antigens (HLA) allele polymorphism have been major obstacles to understanding the full scope of antigens recognized by tumor-infiltrating T cells. In addition, the specificities of the vast majority of tumor-infiltrating T cells remain unknown across all solid tumors despite the availability of advanced technologies for profiling T cell states and repertoires using single-cell sequencing techniques. This is largely due to the absence of tools for analyzing diverse TCR repertoires in the context of highly polymorphic human leukocyte antigens (HLA) alleles. For example, while next-generation sequencing technologies have made the sequencing of large numbers of TCR relatively straightforward and inexpensive, a major problem revolves around how these very large repertoires can be analyzed. This is because there can be hundreds or thousands of possible TCR sequences for the same peptide- MHC specificity.

[006] Accordingly, uncovering the specificities of tumor-infiltrating T cells is important for understanding how T cell-intrinsic factors shape tumor-immune system interactions and impact therapies aimed at harnessing T cell responses against cancer.

SUMMARY

[007] The present disclosure relates generally to the field of immunology. More particularly, provided herein are novel polypeptide constructs having binding affinity for a specific antigen. The disclosure also provides compositions and methods useful for producing such polypeptide constructs as well as methods for the diagnosis, prevention, and/or treatment of conditions associated with cells expressing the cognate antigen recognized by the polypeptide constructs. In particular, also provided are recombinant cells such as lymphocyte T cells that have been engineered to express a polypeptide construct as disclosed herein and are directed against a cell of interest such as a cancer cell.

[008] In one aspect, provided herein are various constructs including at least one complementary determining region (CDR) having at least 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99%, or 100% sequence identity to a sequence selected from the group consisting of SEQ ID NOs: 1-106.

[009] Non-limiting exemplary embodiments of the disclosed constructs can include one or more of the following features. In some embodiments, the at least one CDR has at least 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99%, or 100% sequence identity to a sequence selected from the group consisting of SEQ ID NOs: 1-56. In some embodiments, the at least one CDR has at least 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99%, or 100% sequence identity to SEQ ID NO: 18. In some embodiments, the at least one CDR has at least 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99%, or 100% sequence identity to a sequence selected from the group consisting of SEQ ID NOs: 57-106. In some embodiments, the at least one CDR has at least 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99%, or 100% sequence identity to SEQ ID NO: 64. In some embodiments, the construct is single-chain constructs or double-chain constructs. In some embodiments, the construct is selected from the group consisting of: (a) a T cell receptor (TCR); (b) an antibody; and (c) a functional derivative or fragment of (a) or (b). In some embodiments, the construct is a TCR construct including a TCR alpha chain and a TCR beta chain operably linked to each other. In some embodiments, the construct is a TCR construct including in its beta chain a CDR3 having at least 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99%, or 100% sequence identity to a sequence selected from SEQ ID NOs: 1-106. In some embodiments, the CDR3 has at least 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99%, or 100% sequence identity to a sequence selected from SEQ ID NOs: 1-56. In some embodiments, the CDR3 has at least 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99%, or 100% sequence identity to SEQ ID NO: 18. In some embodiments, the CDR3 has at least 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99%, or 100% sequence identity to a sequence selected from SEQ ID NOs: 57-106. In some embodiments, the CDR3 has at least 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99%, or 100% sequence identity to SEQ ID NO: 64. In some embodiments, the construct further includes in its alpha chain a CDR3a sequence.

[0010] In some embodiments, the construct disclosed herein is an antibody construct selected from the group consisting of an antigen-binding fragment (Fab), a single-chain variable fragment (scFv), a nanobody, a single domain antibody (sdAb), a V H domain, a V L domain, a V H H domain, a diabody, or a functional fragment of any thereof.

[0011] In another aspect, provided herein are recombinant nucleic acids, wherein the nucleic acids including a nucleic sequence encoding a construct of the disclosure.

[0012] Non-limiting exemplary embodiments of the disclosed nucleic acids can include one or more of the following features. In some embodiments, the nucleic acid sequence is operably linked to a heterologous nucleic acid sequence. In some embodiments, the nucleic acid molecule is further configured as an expression cassette or an expression vector. In some embodiments, the vector is a plasmid vector or a viral vector. In some embodiments, the viral vector is derived from a lentivirus, an adeno virus, an adeno-associated virus, a baculovirus, or a retrovirus.

[0013] In another aspect, some embodiments of the disclosure relates to engineered cells that include one or more of: (a) a construct of the disclosure and/or (b) a recombinant nucleic acid of the disclosure. Non-limiting exemplary embodiments of the disclosed cells can include one or more of the following features. In some embodiments, the engineered cell is a eukaryotic cell. In some embodiments, the eukaryotic cell is a mammalian cell. In some embodiments, the mammalian cell is a human cell. In some embodiments, the cell is an immune cell. In some embodiments, the immune cell is a B cell, a monocyte, a natural killer (NK) cell, a natural killer T (NKT) cell, a basophil, an eosinophil, a neutrophil, a dendritic cell, a macrophage, a regulatory T cell, a helper T cell (TH), a cytotoxic T cell (TCTL), a memory T cell, a gamma delta (gd) T cell, another T cell, a hematopoietic stem cell, or a hematopoietic stem cell progenitor.

[0014] In some embodiments, the immune cell is a lymphocyte. In some embodiments, the lymphocyte is a T lymphocyte or a T lymphocyte progenitor. In some embodiments, the T lymphocyte is a CD4+ T cell or a CD8+ T cell. In some embodiments, the T lymphocyte is a CD8+ T cytotoxic lymphocyte cell selected from the group consisting of naive CD8+ T cells, central memory CD8+ T cells, effector memory CD8+ T cells, effector CD8+ T cells, CD8+ stem memory T cells, and bulk CD8+ T cells. In some embodiments, the T lymphocyte is a CD4+ T helper lymphocyte cell selected from the group consisting of naive CD4+ T cells, central memory CD4+ T cells, effector memory CD4+ T cells, effector CD4+ T cells, CD4+ stem memory T cells, and bulk CD4+ T cells.

[0015] In a related aspect, some embodiments of the disclosure relate to cell cultures that include at least one engineered cell of the disclosure and a culture medium.

[0016] In another aspect, some embodiments disclosed herein relate to methods for making an engineered cell, wherein the method includes (a) providing a host cell capable of protein expression; and (b) transducing the provided host cell with a recombinant nucleic acid of the disclosure to produce an engineered cell. Accordingly, in a related aspect, also provided herein are engineered cells produced by the methods of the disclosure. In a further related aspect, some embodiments of the disclosure relate to cell cultures that include at least one engineered cell of the disclosure and a culture medium. [0017] In one aspect, provided herein are various pharmaceutical compositions, wherein the pharmaceutical compositions include a pharmaceutically acceptable carrier and one or more of: (a) a construct of the disclosure; (b) a recombinant nucleic acid of the disclosure; and/or (c) an engineered cell of the disclosure.

[0018] Non-limiting exemplary embodiments of the disclosed pharmaceutical compositions can include one or more of the following features. In some embodiments, the composition includes a recombinant nucleic acid of the disclosure and a pharmaceutically acceptable carrier. In some embodiments, the recombinant nucleic acid is encapsulated in a viral capsid or a lipid nanoparticle. In some embodiments, the composition includes an engineered cell of the disclosure and a pharmaceutically acceptable carrier.

[0019] In another aspect, some embodiments of the disclosure relate to methods for the prevention and/or treatment of a condition in a subject in need thereof, wherein the methods include administering to the subject a composition including one or more of: (a) a construct of the disclosure; (b) a recombinant nucleic acid of the disclosure; (c) an engineered cell of the disclosure; and d) a pharmaceutically composition of the disclosure.

[0020] Non-limiting exemplary embodiments of the disclosed methods for preventing and/or treating a condition in a subject in need thereof can include one or more of the following features. In some embodiments, the condition is associated with an immune checkpoint blockade. In some embodiments, the method is for a checkpoint blockade immunotherapy. In some embodiments, the checkpoint blockade immunotherapy is an anti- PD1 checkpoint therapy or an anti -PD 1 -LI checkpoint therapy. In some embodiments, the condition is associated with a lung cancer is selected from the group consisting of adenocarcinoma, squamous cell carcinoma, small cell carcinoma, non-small cell carcinoma, adenosquamous carcinoma, small cell lung cancer, large cell carcinoma, neuroendocrine cancers of the lung, non-small cell lung cancer (NSCLC), undifferentiated non-small cell carcinoma, non-small cell carcinoma not otherwise specified, pulmonary squamous cell carcinoma, broncho-alveolar carcinoma, sarcomatoid carcinoma, pleomorphic carcinoma, carcinosarcoma, pulmonary blastoma, metastatic carcinoma of unknown primary, primary pulmonary lymphoepithelioma-like carcinoma, and benign neoplasms of the lung. In some embodiments, the lung cancer is a NSCLC selected from the group consisting of squamous cell carcinoma, adenocarcinoma, large cell carcinoma, carcinoid tumor, pleomorphic, salivary gland cancer, adenosquamous, sarcomatoid, and unclassified carcinomas. In some embodiments, the NSCLC includes stage I NSCLC or stage II NSCLC. In some embodiments, the cancer is a non-metastatic cancer, a metastatic cancer, a multiply drug resistant cancer, or a recurrent cancer. In some embodiments, the administered composition inhibits tumor growth or metastasis of the cancer in the subject.

[0021] In some embodiments, provided herein are methods for preventing and/or treating a condition in a subject in need thereof, wherein the condition is a malignancy associated with a viral infection. In some embodiments, the condition is a malignancy associated with an infection by Epstein-Barr virus (EBV). In some embodiments, the malignancy is associated with an EBV infection and is selected from the group consisting of Hodgkin lymphoma, Burkitt lymphoma, diffuse large B cell lymphoma, nasopharyngeal carcinoma, gastric carcinoma, post-transplant lymphoproliferative disease, B lymphoproliferative disease, T/NK lymphoproliferative disease, T/NK lymphomas/leukemias, leiomyosarcomas, and lymphoepithelioma-like carcinomas.

[0022] In some embodiments of the methods for preventing and/or treating a condition in a subject in need thereof, wherein the subject is a mammal. In some embodiments, the mammal is a human. In some embodiments, the composition is administered to the subject individually as a first therapy (monotherapy) or in combination with at least one additional therapies. In some embodiments, the at least one additional therapies is selected from the group consisting of chemotherapy, radiotherapy, immunotherapy, hormonal therapy, toxin therapy, targeted therapy, or surgery. In some embodiments, the at least one additional therapies is selected from the group consisting of an anti-CTLA4 antibody, an anti-PD-1 antibody, an anti-PD-Ll antibody, an anti-CD20 antibody, an anti-CD40 antibody, an anti- DR5 antibody, an anti-CD Id antibody, an anti-TIM3 antibody, an anti-SLAMF7 antibody, an anti-KIR receptor antibody, an anti-OX40 antibody, an anti-HER2 antibody, an anti-ErbB-2 antibody, an anti-EGFR antibody, cetuximab, rituximab, trastuzumab, pembrolizumab, radiotherapy, single dose radiation, fractionated radiation, focal radiation, whole organ radiation, IL-12, IFNa, GM-CSF, a chimeric antigen receptor, adoptively transferred T cells, an anti-cancer vaccine, and an oncolytic virus. In some embodiments, the first therapy and the at least one additional therapies are administered concomitantly. In some embodiments, the first therapy is administered at the same time as the at least one additional therapies. In some embodiments, the first therapy and the at least one additional therapies are administered sequentially. In some embodiments, the first therapy is administered before the at least one additional therapies. In some embodiments, the first therapy is administered after the at least one additional therapies. In some embodiments, the first therapy is administered before and/or after the at least one additional therapies. In some embodiments, the first therapy and the at least one additional therapies are administered in rotation. In some embodiments, the first therapy and the at least one additional therapies are administered together in a single formulation.

[0023] In another aspect, some embodiments of the disclosure relate to kits for the practice of the methods disclosed herein. Some embodiments relate to kits for methods of the diagnosis, prevention, and/or treatment a condition in a subject in need thereof, wherein the kits include one or more of: a construct of the disclosure; a recombinant nucleic acid of the disclosure; an engineered cell of the disclosure; and a pharmaceutical composition of the disclosure.

[0024] In another aspect, provided herein is the use of one or more of: a construct of the disclosure; a recombinant nucleic acid of the disclosure; an engineered cell of the disclosure; and a pharmaceutical composition, for the prevention and/or treatment of a condition. In some embodiments, the condition is a proliferative disorder. In some embodiments, the proliferative disorder is a cancer. In some embodiments, the condition is a malignancy associated with an infection. In some embodiments, the infection is a bacterial infection or viral infection.

[0025] In another aspect, provided herein is the use of one or more of: a construct of the disclosure, a recombinant nucleic acid of the disclosure, an engineered cell of the disclosure, or a pharmaceutical composition of the disclosure, in the manufacture of a medicament for the treatment of a health condition. In some embodiments, the condition is a proliferative disorder. In some embodiments, the proliferative disorder is a cancer. In some embodiments, the condition is a malignancy associated with an infection. In some embodiments, the infection is a bacterial infection or viral infection.

[0026] In yet another aspects, provided herein are various methods for obtaining a construct as disclosed herein, the methods include (a) identifying a plurality of T cell receptors (TCRs) associated with a health condition; (b) determining a sequence of a CDR3 present in each of the identified TCRs; and (c) making a construct including a CDR3 sequence determined in (b), wherein the construct is capable of binding to the one or more cognate antigens. In some embodiments, the condition is a proliferative disease. In some embodiments, the method further includes identifying one or more antigens commonly recognized by the CDR3 sequences.

[0027] The foregoing summary is illustrative only and is not intended to be in any way limiting. In addition to the illustrative embodiments and features described herein, further aspects, embodiments, objects and features of the disclosure will become fully apparent from the drawings and the detailed description and the claims.

BRIEF DESCRIPTION OF THE DRAWINGS

[0028] FIGS. 1A-1C schematically summarize the results of experiments performed to establish specificity groups with TCR CDR3P sequences from lung cancer patients. FIG. 1A: Schematic of the four steps involved in TCR specificity inference with the GLIPH2 algorithm (Grouping of Lymphocyte Interactions by Paratope Hotspots). Step 1 involves acquisition of T cell receptor CDR3P sequences. In this step, 778,938 CDR3P sequences from the MDACC cohort were used as input for GLIPH2 analysis. Step 2 involves discovery of short sequence motifs within CDR3P sequences from multiple patients. These shared motifs are predicted to be involved in the direct engagement with antigenic peptides loaded on HLA molecules. In this step, 66,094 specificity groups with multiple criteria were established (see also, FIGS. 3A). Step 3 involves establishment of 4,226 clonally expanded specificity groups. Multiple cutoffs are used for the inference of a given specificity group, including a) the enrichment of nb genes, b) minimum numbers of distinct CDR3P sequences = 3, c) minimum numbers of patients = 3, d) enrichment of clonally expanded CDR3P clonotypes, and e) enrichment of HLA alleles. Step 4 involves establishment of 435 clonally expanded, tumor-enriched specificity groups. FIG. IB: Relevance of tumor-enriched specificity groups in lung cancer. The most expanded CDR3P sequences from tumors belonged to the 435 tumor-enriched specificity groups, whereas those from lung tissues of healthy donors and COPD patients did not. The trend was validated with tumors of a second NSCLC cohort (the TRACERx consortium, n=202, validation). ***, > < 0.001; *,p < 0.05 by paired t test. NS, not significantly different. FIG. 1C: Network analysis of 396 specificity groups annotated with CDR3P sequences from HLA tetramers with influenza virus (Flu, red), Epstein-Barr virus (EBV, green), and cytomegalovirus (CMV, blue) antigens. Each dot is a specificity group, edges indicate the presence of identical CDR3P sequence(s) across two specificity groups.

[0029] FIGS. 2A-2C schematically summarize the results of experiments performed to illustrate virus-specific CD8+ T cell clones expanded in patients responding to anti -PD 1 treatment. FIG. 2A: Comparisons of pre- and post-treatment CDR3 clonal frequencies in the peripheral blood of patient Ml (left) and M2 (right). CDR3 clones inferred to recognize viral antigens are highlighted. FIG. 2B: Specificity groups containing expanded CDR3 clones post-treatment (column 5, CDR3 sequence) from patients Ml or M2 (column 6, Patient ID) that are annotated with viral tetramer CDR3 sequences (column 2-4, antigen and HLA alleles of the tetramers). Enrichment of the A*02:01 or B*35:01 allele is shown (last 2 columns, p values from the hypergeometric tests are shown). CDR3a/ sequences of the two EBV-related expanded clones from patient M2 are shown at the bottom. FIG. 2C: TCR27- Jurkat cell line (CDR3 : CASSTGDSNQPQHF; SEQ ID NO: 64, top panels) and TCR28- Jurkat cell line (CDR3 : CASSARTGELFF; SEQ ID NO: 18, bottom panels) were created and tested for their reactivities to the predicted EBV antigens in the context of B*35 as shown in FIG. 2B. TCR27- and TCR28-Jurkat cells were co-cultured with T2-B*35 cells pulsed with indicated peptides (above each plot). Level of activation was quantified with CD69 expression. In these experiments, the control peptide had the following sequence: LPFDFTPGY (SEQ ID NO: 107).

[0030] FIG. 3A: Specificity inference pipeline, which schematically summarizes the data availability for the 178 HLA-typed NSCLC patients from the MD Anderson Cancer Center (MDACC). FIG. 3B: Low percentages of TCR clonotypes from adjacent lung tissues are grouped into tumor-enriched specificity groups. The percentages of the top 20 most expanded CDR3P clonotypes from the adjacent lung tissues of patients belonging to the MDACC NSCLC cohort (n=178 samples, left) and the TRACERx cohort (n=63 samples, right) were quantified for those that belonged to the 435 tumor-enriched specificity groups as in FIG. IB (% grouped, loglO-converted). The same analysis was performed on the remainder of the CDR3P clonotypes (Non-exp, non-expanded). ND, no statistically significant difference was found. FIG. 3C: The 71 clonally expanded specificity groups annotated and colored with 10 indicated tetramers are shown in the network. FIG. 3D schematically summarizes the in silico validation of TCR specificity groups using HLA tetramer sequences. Left to right, network analysis of 71 clonally expanded specificity groups colored as in FIG. 1C is shown; the two large Flu-related communities (red) are circled and the CDRjp members of the specificity groups are highlighted with the previously reported short motifs “RS” and “GxY” highlighted in red font; heatmap showing distinct CDR3P members (columns) of the Flu-related (with the “RS” motif) specificity groups (rows) and the levels of shared CDR3P members between specificity groups within the circled community; table showing an example of the “SIRSS%E” specificity group containing the short “RS” motif (bold) that is annotated with 5 Flu-specific tetramer sequences (bottom). The counts of distinct CDR3P members from tumor and the nb gene usage are shown (top). FIG. 3E: 394 specificity groups annotated with indicated tetramers (key) were organized into distinct communities through shared CDR3P sequence(s) as in FIG. 1C. Thickness of edge represents numbers of shared CDR3P sequence(s) between any two connected nodes. FIG. 3F: Community plot as in FIG. 3E. Color of edge represents shared CDR3P sequence(s) between specificity groups with identical (red) or distinct (blue) specificities defined by tetramer-derived sequences (labeled with distinct colors, E). 588 of all (n=634) connections (edges) are labeled in red (92.74%). FIG. 3G: 71 clonally expanded specificity groups as in FIG. 3C. Color of edge represents shared CDR3P sequence(s) between specificity groups with identical (red) or distinct (blue) specificities defined by tetramer-derived sequences (FIG. 3C). 92 of all (n=92) connections (edges) are labeled in red (100%).

[0031] FIGS. 4A-4B schematically summarize the results of CDR3P sequences inferred to recognize CMV, Flu, and EBV do not differ in their distribution between tumor and uninvolved lung. (A) Volcano plots showing the relative distributions of CDR3P sequences with inferred specificities to CMV (blue), Flu (red), or EBV (green) across the tumor (T) and uninvolved lung (N) by comparing multiple patients with Poisson test. The y- axis shows the negative loglO converted p values of the Poisson test and the x-axis shows the log2 converted fold-difference between tumor and uninvolved (adjacent) lung (T/N). (B)

Total frequencies of clonotypes in tumor (right) or uninvolved lung (left) that are inferred to recognize antigens from EBV, CMV, or Flu by GLIPH2. Each dot is a patient and total frequencies are shown as loglO converted values (first and the third quartiles show 25th & 75th percentiles, respectively).

[0032] FIG. 5 is a schematic of the combined single-cell TCR-Seq and single-cell RNA-Seq (scRNA-seq) procedures. CD45+ CD3+ T cells were sorted from single-cell suspensions of lung tumor samples from patients with NSCLC at Stanford. Single-cell TCR- Seq was performed using nested multiplexed PCR as previously described (Han et ak, 2014. Nat. Biotechnol. 32, 684). Single-cell RNseq was performed according to previous methods (Picelli et ak, 2014. Nat. Protoc. 9, 171) with modifications as details in the methods. TCR repertoires were integrated from the single-cell TCR-Seq pipeline and from the scRNA-seq data with reconstruction using the TraCeR algorithm (Stubbington el al ., 2016. Nat. Methods 13, 329) for GLIPH2 analysis.

[0033] FIG. 6A shows CT scan images of pre- and post-treatment from NSCLC patient Ml (top panels) and M2 (bottom panels) treated with anti -PD 1 therapy. Tumors are highlighted with red arrowheads. FIG. 6B: T2 (174 x CEM.T2) cells were transduced with lentiviral vector encoding the full-length coding sequence of WT human HLA-B*35:01. Cells were selected with puromycin and the surface B*35 expression was quantified by FACS with or without the control peptide “LPFDFTPGY” (SEQ ID NO: 107) reported previously (Takamiya et al., 1994. Int Immunol. Vol. 6, 255). FIG. 6C is a volcano plot showing the comparison of the 66,094 shared specificity groups between tumor (T) and the adjacent lung (N) by Poisson test. The y-axis represents the negative loglO converted p values of the Poisson tests and the x-axis represents the log2 converted fold difference between tumor and the adjacent lung (T/N). Dot size represents levels of clonal expansion. Specificity groups annotated with pathogen-related tetramer CDR3 sequences as in FIG. 2B (n=l 1) are highlighted according to the respective CDR3 sequences of the expanded clones (5th column, FIG. 2B).

DETAILED DESCRIPTION OF THE DISCLOSURE [0034] The present disclosure generally relates to, inter alia , compositions and methods for the diagnosis, prevention, and/or treatment of health conditions. More particularly, provided herein are novel polypeptide constructs having binding affinity for a specific cognate antigen. The disclosure also provides compositions and methods useful for producing such polypeptide constructs as well as methods for the diagnosis, prevention, and/or treatment of conditions associated with cells expressing the cognate antigen recognized by the polypeptide constructs. In particular, also provided are recombinant cells such as lymphocyte T cells that have been engineered to express a polypeptide construct as disclosed herein and are directed against a cell of interest such as a cancer cell. As will be discussed more thoroughly below, the present disclosure describes an approach that combines bioinformatics and antigen screening to identify novel shared tumor antigens in lung cancer. In some embodiments, the disclosed approach implements an improved version of the algorithm GLIPH (Grouping of Lymphocyte Interactions with Paratope Hotspots), GLIPH2 ([23] and [24]), to infer the T cell specificities for shared antigens at a global level. Using TCR repertoires from 178 HLA-typed lung cancer patients, GLIPH2 identified over 400 specificity groups inferred to recognize shared tumor antigens in defined HLA contexts. Subsequent analyses were then performed on those with inferred HLA-B*35 restrictions, which informed the prioritization of two particular specificity groups, TCR27 and TCR28. As described in greater detail below, additional analyses revealed that the specificity group TCR27 carries the following motifs for antigen identification: “STGD%NQP”, “%TGDSNQP”, “ST%DSNQP”, “STG%SNQP”, and “S%GDSNQP” where “%” denotes the amino acid that varied (Gee et al., 2018). Non-limiting exemplary CDR3 sequences of the TCR27 specifity group include, for example, those provided in the Sequence Listing as SEQ ID NO: 57-106. The specificity group TCR28 carries the following motifs: “SARTG%”, “S%RTGE”, “SAR%GE”, “SA%TGE”, and “SART%E”. Non-limiting exemplary CDR3 sequences of the TCR28 specifity group include, for example, those provided in the Sequence Listing as SEQ ID NO: 1-56.

[0035] As discussed above, the wide use of immune checkpoint blockade and T cell- based immunotherapies to treat patients with solid tumors requires a deeper understanding of the T cell specificities in cancer. However, the specificities of the vast majority of tumor- infiltrating T cells remain unknown across all solid tumors despite the availability of advanced technologies for profiling T cell states and repertoires using single-cell sequencing techniques. In recent years, the handful specificities of tumor-infiltrating T cells that have been previously described include T cells recognizing mutated antigens, non-mutated (shared) antigens, and viral antigens. In the era of immune checkpoint blockade, there has been a recent focus on mutated antigens (e.g., neoantigens). As neoantigens represent a type of “altered self’ antigen, T cells recognizing this class of antigens have been shown to exhibit an activated phenotype and respond vigorously in tumors. Non-mutated tumor antigens include differentiation antigens (e.g. melanoma-associated antigens) that are expressed in normal tissue counterparts, or self-antigens where expression is restricted to immune- privileged sites, germline tissue, or embryos. There have been numerous examples targeting these types of tumor antigen with adoptive T cell therapies. In addition, T cells with specificities for viruses (such as HPV, EBV, and Merkel cell polyomavirus) have also been a focus of investigation for virus-associated cancers.

[0036] In contrast, the role of other types of T cell specificities in solid tumors remains elusive. For example, numerous reports have described the existence of virus-specific T cells in tumors, such as T cells specific for influenza virus (Flu) or cytomegalovirus (CMV) in lung cancer. Without direct evidence of such viruses playing a role in the oncogenesis of lung cancer or other solid tumors, they have largely been presumed to be irrelevant to the tumor immune response and are often referred as “bystander cells”. As described in greater detail below, experimental data described herein have identified a class of specific CD8+ T cells and their cross-reactive antigens from cancer cells and pathogens. This finding is consistent with the hypothesis that maintaining a broad T cell repertoire to defend against viruses and other pathogens may rely on cross-reactivity. T cells specific to self-antigens have been detected in the peripheral blood of healthy individuals, pruned but not clonally deleted in the thymus, potentially to avoid immunologic “blind spots” to viruses and other pathogens. Because cancer cells histologically resemble their tissue of origin and can express self- antigens, experiments have been designed to investigate the possibility that some tumor- infiltrating T cells are indeed specific to ubiquitously expressed, non-mutated self-antigens. Comprehensively profiling and deep characterization of T cell specificities within the tumor microenvironment provides a fundamental understanding of the T cell response beyond phenotypic characterization and sheds important insight on how the immune system recognizes tumors, normal tissues, and pathogens.

[0037] The vast majority of tumor-infiltrating T cells remain unknown is largely due to the absence of tools for analyzing diverse TCR repertoires in the context of highly polymorphic human leukocyte antigens (HLA) alleles. For example, while next-generation sequencing technologies have made the sequencing of large numbers of TCR relatively straightforward and inexpensive, a major problem revolves around how these very large repertoires can be analyzed.

[0038] This is because there can be hundreds or thousands of possible TCR sequences for the same peptide-MHC specificity. GLIPH algorithms (Glanville et ak, 2018), and more recently an improved version (GLIPH2; Huang et al, 2020) have been previously developed to systemically profile antigen specificities of T cells and to allow inferences of T cell specificity solely based on the CDRjp sequences. These algorithms analyze large numbers of sequences quickly and parse them into TCR specificity groups (a.k.a. specificity groups) that can predict the likely MHC allele restriction. As described in greater detail below, GLIPH2 was used to analyze 778,938 distinct TCR CDR3 sequences (referred to as CDR3 sequences) from 178 HLA-typed, non-small cell lung cancer (NSCLC) patients with surgically resectable disease . A total of 4,300 high-confidence specificity groups were initially derived. Of those, 449 were found enriched in tumor compared to uninvolved lung tissue. It was also found that up to 35% of all tumor-infiltrating T cell repertoires within a patient were inferred to have shared antigen specificities. Subsequently, select specificity groups were validated by identifying novel clonotypes predicted to recognize known viral antigens in given HLA contexts and experimentally confirmed these predictions. Next, two specificity groups were prioritized that were preferentially enriched in tumor and inferred to recognize antigen in the context of HLA-B*35. Phenotypically, these cross-reactive CD8+ T cells adopted an effector cell state, expressing some genes found on activated NK cells and did not express exhaustion markers PD-1 or CD39. In summary, the experimental data described herein offer direct evidence that the T cells infiltrating tumors may cross-react to recognize tumor antigens and pathogen-derived antigens.

[0039] As described in greater detail below, the experimental data disclosed herein establishes a novel approach for discovering shared tumor antigens and the T cells that recognize them. In particular, some experimental data presented herein illustrates EBV- specific CDR3 sequences that were clonally expanded in patients who had clinical responses to anti-PD-1 treatment. This suggests that pathogen cross-reactivity may be an important feature in the interaction between neoplasia and T cell immunity. Overall, the data disclosed herein illustrates a generalizable approach to comprehensively analyze shared T cell specificities in human cancer and identify specific antigens using a yeast display library. This data not only serves as a resource for further T cell studies in lung cancer but can also explain why some apparently “random” virus-specific T cells might congregate in the tumor microenvironment and suggests a way in which this might contribute to neoplasia

[0040] A non-limiting workflow for the approach for discovering novel shared tumor antigens in a target cancer, e.g., lung cancer, generally begins with comprehensive profiling of the T cell specificity landscape in human lung cancer. The bioinformatics tool GLIPH2 was used to profile 778,938 CDR3 sequences from 178 patients and establish 449 tumor- enriched specificity groups. Two such TCRs with inferred specificity in the context of HLA- B*35 was identified. The platform for T cell antigen identification as disclosed herein brings together two technologies. First, the GLIPH2 algorithm performs unbiased inferences of global T cell specificities with accurate predictions of HLA restriction. More information regarding the GLIPH2 algorithm can be found in Huang et al ., Nat Biotechnol, 2020, the content of which is expressed incorporated by reference. The inferences of shared specificity and HLA context are used to prioritize disease-relevant TCR candidates for downstream antigen discovery. Second, the rich diversity of yeast display libraries greatly facilitates antigen identification and allows for discovery of cross-reactive antigens. Unlike other MHC/peptide libraries built in mammalian cells, the yeast display libraries used the experiments described below incorporate more than 10 8 randomly permutated peptide sequences. Previously, the uncertainty of HLA restriction limited the success of antigen identification using the yeast display libraries. The studies described herein overcome this limitation by using GLIPH2 algorithm to infer the correct HLA context of the candidate TCR prior to screening the yeast library for its antigens.

[0041] As discussed above, uncovering the specificities of tumor-infiltrating T cells is important for understanding how T cell-intrinsic factors shape tumor-immune system interactions and impact therapies aimed at harnessing T cell responses against cancer. Complementing the current understanding of T cell exhaustion as a mechanism of tumor immune evasion, the studies described herein demonstrate that T cell specificities for self antigens also play a role. Without being bound to any particular theory, it is believed that T cell specificity for self-antigens partly explain why previous studies observed low reactivities of tumor-infiltrating T cells to autologous tumor.

[0042] In addition, the concept that immunologic exposure to environmental pathogens may influence the immune response to tumors has been previously theorized, although its mechanism is poorly understood. As early as the late 19th century, William Coley pioneered a mixed bacterial vaccine termed Coley’s toxin for the treatment of cancer patients with some successes. In the modern era, Bacillus Calmette-Guerin (BCG) is routinely used as an immunotherapy for early-stage bladder cancer. Recently the gut microbiome has been shown to be a key determinant of immunotherapy responses in cancer. In pancreatic cancer, a unique microbiome has been observed in patients with longest survival after surgery. While the mechanism of action of these various examples could involve cell types of the innate immune system, cross-reactive T cells recognizing both tumor and pathogens might be playing an essential role. Furthermore, as the lungs are exposed to respiratory pathogens, it is contemplated that the cross talk between these antigens and tumor antigens is particularly important for understanding the adaptive immune responses to lung cancer. [0043] Experimental results described herein have demonstrated that the categorization of T cell specificities in tumors as tumor-specific or as pathogen-specific bystanders does not fully capture all possibilities for T cell antigen recognition. As described in greater detail below, T cells in tumors can also be cross-reactive to both tumor antigens and pathogen- derived antigens and therefore offers a more nuanced understanding of T cell specificity in tumors. The disclosed approach for finding this particular class of TCRs also demonstrates a novel methodology for discovering additional tumor antigens. This is because a deeper understanding of how cross-reactive T cells recognize tumor antigens and pathogen-derived antigens can inform advancements in cellular therapies, checkpoint therapies, and vaccination strategies against cancer. The experimental data disclosed herein indicates that an individual’s encounters with environmental pathogens may shape the adaptive immune response against cancer, a concept that can be harnessed for improving immunotherapies for patients.

DEFINITIONS

[0044] Unless otherwise defined, all terms of art, notations and other scientific terms or terminology used herein are intended to have the meanings commonly understood by those of skill in the art to which this disclosure pertains. In some cases, terms with commonly understood meanings are defined herein for clarity and/or for ready reference, and the inclusion of such definitions herein should not necessarily be construed to represent a substantial difference over what is generally understood in the art. Many of the techniques and procedures described or referenced herein are well understood and commonly employed using conventional methodology by those skilled in the art.

[0045] The singular form “a”, “an”, and “the” include plural references unless the context clearly dictates otherwise. For example, the term “a cell” includes one or more cells, including mixtures thereof. “A and/or B” is used herein to include all of the following alternatives: “A”, “B”, “A or B”, and “A and B”.

[0046] The term “about”, as used herein, has its ordinary meaning of approximately.

If the degree of approximation is not otherwise clear from the context, “about” means either within plus or minus 10% of the provided value, or rounded to the nearest significant figure, in all cases inclusive of the provided value. Where ranges are provided, they are inclusive of the boundary values. [0047] The terms “administration” and “administering”, as used herein, refer to the delivery of a bioactive composition or formulation by an administration route including, but not limited to, oral, intravenous, intra-arterial, intramuscular, intraperitoneal, subcutaneous, intramuscular, and topical administration, or combinations thereof. The term includes, but is not limited to, administering by a medical professional and self-administering.

[0048] The terms “cell”, “cell culture”, “cell line” refer not only to the particular subject cell, cell culture, or cell line but also to the progeny or potential progeny of such a cell, cell culture, or cell line, without regard to the number of transfers or passages in culture. It should be understood that not all progeny are exactly identical to the parental cell. This is because certain modifications may occur in succeeding generations due to either mutation ( e.g ., deliberate or inadvertent mutations) or environmental influences (e.g, methylation or other epigenetic modifications), such that progeny may not, in fact, be identical to the parent cell, but are still included within the scope of the term as used herein, so long as the progeny retain the same functionality as that of the originally cell, cell culture, or cell line.

[0049] The term “effective”, “therapeutically effective”, or “pharmaceutically effective” amount or number of a subject construct, nucleic acid, cell, or composition of the disclosure generally refer to an amount or number sufficient for a construct, nucleic acid, cell, or composition to accomplish a stated purpose relative to the absence of the composition (e.g, achieve the effect for which it is administered, prevent or treat a disease, inhibit a microbial infection, or reduce one or more symptoms of a health condition). An example of an effective amount or number is an amount or number sufficient to contribute to the treatment, prevention, or reduction of a symptom or symptoms of a disease, which could also be referred to as a therapeutically effective amount. A “reduction” of a symptom(s) generally means decreasing of the severity or frequency of the symptom(s), or elimination of the symptom(s). The exact amount or number of a construct, nucleic acid, cell, or composition will depend on the purpose of the treatment, and can be ascertainable by one skilled in the art using known techniques (see, e.g., Lieberman, Pharmaceutical Dosage Forms (vols. 1-3, 1992); Lloyd, The Art, Science and Technology of Pharmaceutical Compounding (1999); Pickar, Dosage Calculations (1999); and Remington: The Science and Practice of Pharmacy, 20th Edition, 2003, Gennaro, Ed., Lippincott, Williams & Wilkins).

[0050] The term “operably linked”, as used herein, denotes a physical or functional linkage between two or more elements, e.g, polypeptide sequences or polynucleotide sequences, which permits them to operate in their intended fashion. For example, the term “operably linked” when used in context of the orthogonal DNA target sequences described herein or the promoter sequence in a nucleic acid construct, or in an engineered response element means that the orthogonal DNA target sequences and the promoters are in-frame and in proper spatial and distance away from a polynucleotide of interest coding for a protein or an RNA to permit the effects of the respective binding by transcription factors or RNA polymerase on transcription. It should be understood that, operably linked elements may be contiguous or non-contiguous.

[0051] In the context of polypeptide constructs, “operably linked” refers to a physical linkage ( e.g ., directly or indirectly linked) between amino acid sequences (e.g, different segments, portions, or domains) to provide for a described activity of the constructs. In the present disclosure, region, or domains of the constructs of the disclosure may be operably linked to retain proper folding, processing, targeting, expression, binding, and other functional properties of the constructs in the cell. Unless stated otherwise, the segments, portions, and domains of the constructs of the disclosure are operably linked to each other. Operably linked segments, portions, and domains of the constructs disclosed herein may be contiguous or non-contiguous (e.g, linked to one another through a linker).

[0052] The term “percent identity,” as used herein in the context of two or more nucleic acids or proteins, refers to two or more sequences or subsequences that are the same or have a specified percentage of nucleotides or amino acids that are the same (e.g, about 60% sequence identity, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or higher identity over a specified region, when compared and aligned for maximum correspondence over a comparison window or designated region) as measured using a BLAST or BLAST 2.0 sequence comparison algorithms with default parameters described below, or by manual alignment and visual inspection. See e.g, the NCBI web site at ncbi.nlm.nih.gov/BLAST. Such sequences are then said to be “substantially identical.” This definition also refers to, or may be applied to, the complement of a sequence. This definition also includes sequences that have deletions and/or additions, as well as those that have substitutions. Sequence identity can be calculated using published techniques and widely available computer programs, such as the GCS program package (Devereux et al, Nucleic Acids Res. 12:387, 1984), BLASTP, BLASTN, FASTA (Atschul etal, J Mol Biol 215:403, 1990). Sequence identity can be measured using sequence analysis software such as the Sequence Analysis Software Package of the Genetics Computer Group at the University of Wisconsin Biotechnology Center (1710 University Avenue, Madison, Wis. 53705), with the default parameters thereof.

[0053] The term “pharmaceutically acceptable excipient” as used herein refers to any suitable substance that provides a pharmaceutically acceptable carrier, additive or diluent for administration of a compound(s) of interest to a subject. As such, “pharmaceutically acceptable excipient” can encompass substances referred to as pharmaceutically acceptable diluents, pharmaceutically acceptable additives, and pharmaceutically acceptable carriers. As used herein, the term “pharmaceutically acceptable carrier” includes, but is not limited to, saline, solvents, dispersion media, coatings, antibacterial and antifungal agents, isotonic and absorption delaying agents, and the like, compatible with pharmaceutical administration. Supplementary active compounds ( e.g ., antibiotics and additional therapeutic agents) can also be incorporated into the compositions.

[0054] As used herein, a “subject” or an “individual” includes animals, such as human (e.g., human individuals) and non-human animals. In some embodiments, a “subject” or “individual” is a patient under the care of a physician. Thus, the subject can be a human patient or an individual who has, is at risk of having, or is suspected of having a disease of interest (e.g, cancer) and/or one or more symptoms of the disease. The subject can also be an individual who is diagnosed with a risk of the condition of interest at the time of diagnosis or later. The term “non-human animals” includes all vertebrates, e.g, mammals, e.g, rodents, e.g, mice, non-human primates, and other mammals, such as e.g, sheep, dogs, cows, chickens, and non-mammals, such as amphibians, reptiles, etc.

[0055] The term “vector” is used herein to refer to a nucleic acid molecule or sequence capable of transferring or transporting another nucleic acid molecule. The transferred nucleic acid molecule is generally linked to, e.g, inserted into, the vector nucleic acid molecule. Generally, a vector is capable of replication when associated with the proper control elements. The term “vector” includes cloning vectors and expression vectors, as well as viral vectors and integrating vectors. An “expression vector” is a vector that includes a regulatory region, thereby capable of expressing DNA sequences and fragments in vitro and/or in vivo. A vector may include sequences that direct autonomous replication in a cell, or may include sequences sufficient to allow integration into host cell DNA. Useful vectors include, for example, plasmids (e.g, DNA plasmids or RNA plasmids), transposons, cosmids, bacterial artificial chromosomes, and viral vectors. Useful viral vectors include, e.g., replication defective retroviruses and lentiviruses. In some embodiments, a vector is a gene delivery vector. In some embodiments, a vector is used as a gene delivery vehicle to transfer a gene into a cell.

[0056] It is understood that aspects and embodiments of the disclosure described herein include “comprising”, “consisting”, and “consisting essentially of’ aspects and embodiments.

As used herein, “comprising” is synonymous with “including”, “containing”, or “characterized by”, and is inclusive or open-ended and does not exclude additional, unrecited elements or method steps. As used herein, “consisting of’ excludes any elements, steps, or ingredients not specified in the claimed composition or method. As used herein, “consisting essentially of’ does not exclude materials or steps that do not materially affect the basic and novel characteristics of the claimed composition or method. Any recitation herein of the term “comprising”, particularly in a description of components of a composition or in a description of steps of a method, is understood to encompass those compositions and methods consisting essentially of and consisting of the recited components or steps.

[0057] Where a range of values is provided, it is understood by one having ordinary skill in the art that all ranges disclosed herein encompass any and all possible sub-ranges and combinations of sub-ranges thereof. Any listed range can be easily recognized as sufficiently describing and enabling the same range being broken down into at least equal halves, thirds, quarters, fifths, tenths, etc. As a non-limiting example, each range discussed herein can be readily broken down into a lower third, middle third and upper third, etc. As will also be understood by one skilled in the art all language such as “up to”, “at least”, “greater than”, “less than”, and the like include the number recited and refer to ranges which can be subsequently broken down into sub-ranges as discussed above. Finally, as will be understood by one skilled in the art, a range includes each individual member. Thus, for example, a group having 1-3 articles refers to groups having 1, 2, or 3 articles. Similarly, a group having 1-5 articles refers to groups having 1, 2, 3, 4, or 5 articles, and so forth. Certain ranges are presented herein with numerical values being preceded by the term “about.” The term “about” is used herein to provide literal support for the exact number that it precedes, as well as a number that is near to or approximately the number that the term precedes. In determining whether a number is near to or approximately a specifically recited number, the near or approximating unrecited number may be a number which, in the context in which it is presented, provides the substantial equivalent of the specifically recited number.

[0058] Headings, e.g., (a), (b), (i) etc., are presented merely for ease of reading the specification and claims. The use of headings in the specification or claims does not require the steps or elements be performed in alphabetical or numerical order or the order in which they are presented.

[0059] It is appreciated that certain features of the disclosure, which are, for clarity, described in the context of separate embodiments, may also be provided in combination in a single embodiment. Conversely, various features of the disclosure, which are, for brevity, described in the context of a single embodiment, may also be provided separately or in any suitable sub-combination. All combinations of the embodiments pertaining to the disclosure are specifically embraced by the present disclosure and are disclosed herein just as if each and every combination was individually and explicitly disclosed. In addition, all sub combinations of the various embodiments and elements thereof are also specifically embraced by the present disclosure and are disclosed herein just as if each and every such sub-combination was individually and explicitly disclosed herein.

T CELL RECEPTORS

[0060] A TCR is a heterodimeric cell surface protein of the immunoglobulin super family, which is associated with invariant proteins of the CD3 complex involved in mediating signal transduction. TCRs and antibodies are molecules that have evolved to recognize different classes of antigens (ligands). TCRs are antigen-specific molecules that are responsible for recognizing antigenic peptides presented in the context of a product of the major histocompatibility complex (MHC) on the surface of antigen presenting cells (APCs) or any nucleated cell (e.g., all human cells in the body, except red blood cells). In contrast, antibodies generally recognize soluble or cell-surface antigens, and do not require presentation of the antigen by an MHC. This system endows T cells, via their TCRs, with the potential ability to recognize the entire array of intracellular antigens expressed by a cell (including viral and bacterial proteins) that are processed intracellularly into short peptides, bound to an intracellular MHC molecule, and delivered to the surface as a peptide- MHC complex (pepMHC). This system allows virtually any foreign protein (e.g., mutated cancer antigen or virus protein) or aberrantly expressed protein to serve a target for T cells.

[0061] Generally, TCRs exist in ab and gd forms, which are structurally similar but have quite distinct anatomical locations and probably functions. The extracellular portion of native heterodimeric ab TCR generally consists of two polypeptides, an a chain and a b chain, each of which has a membrane-proximal constant domain, and a membrane-distal variable domain. Each of the constant and variable domains includes an intra-chain disulfide bond. The variable domains contain the highly polymorphic loops analogous to the complementarity determining regions (CDRs) of antibodies, embedded in a framework sequence, one being the hyper-variable region named CDR3. There are several types of alpha chain variable (Va) regions and several types of beta chain variable (nb) regions distinguished by their framework, CDR1 and CDR2 sequences, and by a partly defined CDR3 sequence. The use of TCR gene therapy overcomes a number of current hurdles. For example, it allows equipping patients' own T cells with desired specificities and generation of sufficient numbers of T cells in a short period of time, avoiding their exhaustion. In addition, the TCR can be transduced into central memory T cells or T cells with stem cell characteristics, which may ensure better persistence and function upon transfer. Furthermore, TCR-engineered T cells can be infused into cancer patients rendered lymphopenic by chemotherapy or irradiation, allowing efficient engraftment but inhibiting immune suppression.

Compositions of the disclosure

[0062] As described in greater detail below, one aspect of the present disclosure relates to novel polypeptide constructs having binding affinity for a specific cognate antigen. Also provided are recombinant nucleic acids encoding such polypeptide constructs, as well as recombinant cells that have been engineered to express a polypeptide construct as disclosed herein and are directed against a cell of interest such as a cancer cell.

A. Constructs of the disclosure

[0063] In one aspect, provided herein are various constructs including at least one complementary determining region (CDR) having at least 70% sequence identity to a sequence selected from the group consisting of SEQ ID NOs: 1-106.

[0064] Non-limiting exemplary embodiments of the disclosed constructs can include one or more of the following features. In some embodiments, the constructs include at least one, at least two, or at least three CDR having at least 70% sequence identity to a sequence selected from the group consisting of SEQ ID NOs: 1-106. In some embodiments, the constructs include at least one, at least two, or at least three CDR having at least 70% sequence identity to the sequence of SEQ ID NO: 6. In some embodiments, the constructs include at least one CDR having at least 70%, for example at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or at least 100% sequence identity to a sequence selected from the group consisting of SEQ ID NOs: 1-106. In some embodiments, the constructs include at least one CDR having at least 70%, for example at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or at least 100% sequence identity to the sequence of SEQ ID NO: 6.

[0065] The CDR sequence of the constructs disclosed herein may be modified, e.g., mutated. Non-limiting examples of modifications of the CDR sequence include a substitution, a deletion, an addition, or an insertion of no more than five, no more than four, no more than three, no more than two, or no more than one amino acid residue. In some embodiments, the at least one CDR of the constructs disclosed herein includes a sequence having 100% identity to a sequence selected from the group consisting of SEQ ID NOs: 1- 106, wherein at least 1, at least 2, at least 3, at least 4, at least 5 amino acid residues in the sequence is substituted by a different amino acid residue. In some embodiments, the at least one CDR of the constructs disclosed herein includes a sequence having 100% identity to the sequence of SEQ ID NO: 6, wherein at least 1, at least 2, at least 3, at least 4, at least 5 amino acid residues in the sequence is substituted by a different amino acid residue. In some embodiments, the at least one CDR includes a sequence having 100% identity to a sequence selected from the group consisting of SEQ ID NOs: 1-106, wherein one, two, three, four, or five of the amino acid residues in the sequence is substituted by a different amino acid residue. In some embodiments, the at least one CDR includes a sequence having 100% identity to the sequence of SEQ ID NO: 18, wherein one, two, three, four, or five of the amino acid residues in the sequence is substituted by a different amino acid residue.

[0066] One of ordinary skill in the art will understand that binding affinity can generally be used as a measure of the strength of a non-covalent interaction between two molecules, e.g., an antibody or functional fragment thereof and an antigen. In some cases, binding affinity can be used to describe monovalent interactions (intrinsic activity). Binding affinity between two molecules may be quantified by determination of the dissociation constant (KD). In turn, KD can be determined by measurement of the kinetics of complex formation and dissociation using, e.g, the surface plasm on resonance (SPR) method (Biacore). The rate constants corresponding to the association and the dissociation of a monovalent complex are referred to as the association rate constants k a (or k on ) and dissociation rate constant k d (or k 0ff ), respectively. K D is related to k a and k d through the equation K D = k d / k a. The value of the dissociation constant can be determined directly by well-known methods, and can be computed even for complex mixtures by methods such as those set forth in Caceci et al. (1984, Byte 9: 340-362). For example, the K D may be established using a double-filter nitrocellulose filter binding assay such as that disclosed by Wong & Lohman (1993, Proc. Natl. Acad. Sci. USA 90: 5428- 5432). Other standard assays to evaluate the binding ability of engineered antibodies of the present disclosure towards target antigens are known in the art, including for example, ELISAs, Western blots, RIAs, and flow cytometry analysis, and other assays exemplified elsewhere herein. The binding kinetics and binding affinity of the antibody also can be assessed by standard assays known in the art, such as Surface Plasmon Resonance (SPR), e.g. by using a Biacore™ system, or KinExA. In some embodiments, the binding affinity of a construct as disclosure herein for a target antigen can be assessed by the Scatchard method described by Frankel et al., Mol. Immunol , 16: 101-106, 1979.

[0067] In some embodiments, the construct of the disclosure can be (a) a TCR; (b) an antibody; or (c) a functional derivative or fragment of (a) or (b). One skilled in the art upon will readily understand that the term “functional fragment thereof’ or “functional derivative thereof’ refers to a molecule having quantitative and/or qualitative biological activity in common with the wild-type molecule from which the fragment or derivative was derived. For example, a functional fragment or a functional derivative of an antibody is one which retains essentially the same ability to bind to the same epitope as the antibody from which the functional fragment or functional derivative was derived. For instance, an antibody capable of binding to an epitope may be truncated at the N-terminus and/or C-terminus, and the retention of its epitope binding activity assessed using assays known to those of skill in the art.

[0068] In some embodiments, the construct is a TCR construct including a TCR alpha chain and a TCR beta chain operably linked to each other. In some embodiments, the TCR alpha chain and the TCR beta chain are covalently linked to each other. In some embodiments, the TCR alpha chain and the TCR beta chain are linked to each other in a non- covalent fashion. In some embodiments, the TCR alpha chain and the TCR beta chain are covalently linked to each other via a polypeptide linker. In some embodiments, the polypeptide linker is a cleavable linker. In some embodiments, the polypeptide linker includes an autoproteolytic peptide. In some embodiments, the autoproteolytic peptide includes one or more autoproteolytic cleavage sites derived from calcium-dependent serine endoprotease (furin), a porcine teschovirus-1 2 A (P2A), a foot-and-mouth disease virus (FMDV) 2A (F2A), an Equine Rhinitis A Virus (ERAV) 2A (E2A), a Thosea asigna virus 2A (T2A), a cytoplasmic polyhedrosis virus 2A (BmCPV2A), a Flacherie Virus 2A (BmIFV2A), or a combination thereof. In some embodiments, the TCR alpha chain and the TCR beta chain are covalently linked to each other via a P2A cleavage site. The present disclosure provides both single-chain TCR constructs and multiple-chain TCR constructs. In some embodiments, the TCR constructs of the disclosure may be provided as single chain a or b, or g and d, molecules, or alternatively as double chain constructs composed of both the a and b chain, or g and d chain.

[0069] In some embodiments, the TCR construct of the disclosure may be provided as a single-chain TCR (scTCR). A scTCR can include a polypeptide of a variable region of a first TCR chain ( e.g ., an alpha chain) and a polypeptide of an entire (full-length) second TCR chain (e.g., a beta chain), or vice versa. In some embodiments, the polypeptides are directly linked to one another. In some embodiments, the scTCR can optionally include one or more linkers which join the two or more polypeptides together. In some embodiments, the linker can be a synthetic compound linker such as, for example, a chemical cross-linking agent. Non-limiting examples of suitable cross-linking agents that are available on the market include N- hydroxysuccinimide (NHS), disuccinimidylsuberate (DSS), bis(sulfosuccinimidyl)suberate (BS3), dithiobis(succinimidylpropionate) (DSP), dithiobis(sulfosuccinimidylpropionate) (DTSSP), ethyleneglycol bis(succinimidylsuccinate) (EGS), ethyleneglycol bis(sulfosuccinimidylsuccinate) (sulfo-EGS), disuccinimidyl tartrate (DST), disulfosuccinimidyl tartrate (sulfo-DST), bis[2- (succinimidooxycarbonyloxy)ethyl]sulfone (BSOCOES), and bis[2- (sulfosuccinimidooxycarbonyloxy)ethyl]sulfone (sulfo-BSOCOES).

[0070] In some embodiments, the linker can be a peptide linker, which joins together two single chains, as described herein. In some embodiments, the length and amino acid composition of the peptide linker sequence can be optimized to vary the orientation and/or proximity of the polypeptides relative to one another to achieve a desired activity of the constructs (e.g., TCR constructs) as disclosed herein.

[0071] The construct according to the present disclosure can also be provided in the form of a multimeric complex, including at least two scTCR molecules, wherein the scTCR molecules are each fused to at least one biotin moiety, or other interconnecting molecule/linker, and wherein the scTCRs are interconnected by biotin-streptavidin interaction to allow the formation of said multimeric complex. Similar approaches known in the art for the generation of multimeric TCR are also contemplated and included in this disclosure. Accordingly, also provided are multimeric complexes of a higher order, comprising more than two scTCR of the disclosure.

[0072] Suitable methods of making fusion polypeptides are known in the art, and include, for example, recombinant methods. In some embodiments, the constructs, TCRs (and functional fragments and functional derivatives thereof), and polypeptides of the disclosure may be expressed as a single protein including a linker peptide linking the a chain and the b chain, and/or linking the g chain and the d chain. In this regard, the constructs, TCRs (and functional fragments and functional derivatives thereof), and polypeptides of the disclosure include the amino acid sequences of the variable regions of the TCR of the disclosure and can further include a linker peptide. In some embodiments, the linker peptide may advantageously facilitate the expression of a construct or a TCR (including functional fragments and functional derivatives thereof) in a host cell. In principle, the linker peptide may comprise any suitable amino acid sequence. Linker sequences for single chain TCR constructs are well known in the art. In some embodiments, such a single chain construct can further comprise one, or two, constant domain sequences. Upon expression of the construct including the linker peptide by a host cell, the linker peptide may also be cleaved, resulting in separated a and b chains, and separated g and d chain.

[0073] In some embodiments, the TCR constructs of the disclosure includes at least one TCR a or g and/or TCR b or d variable domain. Generally, they include both a TCR a variable domain and a TCR b variable domain, alternatively both a TCR g variable domain and a TCR d variable domain. In some embodiments, the TCR constructs include ab/gd heterodimers or may be in single chain format. In some embodiments, fuse in adoptive therapy, an ab or gd heterodimeric TCR may, for example, be transfected as full length chains having both cytoplasmic and transmembrane domains. If desired, an introduced disulfide bond between residues of the respective constant domains can be present. [0074] In some embodiments, the TCR constructs of the disclosure are provided as single chain a or b, or g and d, molecules, or alternatively as double chain constructs composed of both the a and b chain, or g and d chain. Accordingly, in some embodiments, the TCR construct is a single-chain TCR construct including in its beta chain a CDR3b having at least 70%, for example, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or at least 100% sequence identity to a sequence selected from SEQ ID NOs: 1-106. In additional or alternative embodiments, the TCR construct may further include a CDR1 and/or a CDR2 domain sequence. In some embodiments, the TCR constructs of the disclosure include at least one, preferably all three CDR sequences CDR1, CDR2 and CDR3.

[0075] In some embodiments, the TCR constructs of the disclosure are provided as double-chain constructs composed of both the a and b chain, or g and d chain. Accordingly, in some embodiments, the TCR constructs of the disclosure are provided as double-chain constructs comprising both the a and b chain, wherein its beta chain includes a CDR3b having at least 70%, for example, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or at least 100% sequence identity to a sequence selected from SEQ ID NOs: 1-106. In some embodiments, the TCR constructs further include in its alpha chain a CDR3a sequence.

[0076] As outlined above, in some embodiments, the construct of the disclosure can be provided in the framework of an antibody construct or a functional fragment thereof, which specifically binds to the antigens described herein. The antibody construct can be any type of immunoglobulin that is known in the art. For instance, the antibody construct can be of any iso-type, e.g ., IgA, IgD, IgE, IgG, IgM, etc. The antibody construct can be monoclonal or polyclonal. The antibody construct can be a naturally-occurring antibody, e.g., an antibody isolated and/or purified from a mammal, e.g., human cell. Alternatively, the antibody construct can be a genetically-engineered antibody, e.g., a humanized antibody or a chimeric antibody. The antibody construct can be in monomeric or polymeric form. In some embodiments, the construct disclosed herein is an antibody construct selected from the group consisting of an antigen-binding fragment (Fab), a single-chain variable fragment (scFv), a nanobody, a single domain antibody (sdAb), a V H domain, a V L domain, a V H H domain, a diabody, or a functional fragment of any thereof.

B. Nucleic acids [0077] In one aspect, provided herein are various nucleic acid molecules including nucleotide sequences encoding the constructs of the disclosure, including expression cassettes, and expression vectors containing these nucleic acid molecules operably linked to heterologous nucleic acid sequences such as, for example, regulator sequences which allow in vivo expression of the constructs in a host cell or ex-vivo cell-free expression system.

[0078] The terms “nucleic acid molecule” and “polynucleotide” are used interchangeably herein, and refer to both RNA and DNA molecules, including nucleic acid molecules comprising cDNA, genomic DNA, synthetic DNA, and DNA or RNA molecules containing nucleic acid analogs. A nucleic acid molecule can be double-stranded or single- stranded ( e.g ., a sense strand or an antisense strand). A nucleic acid molecule may contain unconventional or modified nucleotides. The terms "polynucleotide sequence" and "nucleic acid sequence" as used herein interchangeably refer to the sequence of a polynucleotide molecule. The polynucleotide and polypeptide sequences disclosed herein are shown using standard letter abbreviations for nucleotide bases and amino acids as set forth in 37 CFR §1.82), which incorporates by reference WIPO Standard ST.25 (1998), Appendix 2, Tables 1- 6

[0079] Nucleic acid molecules of the present disclosure can be nucleic acid molecules of any length, including nucleic acid molecules that are generally between about 0.5 Kb and about 50 Kb, for example between about 0.5 Kb and about 20 Kb, between about 1 Kb and about 15 Kb, between about 2 Kb and about 10 Kb, or between about 5 Kb and about 25 Kb, for example between about 10 Kb to 15 Kb, between about 15 Kb and about 20 Kb, between about 5 Kb and about 20 Kb, about 5 Kb and about 10 Kb, or about 10 Kb and about 25 Kb.

In some embodiments, the nucleic acid molecules of the disclosure are between about 1.5 Kb and about 50 Kb, between about 5 Kb and about 40 Kb, between about 5 Kb and about 30 Kb, between about 5 Kb and about 20 Kb, or between about 10 Kb and about 50 Kb, for example between about 15 Kb to 30 Kb, between about 20 Kb and about 50 Kb, between about 20 Kb and about 40 Kb, about 5 Kb and about 25 Kb, or about 30 Kb and about 50 Kb.

[0080] In some embodiments disclosed herein, the nucleic acid molecules of the disclosure include a nucleotide sequence encoding a construct including at least one complementary determining region (CDR) having at least 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99%, or 100% sequence identity to a sequence selected from the group consisting of SEQ ID NOs: 1-106. In some embodiments, the construct is single-chain constructs or double-chain constructs. In some embodiments, the construct is selected from the group consisting of: (a) a TCR; (b) an antibody; and (c) a functional derivative or fragment of (a) or (b). In some embodiments, the construct is a TCR construct including a TCR alpha chain and a TCR beta chain operably linked to each other. In some embodiments, the construct is a TCR construct including in its beta chain a CDR3 having at least 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99%, or 100% sequence identity to a sequence selected from SEQ ID NOs: 1-106. In some embodiments, the construct further includes in its alpha chain a CDR3a sequence.

[0081] In some embodiments, the nucleotide sequence is incorporated into an expression cassette or an expression vector. It will be understood that an expression cassette generally includes a construct of genetic material that contains coding sequences and enough regulatory information to direct proper transcription and/or translation of the coding sequences in a recipient cell, in vivo and/or ex vivo. Generally, the expression cassette may be inserted into a vector for targeting to a desired host cell and/or into an individual. As such, in some embodiments, an expression cassette of the disclosure include a coding sequence for the construct as disclosed herein, which is operably linked to expression control elements, such as a promoter, and optionally, any or a combination of other nucleic acid sequences that affect the transcription or translation of the coding sequence.

[0082] In some embodiments, the nucleotide sequence is incorporated into an expression vector. It will be understood by one skilled in the art that the term “vector” generally refers to a recombinant polynucleotide construct designed for transfer between host cells, and that may be used for the purpose of transformation, e.g ., the introduction of heterologous DNA into a host cell. As such, in some embodiments, the vector can be a replicon, such as a plasmid, phage, or cosmid, into which another DNA segment may be inserted so as to bring about the replication of the inserted segment. In some embodiments, the expression vector can be an integrating vector.

[0083] In some embodiments, the expression vector can be a viral vector. As will be appreciated by one of skill in the art, the term “viral vector” is widely used to refer either to a nucleic acid molecule (e.g, a transfer plasmid) that includes virus-derived nucleic acid elements that typically facilitate transfer of the nucleic acid molecule or integration into the genome of a cell or to a viral particle that mediates nucleic acid transfer. Viral particles will typically include various viral components and sometimes also host cell components in addition to nucleic acid(s). The term viral vector may refer either to a virus or viral particle capable of transferring a nucleic acid into a cell or to the transferred nucleic acid itself. Viral vectors and transfer plasmids contain structural and/or functional genetic elements that are primarily derived from a virus. In some embodiments, the viral vector is a bacculorival vector, a retroviral vector, or a lentiviral vector. The term “retroviral vector” refers to a viral vector or plasmid containing structural and functional genetic elements, or portions thereof, that are primarily derived from a retrovirus. The term “lentiviral vector” refers to a viral vector or plasmid containing structural and functional genetic elements, or portions thereof, including LTRs that are primarily derived from a lentivirus, which is a genus of retrovirus.

[0084] Accordingly, also provided herein are vectors, plasmids, or viruses containing one or more of the nucleic acid molecules encoding any of the constructs disclosed herein. The nucleic acid molecules can be contained within a vector that is capable of directing their expression in, for example, a cell that has been transformed/transduced with the vector. Suitable vectors for use in eukaryotic and prokaryotic cells are known in the art and are commercially available, or readily prepared by a skilled artisan.

[0085] DNA vectors can be introduced into eukaryotic cells via conventional transformation or transfection techniques. Suitable methods for transforming or transfecting cells can be found in Sambrook et al. (2012, supra) and other standard molecular biology laboratory manuals, such as, calcium phosphate transfection, DEAE-dextran mediated transfection, transfection, microinjection, cationic lipid-mediated transfection, electroporation, transduction, scrape loading, ballistic introduction, nucleoporation, hydrodynamic shock, and infection.

[0086] Viral vectors that can be used in the disclosure include, for example, baculoviral vectors, retrovirus vectors, adenovirus vectors, and adeno-associated virus vectors, lentivirus vectors, herpes virus, simian virus 40 (SV40), and bovine papilloma virus vectors (see, for example, Gluzman (Ed.), Eukaryotic Viral Vectors , CSH Laboratory Press, Cold Spring Harbor, N.Y.). For example, a chimeric receptor as disclosed herein can be produced in a eukaryotic cell, such as a mammalian cells ( e.g ., COS cells, NIH 3T3 cells, or HeLa cells). These cells are available from many sources, including the American Type Culture Collection (Manassas, VA). In selecting an expression system, care should be taken to ensure that the components are compatible with one another. Artisans or ordinary skill are able to make such a determination. Furthermore, if guidance is required in selecting an expression system, skilled artisans may consult P. Jones, “Vectors: Cloning Applications”, John Wiley and Sons, New York, N.Y., 2009).

[0087] The nucleic acid molecules provided can contain naturally occurring sequences, or sequences that differ from those that occur naturally, but, due to the degeneracy of the genetic code, encode the same polypeptide, e.g ., antibody. These nucleic acid molecules can consist of RNA or DNA (for example, genomic DNA, cDNA, or synthetic DNA, such as that produced by phosphoamidite-based synthesis), or combinations or modifications of the nucleotides within these types of nucleic acids. In addition, the nucleic acid molecules can be double-stranded or single-stranded (e.g, either a sense or an antisense strand).

[0088] The nucleic acid molecules are not limited to sequences that encode polypeptides (e.g, antibodies); some or all of the non-coding sequences that lie upstream or downstream from a coding sequence (e.g, the coding sequence of a chimeric receptor) can also be included. Those of ordinary skill in the art of molecular biology are familiar with routine procedures for isolating nucleic acid molecules. They can, for example, be generated by treatment of genomic DNA with restriction endonucleases, or by performance of the polymerase chain reaction (PCR). In the event the nucleic acid molecule is a ribonucleic acid (RNA), molecules can be produced, for example, by in vitro transcription.

[0089] In another aspect, provided herein are cell cultures including at least one engineered cell as disclosed herein, and a culture medium. Generally, the culture medium can be any suitable culture medium for culturing the cells described herein. Techniques for transforming a wide variety of the above-mentioned cells and species are known in the art and described in the technical and scientific literature. Accordingly, cell cultures including at least one engineered cell as disclosed herein are also within the scope of this application. Methods and systems suitable for generating and maintaining cell cultures are known in the art.

C. Engineered cells and cell cultures

[0090] The recombinant nucleic acids of the present disclosure can be introduced into a cell, such as, for example, a human T lymphocyte, to produce an engineered cell containing the nucleic acid molecule. Accordingly, some embodiments of the disclosure relate to methods for making an engineered cell, including (a) providing a host cell capable of protein expression; and transducing the provided host cell with a recombinant nucleic acid of the disclosure to produce an engineered cell. Introduction of the nucleic acid molecules of the disclosure into cells can be achieved by methods known to those skilled in the art such as, for example, viral infection, transfection, conjugation, protoplast fusion, lipofection, electroporation, nucleofection, calcium phosphate precipitation, polyethyleneimine (PEI)- mediated transfection, DEAE-dextran mediated transfection, liposome-mediated transfection, particle gun technology, calcium phosphate precipitation, direct micro-injection, nanoparticle-mediated nucleic acid delivery, and the like.

[0091] Accordingly, in some embodiments, the nucleic acid molecules can be introduced into a host cell by viral or non-viral delivery vehicles known in the art to produce an engineered cell. For example, the nucleic acid molecule can be stably integrated in the engineered cell’s genome, or can be episomally replicating, or present in the engineered cell as a mini-circle expression vector for transient expression. Accordingly, in some embodiments, the nucleic acid molecule is maintained and replicated in the recombinant host cell as an episomal unit. In some embodiments, the nucleic acid molecule is present in the engineered cell as a mini-circle expression vector for transient expression. In some embodiments, the nucleic acid molecule is stably integrated into the genome of the engineered cell. Stable integration can be achieved using classical random genomic recombination techniques or with more precise techniques such as guide RNA-directed CRISPR/Cas9 genome editing, or DNA-guided endonuclease genome editing with NgAgo (Natronobacterium gregoryi Argonaute), or TALENs genome editing (transcription activator-like effector nucleases).

[0092] The nucleic acid molecules can be encapsulated in a viral capsid or a lipid nanoparticle, or can be delivered by viral or non-viral delivery means and methods known in the art, such as electroporation. For example, introduction of nucleic acids into cells may be achieved by viral transduction. In a non-limiting example, baculoviral virus or adeno- associated virus (AAV) can be engineered to deliver nucleic acids to target cells via viral transduction. Several AAV serotypes have been described, and all of the known serotypes can infect cells from multiple diverse tissue types. AAV is capable of transducing a wide range of species and tissues in vivo with no evidence of toxicity, and it generates relatively mild innate and adaptive immune responses.

[0093] Lentiviral-derived vector systems are also useful for nucleic acid delivery and gene therapy via viral transduction. Lentiviral vectors offer several attractive properties as gene-delivery vehicles, including: (i) sustained gene delivery through stable vector integration into host genome; (ii) the capability of infecting both dividing and non-dividing cells; (iii) broad tissue tropisms, including important gene- and cell-therapy-target cell types; (iv) no expression of viral proteins after vector transduction; (v) the ability to deliver complex genetic elements, such as polycistronic or intron-containing sequences; (vi) a potentially safer integration site profile; and (vii) a relatively easy system for vector manipulation and production.

[0094] In some embodiments, host cells can be genetically engineered ( e.g ., transduced or transformed or transfected) with, for example, a vector construct of the present disclosure that can be, for example, a viral vector or a vector for homologous recombination that includes nucleic acid sequences homologous to a portion of the genome of the host cell, or can be an expression vector for the expression of the polypeptides of interest. Host cells can be either untransformed cells or cells that have already been transfected with at least one nucleic acid molecule.

[0095] In some embodiments, the engineered cell is a prokaryotic cell or a eukaryotic cell. In some embodiments, the cell is in vivo. In some embodiments, the cell is ex vivo. In some embodiments, the cell is in vitro. In some embodiments, the engineered cell is a eukaryotic cell. In some embodiments, the engineered cell is an animal cell. In some embodiments, the animal cell is a mammalian cell. In some embodiments, the animal cell is a human cell. In some embodiments, the cell is a non-human primate cell. In some embodiments, the engineered cell is an immune system cell, e.g., a B cell, a monocyte, a NK cell, a natural killer T (NKT) cell, a basophil, an eosinophil, a neutrophil, a dendritic cell, a macrophage, a regulatory T cell, a helper T cell (T H ), a cytotoxic T cell (TC TL ), a memory T cell, a gamma delta (gd) T cell, another T cell, a hematopoietic stem cell, or a hematopoietic stem cell progenitor.

[0096] In some embodiments, the immune system cell is a lymphocyte. In some embodiments, the lymphocyte is a T lymphocyte. In some embodiments, the lymphocyte is a T lymphocyte progenitor. In some embodiments, the T lymphocyte is a CD4+ T cell or a CD8+ T cell. In some embodiments, the T lymphocyte is a CD8+ T cytotoxic lymphocyte cell. Non-limiting examples of CD8+ T cytotoxic lymphocyte cell suitable for the compositions and methods disclosed herein include naive CD8+ T cells, central memory CD8+ T cells, effector memory CD8+ T cells, effector CD8+ T cells, CD8+ stem memory T cells, and bulk CD8+ T cells. In some embodiments, the T lymphocyte is a CD4+ T helper lymphocyte cell. Suitable CD4+ T helper lymphocyte cells include, but are not limited to, naive CD4+ T cells, central memory CD4+ T cells, effector memory CD4+ T cells, effector CD4+ T cells, CD4+ stem memory T cells, and bulk CD4+ T cells.

[0097] As outlined above, some embodiments of the disclosure relate to various methods for making an engineered cell of the disclosure, the methods include: (a) providing a host cell capable of protein expression; and transducing the provided host cell with a recombinant nucleic acid of the disclosure to produce an engineered cell. Non-limiting exemplary embodiments of the disclosed methods for making an engineered cell can further include one or more of the following features. In some embodiments, the cell is obtained by leukapheresis performed on a sample obtained from a subject, and the cell is transduced ex vivo. In some embodiments, the recombinant nucleic acid is encapsulated in a viral capsid or a lipid nanoparticle. In some embodiments, the methods further include isolating and/or purifying the produced cells. Accordingly, the engineered cells produced by the methods disclosed herein are also within the scope of the disclosure.

[0098] In another aspect, provided herein are cell cultures including at least one engineered cell as disclosed herein, and a culture medium. Generally, the culture medium can be any suitable culture medium for culturing the cells described herein. Techniques for transforming a wide variety of the above-mentioned cells and species are known in the art and described in the technical and scientific literature. Accordingly, cell cultures including at least one engineered cell as disclosed herein are also within the scope of this application. Methods and systems suitable for generating and maintaining cell cultures are known in the art.

E. Pharmaceutical compositions

[0099] The constructs, nucleic acids, engineered cells, and/or cell cultures of the disclosure can be incorporated into compositions, including pharmaceutical compositions. Such compositions generally include one or more of the constructs, nucleic acids, engineered cells, and/or cell cultures as provided and described herein, and a pharmaceutically acceptable excipient, e.g ., carrier. In some embodiments, the pharmaceutical compositions of the disclosure are formulated for the treating, preventing, ameliorating a disease such as cancer, or for reducing or delaying the onset of the disease.

[00100] Accordingly, one aspect of the present disclosure relates to pharmaceutical compositions that include a pharmaceutically acceptable carrier and one or more of the following: (a) a construct of the disclosure; (b) a recombinant nucleic acid of the disclosure; and (c) an engineered cell of the disclosure. In some embodiments, the pharmaceutical compositions include (a) a construct of the disclosure and (b) a pharmaceutically acceptable carrier. In some embodiments, the pharmaceutical compositions include (a) a recombinant nucleic acid of the disclosure and (b) a pharmaceutically acceptable carrier. In some embodiments, the recombinant nucleic acid is encapsulated in a viral capsid or a lipid nanoparticle. In some embodiments, the pharmaceutical compositions of the disclosure include (a) an engineered cell of the disclosure and (b) a pharmaceutically acceptable carrier.

[00101] Pharmaceutical compositions suitable for injectable use include sterile aqueous solutions (where water soluble) or dispersions and sterile powders for the extemporaneous preparation of sterile injectable solutions or dispersion. For intravenous administration, suitable carriers include physiological saline, bacteriostatic water, Cremophor EL™. (BASF, Parsippany, N.J.), or phosphate buffered saline (PBS). In all cases, the composition should be sterile and should be fluid to the extent that easy syringability exists. It should be stable under the conditions of manufacture and storage and must be preserved against the contaminating action of microorganisms such as bacteria and fungi. The carrier can be a solvent or dispersion medium containing, for example, water, ethanol, polyol (for example, glycerol, propylene glycol, and liquid polyethylene glycol, and the like), and suitable mixtures thereof. The proper fluidity can be maintained, for example, by the use of a coating such as lecithin, by the maintenance of the required particle size in the case of dispersion and by the use of surfactants, e.g ., sodium dodecyl sulfate. Prevention of the action of microorganisms can be achieved by various antibacterial and antifungal agents, for example, parabens, chlorobutanol, phenol, ascorbic acid, thimerosal, and the like. In many cases, it will be generally to include isotonic agents, for example, sugars, polyalcohols such as mannitol, sorbitol, and/or sodium chloride in the composition. Prolonged absorption of the injectable compositions can be brought about by including in the composition an agent which delays absorption, for example, aluminum monostearate and gelatin.

[00102] Sterile injectable solutions can be prepared by incorporating the active compound in the required amount in an appropriate solvent with one or a combination of ingredients enumerated above, as required, followed by filtered sterilization. Generally, dispersions are prepared by incorporating the active compound into a sterile vehicle, which contains a basic dispersion medium and the required other ingredients from those enumerated above. In the case of sterile powders for the preparation of sterile injectable solutions, the preferred methods of preparation are vacuum drying and freeze-drying which yields a powder of the active ingredient plus any additional desired ingredient from a previously sterile- filtered solution thereof.

METHODS OF THE DISCLOSURE

[00103] Administration of any one of the therapeutic compositions described herein, e.g ., constructs, nucleic acids, engineered cells, and pharmaceutical compositions, can be used to treat subjects in the treatment of relevant diseases, such as cancers, immune diseases, and chronic infections. In some embodiments, the constructs , nucleic acids, engineered cells, and pharmaceutical compositions as described herein can be incorporated into therapeutic agents for use in methods of preventing and/or treating a subject who has, who is suspected of having, or who may be at high risk for developing one or more health conditions, such as proliferative disorders or microbial infections. Exemplary proliferative disorders can include, without limitation, angiogenic diseases, a metastatic diseases, tumorigenic diseases, neoplastic diseases and cancers. In some embodiments, the proliferatieve disorder is a cancer.

[00104] Accordingly, in one aspect, some embodiments of the disclosure relate to methods for the prevention and/or treatment of a condition in a subject in need thereof, wherein the methods include administering to the subject a composition including one or more of: (a) a construct of the disclosure; (b) a recombinant nucleic acid of the disclosure; (c) an engineered cell of the disclosure; and d) a pharmaceutically composition of the disclosure. In some embodiments, the composition includes a therapeutically effective amount or number of: (a) a construct of the disclosure; (b) a recombinant nucleic acid of the disclosure; (c) an engineered cell of the disclosure; and/or a pharmaceutical composition of the disclosure.

[00105] In some embodiments, the disclosed pharmaceutical composition is formulated to be compatible with its intended route of administration. The recombinant polypeptides of the disclosure may be given orally or by inhalation, but it is more likely that they will be administered through a parenteral route. Examples of parenteral routes of administration include, for example, intravenous, intradermal, subcutaneous, transdermal (topical), transmucosal, and rectal administration. Solutions or suspensions used for parenteral application can include the following components: a sterile diluent such as water for injection, saline solution, fixed oils, polyethylene glycols, glycerine, propylene glycol or other synthetic solvents; antibacterial agents such as benzyl alcohol or methyl parabens; antioxidants such as ascorbic acid or sodium bisulfite; chelating agents such as ethylenediaminetetraacetic acid (EDTA); buffers such as acetates, citrates or phosphates and agents for the adjustment of tonicity such as sodium chloride or dextrose. pH can be adjusted with acids or bases, such as mono- and/or di-basic sodium phosphate, hydrochloric acid or sodium hydroxide ( e.g ., to a pH of about 7.2-7.8, e.g., 7.5). The parenteral preparation can be enclosed in ampoules, disposable syringes or multiple dose vials made of glass or plastic.

[00106] Dosage, toxicity and therapeutic efficacy of such subject recombinant polypeptides of the disclosure can be determined by standard pharmaceutical procedures in cell cultures or experimental animals, e.g, for determining the LD50 (the dose lethal to 50% of the population) and the ED50 (the dose therapeutically effective in 50% of the population). The dose ratio between toxic and therapeutic effects is the therapeutic index and it can be expressed as the ratio LD50/ED50. Compounds that exhibit high therapeutic indices are generally suitable. While compounds that exhibit toxic side effects may be used, care should be taken to design a delivery system that targets such compounds to the site of affected tissue in order to minimize potential damage to uninfected cells and, thereby, reduce side effects.

[00107] The data obtained from the cell culture assays and animal studies can be used in formulating a range of dosage for use in humans. The dosage of such compounds lies preferably within a range of circulating concentrations that include the ED50 with little or no toxicity. The dosage may vary within this range depending upon the dosage form employed and the route of administration utilized. For any compound used in the method of the disclosure, the therapeutically effective dose can be estimated initially from cell culture assays. A dose may be formulated in animal models to achieve a circulating plasma concentration range that includes the IC50 (e.g., the concentration of the test compound which achieves a half-maximal inhibition of symptoms) as determined in cell culture. Such information can be used to more accurately determine useful doses in humans. Levels in plasma may be measured, for example, by high performance liquid chromatography.

[00108] The therapeutically effective amount of a subject recombinant polypeptide of the disclosure (e.g, an effective dosage) depends on the polypeptide selected. For instance, single dose amounts in the range of approximately 0.001 to 0.1 mg/kg of patient body weight can be administered; in some embodiments, about 0.005, 0.01, 0.05 mg/kg may be administered. In some embodiments, 600,000 IU/kg is administered (IU can be determined by a lymphocyte proliferation bioassay and is expressed in International Units (IU). The compositions can be administered one from one or more times per day to one or more times per week; including once every other day. The skilled artisan will appreciate that certain factors may influence the dosage and timing required to effectively treat a subject, including but not limited to the severity of the disease, previous treatments, the general health and/or age of the subject, and other diseases present. Moreover, treatment of a subject with a therapeutically effective amount of the subject recombinant polypeptides of the disclosure can include a single treatment or, can include a series of treatments. In some embodiments, the compositions are administered every 8 hours for five days, followed by a rest period of 2 to 14 days, e.g., 9 days, followed by an additional five days of administration every 8 hours. Administration of engineered cells to a subject

[00109] In some embodiments, the methods of treatment as disclosed herein involve administering an effective amount or number of the engineered cells to a subject in need of such treatment. This administering step can be accomplished using any method of implantation delivery in the art. For example, the engineered cells can be infused directly in the individual’s bloodstream or otherwise administered to the individual.

[00110] In some embodiments, the methods disclosed herein include administering, which term is used interchangeably with the terms “introducing,” implanting,” and “transplanting,” engineered cells into a subjet, by a method or route that results in at least partial localization of the introduced cells at a desired site such that a desired effect(s) is/are produced. The engineered cells or their differentiated progeny can be administered by any appropriate route that results in delivery to a desired location in the individual where at least a portion of the administered cells or components of the cells remain viable. The period of viability of the cells after administration to a subjet can be as short as a few hours, e.g. , twenty-four hours, to a few days, to as long as several years, or even the lifetime of the individual, i.e., long-term engraftment.

[00111] When provided prophylactically, the engineered cells described herein can be administered to a subjet in advance of any symptom of a disease or condition to be treated. Accordingly, in some embodiments the prophylactic administration of an engineered cell population prevents the occurrence of symptoms of the disease or condition.

[00112] When provided therapeutically in some embodiments, engineered cells are provided at (or after) the onset of a symptom or indication of a disease or condition, e.g. , upon the onset of disease or condition. [00113] For use in the various embodiments described herein, an effective amount or number of engineered cells as disclosed herein, can be at least 10 2 cells, at least 5 c 10 2 cells, at least 10 3 cells, at least 5 c 10 3 cells, at least 10 4 cells, at least 5 c 10 4 cells, at least 10 5 cells, at least 2 c 10 5 cells, at least 3 c 10 5 cells, at least 4 c 10 5 cells, at least 5 c 10 5 cells, at least 6 c 10 5 cells, at least 7 c 10 5 cells, at least 8 c 10 5 cells, at least 9 c 10 5 cells, at least 1 c 10 6 cells, at least 2 c 10 6 cells, at least 3 c 10 6 cells, at least 4 c 10 6 cells, at least 5 c 10 6 cells, at least 6 c 10 6 cells, at least 7 c 10 6 cells, at least 8 c 10 6 cells, at least 9 c 10 6 cells, or multiples thereof. The engineered cells can be derived from one or more donors or can be obtained from an autologous source. In some embodiments, the engineered cells are expanded in culture prior to administration to a subject in need of a treatment.

[00114] In some embodiments, the delivery of an engineered cell composition (e.g, a composition including a plurality of engineered cells according to any of the cells described herein) into an individual by a method or route results in at least partial localization of the cell composition at a desired site. A composition including engineered cells can be administered by any appropriate route that results in effective treatment in the individual, e.g, administration results in delivery to a desired location in the individual where at least a portion of the composition delivered, e.g, at least 1 c 10 3 cells, is delivered to the desired site for a period of time. Modes of administration include injection, infusion, and instillation. “Injection” includes, without limitation, intravenous, intramuscular, intra-arterial, intrathecal, intraventricular, intracap sular, intraorbital, intracardiac, intradermal, intrap eritoneal, transtracheal, subcutaneous, subcuticular, intraarticular, subcapsular, subarachnoid, intraspinal, intracerebrospinal, and intrastemal injection and infusion. In some embodiments, the route is intravenous. For the delivery of cells, delivery by injection or infusion is a standard mode of administration.

[00115] In some embodiments, the engineered cells are administered systemically, e.g, via infusion or injection. For example, a population of engineered cells are administered other than directly into a target site, tissue, or organ, such that it enters the individual’s circulatory system and, thus, is subject to metabolism and other similar biological processes.

[00116] The efficacy of a treatment including any of the compositions provided herein for the treatment of a disease or condition can be determined by a skilled clinician. However, one skilled in the art will appreciate that a treatment is considered effective if any one or all of the signs or symptoms or markers of disease are improved or ameliorated. Efficacy can also be measured by failure of a subject to worsen as assessed by decreased hospitalization or need for medical interventions ( e.g ., progression of the disease is halted or at least slowed). Methods of measuring these indicators are known to those of skill in the art and/or described herein. Treatment includes any treatment of a disease in a subject or an animal (some non limiting examples include a human, or a mammal) and includes: (1) inhibiting the disease, e.g., arresting, or slowing the progression of symptoms; or (2) relieving the disease, e.g, causing regression of symptoms; and (3) preventing or reducing the likelihood of the development of symptoms.

[00117] As discussed above, a therapeutically effective number of engineered cells refers to a number of engineered cells that is sufficient to promote a provide a therapeutic benefit in the treatment or management of a disease, e.g, cancer, or to delay or minimize one or more symptoms associated with the disease when administered to a subject, such as one who has, is suspected of having, or is at risk for the disease. In some embodiments, an effective number includes a number sufficient to prevent or delay the development of a symptom of the disease, alter the course of a symptom of the disease (for example but not limited to, slow the progression of a symptom of the disease), or reverse a symptom of the disease. In some embodiments, an effective number includes a number sufficient to inhibit tumor growth or metastasis of a cancer in the individual. In some embodiments, an effective number includes a number sufficient to increase cytokine production, inhibit (e.g, kill) a cancer cell or an infected cell.

[00118] In some embodiments of the disclosed methods, the individual is a mammal. In some embodiments, the mammal is a human. In some embodiments, the individual has or is suspected of having a condition associated with a proliferative disorder or disease, such as a cancer. The term cancer generally refers to the presence of cells possessing characteristics typical of cancer-causing cells, such as uncontrolled proliferation, immortality, metastatic potential, rapid growth and proliferation rate, and certain characteristic morphological features. Cancer cells are often observed aggregated into a tumor, but such cells can exist alone within an animal subject, or can be a non-tumorigenic cancer cell, such as a leukemia cell. Thus, the terms “cancer” or can encompass reference to a solid tumor, a soft tissue tumor, or a metastatic lesion. As used herein, the term “cancer” includes premalignant, as well as malignant cancers. In some embodiments, the cancer is a solid tumor, a soft tissue tumor, or a metastatic lesion. [00119] Examples of conditions suitable for being treated by the compositions and methods of the disclosure include those associated with cancers, autoimmune diseases, inflammatory diseases, and infectious diseases. In some embodiments, the proliferative disorder is a cancer. Examples of cancers that can be suitably diagnosed, prevented, and/or treated by the compositions and methods of the disclosure include lung cancers. In principle, there are no particular limitations to the in regard to the lung cancers that can be suitably diagnosed, prevented, and/or treated by the compositions and methods disclosed herein. Examples of suitable lung cancers include adenocarcinoma, squamous cell carcinoma, small cell carcinoma, non-small cell carcinoma, adenosquamous carcinoma, small cell lung cancer, large cell carcinoma, neuroendocrine cancers of the lung, non-small cell lung cancer (NSCLC). Additional lung cancers that can be suitably diagnosed, prevented, and/or treated by the compositions and methods disclosed herein include, but are not limited to, undifferentiated non-small cell carcinoma, non-small cell carcinoma not otherwise specified, pulmonary squamous cell carcinoma, broncho-alveolar carcinoma, sarcomatoid carcinoma, pleomorphic carcinoma, carcinosarcoma, pulmonary blastoma, metastatic carcinoma of unknown primary, primary pulmonary lymphoepithelioma-like carcinoma, and benign neoplasms of the lung. In some embodiments, the cancer is NSCLC. In some embodiments, the lung cancer is a NSCLC selected from the group consisting of squamous cell carcinoma, adenocarcinoma, large cell carcinoma, carcinoid tumor, pleomorphic, salivary gland cancer, adenosquamous, sarcomatoid, and unclassified carcinomas. In some embodiments, the NSCLC is squamous cell carcinoma. In some embodiments, the NSCLC is adenocarcinoma. In some embodiments, the NSCLC is large cell carcinoma. In some embodiments, the NSCLC includes stage I NSCLC. In some embodiments, the NSCLC includes stage II NSCLC.

[00120] In some embodiments, the cancer is a multiply drug resistant cancer or a recurrent cancer. It is contemplated that the compositions and methods disclosed here are suitable for both non-metastatic cancers and metastatic cancers. Accordingly, in some embodiments, the cancer is a non-metastatic cancer. In some other embodiments, the cancer is a metastatic cancer. In some embodiments, the composition administered to the subject inhibits metastasis of the cancer in the subject. In some embodiments, the administered composition inhibits tumor growth in the subject.

[00121] In another aspect, provided herein are methods for assisting in the prevention and/or treatment of a condition in a subject in need thereof, the methods including the steps of administering to the subject a first therapy including one or more constructs, recombinant nucleic acids, engineered cells, or pharmaceutical compositions as disclosed herein, and administering to the subject at least one additional therapies, wherein the first therapy and at least one additional therapies together prevent and/or treat the condition in the subject. In some embodiments, the methods include administering to the subject a first therapy including an effective number of the engineered cells as disclosed herein, wherein the engineered cells treat the condition.

[00122] Additional examples of conditions suitable for being treated by the compositions and methods of the disclosure include those associated with an immune checkpoint blockade. In these instances, the method of the disclosure may be for an immunotherapy of a checkpoint blockade. In some embodiments, the checkpoint blockade immunotherapy involves using one or more inhibitors of a checkpoint receptor such as, for example, PD- 1/PD-Ll, CTLA-4, IDO, TIM3, LAG3, TIGIT, BTLA, VISTA, ICOS, KIRs and CD39. In some embodiments, the checkpoint receptor is an inhibitory checkpoint receptor selected from the group consisting of PD-1, CTLA-4, A2AR, B7-H3, B7-H4, BTLA, CD5, CD132, IDO, KIR, LAG3, TIM-3, TIGIT, VISTA. In some embodiments, the checkpoint receptor is a stimulatory checkpoint receptor selected from the group consisting of CD27, CD28, CD40, 0X40, GITR, ICOS, and CD137. In some embodiments, the checkpoint blockade immunotherapy includes an anti -PD 1 checkpoint therapy. In some embodiments, the checkpoint blockade immunotherapy includes an anti-PDl-Ll checkpoint therapy.

[00123] As described in greater detail below, various constructs of the disclosure are capable of binding antigens derived from viral pathogens. In some embodiments of the disclosure, provided herein are methods for the diagnosis, prevention, and/or treatment of a malignancy associated with a microbial infection. In some embodiments, the malignancy is associated with a bacterial infection..

[00124] In some embodiments, provided herein are methods for the diagnosis, prevention, and/or treatment of a malignancy associated with a viral infection. In some embodiments, the malignancy associated with an infection by Epstein-Barr virus (EBV), which was originally discovered through its association with Burkitt lymphoma, but has since been linked to a remarkably wide range of lymphoproliferative lesions and malignant lymphomas of B-, T- and NK-cell origin. Examples of EBV-associated malignancies that can suitably be diagnosed, prevented, and/or treated by using the compositions and methods disclosed herein include Hodgkin lymphoma, Burkitt lymphoma, diffuse large B cell lymphoma, nasopharyngeal carcinoma, gastric carcinoma, post-transplant lymphoproliferative disease, B lymphoproliferative disease. Additional EBV-associated malignancies that can suitably be diagnosed, prevented, and/or treated by using the compositions and methods disclosed herein include, but are not limited to, T-cell lymphoproliferative disease, NK-cell lymphoproliferative disease, NK-cell lymphomas, T- cell lymphomas, NK-cell lymphomas, T-cell leukemias, leiomyosarcomas, and lymphoepitheli oma-like carcinomas .

Additional therapies

[00125] As discussed above, some embodiments of the disclosure provide methods for the prevention or treatment of a condition in a subject, wherein the methods include administering a composition as disclosed herein to the subject as a single therapy ( e.g ., monotherapy). In addition, in some embodiments of the disclosure, the composition is administered to the subject individually as a first therapy or in combination with at least one additional therapies, e.g., at least one, two, three, four, or five additional therapies. Suitable therapies to be administered in combination with the compositions of the disclosure include, but are not limited to chemotherapy, radiotherapy, immunotherapy, hormonal therapy, toxin therapy, targeted therapy, and surgery.

[00126] Non-limiting examples of therapies suitable for combining with the methods disclosed herein include is an anti-CTLA4 antibody, an anti-PD-1 antibody, an anti-PD-Ll antibody, an anti-CD20 antibody, an anti-CD40 antibody, an anti-DR5 antibody, an anti- CD Id antibody, an anti-TIM3 antibody, an anti-SLAMF7 antibody, an anti -KIR receptor antibody, an anti-OX40 antibody, an anti-HER2 antibody, an anti-ErbB-2 antibody, an anti- EGFR antibody, cetuximab, rituximab, trastuzumab, pembrolizumab. Additional therapies suitable for combining with the methods disclosed herein include, but are not limited to, radiotherapy such as single dose radiation, fractionated radiation, and focal radiation, and whole organ radiation. Also suitable for combining with the methods disclosed herein include IL-12, IFNa, GM-CSF, chimeric antigen receptors, adoptively transferred T cells, anti-cancer vaccines, and oncolytic viruses.

[00127] In some embodiments, the first therapy and the at least one additional therapies are administered concomitantly. In some embodiments, the first therapy is administered at the same time as the at least one additional therapies. In some embodiments, the first therapy and the at least one additional therapies are administered sequentially. In some embodiments, the first therapy is administered before the at least one additional therapies. In some embodiments, the first therapy is administered after the at least one additional therapies. In some embodiments, the first therapy is administered before and/or after the at least one additional therapies. In some embodiments, the first therapy and the at least one additional therapies are administered in rotation. In some embodiments, the first therapy and the at least one additional therapies are administered together in a single formulation.

[00128] In yet another aspects, provided herein are various methods for obtaining a construct as disclosed herein, the methods include (a) identifying a plurality of TCRs associated with a health condition; (b) determining a sequence of a CDR3 present in each of the identified TCRs; and making a construct including a CDR3 sequence determined in (b). In some embodiments, the methods further include identifying one or more cognate antigens commonly recognized by the CDR3 sequences. In some embodiments, the condition is associated with a proliferative disease. In some embodiments, the proliferative disease is a cancer. In some embodiments, the cancer is a lung cancer. In some embodiments, the condition is a malignancy associated with a viral infection. In some embodiments, the condition is a malignancy associated with an infection by Epstein-Barr virus (EBV). In some embodiments, the condition is associated with an immune checkpoint blockade.

KITS

[00129] Also provided herein are various kits for the practice of a method described herein. In particular, some embodiments of the disclosure provide kits for the diagnosis of a condition in a subject. Some other embodiments relate to kits for the prevention of a condition in a subject in need thereof. Some other embodiments relate to kits for methods of treating a condition in a subject in need thereof. For example, provided herein, in some embodiments, are kits that include one or more of the constructs, recombinant nucleic acids, engineered cells, or pharmaceutical compositions as provided and described herein, as well as written instructions for making and using the same.

[00130] In some embodiments, the kits of the disclosure further include one or more means useful for the administration of any one of the provided constructs, recombinant nucleic acids, engineered cells, or pharmaceutical compositions to an individual. For example, in some embodiments, the kits of the disclosure further include one or more syringes (including pre-filled syringes) and/or catheters (including pre-filled syringes) used to administer any one of the provided constructs, recombinant nucleic acids, engineered cells, or pharmaceutical compositions to an individual. In some embodiments, a kit can have one or more additional therapeutic agents that can be administered simultaneously or sequentially with the other kit components for a desired purpose, e.g ., for diagnosing, preventing, or treating a condition in a subject in need thereof.

[00131] Any of the above-described kits can further include one or more additional reagents, where such additional reagents can be selected from: dilution buffers; reconstitution solutions, wash buffers, control reagents, control expression vectors, negative control constructs, positive control constructs, and reagents suitable for in vitro production of the constructs.

[00132] In some embodiments, the components of a kit can be in separate containers. In some other embodiments, the components of a kit can be combined in a single container.

[00133] In some embodiments, a kit can further include instructions for using the components of the kit to practice the methods disclosed herein. For example, the kit can include a package insert including information concerning the pharmaceutical compositions and dosage forms in the kit. Generally, such information aids patients and physicians in using the enclosed pharmaceutical compositions and dosage forms effectively and safely. For example, the following information regarding a combination of the disclosure may be supplied in the insert: pharmacokinetics, pharmacodynamics, clinical studies, efficacy parameters, indications and usage, contraindications, warnings, precautions, adverse reactions, overdosage, proper dosage and administration, how supplied, proper storage conditions, references, manufacturer/distributor information and intellectual property information.

[00134] The instructions for practicing the methods are generally recorded on a suitable recording medium. For example, the instructions can be printed on a substrate, such as paper or plastic, etc. The instructions can be present in the kit as a package insert, in the labeling of the container of the kit or components thereof (e.g. , associated with the packaging or sub packaging), etc. The instructions can be present as an electronic storage data file present on a suitable computer readable storage medium, e.g. CD-ROM, diskette, flash drive, etc. In some instances, the actual instructions are not present in the kit, but means for obtaining the instructions from a remote source (e.g, via the internet), can be provided. An example of this embodiment is a kit that includes a web address where the instructions can be viewed and/or from which the instructions can be downloaded. As with the instructions, this means for obtaining the instructions can be recorded on a suitable substrate.

[00135] All publications and patent applications mentioned in this disclosure are herein incorporated by reference to the same extent as if each individual publication or patent application was specifically and individually indicated to be incorporated by reference.

[00136] No admission is made that any reference cited herein constitutes prior art. The discussion of the references states what their authors assert, and the Applicant reserves the right to challenge the accuracy and pertinence of the cited documents. It will be clearly understood that, although a number of information sources, including scientific journal articles, patent documents, and textbooks, are referred to herein; this reference does not constitute an admission that any of these documents forms part of the common general knowledge in the art.

[00137] The discussion of the general methods given herein is intended for illustrative purposes only. Other alternative methods and alternatives will be apparent to those of skill in the art upon review of this disclosure, and are to be included within the spirit and purview of this application.

[00138] Additional embodiments are disclosed in further detail in the following examples, which are provided by way of illustration and are not in any way intended to limit the scope of this disclosure or the claims.

EXAMPLES

[00139] The practice of the present invention will employ, unless otherwise indicated, conventional techniques of molecular biology, microbiology, cell biology, biochemistry, nucleic acid chemistry, and immunology, which are well known to those skilled in the art. Such techniques are explained fully in the literature, such as Sambrook, T, & Russell, D. W. (2012). Molecular Cloning: A Laboratory Manual (4th ed.). Cold Spring Harbor, NY: Cold Spring Harbor Laboratory and Sambrook, J., & Russel, D. W. (2001). Molecular Cloning: A Laboratory Manual (3rd ed.). Cold Spring Harbor, NY: Cold Spring Harbor Laboratory (jointly referred to herein as “Sambrook”); Ausubel, F. M. (1987). Current Protocols in Molecular Biology . New York, NY: Wiley (including supplements through 2014); Bollag, D. M. etal. (1996). Protein Methods. New York, NY: Wiley-Liss; Huang, L. et al. (2005). Nonviral Vectors for Gene Therapy. San Diego: Academic Press; Kaplitt, M. G. et al. (1995). Viral Vectors: Gene Therapy and Neuroscience Applications. San Diego, CA: Academic Press; Lefkovits, I. (1997). The Immunology Methods Manual: The Comprehensive Sourcebook of Techniques. San Diego, CA: Academic Press; Doyle, A. etal. (1998). Cell and Tissue Culture: Laboratory Procedures in Biotechnology. New York, NY : Wiley; Mullis, K. B., Ferre, F. & Gibbs, R. (1994). PCR: The Polymerase Chain Reaction. Boston: Birkhauser Publisher; Greenfield, E. A. (2014). Antibodies: A Laboratory Manual (2nd ed.). New York, NY: Cold Spring Harbor Laboratory Press; Beaucage, S. L. etal. (2000). Current Protocols in Nucleic Acid Chemistry. New York, NY: Wiley, (including supplements through 2014); and Makrides, S. C. (2003). Gene Transfer and Expression in Mammalian Cells. Amsterdam, NL: Elsevier Sciences B.V., the disclosures of which are incorporated herein by reference.

EXAMPLE 1 Clinical samples

[00140] Protocols for collection of human tissue and blood were approved by the Stanford Institutional Review Board (IRB 15166 and IRB 21319). Inclusion criteria included adult patients (age >= 18 years), known or suspected diagnosis of NSCLC, primary tumor >2 cm, and consent for research. Patients receiving neoadjuvant therapy or patients with underlying lung infection, inflammatory, or fibrotic disease were excluded. Overall 21 patients with surgically-resectable NSCLC treated at Stanford were included in this study. DNA was extracted from peripheral blood PBMC (Qiagen) for HLA tying.

EXAMPLE 2 Tissue processing

[00141] Tissue was processed within 2 hours from surgery. Tissue was divided and one section for cell suspensions and another section for histology. Cell suspensions were generated by mincing of tissue followed by digestion with collagenase III (200 IU/mL) and DNAse (100 U/mL) (Worthington Biochemical) for 40 minutes in RPMI and passing through a 70-um filter. Sections for histology were fixed in 4% paraformaldehyde and transferred to 70% ethanol solution the following day.

EXAMPLE 3 FACS analyses

[00142] T cells were isolated from tumor single cell suspensions by antibody staining followed by cell sorting on a 5-laser FACSAria Fusion (Stanford FACS Facility) purchased using funds from the Parker Institute for Cancer Immunotherapy. Tumor cell suspensions were stained in PBS with Zombie Aqua dye (Biolegend) for viability assessment. This was followed by staining in PBS with 2% FBS in Fc Blocking solution (Biolegend) plus the following antibodies: anti-CD4 (OKT4, Biolegend), anti-CD8 (SKI, Biolegend), anti-CD3 (OKT3, Biolegend), anti-CD45 (H130, Biolegend), anti-CD25 (BC96, Biolegend), anti-PD- 1(EH12.2H7, Biolegend), anti-CD137 (4B4-1, BD Biosciences), anti-HLA-DR (L243, Biolegend). CD3+CD45+AquaZombie- cells were index sorted directly into 96-well plates preloaded with 4 pL of capture buffer, snap frozen on dry ice, and stored at -80°C.

EXAMPLE 4

GLIPH2 analyses and establishment of T cell specificity groups [00143] The GLIPH2 algorithm was implemented for the establishment of T cell specificity groups using 778,938 distinct CDR3 sequences from the MD Anderson NSCLC dataset [25] Briefly, by comparing with the reference dataset of 273,920 distinct CDR3 sequences (both CD4 and CD8) from 12 healthy individuals, GLIPH2 first discovered clusters of CDR3 sequences sharing either global or local motifs as previously described [24] The output of CDR3 clusters with shared sequence motifs is accompanied by multiple statistical measurements to facilitate the calling of high-confidence specificity groups, including biases in nb gene usage, CDR3 length distribution (relevant only for local motifs), cluster size, HLA allele usage, and clonal expansion. To establish high-confidence specificity groups with the NSCLC dataset, TCR specificity groups with at least 3 distinct CDR3 members from a minimum of 3 different patients with significant biases in nb gene usage, and CDR3b clonal expansion in comparison with the reference dataset were prioritized. This led to the discovery of 4,226 specificity groups that formed the basis for further analyses throughout the study.

EXAMPLE 5

Annotation of specificity groups with tetramer-derived CDR3B sequences [00144] To annotate inferred specificity groups from lung cancer patients, a combined GLIPH analysis using both the MD Anderson lung cancer patient Eϋ113b sequences and publicly available, tetramer-derived EϋB3b sequences was performed. To do so, tetramer- derived EϋB3b sequences that could form TCR specificity groups were first identified by running an independent GLIPH analysis with a total 10,051 EϋE3b sequences from the tetramer datasets. This led to the formation of 395 specificity groups containing 1,561 CDR3 sequences. These 1,561 CDR3 sequences were then combined with the 778,938 CDR3 sequences from the MD Anderson lung cancer dataset for the aforementioned GLIPH2 analysis. Any specificity group that includes at least one CDR3 sequence from the tetramer data is considered “annotated” and would be assigned a specificity and HLA restriction according to the associated tetramer sequence(s). Of note, in all cases where multiple tetramer-derived CDR3 sequences were found in a given specificity group, there was only one dominant tetramer-defmed specificity/HLA involved.

EXAMPLE 6

Single-cell RNA-seq (scRNA-Seq) sample preparation with the Smart-seq2 method [00145] Full transcriptomes from FACS sorted T cells at the single-cell level were generated according to the previously reported procedures with some modifications [54]

First strand cDNA was then generated with Takara’s SMARTScribe Reverse Transcriptase kit according to manufacturer’s protocol (Takara Bio). Notable changes from the previously reported Smart-Seq2 RT step includes: 2 mM of dNTP and 2 mM of oligo-dT were included in the capture buffer; 1M of Betaine and additional 6 mM MgCh were included in the RT reaction buffer. The cDNA samples were then amplified with the KAPA Library Quantification kit for 22 - 25 cycles (Roche). One microliter of amplified cDNA (of total 25/well) was used for single-cell TCR-sequencing and thus bypassing the RT step as reported previously [35] To proceed with scRNA-Seq, full-length cDNA samples were first cleaned up with 0.6 - 0.8x volume of pre-calibrated AMPure XP beads (Beckman Coulter) to exclude DNA fragments smaller than 500 base pairs. The automatic liquid handler Biomek FXP Automated Workstation (Beckman Coulter) was used in order to eliminate cell-to-cell variabilities. The quality of purified full-length cDNA was validated with the AATI Fragment Analyzer (Agilent). Subsequently, the measurements from the Fragment Analyzer were used in order to normalize the cDNA input with a Mantis liquid handler (Formulatrix). The cDNA samples were then consolidated into a 384-well plate (LVSD) with a Mosquito XI liquid handler (TTP labtech). After transfer, Illumina sequencing libraries were prepared using a Mosquito HTS liquid handler (TTP labtech). Only 0.4 uL (of total 23 uL) of cDNA per well were used to make the full transcriptome libraries with the Nextera XT DNA Library Preparation Kit (Illumina, FC-131-1096). Custom-made i5 and i7 unique 8-bp indexing primers (IDT) were used to multiplex 384 wells in a single sequencing run. The libraries were amplified on a C1000 Touch™ Thermal Cycler with 384-Well Reaction Module (Bio-rad). The pooled libraries were checked with the Agilent 2100 Bioanalyzer (Stanford PAN facility) and acquired paired-end sequences (150bp x 2) on a Hiseq 4000 Sequencing System (Illumina) purchased with funds from NIH (S10OD018220) for the Stanford Functional Genomics Facility (SFGF).

EXAMPLE 7

Single-cell sequencing of the TCRa/b chains [00146] Single T cells were sorted and captured as described above in the method for scRNA-Seq sample preparation. Following first strand cDNA synthesis (Takara) and amplification (Roche), one microliter of amplified cDNA (of total 25 uL/well) was used for single-cell TCR-sequencing and thus bypassing the RT step as reported previously [35] Nested PCR was performed with TCRa/b primers carrying multiplexing barcodes that enabled pooled CDR3a/ sequencing in a single Miseq run. Paired sequencing reads were joined, demultiplexed, and mapped to the human TCR references from the international ImMunoGeneTics information system® (IMGT) with custom scripts as reported previously.

EXAMPLE 8

Data analyses of scRNA-Seq results

[00147] Sequencing reads were first de-multiplexed and binned into separate fastq files that correspond with the full transcriptomes of individual T cells. STAR aligner (2.7.1a) was used to map the reads with default parameters against human genome reference GRCh38 (v21) from the UCSC genome browser. Mapped reads were sorted and indexed with samtools (1.4). Gene expression was first quantified by counting reads mapped to genes with htseq- count (HTSeq 0.9.1) using the following settings: — stranded=no — type=exon — idattr=gene_name — mode=intersection-nonempty. Unless otherwise stated, all single-cell T cell states were analyzed with Seurat (3.1.4) packages in R using raw read counts. To derive TCR repertoires from the scRNA-Seq results, reads mapped to both the TCRa and TCR genes were first reconstructed with the TraCeR algorithm as described previously [36] The reconstructed DNA sequences were then submitted to the IMGT to call gene segment usage and the CDR3 amino acid sequences through HighV-QUEST. EXAMPLE 9

GLIPH2 analysis on the CDR3B sequences from the TRACERx NSCLC cohort [00148] Raw fastq files (n = 202) of the bulk CDR3 nucleotide sequences from the TRACERx cohort of NSCLC were downloaded from the Short Read Archive as reported [26] The amino acid sequences of CDR3 , V gene usage, and clonal counts were subsequently derived by using the custom pipeline established previously [35] To quantify the percentages of tumor-enriched specificity groups shown in FIG. IB, joint GLIPH2 analyses were first conducted with combined CDR3 sequences from the MD Anderson cohort (n = 778,938) and the bulk CDR3 sequences from each tumor sample of the TRACERx cohort. The total percentages (%) of top-20 clonally expanded as well as the rest CDR3 clonotypes that belonged to the 449 tumor-enriched specificity groups were then derived for each tumor (n = 202).

EXAMPLE 10

Soluble biotinylated TCRa/b chains for yeast screen [00149] Soluble TCRa/b chains used for yeast selections were made as described previously [33] Briefly, synthetic gene blocks (gBlocks®) of N-terminal truncated TCRa or TCR chain V and modified C gene fragments were assembled into the baculoviral pAcGP67a construct (BD Biosciences) with Gibson assembly (New England BioLabs). The final baculoviral plasmid was co-transfected into SF9 cells (ATCC) with Bestbac 2.0 (Expression systems) with FuGENE® 6 (Promega) to make the crude viral supernatant (P0). Subsequently, viruses were passaged at a dilution of 1 :500 in 30-50 mL cultures at a density of 1 x 10 6 cells/mL to generate higher titer viruses (PI). To generate the soluble TCRa/b chains, up to 4 liters of High Five (Hi5, Thermo Fisher Scientific) cells were infected with PI baculovirus at a dilution of 1 :500-l : 1000 at a density of 2 x 10 6 cells/mL for a week before protein purification. Recombinant TCRa/b chains were bound with Ni-NTA resin (QIAGEN) in the Hi5 cell media for 3 hours at room temperature, washed with 20 mM imidazole in IX HBS at pH 7.2, and eluded eluted in 200 mM imidazole in IX HBS at pH 7.2. After buffer exchange to IX HBS at pH 7.2 with a 30 kDa filter (Millipore), purified proteins were biotinylated overnight with birA ligase in the presence of 100 mM biotin, 40 mM Bicine at pH 8.3, 10 mM ATP, and lOmM Magnesium Acetate at 4°C. Biotinylated proteins were purified by size-exclusion chromatography using an AKTAPurifier Superdex 200 column (GE Healthcare) and validated on a SDS-PAGE gel to confirm the stoichiometry and biotinylation with excess streptavidin.

EXAMPLE 11

Lentiviral TCR transduction

[00150] TCRa chain, P2A linker, and TCRP chain fusion gene fragments were purchased from IDT and cloned into MCS of the EFla-MCS-GFP-PGK-puro lentiviral vector [23] HEK-293T cells were plated on a 10-cm dish at a density of 7.5 x 10 6 cells in 10 mL of DMEM the day prior to transfection. 293 Ts were co-transfected with 3.3 pg of the lentiviral plasmid, 2.5 pg of the gag-pol plasmid, and 0.83 pg of the VSV-G envelope plasmid pre mixed with 33 pL of PEI in 120 pL of Opti-MEM (ThermoFisher Scientific). After 24 hours, the medium was replenished and viral supernatant was collected 24 and 48 hours later. TCR- deficient Jurkat cells (below) were transduced with viral supernatant, TCR expression was assessed by flow cytometry, and TCR-expressing cells were sorted based on the expression of GFP, CD3, and the transduced TCRa/b chains.

EXAMPLE 12

Retroviral TCR transduction

[00151] For retroviral-mediated expression of TCR27 and TCR28 in primary T cells, TCRa chain, P2A linker, and TCRP chain were PCR amplified from the lentiviral vector (described above) and cloned into the MCS of an MSGV1 -based retroviral vector (gift from Steve Rosenberg laboratory) using In-Fusion Cloning (Takara). For retroviral-mediated expression of TCR27 and TCR28 in primary T cells, TCRa chain, P2A linker, and TCRP chain fusion gene fragments were purchased from IDT and cloned into MCS of an MSGV1- based retroviral vector.

EXAMPLE 13

Cell cultures

[00152] The Jurkat 76 T-cell line deficient for both TCRa and TCRP were provided by Dr. Shao-An Xue (Department of Immunology, University of College London). Jurkat cells and primary T cells were grown in complete RPMI (ThermoFisher) containing 10% FBS, 25 mM HEPES, 290 pg/mL L-glutamine, 100 U/mL penicillin, 100 U/mL streptomycin, ImM sodium pyruvate, and lx non-essential amino acids. T2 cells were grown in EMDM (Fisher Scientific) with 20% FBS, 290 pg/mL L-glutamine, 100 U/mL penicillin, 100 U/mL streptomycin.

EXAMPLE 14

In vitro stimulation of the Jurkat T cell clones

[00153] Jurkat 76 cells expressing the exogenous TCR of interest were sorted and co cultured with T2 cells in complete RPMI as detailed above. Peptides were dissolved in DMSO at 20 mM stock concentration and diluted to a final concentration of 2 mM. After 18 hours of stimulation, cells were washed and stained with anti-CD3 (OKT3, Biolegend), anti- CD69 (FN50, Biolegend), and anti-TCRa/b (IP26, Biolegend) antibodies. Cells were acquired using FACS Fortessa (BD Biosciences) automated high throughput sampler, and data analyzed using FlowJo software (Treestar).

EXAMPLE 15

Whole-exome sequencing

[00154] Whole-exome sequencing of tumor DNA and matched germline leukocyte DNA was performed by inputting 75ng of sheared genomic DNA for library preparation with the KAPA HyperPrep Kit (Roche) with modifications to the manufacturer's instruction, as described previously [57] Library-prepared samples were captured with the SeqCap EZ MedExome Kit (NimbleGen) according to the manufacturer's instructions. Sequencing data were demultiplexed and mapped to hgl9 using a custom bioinformatics pipeline, as described previously [58] VarScan 2 [59], Mutect [60], and Strelka [61] were used to call variants use default parameters. Variants called by at least two of the approaches were then filtered by requiring: 1) variant allele frequency of at least 2.5%, 2) at least 30X depth in both tumor and germline samples, 3) zero germline reads, and 4) a population allele frequency of less than 0.1% in the Genome Aggregation database [62]

EXAMPLE 16 Statistical analysis

[00155] Unless stated otherwise, all statistical analyses performed in finding high- confidence specificity groups with GLIPH2 were Fisher’s exact tests using the contingency tables with the CDR3 query set (specificity group) and the reference set [24] Poisson test was used to determine the representation bias in comparisons of distinct CDR3 sequences or specificity groups between tumors and uninvolved lungs. Student’s t test was used to assess the results from all in vitro assays. Statistical significance was defined as p value < 0.05. EXAMPLE 17

Establishing specificity groups from tumor-infiltrating T cells in human lung cancer [00156] This Example describes the results of experiments performed to identify T cells recognizing shared tumor antigens in lung cancer.

[00157] In order to identify T cells recognizing shared tumor antigens in lung cancer, specificity groups using GLIPH2 were established. As described previously, TCR clonotypes with a high probability of sharing specificities are grouped based on short amino acid sequence motifs embedded within the variable CDR3 regions of the TCR [23] The improved GLIPH2 offers the advantage of analyzing large T cell repertoire datasets and identifying specificity groups carrying local or global sequence motifs with a much greater capacity [24] GLIPH2 algorithms were applied to a recently published T cell repertoire dataset of 778,938 distinct CDR3 sequences from 178 HLA-typed, non-small cell lung cancer (NSCLC) patients with surgically resectable disease [25] (see, e.g., Table 1 below). Of note, the T cell clonotypes from bulk sequencing were derived from both the surgically removed tumor as well as the uninvolved lung (see, e.g., Table 1).

TABLE 1 : Summary of data available for the 178 HLA-typed NSCLC patients from the MD Anderson Cancer Center

[00158] With this dataset, 435 specificity groups enriched in tumors from NSCLC patients were established after applying a set of criteria including nb gene enrichment and clonal expansion, the latter indicative of T cell antigen recognition (FIGS. 1A and 3A). To identify specificity groups related to antigens shared across these patients, a specificity group was further configured as including at least 3 distinct CDR3 sequences from a minimum of 3 patients. The fraction of clonotypes, which are members of the specificity groups as previously defined above, was established. It was found that significantly higher percentages of the most expanded TCR clonotypes in tumor belonged to the tumor-enriched specificity groups (FIG. IB). In contrast, TCR clonotypes from patients’ uninvolved lungs showed much lower percentages belonging to the tumor-enriched specificity groups (see, e.g., FIG. 3B). It was next established that the 435 tumor-enriched specificity groups are relevant to lung cancer, and not merely to normal lung tissue or other types of lung disease. In a validation cohort of 7,363,492 clonotypes from 202 tumor samples representing 68 NSCLC patients [26], a significantly higher percentage of top expanded TCR clonotypes in tumor belonged to the 435 tumor-enriched specificity groups compared to the non-expanded counterparts (FIG. IB). In contrast, a lower percentage of TCR clonotypes belonging to the tumor-enriched specificity groups in lung tissue from healthy donors and patients with COPD (without a cancer diagnosis) [25] was observed, regardless of clonal expansion (FIG. IB). In summary, the experimental data described herein has identified a set of specificity groups predicted to recognize shared tumor antigens across NSCLC patients.

EXAMPLE 18

In silico validation of TCR specificity groups using HLA tetramer sequences [00159] In order to validate the specificity groups established by GLIPH2, publicly available CDR3 sequences from various HLA tetramer databases were included in combination with the MD Anderson CDR3 sequences for a joint GLIPH2 analysis [23, 27, 28] The CDR3 sequences available from the tetramer datasets primarily cover viral specificities and have been experimentally shown to bind epitopes in the context of their respective HLAs. This allows us to annotate some of the specificity groups with sequences from the tetramer databases linked to experimentally-established antigen specificities and HLA restrictions. The joint analysis led to the annotation of 396 specificity groups (FIG.

1C). Of these specificity groups, 71 were clonally expanded and annotated with 10 different tetramers (FIG. 3C). As anticipated, it was found that clonotypes with inferred specificities to Flu, Epstein-Barr virus (EBV), or CMV antigens were not preferentially localized in the tumor compared to uninvolved lung (FIGS. 4A-4B). In addition, network analysis organized these tetramer-annotated specificity groups sharing some identical CDR3 sequence members into communities (FIGS. 1C and 3D-3G). Specificity groups belonging to a given community were consistently annotated with identical HLA tetramers (FIGS. 1C, 3D, and 3E), indicating that some antigen specificity groups, albeit sharing distinct sequence motifs, are likely related to the same specificity and HLA restriction.

EXAMPLE 19

Experimental validation of GLIPH2 -inferred specificities

[00160] In order to experimentally validate peptide-MHC specificities, sequences from TCRa/b pairs are required from single T cells. Therefore, single-cell TCR sequencing (TCR- seq) from 15 early-stage NSCLC patients treated at Stanford was performed. Tumor- infiltrating T cells were prepared from surgically resected specimens and sorted by FACS before sequencing (see, e.g., FIG. 5). A total of 4,704 paired CDR3a and CDR3 sequences were sequenced and combined with the CDR3 sequences from MD Anderson for further analysis.

EXAMPLE 20

Expansion of EBV-specific T cell clones in patients responding to immune checkpoint blockade

[00161] To determine if pathogen-specific T cells can impact clinical responses to anti- PD1 checkpoint immunotherapy, the TCR repertoire of two NSCLC patients who experienced a clinical response to treatment (see, e.g., FIG. 6A) was analyzed. Paired CDR3a/ repertoires on both pre- and post-treatment blood samples were sequenced and used to identify 102 CDR3 clonotypes that expanded in post-treatment samples (see, e.g., FIG. 2A). Of these expanded clones, 41 belonged to 99 specificity groups identified in tumor-infiltrating T cell CDR3 repertoires (total n=66,094, see, e.g., FIG. 3A). Tetramer CDR3 sequences were then used to annotate these specificity groups and found 11 (total n=99) containing 3 expanded CDR3 clones inferred to recognize EBV and Flu antigens (FIG. 2B). To validate the specificity inferences, two Jurkat cell clones expressing the TCRa/b chains inferred to recognize the EBV antigens and a T2 cell line expressing wildtype B*35 (FIGS. 2B and 6B) were created. Indeed, it was observed that upon co-culture with the T2-B*35 cells, both Jurkat-TCR27 and -TCR28 cells responded to the predicted EBV peptides (see, e.g., FIG. 2C). Of note, these EBV-specific specificity groups were not only expanded post-treatment, but also showed a bias in tumor compared to the adjacent lung, suggesting the potential cross-reactivities to unknown TAAs (see, e.g., FIG. 6C). Previously, common pathogen-specific T cells found in tumors have been presumed to be “bystanders” and not specific for TAAs. The experimental data described herein showed that T cell specificities for TAAs and pathogen-derived antigens were not mutually exclusive. Furthermore, these pathogen-specific T cells in tumors exhibited an effector phenotype rather than an exhausted or stressed state and lacked CD39 expression. These data suggested that cross-reactive T cells might play a role in controlling cancer progression in the setting of anti- PD1 checkpoint blockade.

[00162] While particular alternatives of the present disclosure have been disclosed, it is to be understood that various modifications and combinations are possible and are contemplated within the true spirit and scope of the appended claims. There is no intention, therefore, of limitations to the exact abstract and disclosure herein presented.

INFORMAL SEQUENCE LISTING

REFERENCES

1. Kawakami, Y., et al. , Cloning of the gene coding for a shared human melanoma antigen recognized by autologous T cells infiltrating into tumor. Proc Natl Acad Sci U S A, 1994. 91(9): p. 3515-9.

2. Coulie, P.G., et al ., A new gene coding for a differentiation antigen recognized by autologous cytolytic T lymphocytes on HLA-A2 melanomas. J Exp Med, 1994. 180(1): p. 35-42.

3. van der Bruggen, P., et al. , A gene encoding an antigen recognized by cytolytic T lymphocytes on a human melanoma. Science, 1991. 254(5038): p. 1643-7.

4. Coulie, P.G., et al. , A mutated intron sequence codes for an antigenic peptide recognized by cytolytic T lymphocytes on a human melanoma. Proc Natl Acad Sci U S A, 1995. 92(17): p. 7976-80.

5. Wolfel, T., et al. , A pl6INK4a-insensitive CDK4 mutant targeted by cytolytic T lymphocytes in a human melanoma. Science, 1995. 269(5228): p. 1281-4.

6. Murray, R.J., et al. , Identification of target antigens for the human cytotoxic T cell response to Epstein-Barr virus (EBV): implications for the immune control of EBV- positive malignancies. J Exp Med, 1992. 176(1): p. 157-68.

7. Koziel, M.J., et al. , HLA class I-restricted cytotoxic T lymphocytes specific for hepatitis C virus. Identification of multiple epitopes and characterization of patterns of cytokine release. J Clin Invest, 1995. 96(5): p. 2311-21.

8. Rehermann, B., et al. , The cytotoxic T lymphocyte response to multiple hepatitis B virus polymerase epitopes during and after acute viral hepatitis. J Exp Med, 1995. 181(3): p. 1047-58.

9. Tran, E., et al. , Immunogenicity of somatic mutations in human gastrointestinal cancers. Science, 2015. 350(6266): p. 1387-90.

10. Gros, A., et al. , Prospective identification of neoantigen-specific lymphocytes in the peripheral blood of melanoma patients. Nat Med, 2016. 22(4): p. 433-8.

11. Zacharakis, N., et al. , Immune recognition of somatic mutations leading to complete durable regression in metastatic breast cancer. Nat Med, 2018. 24(6): p. 724-730.

12. Schumacher, T.N., W. Scheper, and P. Kvistborg, Cancer Neoantigens. Annu Rev Immunol, 2019. 37: p. 173-200. Rosenberg, S.A. and M.E. Dudley, Adoptive cell therapy for the treatment of patients with metastatic melanoma. Curr Opin Immunol, 2009. 21(2): p. 233-40. Hinrichs, C.S. and S.A. Rosenberg, Exploiting the curative potential of adoptive T-cell therapy for cancer. Immunol Rev, 2014. 257(1): p. 56-71. de Vos van Steenwijk, P.J., etal. , An unexpectedly large polyclonal repertoire of HPV- specific T cells is poised for action in patients with cervical cancer. Cancer Res, 2010. 70(7): p. 2707-17. Piersma, S.J., etal. , Human papillomavirus specific T cells infiltrating cervical cancer and draining lymph nodes show remarkably frequent use of HLA-DQ and -DP as a restriction element. Int J Cancer, 2008. 122(3): p. 486-94. Evans, E.M., etal. , Infiltration of cervical cancer tissue with human papillomavirus- specific cytotoxic T-lymphocytes. Cancer Res, 1997. 57(14): p. 2943-50. Triozzi, P.L. and A.P. Fernandez, The role of the immune response in merkel cell carcinoma. Cancers (Basel), 2013. 5(1): p. 234-54. Simoni, Y., el al ., Bystander CD8(+) T cells are abundant and phenotypically distinct in human tumour infiltrates. Nature, 2018. 557(7706): p. 575-579. Rosato, P.C., et al. , Virus-specific memory T cells populate tumors and can be repurposed for tumor immunotherapy. Nat Commun, 2019. 10(1): p. 567. Scheper, W., et al. , Low and variable tumor reactivity of the intratumoral TCR repertoire in human cancers. Nat Med, 2019. 25(1): p. 89-94. Yu, W., et al. , Clonal Deletion Prunes but Does Not Eliminate Self-Specific alphabeta CD8(+) T Lymphocytes. Immunity, 2015. 42(5): p. 929-41. Glanville, J., et al. , Identifying specificity groups in the T cell receptor repertoire. Nature, 2017. 547(7661): p. 94-98. Huang, H., et al. , Analyzing the CD4+ T cell response repertoire to M. tuberculosis using GLIPH2 and whole-genome antigen screening. Nat Biotechnol, 2020. Reuben, A., etal. , Comprehensive T cell repertoire characterization of non-small cell lung cancer. Nat Commun, 2020. 11(1): p. 603. Joshi, K., etal. , Spatial heterogeneity of the T cell receptor repertoire reflects the mutational landscape in lung cancer. Nat Med, 2019. 25(10): p. 1549-1559. Shugay, M., et al. , VDJdb: a curated database of T-cell receptor sequences with known antigen specificity. Nucleic Acids Res, 2018. 46(D1): p. D419-D427. Song, I., et al. , Broad TCR repertoire and diverse structural solutions for recognition of an immunodominant CD8(+) T cell epitope. Nat Struct Mol Biol, 2017. 24(4): p. 395- 406. Sidney, J., et al ., HLA class I supertypes: a revised and updated classification. BMC Immunol, 2008. 9: p. 1. Haijanto, S., L.F. Ng, and J.C. Tong, Clustering HLA class I superfamilies using structural interaction patterns. PLoS One, 2014. 9(1): p. e86655. Robins, H.S., et al. , Overlap and effective size of the human CD8+ T cell receptor repertoire. Sci Transl Med, 2010. 2(47): p. 47ra64. Arstila, T.P., et al. , A direct estimate of the human alphabeta T cell receptor diversity. Science, 1999. 286(5441): p. 958-61. Gee, M.H., etal. , Antigen Identification for Orphan T Cell Receptors Expressed on Tumor-Infiltrating Lymphocytes. Cell, 2018. 172(3): p. 549-563 el6. Kheir, F., etal. , Detection of Epstein-Barr Virus Infection in Non-Small Cell Lung Cancer. Cancers (Basel), 2019. 11(6). Han, A., et al. , Linking T-cell receptor sequence to functional phenotype at the single-cell level. Nat Biotechnol, 2014. 32(7): p. 684-92. Stubbington, M.J.T., etal. , T cell fate and clonality inference from single-cell transcriptomes. Nat Methods, 2016. 13(4): p. 329-332. Guo, X., et al. , Global characterization of T cells in non-small-cell lung cancer by single cell sequencing. Nat Med, 2018. 24(7): p. 978-985. Joglekar, A.V., et al. , T cell antigen discovery via signaling and antigen-presenting bifunctional receptors. Nat Methods, 2019. 16(2): p. 191-198. Li, G., etal. , T cell antigen discovery via trogocytosis. Nat Methods, 2019. 16(2): p. 183- 190. Kula, T., et al. , T-Scan: A Genome-wide Method for the Systematic Discovery of T Cell Epitopes. Cell, 2019. 178(4): p. 1016-1028 el3. Sewell, A.K., Why must T cells be cross-reactive? Nat Rev Immunol, 2012. 12(9): p. 669-77. McCarthy, E.F., The toxins of William B. Coley and the treatment of bone and soft-tissue sarcomas. Iowa Orthop J, 2006. 26: p. 154-8. Morales, A., D. Ei dinger, and A.W. Bruce, Intracavitary Bacillus Calmette-Guerin in the treatment of superficial bladder tumors. J Urol, 1976. 116(2): p. 180-3. Gopalakrishnan, V., et al. , Gut microbiome modulates response to anti-PD-1 immunotherapy in melanoma patients. Science, 2018. 359(6371): p. 97-103. Routy, B., et al. , Gut microbiome influences efficacy of PD-l-based immunotherapy against epithelial tumors. Science, 2018. 359(6371): p. 91-97. Vetizou, M., et al. , Anticancer immunotherapy by CTLA-4 blockade relies on the gut microbiota. Science, 2015. 350(6264): p. 1079-84. Sivan, A., etal. , Commensal Bifidobacterium promotes antitumor immunity and facilitates anti-PD-Ll efficacy. Science, 2015. 350(6264): p. 1084-9. Matson, V., etal., The commensal microbiome is associated with anti-PD-1 efficacy in metastatic melanoma patients. Science, 2018. 359(6371): p. 104-108. Riquelme, E., et al. , Tumor Microbiome Diversity and Composition Influence Pancreatic Cancer Outcomes. Cell, 2019. 178(4): p. 795-806 el2. Emerson, R.O., etal. , Immunosequencing identifies signatures of cytomegalovirus exposure history and HLA-mediated effects on the T cell repertoire. Nat Genet, 2017. 49(5): p. 659-665. Abazeed, M.E., etal. , Integrative radiogenomic profiling of squamous cell lung cancer. Cancer Res, 2013. 73(20): p. 6289-98. Barbie, D.A., et al. , Systematic RNA interference reveals that oncogenic KRAS-driven cancers require TBK1. Nature, 2009. 462(7269): p. 108-12. Altman, J.D. and M.M. Davis, MHC -peptide tetramers to visualize antigen-specific T cells. Curr Protoc Immunol, 2003. Chapter 17: p. Unit 17 3. Picelli, S., et al. , Full-length RNA-seq from single cells using Smart-seq2. Nat Protoc, 2014. 9(1): p. 171-81. Rueden, C.T., et al. , ImageJ2: ImageJ for the next generation of scientific image data. BMC Bioinformatics, 2017. 18(1): p. 529. Schindelin, J., etal. , Fiji: an open-source platform for biological-image analysis. Nat Methods, 2012. 9(7): p. 676-82. Hellmann, M.D., et al. , Circulating tumor DNA analysis to assess risk of progression after long-term response to PD-(L)1 blockade in NSCLC. Clin Cancer Res, 2020. Newman, A.M., et al. , An ultrasensitive method for quantitating circulating tumor DNA with broad patient coverage. Nat Med, 2014. 20(5): p. 548-54. Koboldt, D.C., etal. , VarScan 2: somatic mutation and copy number alteration discovery in cancer by exome sequencing. Genome Res, 2012. 22(3): p. 568-76. Cibulskis, K., etal. , Sensitive detection of somatic point mutations in impure and heterogeneous cancer samples. Nat Biotechnol, 2013. 31(3): p. 213-9. Saunders, C.T., etal. , Strelka: accurate somatic small-variant calling from sequenced tumor-normal sample pairs. Bioinformatics, 2012. 28(14): p. 1811-7. Lek, M., et al. , Analysis of protein-coding genetic variation in 60,706 humans. Nature, 2016. 536(7616): p. 285-91.