Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
METHODS OF TREATING CELLS CONTAINING FUSION GENES BY GENOMIC TARGETING
Document Type and Number:
WIPO Patent Application WO/2018/112098
Kind Code:
A1
Abstract:
The present invention relates to methods for treating patients having cancer or a premalignant or neoplastic condition. It is based, at least in part, on the discovery that a genome editing technique that specifically targets a fusion gene can induce cell death in a cancer cell other than a prostate cancer cell, e.g., a hepatocellular cancer cell, having the fusion gene. The present invention provides methods for treating cancer patients that include performing a genome editing technique targeting a fusion gene present within one or more cells of a subject to produce an anti-cancer effect.

Inventors:
LUO, Jianhua (431 Mckean Dr, Wexford, PA, 15090, US)
CHEN, Zhanghui (Xinghai Ming-cheng 1 #902, Huzhou, Zhejiang, Zhejiang, CN)
YU, Yanping (431 Mckean Drive, Wexford, PA, 15090, US)
MICHALOPOULOS, George (172 Lancaster Avenue, Pittsburgh, PA, 15228, US)
NELSON, Joel, B. (148 Shady Lane, Pittsburgh, PA, 15215, US)
TSENG, Chien-Cheng (8181 Steamside Dr, Pittsburgh, PA, 15237, US)
Application Number:
US2017/066207
Publication Date:
June 21, 2018
Filing Date:
December 13, 2017
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
UNIVERSITY OF PITTSBURGH - OF THE COMMONWEALTH SYSTEM OF HIGHER EDUCATION (1st Floor Gardner Steel Conference Center, 130 Thackeray AvenuePittsburgh, PA, 15260, US)
International Classes:
A61K48/00; A61P35/00; C12Q1/68
Domestic Patent References:
WO2016011428A12016-01-21
Foreign References:
US9464327B22016-10-11
US20110027808A12011-02-03
Attorney, Agent or Firm:
KOLE, Lisa, B. et al. (Baker Botts L.L.P, 30 Rockefeller PlazaNew York, NY, 10112-4498, US)
Download PDF:
Claims:
WHAT IS CLAIMED IS:

1. A method of treating a subject, comprising determining that at least one fusion gene is present in a sample obtained from a subject and then performing a genome editing technique targeting the fusion gene within one or more cells of the subject to achieve an anti-neoplastic effect, wherein the subject does not have prostate cancer.

2. A method of treating a subject having cancer, comprising determining that at least one fusion gene is present in a sample obtained from a subject and then performing a genome editing technique targeting the fusion gene within one or more cells of the cancer of the subject to produce an anti-cancer effect, wherein the subject does not have prostate cancer.

3. The method of claim 1 or 2, wherein the fusion gene is selected from the group consisting of TRMT11-GRIK2, SLC45A2-AMACR, MTOR-TP53BP1, LRRC59- FLJ60017, TMEM135-CCDC67, KDM4B-AC011523.2, MAN2A1-FER, PTEN- NOLC1, CCNH-C5orf30, ZMPSTE24‐ZMYM4, CLTC‐ETV1, ACPP‐SEC13, DOCK7‐ OLR1, PCMTD1‐SNTG1 and a combination thereof.

4. The method of claim 1, wherein the subject has a pre-malignant or neoplastic condition.

5. The method of claim 1, wherein the subject has cancer.

6. The method of claim 2 or 5, wherein the cancer is breast cancer, liver cancer, lung cancer, cervical cancer, endometrial cancer, pancreatic cancer, ovarian cancer, gastric cancer, thyroid cancer, glioblastoma multiforme, colorectal cancer, diffuse large B cell-lymphoma, sarcoma, acute and chronic lymphocytic leukemia, Hodgkin’s lymphoma, non-Hodgkin’s lymphoma, or esophageal adenocarcinoma.

7. The method of claim 1 or 2, wherein the fusion gene is detected by FISH analysis.

8. The method of claim 1 or 2, wherein the fusion gene is detected by reverse transcription polymerase chain reaction.

9. The method of claim 1 or 2, wherein the fusion gene is MAN2A1-FER. 10. The method of claim 1 or 2, wherein the fusion gene is TMEM135- CCDC67.

11. The method of claim 1 or 2, wherein the fusion gene is PTEN-NOLC1. 12. The method of claim 1 or 2, wherein the genome editing technique uses the CRISPR/Cas system.

13. The method of claim 12, wherein the CRISPR/Cas system cleaves a sequence within the fusion gene genomic sequence to insert a nucleic acid within the fusion gene to induce cell death.

14. The method of claim 12, wherein the CRISPR/Cas system uses a Cas endonuclease selected from the group consisting of Casl, CaslB, Cas2, Cas3, Cas4, Cas5, Cas6, Cas7, Cas8, Cas9 (also known as Csnl or Csxl2), CaslO, Csyl , Csy2, Csy3, Cse l, Cse2, Cscl, Csc2, Csa5, Csn2, Csm2, Csm3, Csm4, Csm5, Csm6, Cmrl, Cmr3, Cmr4, Cmr5, Cmr6, Csbl, Csb2, Csb3, Csxl7, Csxl4, CsxlO, Csxl6, CsaX, Csx3, Csxl, CsxlS, Csfl, Csf2, CsO, Csf4, Cpfl, c2cl, c2c3, Cas9HiFi and a combination thereof.

15. The method of claim 14, wherein the Cas endonuclease is Cas9.

16. The method of claim 1 or 2, wherein the genome editing technique comprises: transducing one or more cells or cancer cells with (i) a vector comprising a nucleic acid encoding a Cas protein and two guide RNAs (gRNA) and (ii) a vector comprising a donor nucleic acid and one or more targeting sequences.

17. The method of claim 16, wherein one gRNA is complementary to a region within one gene of the fusion gene and the other gRNA is complementary to a region within the second gene of the fusion gene.

18. The method of claim 16, wherein the donor nucleic acid encodes HSV-1 thymidine kinase.

19. The method of claim 16 or 18, wherein the method further comprises administering to the subject a therapeutically effective amount of ganciclovir or valganciclovir.

20. The method of claim 16, wherein one gRNA is complementary to a region within the MAN2A1 gene of the MAN2A1-FER fusion gene and another gRNA can be complementary to a region within the FER gene.

21. The method of claim 16, wherein one gRNA is complementary to a region within the TMEM135 gene of the TMEM135-CCDC67 fusion gene and another gRNA can be complementary to a region within the CCDC67 gene.

22. A method of treating a subject comprising determining whether one or more cells of the subject contains a fusion gene selected from the group consisting of TRMT11-GRIK2, SLC45A2-AMACR, MTOR-TP53BP1, LRRC59-FLJ60017,

TMEM135-CCDC67, KDM4B-AC011523.2, MAN2A1-FER, PTEN-NOLC1, CCNH- C5orf30, ZMPSTE24‐ZMYM4, CLTC‐ETV1, ACPP‐SEC13, DOCK7‐OLR1,

PCMTD1‐SNTG1 and a combination thereof; and where if the one or more cells contain a fusion gene, then performing a genome editing technique targeting the fusion gene present within the one or more cells of the subject, wherein the subject does not have prostate cancer.

23. A method of treating a subject having a premalignant or neoplastic condition comprising determining whether one or more cells of the subject contains a fusion gene selected from the group consisting of TRMT11-GRIK2, SLC45A2- AMACR, MTOR-TP53BP1, LRRC59-FLJ60017, TMEM135-CCDC67, KDM4B- AC011523.2, MAN2A1-FER, PTEN-NOLC1, CCNH-C5orf30, ZMPSTE24‐ZMYM4, CLTC‐ETV1, ACPP‐SEC13, DOCK7‐OLR1, PCMTD1‐SNTG1 and a combination thereof; and where if the one or more cells contain a fusion gene, then performing a genome editing technique targeting the fusion gene present within the one or more cells of the subject, wherein the subject does not have prostate cancer.

24. The method of claim 21, wherein the one or more cells of the subject are cancer cells.

25. The method of claim 24, wherein the cancer cells are breast cancer, liver cancer, lung cancer, cervical cancer, endometrial cancer, pancreatic cancer, ovarian cancer, gastric cancer, thyroid cancer, glioblastoma multiforme, colorectal cancer, diffuse large B cell lymphoma, sarcoma, acute and chronic lymphocytic leukemia, Hodgkin’s lymphoma, non-Hodgkin’s lymphoma or esophageal adenocarcinoma cells.

26. The method of any one of claims 2-3, 5-21 and 24-25, wherein the cancer is not lung adenocarcinoma, glioblastoma multiforme or hepatocellular carcinoma.

27. A kit for performing a genome editing technique targeting a fusion gene present within a cell, where the cell is not a prostate cancer cell, wherein the kit comprises: (i) a vector comprising a nucleic acid encoding a Cas protein and one or more guide RNAs (gRNA) and (ii) a vector comprising a donor nucleic acid and one or more targeting sequences.

28. The kit of claim 27, wherein one gRNA is complementary to a region within one gene of the fusion gene and the other gRNA is complementary to a region within the second gene of the fusion gene.

29. The kit of claim 27, wherein the donor nucleic acid encodes HSV-1 thymidine kinase.

30. The kit of claim 27, 28 or 29, wherein the kit further comprises ganciclovir and/or valganciclovir.

31. The kit of claim 27, wherein the fusion gene is selected from the group consisting of TRMT11-GRIK2, SLC45A2-AMACR, MTOR-TP53BP1, LRRC59- FLJ60017, TMEM135-CCDC67, KDM4B-AC011523.2, MAN2A1-FER, PTEN- NOLC1, CCNH-C5orf30, ZMPSTE24‐ZMYM4, CLTC‐ETV1, ACPP‐SEC13, DOCK7‐ OLR1, PCMTD1‐SNTG1 and a combination thereof.

32. The kit of claim 27 further comprising nucleic acid primers for PCR analysis of one or more fusion genes selected from the group consisting of TRMT11- GRIK2, SLC45A2-AMACR, MTOR-TP53BP1, LRRC59-FLJ60017, TMEM135- CCDC67, PTEN-NOLC1, CCNH-C5orf30, TRMT11-GRIK2, SLC45A2-AMACR, KDM4B-AC011523.2, MAN2A1-FER, MTOR-TP53BP, ZMPSTE24‐ZMYM4, CLTC‐ ETV1, ACPP‐SEC13, DOCK7‐OLR1 or PCMTD1‐SNTG11.

33. The kit of claim 27 further comprising nucleic acid probes for FISH analysis of one or more fusion genes selected from the group consisting of TRMT11- GRIK2, SLC45A2-AMACR, MTOR-TP53BP1, LRRC59-FLJ60017, TMEM135- CCDC67, PTEN-NOLC1, CCNH-C5orf30, TRMT11-GRIK2, SLC45A2-AMACR, KDM4B-AC011523.2, MAN2A1-FER, MTOR-TP53BP1, ZMPSTE24‐ZMYM4, CLTC‐ETV1, ACPP‐SEC13, DOCK7‐OLR1 and PCMTD1‐SNTG1.

34. The kit of claim 27, wherein the cell is a cell from a breast cancer, liver cancer, lung cancer, cervical cancer, endometrial cancer, pancreatic cancer, ovarian cancer, gastric cancer, thyroid cancer, glioblastoma multiforme, colorectal cancer, diffuse large B cell lymphoma, sarcoma, acute and chronic lymphocytic leukemia, Hodgkin’s lymphoma, non-Hodgkin’s lymphoma or esophageal adenocarcinoma cell.

35. The kit of claim 27, wherein the cell is a neoplastic and/or pre-malignant cell.

36. The kit of claim 27, wherein the Cas protein is Cas9 or Cas9D10A.

37. The kit of claim 27, wherein the cell is not a cell from a lung

adenocarcinoma, a glioblastoma multiforme or a hepatocellular carcinoma.

38. An agent capable of targeted genome editing for use in a method to treat or prevent cancer in a subject, the method comprising (i) determining whether a sample of the subject contains a fusion gene; and (ii) where the sample contains one or more fusion genes than performing a genome editing procedure using the agent to target the fusion gene in one or more cancer cells within the subject, where the cancer is not prostate cancer.

39. An agent capable of targeted genome editing for use in a method to treat a subject, the method comprising (i) determining whether a sample of the subject contains a fusion gene; and (ii) where the sample contains one or more fusion genes than performing a genome editing procedure using the agent to target the fusion gene in one or more cells within the subject, where the subject does not have prostate cancer.

40. An agent capable of targeted genome editing for use in a method to treat a subject that has a premalignant or neoplastic condition, the method comprising (i) determining whether a sample of the subject contains a fusion gene; and (ii) where the sample contains one or more fusion genes than performing a genome editing procedure using the agent to target the fusion gene in one or more cells within the subject, where the subject does not have prostate cancer.

41. The agent of claim 38, 39 or 40, wherein the agent is a Cas9 protein. 42. The agent of claim 41, wherein the Cas9 protein is Cas9D10A.

43. The agent of claim 38, wherein the cancer is breast cancer, liver cancer, lung cancer, cervical cancer, endometrial cancer, pancreatic cancer, ovarian cancer, gastric cancer, thyroid cancer, glioblastoma multiforme, colorectal cancer, diffuse large B cell lymphoma, sarcoma, acute and chronic lymphocytic leukemia, Hodgkin’s lymphoma, non-Hodgkin’s lymphoma or esophageal adenocarcinoma.

44. The agent of claim 38, wherein the cancer is not lung adenocarcinoma, glioblastoma multiforme or hepatocellular carcinoma.

45. A composition for use in treating a subject that has one or more cells that contains one or more fusion genes selected from the group consisting of TMEM135- CCDC67, TRMT11-GRIK2, SLC45A2-AMACR, MTOR-TP53BP1, LRRC59- FLJ60017, KDM4B-AC011523.2, MAN2A1-FER, PTEN-NOLC1, CCNH-C5orf30, ZMPSTE24‐ZMYM4, CLTC‐ETV1, ACPP‐SEC13, DOCK7‐OLR1, PCMTD1‐SNTG1 and a combination thereof, comprising one or more agents for use in a genome editing technique targeting the one or more fusion genes present within the one or more cells, wherein the subject does not have prostate cancer.

46. The composition of claim 45, wherein the one or more agents for use in the genome editing technique comprises one or more guide RNAs and an endonuclease.

47. The composition of claim 46, wherein the endonuclease is Cas 9.

48. The composition of any one of claims 45-47, wherein the one or more guide RNAs comprises one gRNA complementary to a region within one gene of the fusion gene and a second gRNA complementary to a region within the second gene of the fusion gene.

49. The composition of any one of claims 45-47, wherein the one or more guide RNAs comprises one gRNA complementary to a specific sequence within a chromosomal breakpoint of the one or more fusion genes.

50. A method of determining a treatment for a subject having one or more cells that contains one or more fusion genes, comprising

i) providing a sample from the subject;

ii) determining whether one or more cells of the subject contains one or more fusion genes selected from the group consisting of TMEM135-CCDC67, TRMT11- GRIK2, SLC45A2-AMACR, MTOR-TP53BP1, LRRC59-FLJ60017, KDM4B- AC011523.2, MAN2A1-FER, PTEN-NOLC1, CCNH-C5orf30, ZMPSTE24‐ZMYM4, CLTC‐ETV1, ACPP‐SEC13, DOCK7‐OLR1, PCMTD1‐SNTG1 and a combination thereof; and

iii) instructing a genome editing technique to be performed if one or more fusion genes are detected in the one or more cells, wherein the genome editing technique targets the one or more of the fusion genes detected in the one or more cells, and wherein the subject does not have prostate cancer.

51. The method of claim 50, wherein the genome editing technique is performed using the CRISPR/Cas 9 system.

Description:
METHODS OF TREATING CELLS CONTAINING FUSION GENES

BY GENOMIC TARGETING PRIORITY INFORMATION

This application claims priority to U.S. Provisional Patent Application Serial No. 62/433,608, filed December 13, 2016, and U.S. Provisional Patent Application Serial No. 62/572,960, filed October 16, 2017, the contents of both of which are herein incorporated by reference in their entireties. GRANT INFORMATION

This invention was made with government support under Grant Nos. CA098249 and CA190766 awarded by the National Institutes of Health and Grant Nos. W81XWH- 16-1-0541 and W81XWH-16-1-0364 awarded by the U.S. Army Medical Research & Materiel Command. The government has certain rights in the invention. 1. INTRODUCTION

The present invention relates to methods of treating patients carrying one or more specific fusion genes by performing a genome targeting technique. 2. BACKGROUND OF THE INVENTION

Clustered regularly interspaced short palindromic repeats (CRISPR)/CRISPR- associated (Cas) were originally discovered to act as immunity defense mechanisms against foreign pathogens in prokaryotic cells (Mojica et al. (2005) J. of Molecular Evolution 60:174‐182). Cas9, a protein for the type II CRISPR/Cas system, was found to exhibit DNA cleavage activity. The nuclease activity of Cas9 can be guided by a CRISPR RNA and a trans-activating CRISPR RNA complementary to a targeted sequence of DNA in the genome (Jinek et al. (2012) Science 337:816‐821). Since trans- activating CRISPR RNA and CRISPR RNA can be made into a chimeric RNA containing the full function of both RNA species, artificial fusion RNA sequences, also called guide RNAs (gRNAs), were generated to target the activity of Cas9 to a target DNA sequence (Esvelt et al. (2014) eLife:e03401). A D10A mutation present in the catalytic domain of Cas9 converts it to a nickase that produces single nucleotide breaks at the target DNA (Jinek et al. (2012) Science 337:816‐821). Double nicking of target DNA can increase genome editing specificity by 50-1500 fold (Ranet al. (2013) Cell 154:1380‐1389), with the off-target rate as low as 1/10,000. Such specificity can make somatic genomic targeting a viable approach in treating human diseases.

In the U.S., prostate cancer is one of the most frequent malignancies observed in men. The mortality of prostate cancer reached 27,540 in 2014, the second most lethal cancer for men (Siegel et al. (2015) A Cancer Journal For Clinicians 65:5-29)). As disclosed in WO 2015/103057 and WO 2016/011428, a number of fusion genes generated by chromosomal rearrangement were identified in prostate cancers that have been shown to be recurrent and lethal. The expression of these fusion genes are widespread among aggressive prostate cancers but are absent in normal tissues. WO 2016/011428 discloses the genomic targeting of the chromosomal breakpoint of the fusion gene TMEM135-CCDC67, which resulted in cell death and remission of xenografted prostate cancer in mice.

Cancers, in general, are among the leading causes of death in the U.S. The mortality rate of cancers reached 595,690 in 2015 in the U.S. alone, making it the second most lethal cause of death after cardiovascular diseases (Siegel et al. (2016) A Cancer Journal For Clinicians 66(1):7-30). Treatment of cancers, particularly of those that become metastatic, remains problematic, and cures for cancer remain elusive. Therefore, there remains a need in the art for methods of treating cancer. 3. SUMMARY OF THE INVENTION

The present invention relates to methods for treating patients suffering from cancer or a pre-malignant or neoplastic condition. It is based, at least in part, on the discovery that a genome editing technique that specifically targets a fusion gene can induce cell death in a cancer cell, for example a cancer cell other than a prostate cancer cell, for example a hepatocellular cancer cell, having the fusion gene.

In various non-limiting embodiments, the present invention provides for methods of treating a subject that carries a fusion gene. For example, and not by way of limitation, the subject can have cancer, a pre-malignant condition or a neoplastic condition. In certain embodiments, a method of the present invention comprises performing a genome editing technique that targets a fusion gene present within one or more cancer cells of the subject. Non-limiting examples of such fusion genes include TRMT11-GRIK2, SLC45A2-AMACR, MTOR-TP53BP1, LRRC59-FLJ60017,

TMEM135-CCDC67, KDM4B-AC011523.2, MAN2A1-FER, PTEN-NOLC1, CCNH- C5orf30, ZMPSTE24‐ZMYM4, CLTC‐ETV1, ACPP‐SEC13, DOCK7‐OLR1 and PCMTD1‐SNTG1. In certain embodiments, the fusion gene is PTEN-NOLC1. In certain embodiments, the fusion gene is MAN2A1-FER. In certain embodiments, the cancer is not prostate cancer. In certain embodiments, the cancer is not lung

adenocarcinoma, glioblastoma multiforme or hepatocellular carcinoma.

In certain non-limiting embodiments, the present invention further provides kits for performing methods of treating a subject that carries a fusion gene. For example, and not by way of limitation, the subject can have a cancer, a pre-malignant condition or a neoplastic condition. In certain embodiments, a kit of the present invention can comprise one or more vectors or plasmids comprising a nucleic acid encoding a Cas9 protein, e.g., Cas9 D10A . In certain embodiments, the one or more vectors can further comprise one or more gRNAs specific to a fusion gene, e.g., specific to a breakpoint of a fusion gene and/or sequences flanking the breakpoint of a fusion gene.

In certain embodiments, a kit of the present invention can further include one or more vectors or plasmids comprising a nucleic acid, that when expressed results in cell death. In certain embodiments, the nucleic acid encodes HSV-1 thymidine kinase. In certain embodiments, this vector can further comprise one or more targeting sequences that are complementary to sequences within the fusion gene to promote homologous recombination and insertion of the nucleic acid. In certain embodiments, where the nucleic acid encodes HSV-1 thymidine kinase, the kit can further comprise ganciclovir and/or valganciclovir.

In certain embodiments, the kit can include nucleic acid primers for PCR analysis or nucleic acid probes for RNA in situ analysis to detect the presence of one or more fusion genes in a sample from the subject. In certain non-limiting embodiments, the one or more fusion genes can be selected from the group consisting of TRMT11-GRIK2, SLC45A2-AMACR, MTOR-TP53BP1, LRRC59-FLJ60017, TMEM135-CCDC67, PTEN-NOLC1, CCNH-C5orf30, TRMT11-GRIK2, SLC45A2-AMACR, KDM4B- AC011523.2, MAN2A1-FER, MTOR-TP53BP, ZMPSTE24‐ZMYM4, CLTC‐ETV1, ACPP‐SEC13, DOCK7‐OLR1, PCMTD1‐SNTG11 and a combination thereof. 4. BRIEF DESCRIPTION OF THE FIGURES

FIGURE 1. Unique fusion gene events. Left panel: Miniature diagrams of genome of the fusion genes, the transcription directions, the distances between the joining genes and directions of the fusions. Middle panel: Representative sequencing chromograms of fusion genes. The joining gene sequences were indicated (SEQ ID NOs: 45-52). Right panel: Diagrams of translation products of fusion genes. Blue-driver gene translation product; Red-passenger gene translation product; Orange-novel translation products due to frameshift or translation products from a non-gene region.

FIGURE 2. Genome breakpoint analysis of fusion genes. Top panel:

Miniature diagrams of genome of the fusion genes, the transcription directions, the distances between the joining genes and directions of the chromosome joining. Middle panel: Miniature of fusion genome and transcription direction. Bottom: Representative sequencing chromograms encompassing the joining breakpoint of chromosomes (SEQ ID NOs: 53-55).

FIGURE 3A-B. PTEN-NOLC1 fusion gene (A) PTEN-NOLC1 fusion transcript. Top panel: Miniature diagrams of genome of the PTEN and NOLC1 genes, the transcription direction, the distance between the joining genes and direction of the fusion. Middle panel: Representative sequencing chromogram of PTEN-NOLC1 transcript. The joining gene sequences were indicated. Lower panel: Diagram of translation product of fusion transcript. Blue-head gene translation product; Red-tail gene translation product. (B) Schematic diagram of PTEN and NOLC1 genome recombination and FISH probe positions.

FIGURE 4. Motif analysis of MAN2A1-FER. Diagram of functional domains of MAN2A1, FER and MAN2A1-FER fusion proteins and the chromosomal breakpoints observed for the MAN2A1-FER fusion gene in different cell lines. In the fusion gene MAN2A1-FER, the N-terminus of FER suffers a loss of SH2 and FHC domains. These domains were replaced with the glycoside hydrolase and α-mannosidase middle domain from MAN2A1.

FIGURE 5. Schematic diagram of Genome editing targeting at a fusion gene breakpoint in cancer cells positive for CCNH-C5orf30. Genome recombination in prostate cancer case 3T produced a breakpoint in chromosome 5 that connect intron 6 of CCNH with intron 1 of C5orf30. A guide RNA (gRNA) of 23 bp including protospacer adjacent motif (PAM) sequence is designed specific for the breakpoint region. The DNA sequence corresponding to this target sequence is artificially ligated into vector containing the remainder of gRNA and Cas9. This sequence is recombined and packaged into recombinant virus (Adenovirus or lenti-virus). A promoterless Herpes Simplex Virus Type 1 (HSV-1) thymidine kinase is constructed into a shuttle vector for adenovirus along with splice tag sequence from intron/exon juncture of CCNH exon 7. A 500 bp sequence surrounding the CCNH-C5orf30 breakpoint from each side is also ligated into the shuttle vector in order to produce efficient homologous recombination to complete the donor DNA construction. The vector is recombined and packaged into AdEasy to generate recombinant viruses. These viruses can be administered to patients or animals that have cancers positive for CCNH-C5orf30 fusion transcript. This leads to insertion of donor DNA into the target site (fusion breakpoint). Since HSV-1 TK in recombinant virus is promoterless, no transcription will occur if HSV-1 TK cDNA does not integrate into a transcription active genome. However, transcription of HSV-1 TK is active if HSV-1 TK is integrated into the target site of CCNH-C5orf30 in the patient, and when ganciclovir or its oral homologue valganciclovir is administered to the patient takes, the homologue is readily converted to triphosphate guanine analogue by HSV-1 TK and incorporated into the genomes of cancer cells. This leads to stoppage of DNA elongation in cells that are positive for CCNH-C5orf30.

FIGURE 6. Schematic diagram of fusion genes. Left panel: Schematic diagram of genome of fusion partners. Genetic locus, distance between partners, transcription direction and fusion direction are indicated. Middle panel: Histogram of Sanger sequencing surrounding the fusion point of each fusion gene (SEQ ID NOs: 40- 44). Right panel: Predicted protein products of fusion genes. Blue: Head gene protein; Yellow: frameshift translation; Red: tail.

FIGURE 7. Schematic diagram of ZMPSTE24‐ZMYM5 fusion formation. Functional domains are indicated. The fusion formation between ZMPSTE24 and ZMYM4 produces a truncation of 159 amino acids from the C-terminus of ZMPSTE24 and 1315 amino acids from the N-terminus of ZMYM4. Motif analysis suggests that ZMPSTE24‐ZMYM4 fusion will delete about 50% of the peptidase domain from

ZMPSTE24 and remove all zinc fingers from ZMYM4, but leave ZUF3504 (domain of unknown function) and apoptosis inhibitor domain intact.

FIGURE 8. Schematic diagram of CLTC‐ETV1 fusion formation. Functional domains are indicated. CLTC‐ETV1 fusion preserves a largely intact transcription domain in ETV1, and deletes 3 clathrin domains from CLTC. Truncation in the N‐ terminus of ETV1 eliminates all these regulatory elements from ETV1.

FIGURE 9. Schematic diagram of ACPP-SEC13 fusion formation.

Functional domains are indicated. In ACPP‐SEC13 fusion, only the N‐terminus 72 amino acids of ACPP is preserved, and over 2/3 of the phosphatase domain is truncated, while SEC13 loses 196 amino acids from its N‐terminus and has 3 WD‐repeat domains deleted.

FIGURE 10. Schematic diagram of DOCK7‐OLR1 fusion formation.

Functional domains are indicated. DOCK7‐OLR1 does not produce a chimeric protein. Separate translation of DOCK7 and OLR1 occurs from the fusion transcript. The fusion gene deletes a significant portion of cytokinesis domain of DOCK, and the fusion transcript produces an intact OLR1 protein.

FIGURE 11. Schematic diagram of PCMTD1‐SNTG1 fusion formation. Functional domains are indicated. PCMTD1‐SNTG1 fusion does not produce a chimeric protein. PCMTD1‐SNTG1 fusion produces a truncated PCMTD1, which removes half of the methyl‐transferase domain of PCMTD1, and SNTG1 remains intact.

FIGURE 12. Schematic diagram of SLC45A2‐AMACR chimeric protein. Fusion between SLC45A2 and AMACR results in truncation of two‐third of (MFS) domain in SLC45A2, but largely retains CoA‐transferase domain of AMACR.

SLC45A2-AMACR produces a chimeric protein with the N-terminal 187 amino acids of SLC45A2 and the C-terminal 311 amino acids of AMACR. SLC45A2-AMACR replaces 5 transmembrane and cytosolic domains of SLC45A2 with an intact racemase domain from AMACR, while leaving the extracellular and the N-terminal

transmembrane domains intact.

FIGURE 13A-C. Schema of strategy to introduce EGFP-tk into the breakpoint of TMEM135-CCDC67 fusion gene. (A) Diagram representation and Sanger sequencing of TMEM135-CCDC67 chromosome breakpoint. Direction of transcription is indicated by the arrows. (B and C) Schematic diagrams of the strategy to introduce EGFP-tk into the breakpoint of TMEM135-CCDC67. The locations of gRNA- and gRNA+ are indicated by boxes. These gRNAs were ligated with Cas9 D10A into VQAd5-CMV shuttle vector and recombined into pAd5 virus. Separately, 584 bp of TMEM135 intron 13 sequence and 561 bp of CCDC67 intron 9 sequence were designed to sandwich a promoterless EGFP-tk cDNA, ligated into PAdlox shuttle vector and recombined into adenovirus. A splice acceptor and a splice donor from exon 14 of TMEM135 were inserted between TMEM intron 13 and EGFP-tk, and between EGFP-tk and CCDC67 intron 9, respectively, to allow proper EGFP-tk RNA splicing to occur. Cells containing TMEM135-CCDC67 chromosome breakpoint were infected with these recombinant viruses. The integrated EGFP-tk was transcribed by the fusion head gene promoter in these cells, spliced and translated into protein product of EGFP-tk, which in turn blocks DNA synthesis by converting ganciclovir to ganciclovir triphosphate.

FIGURE 14. qPCR to quantify the relative copy number of TMEM135- CCDC67 breakpoint and pCMV vector sequence in the genome of transformed prostate cancer cells. One microgram of genomic DNA of PC3 BP (PC3 cells transformed with pCMV-TMEM135 int13 -CCDC67 int9 ), or DU145 BP (DU145 cells transformed with pCMV-TMEM135 int13 -CCDC67 int9 ), or PC3 CMV (PC3 cells transformed with pCMVscript) or DU145 CMV (DU145 cells transformed with pCMVscript) was quantified for β -actin or the TMEM135-CCDC67 breakpoint through qPCR using the primers listed in Table 2. The copy numbers of BP and β-actin were fitted with standard curves generated with serial titrations of known copy numbers of BP and β-actin, respectively. The BP/β-actin ratios were plotted.

FIGURE 15A-I. Genome therapy targeting at MAN2A1-FER breakpoint. (A) Design of gRNA and recombination donor adenoviruses for MAN2A1-FER fusion gene. Upper panel: Sanger sequencing diagram of MAN2A1-FER chromosome breakpoint of HUH7 cells; Middle panel: Design of gRNA for pAD5-Cas9 D10A - gRNAMAN2A1 int13 -gRNAFER int14 ; Lower panel: Design of homologous DNA sequences and EGFP-tk for pAD-MAN2A1 int13 -EGFP-tk-FER int14 . The splicing acceptor and donor sequences correspond to the juncture sequences of intron13-exon 14 of MAN2A1and exon15-intron 15 of FER. (B) Expression of MAN2A1-FER in HUH7 cells. Lanes 1 and 2: immunoblots of protein extracts from HUH7 and HEP3B cells with antibodies specific for FER or GAPDH. MAN2A1-FER (MF) and FER protein are indicated. Lanes 3 and 4: RT-PCR of RNA from HUH7 and HEP3B cells with primers specific for MAN2A1-FER (MF) or β-actin. (C) In vitro cleavage assays were performed on BamH1 linearized pTAMAN2A1int13-FERint14 vector using recombinant Cas9, S. pyogenes and in vitro transcribed gRNA- or gRNA+ as indicated. The cleavage generated 2446 and 1944 bp fragments of pTAMAN2A1 int13 -FER int14 vector for gRNA-, and 2484 and 1906 bp for gRNA+. (D) Infection of HUH7 or HEP3B cells led to expression of EGFP-tk in HUH7 but not HEP3B cells. HUH7 and HEP3B cells were infected with pAD5-Cas9D10A-gRNAMAN2A1 int13 -gRNAFER int14 (Ad-MF) and pAD- MAN2A1 int13 -EGFP-tk-FER int14 (Ad-MF-EGFP-tk). Expression of Cas9 D10A -RFP is indicated by red fluorescence, while expression EGFP-tk is indicated by green. HUH7 cells infected with pAD5-Cas9D10A-gRNATMEM135 int13 -gRNACCDC67 int9 (Ad-gTC) and pADTMEM135 int13 -EGFP-tk-CCDC67 int9 (Ad-TC-EGFP-tk) were used as specificity control. (E) Quantification of EGFP-tk integration/expression by flow cytometry. (F) Killing of HUH7 cells with ganciclovir. HUH7 or HEP3B cells were infected with pAD5-Cas9D10A-gRNAMAN2A1 int13 -gRNAFER int14 /pAD-MAN2A1 int13 - EGFP-tk-FER int14 (Ad-MF). These cells were then incubated with various concentrations of ganciclovir for 24 hours. Cell deaths were then quantified with phycoerythrin labeled Annexin V through flow cytometer. HUH7 cells infected with pAD5-Cas9 D10A - gRNATMEM135 int13 -gRNACCDC67 int9 /pAD-TMEM135 int13 -EGFP-tk-CCDC67 int9 (Ad- TC) were used as specificity controls. (G) HUH7 and HEP3B cells were xenografted into the subcutaneous regions of SCID mice. These tumors were allowed to grow for 2 weeks before the treatment. These mice were treated with the indicated viruses plus ganciclovir (G, 80mg/kg) or PBS (P). The indicated drugs were applied through peritoneal injections 3 times a week until all the mice from control treatments died off. The tumor volumes were measured weekly. (H) Mice treated with MAN2A1-FER breakpoint therapy are free of cancer metastasis. (I) Mice treated MAN2A1-FER breakpoint therapy had no mortality.

FIGURE 16A-B. (A) Expression of Cas9 D10A and HSV1-tk in HUH7 or HEP3B tumors treated with Ad-TC or pAD5-Cas9 D10A -gRNAMAN2A1 int13 -gRNAFER int14 /pAD- MAN2A1 int13 -EGFP-tk-FER int14 (Ad-MF). Green arrows indicate unstained mouse stromal cells. (B) Genome therapy induced apoptosis of xenografted cancers that contain fusion gene breakpoints. Terminal deoxynucleotidyl transferase (TdT) dUTP Nick-End Labeling (TUNEL) assays were performed on the PC3 BP, DU145 BP, PC3 CMV, DU145 CMV, or HUH7 xenografted cancers treated with either Ad-TC or Ad-MF.

FIGURE 17A-C. Pten‐NOLC1 fusion. (A) Schematic diagram of Pten‐NOLC1 fusion. Top panel: Miniature diagrams of the genome of the fusion gene, the

transcription direction, the distance between the joining gene and direction of the fusions. Middle panel: Representative sequencing chromogram of fusion transcript. The joining gene sequences were indicated. Lower panel: Diagrams of translation products of Pten‐NOLC1 fusion transcript. Blue‐head gene translation product; Red‐tail gene translation product. (B) Fluorescence in situ hybridization indicates genome

recombination in prostate cancer cells. Schematic diagram of Pten and NOLC1 genome recombination and FISH probe positions. Representative FISH images were shown for normal prostate epithelial cells and cancer cells positive for Pten‐NOLC1 fusion. Orange denotes probe 1; Green denotes probe 2. Fusion signals are indicated by green arrows. (C) Genome breakpoint analysis of Pten‐NOLC1 fusion. Top panel: Miniature diagrams of the genome of the fusion genes, the transcription directions, the distances between the joining genes and directions of the chromosome joining. Middle panel: Miniature of fusion genome and transcription direction. Bottom panel: Representative sequencing chromogram encompassing the joining breakpoint of chromosomes. Intron 11 of Pten (blue) and intron 1 (red) of NOLC1 are indicated.

FIGURE 18. Unique fusion gene events. Left panel: Miniature diagrams of genome of the fusion genes, the transcription directions, the distances between the joining genes and directions of the fusions. Middle panel: Representative sequencing chromograms of fusion transcripts. The joining gene sequences were indicated. Right panel: Diagrams of translation products of fusion transcripts. Blue-head gene translation product; Red-tail gene translation product; Orange-novel translation products due to frameshift or translation products from a non- gene region.

FIGURE 19A-E. Fluorescence in situ hybridization suggests genome recombination in prostate cancer cells. (A) Schematic diagram of DOCK7 and OLR1 genome recombination and FISH probe positions. Representative FISH images were shown for normal prostate epithelial cells and cancer cells positive for DOCK7-OLR1 fusion. Orange denotes probe 1; Green denotes probe 2. Fusion joining signals are indicated by green arrows. (B) Schematic diagram of SNTG1 and PCMTD1 genome recombination and FISH probe positions. Representative FISH images were shown for normal prostate epithelial cells and cancer cells positive for SNTG1-PCMTD1 fusion. Orange denotes probe 1; Green denotes probe 2. Fusion joining signals are indicated by green arrows. (C) Schematic diagram of ACPP and SEC13 genome recombination and FISH probe positions. Representative FISH images were shown for normal prostate epithelial cells and cancer cells positive for ACPP-SEC13 fusion. Orange denotes probe 1; Green denotes probe 2. Fusion joining signals are indicated by green arrows. (D) Schematic diagram of ZMPSTE24 and ZMYM4 genome recombination and FISH probe positions. Representative FISH images were shown for normal prostate epithelial cells and cancer cells positive for ZMPSTE24-ZMYM4 fusion. Orange denotes probe 1; Green denotes probe 2. Fusion joining signals are indicated by green arrows. (E) Schematic diagram of CLTC and ETV1 genome recombination and FISH probe positions.

Representative FISH images were shown for normal prostate epithelial cells and cancer cells positive for CLTC-ETV1 fusion. Orange denotes probe 1; Green denotes probe 2. Fusion joining signals are indicated by green arrows.

FIGURE 20A-C. Pten‐NOLC1 in human cancers. (A) Taqman qRT‐PCR to detect Pten‐NOLC1 in human cancer cell lines and a healthy organ donor prostate sample. (B) The frequency of Pten‐NOLC1 in primary human cancers. (C) Expression of Pten‐NOLC1 protein. Upper panel: diagram of functional domains of Pten and NOLC1 protein as well as Pten‐NOLC1 fusion protein. Truncation sites are indicated by arrows. NLS denotes nuclear localization signal; SRP40 denotes homolog domain to C‐terminus of the S. cerevisiae SRP40 protein; snoRNA binding denotes binding site for small nucleolus RNA; T denotes serine‐rich sequence homologous to HSV1 transcription factor ICP4. Lower panel: Immunoblotting of Pten and Pten‐NOLC1 proteins from primary prostate cancer samples (PCa638T, PCa207T, PCa624T, PCa099T and

PCa090T) or healthy organ donor prostate samples (DO12 and DO17), or the indicated cell lines (lanes 8‐12).

FIGURE 21A-B. Pten-NOLC1 in metastasis and multiple loci of human cancers. (A) Pten-NOLC1 in primary cancers (P) and its matched lymph node metastases (LN). Red-positive; Grey-negative; Blank-no sample. (B) Pten-NOLC1 is present in most prostate cancer loci. Red- positive; Grey-negative; Blank-no sample.

FIGURE 22A-C. Pten‐NOLC1 is translocated to the nucleus and lacks lipid phosphatase activity. (A) Immunofluorescence analyses of Pten, NOLC1 and Pten‐ NOLC1 in NIH3T3 and PC3 cells. Top panel: Immunostaining of Pten (left), NOLC1 (middle) and Pten‐NOLC1‐FLAG (right) in NIH3T3 cells, using antibodies specific for Pten, NOLC1 or FLAG, respectively. Lower panel: PC3 cells were transfected with pPten‐EGFP (left), pNOLC1‐mCherry (middle) and pPten‐NOLC1‐EGFP (right). (B) Immunoblotting of Pten and Pten‐NOLC1 in nuclear and cytoplasmic fractions of DU145 cells or NIH3T3 cells transfected with Pten‐NOLC1‐FLAG (PNOL‐FLAG). Immunoblotting using antibodies specific for GAPDH and Histone 3 was used as faction purity controls. (C) Pten‐NOLC1 lacks PIP3 phosphatase activity in vitro. GST‐

Glutamate‐ S‐Transferase; PNOL‐Pten‐NOLC1; IP‐immunoprecipitation; T+‐ tetracycline induced; Ab‐antibodies.

FIGURE 23A-I. Pten‐NOLC1 promotes cancer cell growth and invasion. (A) Schematic diagram of Pten‐NOLC1 knockout strategy. Intron 11 sequence of Pten is indicated by blue, while intron 1 sequence of NOLC1 is indicated by red. pCas9 D10A ‐ EGFP and Pten donor‐Zeocin‐mCherry‐NOLC1 donor vectors were cotransfected into DU145 cells. The images represent the co‐expression Cas9 D10A ‐EGFP and integrated zeocin‐mCherry, representing knockout of Pten‐NOLC1 in DU145KO1 cells. (B) Taqman quantitative RT‐PCR on Pten‐NOLC1 knockout DU145 and MCF7 cells.

Taqman RT‐PCRs for β‐actin are for RNA quantity controls. (C) Pten‐NOLC1 expression promotes cell entry to S phase. Insets are representative images of BrdU labeling of 10,000 cells of DU145, DU145KO1, MCF7 and MCF7KO1. Triplicates experiments were performed. Standard deviations are indicated. (D) Pten‐NOLC1 promotes colony formation. Insets are representative crystal violet staining images of colonies of DU145/DU145KO1 and MCF7/MCF7KO1 cells. Triplicates experiments were performed. Standard deviations are indicated. (E) Pten‐NOLC1 promotes resistance to UV‐induced cell death. Insets are representative images of annexin V and PI staining of cells after exposure to 175 mj UV irradiation. Triplicates experiments were performed. Standard deviations are indicated. (F) Removal of Pten‐NOLC1 reduced cancer cell invasion. Matrigel travers analysis was performed on DU145, DU145KO1 and DU145KO2 cells. Triplicates experiments were performed. Standard deviations are indicated. MCF7 cells fail to migrate through matrigel. (G) Removal of Pten‐NOLC1 reduced tumor volume of xenografted DU145 cells. (H) Removal of Pten‐NOLC1 reduced incidence of metastasis of xenografted DU145 cancers. (I) Removal of Pten‐ NOLC1 improves survival of animals xenografted with DU145 cancer.

FIGURE 24A-C. Pten‐NOLC1 promotes expression of pro‐growth genes. (A) Removal of Pten‐NOLC1 induced downregulation of EGFR, VEGFA, GAB1 EREG, AXL and c‐MET based on microarray analysis of DU145, DU145KO1 and DU145KO2. (B) Taqman quantitative RT‐PCR of EGFR, AXL, EREG, VEGFA, c‐MET and GAB1. Relative fold changes to parental DU145 cells are shown. Triplicates experiments were performed. Standard deviations are indicated. (C) Removal of Pten‐NOLC1 reduced c‐ MET and GAB1 protein expression.

FIGURE 25A-E. Creation of Pten‐NOLC1 generates spontaneous liver cancer. (A) Schematic diagram of the process to delete Pten somatically, and

hydrodynamic tail vein injection of pT3‐Pten‐NOLC1‐mCherry/pSB. Insets are representative images of PC3 cells transfected with pT3‐Pten‐NOLC1‐mCherry/pSB. (B) Representative images of livers from mice treated with AAV8‐cre and pT3‐Pten‐ NOLC1‐mCherry/pSB (right) or treated with AAV8‐cre and pT3/pSB (left). (C) Representative histology images of liver cancers from AAV8‐cre and pT3‐Pten‐NOLC1‐ mCherry/pSB treated mice (right), versus histology images for AAV8‐cre and pT3/pSB treated mice (left). (D) High frequency of Ki‐67 expression in liver cancer cells. The results are the average of number cells positive for Ki‐67 per higher field. Seven fields per sample were counted. (E) Pten‐NOLC1 promotes expression of c‐MET and GAB1 expression in liver cancer cells.

FIGURE 26. Transcriptome sequencing read distributions of Pten and NOLC1 genes. The graphs represent distribution of individual sample on the ratio of read counts of first exon to reads of all exons of Pten (top left) or NOLC1 (bottom left) or the ratios of read counts of last exon to reads of all exons of Pten (top right) or NOLC1 (bottom right). Orange-Samples from TCGA data set (550 samples); Blue-Luo et al data set (86 samples). P-values are indicated.

FIGURE 27A-C. Spanning deletion between Pten exon 11 and NOLC1 exon 2 in 17 types of human malignancies. (A) Schematic diagram of Pten and NOLC1 minigenomes as well as detection of the spanning deletion between the 2 genes (red line) through copy number analysis of TCGA Affymetrix SNP6.0 data. The number of samples deemed positive for deletion is indicated. (B) Frequency of spanning deletion between Pten and NOLC1 in 17 different types of human malignancies. Total number of samples of each type of human malignancies is indicated. (C) Frequency of Pten deletion in samples with suggestive Pten-NOLC1 fusion. Total number of samples with suggestive Pten-NOLC1 fusion is indicated.

FIGURE 28A-D. Pten-NOLC1 interacts with genomic DNA and activates expression of pro-growth genes. (A) Distribution of mapped DNA fragments from ChIP sequencing of DU145 versus DU145 KO1/KO2 (top panel), and MCF7 versus MCF7 KO1/KO2 (bottom panel), using antibodies specific for NOLC1. The distributions of DNA fragments of DU145 or MCF7 cells after subtraction from their knockout counterparts were shown in the right. (B) Taqman Q-PCR quantification of Pten- NOLC1 binding to promoter/enhancer regions of MET, EGFR, RAF1, AXL, GAB1 and VEGFA. (C) Removal of Pten-NOLC1 reduced c-MET, EGFR, RAF1 and GAB1 protein expression, and phosphorylation of STAT3 and RAF1. (D) The signaling pathways of MET, EGF and ECM are impacted by the presence of Pten-NOLC1. Red icons indicate genes that were interacted by Pten-NOLC1 but not by NOLC1 protein. FIGURE 29. Pten-NOLC1 binding to promoter regions of MET, EGFR, RAF1, AXL and VEGFA. ChIP sequencing mapped peaks from Pten-NOLC1 positive samples to the promoter regions of these genes are shown in red. Genome positions of the mapped DNA (using HG19 as reference) are indicated. Transcription start sites are indicated by arrows. Box indicates the region enriched with Pten-NOLC1 binding fragments.

FIGURE 30. Cell death induced by genomic interruption of Pten-NOLC1. PC3 (prostate cancer), DU145 (prostate cancer), MCF7 (breast cancer), H1299 (lung cancer), SNU449 (liver cancer), SNU475 (liver cancer), HEP3B (liver cancer), T98G (glioblastoma multiforme), MB231 (breast cancer) and NIH3T3 (mouse immortalized fibroblasts) cells were treated with Cas9 D10A , or Cas9 D10A plus gRNA specific for Pten- NOLC1 or Cas 9D10A plus gRNA specific for Pten-NOLC1 plus Pten-NOLC1 knockout cassette. Cell death was then analyzed 2 days after the treatment using Annexin V and propridium iodide staining.

FIGURE 31. Schematic diagram for the detection of TMEM135int13‐

EGFP‐tk‐CCDC67int9 integration into TMEM135‐CCDC67 breakpoint in the PC3 cell genome. Arrows indicate the primer position for PCR. Putative integration sites that generated mutations are indicated by yellow stars. The PCR products obtained from xenografted PC3 cells that contain TMEM135‐CCDC67 breakpoint before virus treatment were used as reference control. PCR products obtained after viral (Ad‐TC) infections were sequenced. The positions of mutations due to DNA integration were detected through Sanger’s sequencing. 5. DETAILED DESCRIPTION OF THE INVENTION For clarity, and not by way of limitation, the detailed description of the invention is divided into the following subsections:

(i) fusion genes;

(ii) fusion gene detection;

(iii) cancer targets;

(iv) methods of treatment;

(v) genome editing techniques; and

(vi) kits. 5.1 FUSION GENES

The term“fusion gene,” as used herein, refers to a nucleic acid or protein sequence which combines elements of the recited genes or their RNA transcripts in a manner not found in the wild type/normal nucleic acid or protein sequences. For example, but not by way of limitation, in a fusion gene in the form of genomic DNA, the relative positions of portions of the genomic sequences of the recited genes is altered relative to the wild type/normal sequence (for example, as reflected in the NCBI chromosomal positions or sequences set forth herein). In a fusion gene in the form of mRNA, portions of RNA transcripts arising from both component genes are present (not necessarily in the same register as the wild-type transcript and possibly including portions normally not present in the normal mature transcript). In non-limiting embodiments, such a portion of genomic DNA or mRNA may comprise at least about 10 consecutive nucleotides, or at least about 20 consecutive nucleotides, or at least about 30 consecutive nucleotides, or at least 40 consecutive nucleotides. In certain embodiments, such a portion of genomic DNA or mRNA may comprise up to about 10 consecutive nucleotides, up to about 50 consecutive nucleotides, up to about 100 consecutive nucleotides, up to about 200 consecutive nucleotides, up to about 300 consecutive nucleotides, up to about 400 consecutive nucleotides, up to about 500 consecutive nucleotides, up to about 600 consecutive nucleotides, up to about 700 consecutive nucleotides, up to about 800 consecutive nucleotides, up to about 900 consecutive nucleotides, up to about 1,000 consecutive nucleotides, up to about 1,500 consecutive nucleotides or up to about 2,000 consecutive nucleotides of the nucleotide sequence of a gene present in the fusion gene. In certain embodiments, such a portion of genomic DNA or mRNA may comprise no more than about 10 consecutive nucleotides, about 50 consecutive nucleotides, about 100 consecutive nucleotides, about 200 consecutive nucleotides, about 300 consecutive nucleotides, about 400 consecutive nucleotides, about 500 consecutive nucleotides, about 600 consecutive nucleotides, about 700 consecutive nucleotides, about 800 consecutive nucleotides, about 900 consecutive nucleotides, about 1,000 consecutive nucleotides, about 1,500 consecutive nucleotides or about 2,000 consecutive nucleotides of the nucleotide sequence of a gene present in the fusion gene. In certain embodiments, such a portion of genomic DNA or mRNA does not comprise the full wildtype/normal nucleotide sequence of a gene present in the fusion gene. In a fusion gene in the form of a protein, portions of amino acid sequences arising from both component genes are present (not by way of limitation, at least about 5 consecutive amino acids or at least about 10 amino acids or at least about 20 amino acids or at least about 30 amino acids). In certain embodiments, such a portion of a fusion gene protein may comprise up to about 10 consecutive amino acids, up to about 20 consecutive amino acids, up to about 30 consecutive amino acids, up to about 40 consecutive amino acids, up to about 50 consecutive amino acids, up to about 60 consecutive amino acids, up to about 70 consecutive amino acids, up to about 80 consecutive amino acids, up to about 90 consecutive amino acids, up to about 100 consecutive amino acids, up to about 120 consecutive amino acids, up to about 140 consecutive amino acids, up to about 160 consecutive amino acids, up to about 180 consecutive amino acids, up to about 200 consecutive amino acids, up to about 220 consecutive amino acids, up to about 240 consecutive amino acids, up to about 260 consecutive amino acids, up to about 280 consecutive amino acids or up to about 300 consecutive amino acids of the amino acid sequence encoded by a gene present in the fusion gene. In certain embodiments, such a portion of a fusion gene protein may comprise no more than about 10 consecutive amino acids, about 20 consecutive amino acids, about 30 consecutive amino acids, about 40 consecutive amino acids, about 50 consecutive amino acids, about 60 consecutive amino acids, about 70 consecutive amino acids, about 80 consecutive amino acids, about 90 consecutive amino acids, about 100 consecutive amino acids, about 120 consecutive amino acids, about 140 consecutive amino acids, about 160 consecutive amino acids, about 180 consecutive amino acids, about 200 consecutive amino acids, about 220 consecutive amino acids, about 240 consecutive amino acids, about 260 consecutive amino acids, about 280 consecutive amino acids or about 300 consecutive amino acids of the amino acid sequence encoded by a gene present in the fusion gene. In certain embodiments, such a portion of a fusion gene protein does not comprise the full wildtype/normal amino acid sequence encoded by a gene present in the fusion gene. In this paragraph, portions arising from both genes, transcripts or proteins do not refer to sequences which may happen to be identical in the wild type forms of both genes (that is to say, the portions are“unshared”). As such, a fusion gene represents, generally speaking, the splicing together or fusion of genomic elements not normally joined together. See WO 2015/103057 and WO 2016/011428, the contents of which are hereby incorporated by reference, for additional information regarding the disclosed fusion genes.

The fusion gene TRMT11-GRIK2 is a fusion between the tRNA methyltransferase 11 homolog (“TRMT11”) and glutamate receptor, ionotropic, kainate 2 (“GRIK2”) genes. The human TRMT11 gene is typically located on chromosome 6q11.1 and the human GRIK2 gene is typically located on chromosome 6q16.3. In certain embodiments, the TRMT11 gene is the human gene having NCBI Gene ID No: 60487, sequence chromosome 6; NC_000006.11 (126307576..126360422) and/or the GRIK2 gene is the human gene having NCBI Gene ID No:2898, sequence chromosome 6; NC_000006.11 (101841584..102517958). In certain embodiments, the junction (also referred to herein as chromosomal breakpoint and/or junction fragment) of a TRMT11- GRIK2 fusion gene comprises a sequence as shown in Figure 1 and/or Table 1.

The fusion gene SLC45A2-AMACR is a fusion between the solute carrier family 45, member 2 (“SLC45A2”) and alpha-methylacyl-CoA racemase (“AMACR”) genes. The human SLC45A2 gene is typically located on human chromosome 5p13.2 and the human AMACR gene is typically located on chromosome 5p13. In certain

embodiments the SLC45A2 gene is the human gene having NCBI Gene ID No: 51151, sequence chromosome 5; NC_000005.9 (33944721..33984780, complement) and/or the AMACR gene is the human gene having NCBI Gene ID No:23600, sequence

chromosome 5; NC_000005.9 (33987091..34008220, complement). In certain embodiments, the junction and/or junction fragment of a SLC45A2-AMACR fusion gene comprises a sequence as shown in Figure 1 and/or Table 1.

The fusion gene MTOR-TP53BP1 is a fusion between the mechanistic target of rapamycin (“MTOR”) and tumor protein p53 binding protein 1 (“TP53BP1”) genes. The human MTOR gene is typically located on chromosome 1p36.2 and the human TP53BP1 gene is typically located on chromosome 15q15 - q21. In certain embodiments, the MTOR gene is the human gene having NCBI Gene ID No:2475, sequence chromosome 1 NC_000001.10 (11166588..11322614, complement) and/or the TP53BP1gene is the human gene having NCBI Gene ID No: 7158, sequence chromosome 15; NC_000015.9 (43695262..43802707, complement). In certain embodiments, the junction and/or junction fragment of a MTOR-TP53BP1 fusion gene comprises a sequence as shown in Figure 1 and/or Table 1.

The fusion gene LRRC59-FLJ60017 is a fusion between the leucine rich repeat containing 59 (“LRRC59”) gene and the“FLJ60017” nucleic acid. The human LRRC59 gene is typically located on chromosome 17q21.33 and nucleic acid encoding human FLJ60017 is typically located on chromosome 11q12.3. In certain embodiments, the LRRC59 gene is the human gene having NCBI Gene ID No:55379, sequence

chromosome 17; NC_000017.10 (48458594..48474914, complement) and/or FLJ60017 has a nucleic acid sequence as set forth in GeneBank AK_296299. In certain

embodiments, the junction and/or junction fragment of a LRRC59-FLJ60017 fusion gene comprises a sequence as shown in Figure 1, Figure 2 and/or Table 1.

The fusion gene TMEM135-CCDC67 is a fusion between the transmembrane protein 135 (“TMEM135”) and coiled-coil domain containing 67 (“CCDC67”) genes. The human TMEM135 gene is typically located on chromosome 11q14.2 and the human CCDC67 gene is typically located on chromosome 11q21. In certain embodiments the TMEM135 gene is the human gene having NCBI Gene ID No: 65084, sequence chromosome 11; NC_000011.9 (86748886..87039876) and/or the CCDC67 gene is the human gene having NCBI Gene ID No: 159989, sequence chromosome 11;

NC_000011.9 (93063156..93171636). In certain embodiments, the junction and/or junction fragment of a TMEM135-CCDC67 fusion gene comprises a sequence as shown in Figure 1, Figure 2, Figure 13 and/or Table 1.

The fusion gene CCNH-C5orf30 is a fusion between the cyclin H (“CCNH”) and chromosome 5 open reading frame 30 (“C5orf30”) genes. The human CCNH gene is typically located on chromosome 5q13.3-q14 and the human C5orf30gene is typically located on chromosome 5q21.1. In certain embodiments, the CCNH gene is the human gene having NCBI Gene ID No: 902, sequence chromosome 5; NC_000005.9

(86687310..86708850, complement) and/or the C5orf30gene is the human gene having NCBI Gene ID No: 90355, sequence chromosome 5; NC_000005.9

(102594442..102614361). In certain embodiments, the junction and/or junction fragment of a CCNH-C5orf30 fusion gene comprises a sequence as shown in Figure 1, Figure 2 and/or Table 1.

The fusion gene KDM4B-AC011523.2 is a fusion between lysine (K)-specific demethylase 4B (“KDM4B”) and chromosomal region“AC011523.2.” The human KDM4B gene is typically located on chromosome 19p13.3 and the human AC011523.2 region is typically located on chromosome 19q13.4. In certain embodiments the

KDM4B gene is the human gene having NCBI Gene ID NO: 23030, sequence chromosome 19; NC_000019.9 (4969123..5153609); and/or the AC011523.2 region comprises a sequence as shown in Figure 1. In certain embodiments, the junction and/or junction fragment of a KDM4B-AC011523.2 fusion gene comprises a sequence as shown in Figure 1 and/or Table 1.

The fusion gene MAN2A1-FER is a fusion between mannosidase, alpha, class 2A, member 1 (“MAN2A1”) and (fps/fes related) tyrosine kinase (“FER”). The human MAN2A1 gene is typically located on chromosome 5q21.3 and the human FER gene is typically located on chromosome 5q21. In certain embodiments, the MAN2A1gene is the human gene having NCBI Gene ID NO: 4124, sequence chromosome 5;

NC_000005.9 (109025156..109203429) or NC_000005.9 (109034137..109035578); and/or the FER gene is the human gene having NCBI Gene ID NO: 2241, sequence chromosome 5: NC_000005.9 (108083523..108523373). In certain embodiments, the junction and/or junction fragment of a MAN2A1-FER fusion gene comprises a sequence as shown in Figure 1, Figure 4, Figure 15 and/or Table 1.

The fusion gene PTEN-NOLC1 is a fusion between the phosphatase and tensin homolog (“PTEN”) and nucleolar and coiled-body phosphoprotein 1 (“NOLC1”). The human PTEN gene is typically located on chromosome 10q23.3 and the human NOLC1 gene is typically located on chromosome 10q24.32. In certain embodiments, the PTEN gene is the human gene having NCBI Gene ID NO: 5728, sequence chromosome 10; NC_000010.11 (87863438..87970345) and/or the NOLC1 gene is the human gene having NCBI Gene ID NO: 9221, sequence chromosome 10; NC_000010.11

(102152176..102163871). In certain embodiments, the junction and/or junction fragment of a PTEN-NOLC1 fusion gene comprises a sequence as shown in Figure 3, Figure 17 and/or Table 1.

The fusion gene ZMPSTE24‐ZMYM4 is a fusion between zinc metallopeptidase STE24 (“ZMPSTE24”) and zinc finger, MYM-type 4 (“ZMYM4”). The human

ZMPSTE24 is typically located on chromosome 1p34 and the human ZMYM4 gene is typically located on chromosome 1p32-p34. In certain embodiments, the ZMPSTE24 gene is the human gene having NCBI Gene ID NO: 10269, sequence chromosome 1; NC_000001.11 (40258050..40294184) and/or the ZMYM4 gene is the human gene having NCBI Gene ID NO: 9202, sequence chromosome 1; NC_000001.11

(35268850..35421944). In certain embodiments, the junction and/or junction fragment of a ZMPSTE24‐ZMYM4 fusion gene comprises a sequence as shown in Figure 6 and/or Figure 18.

The fusion gene CLTC‐ETV1 is a fusion between clathrin, heavy chain (Hc) (“CLTC”) and ets variant 1 (“ETV1”). The human CLTC is typically located on chromosome 17q23.1 and the human ETV1 gene is typically located on chromosome 7p21.3. In certain embodiments, the CLTC gene is the human gene having NCBI Gene ID NO: 1213, sequence chromosome 17; NC_000017.11 (59619689..59696956) and/or the ETV1gene is the human gene having NCBI Gene ID NO: 2115, sequence

chromosome 7; NC_000007.14 (13891229..13991425, complement). In certain embodiments, the junction and/or junction fragment of a CLTC‐ETV1 fusion gene comprises a sequence as shown in Figure 6 and/or Figure 18 or a fragment thereof.

The fusion gene ACPP‐SEC13 is a fusion between acid phosphatase, prostate (“ACPP”) and SEC13 homolog (“SEC13”). The human ACPP is typically located on chromosome 3q22.1 and the human SEC13 gene is typically located on chromosome 3p25-p24. In certain embodiments, the ACPP gene is the human gene having NCBI Gene ID NO: 55, sequence chromosome 3; NC_000003.12 (132317367..132368302) and/or the SEC13 gene is the human gene having NCBI Gene ID NO: 6396, sequence chromosome 3; NC_000003.12 (10300929..10321188, complement). In certain embodiments, the junction and/or junction fragment of an ACPP‐SEC13 fusion gene comprises a sequence as shown in Figure 6 and/or Figure 18.

The fusion gene DOCK7‐OLR1 is a fusion between dedicator of cytokinesis 7 (“DOCK7”) and oxidized low density lipoprotein (lectin-like) receptor 1 (“OLR1”). The human DOCK7 is typically located on chromosome 1p31.3 and the human OLR1 gene is typically located on chromosome 12p13.2-p12.3. In certain embodiments, the DOCK7 gene is the human gene having NCBI Gene ID NO: 85440, sequence chromosome 1; NC_000001.11 (62454726..62688368, complement) and/or the OLR1 gene is the human gene having NCBI Gene ID NO: 4973, sequence chromosome 12; NC_000012.12 (10158300..10172191, complement). In certain embodiments, the junction and/or junction fragment of a DOCK7‐OLR1 fusion gene comprises a sequence as shown in Figure 6 and/or Figure 18.

The fusion gene PCMTD1‐SNTG1 is a fusion between protein-L-isoaspartate (D- aspartate) O-methyltransferase domain containing 1 (“PCMTD1”) and syntrophin, gamma 1 (“SNTG1”). The human PCMTD1 is typically located on chromosome 8q11.23 and the human SNTG1 gene is typically located on chromosome 8q11.21. In certain embodiments, the PCMTD1 gene is the human gene having NCBI Gene ID NO: 115294, sequence chromosome 8; NC_000008.11 (51817575..51899186, complement) and/or the SNTG1gene is the human gene having NCBI Gene ID NO: 54212, sequence chromosome 8; NC_000008.11 (49909789..50794118). In certain embodiments, the junction and/or junction fragment of a PCMTD1‐SNTG1 fusion gene comprises a sequence as shown in Figure 6 and/or Figure 18. 5.2 FUSION GENE DETECTION

Any of the foregoing fusion genes described above in section 5.1 may be identified and/or detected by methods known in the art. The fusion genes may be detected by detecting a fusion gene manifested in a DNA molecule, an RNA molecule or a protein. In certain embodiments, a fusion gene can be detected by determining the presence of a DNA molecule, an RNA molecule or protein that is encoded by the fusion gene. For example, and not by way of limitation, the presence of a fusion gene may be detected by determining the presence of the protein encoded by the fusion gene.

The fusion gene may be detected in a sample of a subject. A“patient” or “subject,” as used interchangeably herein, refers to a human or a non-human subject. Non-limiting examples of non-human subjects include non-human primates, dogs, cats, mice, etc. The subject may or may not be previously diagnosed as having cancer.

In certain non-limiting embodiments, a sample includes, but is not limited to, cells in culture, cell supernatants, cell lysates, serum, blood plasma, biological fluid (e.g., blood, plasma, serum, stool, urine, lymphatic fluid, ascites, ductal lavage, saliva and cerebrospinal fluid) and tissue samples. The source of the sample may be solid tissue (e.g., from a fresh, frozen, and/or preserved organ, tissue sample, biopsy, or aspirate), blood or any blood constituents, bodily fluids (such as, e.g., urine, lymph, cerebral spinal fluid, amniotic fluid, peritoneal fluid or interstitial fluid), or cells from the individual, including circulating cancer cells. In certain non-limiting embodiments, the sample is obtained from a cancer. In certain embodiments, the sample may be a“biopsy sample” or“clinical sample,” which are samples derived from a subject. In certain embodiments, the sample includes one or more cancer cells from a subject. In certain embodiments, the one or more fusion genes can be detected in one or more samples obtained from a subject, e.g., in one or more cancer cell samples. In certain embodiments, the sample is not a prostate cancer sample or one or more prostate cancer cells. In certain

embodiments, the sample is not a lung adenocarcinoma, glioblastoma multiforme or hepatocellular carcinoma sample.

In certain non-limiting embodiments, the fusion gene is detected by nucleic acid hybridization analysis.

In certain non-limiting embodiments, the fusion gene is detected by fluorescent in situ hybridization (FISH) analysis. FISH is a technique that can directly identify a specific sequence of DNA or RNA in a cell or biological sample and enables visual determination of the presence and/or expression of a fusion gene in a tissue sample. In certain non-limiting embodiments, where a fusion gene combines genes not typically present on the same chromosome, FISH analysis may demonstrate probes binding to the same chromosome. For example, and not by way of limitation, analysis may focus on the chromosome where one gene normally resides and then hybridization analysis may be performed to determine whether the other gene is present on that chromosome as well.

In certain non-limiting embodiments, the fusion gene is detected by DNA hybridization, such as, but not limited to, Southern blot analysis.

In certain non-limiting embodiments, the fusion gene is detected by RNA hybridization, such as, but not limited to, Northern blot analysis. In certain

embodiments, Northern blot analysis can be used for the detection of a fusion gene, where an isolated RNA sample is run on a denaturing agarose gel, and transferred to a suitable support, such as activated cellulose, nitrocellulose or glass or nylon membranes. Radiolabeled cDNA or RNA is then hybridized to the preparation, washed and analyzed by autoradiography to detect the presence of a fusion gene in the RNA sample.

In certain non-limiting embodiments, the fusion gene is detected by nucleic acid sequencing analysis.

In certain non-limiting embodiments, the fusion gene is detected by probes present on a DNA array, chip or a microarray. For example, and not by way of limitation, oligonucleotides corresponding to one or more fusion genes can be

immobilized on a chip which is then hybridized with labeled nucleic acids of a sample obtained from a subject. Positive hybridization signal is obtained with the sample containing the fusion gene transcripts.

In certain non-limiting embodiments, the fusion gene is detected by a method comprising Reverse Transcription Polymerase Chain Reaction (“RT-PCR”). In certain embodiments, the fusion gene is detected by a method comprising RT-PCR using the one or more pairs of primers disclosed herein (see, for example, Table 5).

In certain non-limiting embodiments, the fusion gene is detected by antibody binding analysis such as, but not limited to, Western Blot analysis and

immunohistochemistry.

5.3 CANCER TARGETS

Non-limiting examples of cancers that may be subject to the presently disclosed invention include prostate cancer, breast cancer, liver cancer, hepatocarcinoma, hepatoma, lung cancer, non-small cell lung cancer, cervical cancer, endometrial cancer, pancreatic cancer, ovarian cancer, gastric cancer, thyroid cancer, glioblastoma multiforme, colorectal cancer, diffuse large B cell lymphoma, sarcoma, acute and chronic lymphocytic leukemia, Hodgkin’s lymphoma, non-Hodgkin’s lymphoma, and adenocarcinoma, e.g., esophageal adenocarcinoma. In certain embodiments, the target of treatment is a premalignant or neoplastic condition involving lung, cervix, endometrium, pancreas, ovary, stomach, thyroid, glia, intestine, esophagus, muscle or B cells. In certain embodiments, the cancer is not prostate cancer. In certain embodiments, the cancer is not lung adenocarcinoma, glioblastoma multiforme or hepatocellular carcinoma. In certain embodiments, the target of treatment is a cell that carries at least one fusion gene, e.g., PTEN-NOLC1 or MAN2A1-FER. 5.4 METHODS OF TREATMENT

The present invention provides methods of treating a subject that has one or more cells that carry a fusion gene. In certain embodiments, the subject has, or is suspected of having, cancer or a neoplastic or pre-malignant condition that carries one or more fusion genes (a pre-malignant condition is characterized, inter alia, by the presence of pre- malignant or neoplastic cells). Non-limiting examples of fusion genes are disclosed herein and in section 5.1. In certain embodiments, the methods of treatment include performing a targeted genome editing technique on one or more cancer cells within the subject to produce an anti-cancer or anti-neoplastic or anti-proliferative effect. Non- limiting examples of cancers that can be treated using the disclosed methods are provided in section 5.3. Non-limiting examples of genome editing techniques are disclosed in section 5.5.

An“anti-cancer effect” refers to one or more of a reduction in aggregate cancer cell mass, a reduction in cancer cell growth rate, a reduction in cancer progression, a reduction in cancer cell proliferation, a reduction in tumor mass, a reduction in tumor volume, a reduction in tumor cell proliferation, a reduction in tumor growth rate and/or a reduction in tumor metastasis. In certain embodiments, an anti-cancer effect can refer to a complete response, a partial response, a stable disease (without progression or relapse), a response with a later relapse or progression-free survival in a patient diagnosed with cancer. In certain embodiments, an anti-cancer effect can refer to the induction of cell death, e.g., in one or more cells of the cancer, and/or the increase in cell death within a tumor mass. Similarly, an“anti-neoplastic effect” refers to one or more of a reduction in aggregate neoplastic cell mass, a reduction in neoplastic cell growth rate, a reduction in neoplasm progression (e.g., progressive de-differentiation or epithelial to mesenchymal transition), a reduction in neoplastic cell proliferation, a reduction in neoplasm mass, a reduction in neoplasm volume, and/or a reduction in neoplasm growth rate.

In certain embodiments, a method of treating a subject comprises determining the presence of one or more fusion genes in a sample from the subject, where if one or more fusion genes are present in the sample then performing a targeted genome editing technique on one or more cells within the subject. In certain embodiments, the genome editing technique results in the reduction and/or elimination of the expression of a fusion gene and/or the expression of the protein encoded by the fusion gene in one or more cells of the subject. In certain embodiments, the genome editing technique specifically targets the cells that carry the fusion gene, e.g., by specifically targeting a nucleic acid sequence of the fusion gene. For example, and not by way of limitation, the methods of the current invention specifically target a chromosomal breakpoint of one or more of the fusion genes. In certain embodiments, the methods of the current invention involve the targeting of sequences that flank the breakpoint. In certain embodiments, the methods of the current invention involve the targeting of sequences that flank and partially overlap the breakpoint. Non-limiting examples of techniques for identifying and/or detecting a fusion gene are disclosed in section 5.2.

In certain embodiments, a method of treating a cancer in a subject comprises determining the presence of one or more fusion genes in a cancer cell-containing sample from the subject, where if one or more fusion genes are present in the sample then performing a targeted genome editing technique on one or more cancer cells within the subject to produce an anti-cancer effect or anti-neoplastic effect.

In certain embodiments, the method can include determining the presence or absence of a fusion gene. For example, and not by way of limitation, the method can include determining the presence or absence of one or more, two or more, three or more, four or more, five or more, six or more, seven or more, eight or more, nine or more, ten or more, eleven or more, twelve or more, thirteen or more or all fourteen of the fusion genes disclosed herein. In certain embodiments, the one or more fusion genes can be TRMT11-GRIK2, SLC45A2-AMACR, MTOR-TP53BP1, LRRC59-FLJ60017,

TMEM135-CCDC67, KDM4B-AC011523.2, MAN2A1-FER, PTEN-NOLC1, CCNH- C5orf30, ZMPSTE24‐ZMYM4, CLTC‐ETV1, ACPP‐SEC13, DOCK7‐OLR1,

PCMTD1‐SNTG1 or a combination thereof.

In certain embodiments, the fusion gene can be TMEM135-CCDC67.

In certain embodiments, the fusion gene can be CCNH-C5orf30.

In certain embodiments, the fusion gene can be MAN2A1-FER.

In certain embodiments, the fusion gene can be PTEN-NOLC1.

In certain embodiments, the fusion gene is not TMEM135-CCDC67 or CCNH- C5orf30.

In certain embodiments, the method of treating a subject comprises determining the presence of one or more fusion genes (e.g., selected from the group consisting MAN2A1-FER, TMEM135-CCDC67, TRMT11-GRIK2, CCNH-C5orf30, LRRC59- FLJ60017, SLC45A2-AMACR, KDM4B-AC011523.2, PTEN-NOLC1, MTOR- TP53BP1 or a combination thereof) in a sample of the subject, where if one or more fusion genes are detected in the sample then performing a targeted genome editing technique on the fusion gene in one or more cancer cells within the subject to produce an anti-cancer effect.

In certain embodiments, the method of treating a subject having a cancer comprises determining the presence, in one or more cancer cell(s) of the subject, of one or more fusion genes selected from the group consisting of TRMT11-GRIK2, SLC45A2- AMACR, MTOR-TP53BP1, LRRC59-FLJ60017, TMEM135-CCDC67, KDM4B- AC011523.2, MAN2A1-FER, PTEN-NOLC1, CCNH-C5orf30, ZMPSTE24‐ZMYM4, CLTC‐ETV1, ACPP‐SEC13, DOCK7‐OLR1, PCMTD1‐SNTG1 or a combination thereof, where if one or more fusion genes are detected in the cancer cell(s) then performing a genome editing technique targeting the fusion gene present within one or more cancer cells of the subject to produce an anti-cancer effect. In certain

embodiments, the normal or non-cancerous cells that are adjacent to the cancer are not subjected to a genome editing technique as the gRNAs are specific for the sequences of the fusion gene, e.g., specific to the sequence of the breakpoint.

In certain embodiments, the method of treating a subject having a cancer comprises determining the presence, in one or more cancer cell(s) of the subject, of one or more fusion genes selected from the group consisting of ZMPSTE24‐ZMYM4, CLTC‐ETV1, ACPP‐SEC13, DOCK7‐OLR1, PCMTD1‐SNTG1 or a combination thereof, where if one or more fusion genes are detected in the cancer cell(s) then performing a targeted genome editing technique on one or more cancer cells within the subject to produce an anti-cancer effect.

In certain embodiments, the method of treating a subject comprises determining the presence, in one or more cell(s) of the subject, of one or more fusion genes selected from the group consisting of CCNH-C5orf30, TMEM135-CCDC67, PTEN-NOLC1, MAN2A1-FER and a combination thereof, where if one or more fusion genes are detected in the cell(s) then performing a targeted genome editing technique on one or more cells within the subject, e.g., to reduce and/or eliminate the expression of the fusion gene and/or reduce and/or eliminate the expression of the protein encoded by the fusion gene in the one or more cells of the subject. In certain embodiments, the method of treating a subject having a cancer comprises determining the presence, in one or more cancer cell(s) of the subject, of one or more fusion genes selected from the group consisting of CCNH-C5orf30, TMEM135- CCDC67, PTEN-NOLC1, MAN2A1-FER and a combination thereof, where if one or more fusion genes are detected in the cancer cell(s) then performing a targeted genome editing technique on one or more cancer cells within the subject to produce an anti- cancer effect.

In certain embodiments, the present invention provides a method of producing an anti-cancer effect in a subject having a cancer comprising performing a targeted genome editing technique on one or more cancer cells that contain a fusion gene within the subject, e.g., by targeting the fusion gene, to produce an anti-cancer effect.

The present invention further provides a method of preventing, minimizing and/or reducing the growth of a tumor comprising determining the presence of one or more fusion genes in a sample of the subject, where if one or more fusion genes are present in the sample then performing a genome editing technique targeting the fusion gene present within the tumor of the subject to prevent, minimize and/or reduce the growth of the tumor.

The present invention provides a method of preventing, minimizing and/or reducing the growth and/or proliferation of a cancer cell comprising determining the presence of one or more fusion genes in a sample of the subject, where if one or more fusion genes are present in the sample then performing a genome editing technique targeting the fusion gene, e.g., by targeting the chromosome breakpoint, present within the cancer cell of the subject to prevent, minimize and/or reduce the growth and/or proliferation of the cancer cell. In certain embodiments, the sequences that flank the breakpoint can be targeted by the genome editing technique.

In certain non-limiting embodiments, the present invention provides for methods of treating and/or inhibiting the progression of cancer and/or tumor and/or neoplastic growth in a subject comprising determining the presence of one or more fusion genes in a sample of the subject, where if one or more fusion genes are present in the sample then performing a targeted genome editing technique on one or more cells from the cancer and/or tumor of the subject to treat and/or inhibit the progression of the cancer and/or the tumor.

In certain embodiments, the present invention provides a method for lengthening the period of survival of a subject having a cancer. In certain embodiments, the method comprises determining the presence of one or more fusion genes in a sample of the subject, where if one or more fusion genes are present in the sample then performing a targeted genome editing technique on one or more cancer cells within the subject to produce an anti-cancer effect. In certain embodiments, the period of survival of a subject having cancer can be lengthened by about 1 month, about 2 months, about 4 months, about 6 months, about 8 months, about 10 months, about 12 months, about 14 months, about 18 months, about 20 months, about 2 years, about 3 years, about 5 years or more using the disclosed methods.

In certain embodiments, the present invention provides a method for treating a subject that comprises determining that at least one fusion gene is present in a sample obtained from a subject and then performing a genome editing technique targeting the fusion gene within one or more cells of the subject to achieve an anti-neoplastic effect, wherein the subject does not have prostate cancer.

In certain embodiments, the present invention provides an agent, or a

composition comprising an agent, capable of targeted genome editing for use in a method to treat a subject. For example, and not by way of limitation, the present invention provides an agent capable of targeted genome editing for use in a method to treat or prevent cancer in a subject, wherein the method comprises performing a targeted genome editing procedure using the agent on one or more cells, e.g., cancer cells, that contain a fusion gene within the subject. In certain embodiments, the invention provides an agent, or a composition thereof, capable of targeted genome editing for use in a method to treat or prevent cancer in a subject, wherein the method comprises (i) determining the presence of one or more fusion genes in a cancer sample of the subject and (ii) where the sample contains a fusion gene, performing a targeted genome editing procedure using the agent on one or more cancer cells within the subject. In certain embodiments, the agent targets a specific chromosomal breakpoint of one or more of the fusion genes. In certain embodiments, the methods of the current invention involve the targeting of sequences that flank the breakpoint. In certain embodiments, the agent is an endonuclease. For example, and not by way of limitation, the endonuclease is a Cas9 protein. In certain embodiments, the endonuclease is a mutated form of Cas9, e.g., Cas9 D10A . In certain embodiments, the agent is an endonuclease, e.g., Cas9, in complex with one or more gRNAs (e.g., a ribonucleoprotein). In certain embodiments, the agent is an siRNA molecule. In certain embodiments, the present invention provides a method of determining a treatment for a subject having one or more cells that contains one or more fusion genes. In certain embodiments, the method can include i) providing a sample from the subject; ii) determining whether one or more cells of the subject contains one or more fusion genes selected from the group consisting of TMEM135-CCDC67, TRMT11-GRIK2, SLC45A2-AMACR, MTOR-TP53BP1, LRRC59-FLJ60017, KDM4B-AC011523.2, MAN2A1-FER, PTEN-NOLC1, CCNH-C5orf30, ZMPSTE24‐ZMYM4, CLTC‐ETV1, ACPP‐SEC13, DOCK7‐OLR1, PCMTD1‐SNTG1 and a combination thereof; and iii) instructing a genome editing technique to be performed if one or more fusion genes are detected in the one or more cells, wherein the genome editing technique targets the one or more of the fusion genes detected in the one or more cells, and wherein the subject does not have prostate cancer. In certain embodiments, the genome editing technique is performed using the CRISPR/Cas 9 system.

In certain embodiments, the sample in which the one or more fusion genes are detected is prostate cancer, breast cancer, liver cancer, hepatocarcinoma,

adenocarcinoma, hepatoma, lung cancer, non-small cell lung cancer, cervical cancer, endometrial cancer, pancreatic cancer, ovarian cancer, gastric cancer, thyroid cancer, glioblastoma multiforme, colorectal cancer, sarcoma, diffuse large B-cell lymphoma, acute lymphocytic leukemia, chronic lymphocytic leukemia, Hodgkin’s lymphoma, non- Hodgkin’s lymphoma and esophageal adenocarcinoma.

In certain embodiments, the sample is a glioblastoma sample, a breast cancer sample, a lung cancer sample, a liver cancer sample, an ovarian cancer sample, an adenocarcinoma or a colon cancer sample.

In certain embodiments, the sample in which the one or more fusion genes are detected is a breast cancer sample, a lung cancer sample or a colon cancer sample.

In certain embodiments, the sample in which the one or more fusion genes are detected is not a prostate cancer sample.

In certain embodiments, the sample is not a lung adenocarcinoma, glioblastoma multiforme or hepatocellular carcinoma sample.

In certain embodiments, the fusion gene in a sample is detected by genome sequencing. In certain embodiments, the fusion gene in a sample is detected by RNA sequencing. For example, and not by way of limitation, RNA sequencing can be performed using the primers disclosed in Table 5. In certain embodiments, the fusion gene in a sample is detected by FISH. In certain embodiments, the methods of treating a subject, e.g., a subject that has a cancer that carries a fusion gene disclosed herein, can further comprise administering a therapeutically effective amount of an anti-cancer agent or agent that results in an anti- neoplastic effect. A“therapeutically effective amount” refers to an amount that is able to achieve one or more of the following: an anti-cancer effect, an anti-neoplastic effect, a prolongation of survival and/or prolongation of period until relapse. An anti-cancer agent can be any molecule, compound chemical or composition that has an anti-cancer effect. Anti-cancer agents include, but are not limited to, chemotherapeutic agents, radiotherapeutic agents, cytokines, anti-angiogenic agents, apoptosis-inducing agents or anti-cancer immunotoxins. In certain non-limiting embodiments, a genome-editing technique, disclosed herein, can be used in combination with one or more anti-cancer agents.“In combination with,” as used herein, means that the genome-editing technique and the one or more anti-cancer agents (or agents that are that results in an anti- neoplastic effect) are part of a treatment regimen or plan for a subject. 5.5 GENOME TARGETING/EDITING TECHNIQUES Genome editing is a technique in which endogenous chromosomal sequences present in one or more cells within a subject, can be edited, e.g., modified, using targeted endonucleases and single-stranded nucleic acids. The genome editing method can result in the insertion of a nucleic acid sequence at a specific region within the genome, the excision of a specific sequence from the genome and/or the replacement of a specific genomic sequence with a new nucleic acid sequence. In certain embodiments, the genome editing technique can results in the repression of the expression of a gene, e.g., fusion gene. For example, and not by way of limitation, a nucleic acid sequence can be inserted at a chromosomal breakpoint of a fusion gene. A non-limiting example of a genome editing technique for use in the disclosed methods is the CRISPR system, e.g., CRISPR/Cas 9 system. Non-limiting examples of such genome editing techniques are disclosed in PCT Application Nos. WO 2014/093701 and WO 2014/165825, the contents of which are hereby incorporated by reference in their entireties.

In certain embodiments, the genome editing technique can include the use of one or more guide RNAs (gRNAs), complementary to a specific sequence within a genome, e.g., a chromosomal breakpoint associated with a fusion gene, including protospacer adjacent motifs (PAMs), to guide a nuclease, e.g., an endonuclease, to the specific genomic sequence. In certain embodiments, the genome editing technique can include the use of one or more guide RNAs (gRNAs), complementary to the sequences that are adjacent to and/or overlap the chromosomal breakpoint (see, e.g., Figures 13, 15 and 23), to guide one or more nucleases.

In certain embodiments, the one or more gRNAs can include a targeting sequence that is complementary to a sequence present within the fusion gene, e.g., complementary to the sequences that are adjacent to and/or overlap the chromosomal breakpoint. In certain embodiments, the one or more gRNAs used for targeting the fusion gene can comprise a sequence that is at least partially complementary to the breakpoint sequence of the fusion gene and at least partially complementary to a sequence of one of the genes that comprises the fusion gene. In certain embodiments, the targeting sequences are about 10 to about 50 nucleotides in length, e.g., from about 10 to about 45 nucleotides, from about 10 to about 40 nucleotides, from about 10 to about 35 nucleotides, from about 10 to about 30 nucleotides, from about 10 to about 25 nucleotides, from about 10 to about 20 nucleotides, from about 10 to about 15 nucleotides, from about 15 to about 50 nucleotides, from about 20 to about 50 nucleotides, from about 25 to about 50 nucleotides, from about 30 to about 50 nucleotides, from about 35 to about 50

nucleotides, from about 40 to about 50 nucleotides or from about 45 to about 50 nucleotides in length. In certain embodiments, the targeting sequence is greater than about 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100 or more nucleotides in length.

In certain embodiments, the one or more gRNAs comprise a pair of offset gRNAs complementary to opposite strands of the target site. In certain embodiments, the one or more gRNAs comprises a pair of offset gRNAs complementary to opposite strands of the target site to generate offset nicks by an endonuclease. In certain embodiments, the offset nicks are induced using a pair of offset gRNAs with a nickase, e.g., a Cas9 nickase such as Cas9 D10A . In certain embodiments, the pair of offset gRNAs are offset by at least about 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 88, 89, 90, 91, 92,93, 94, 95 ,96, 97, 98, 99, or at least 100 nucleotides. In certain embodiments, the pair of offset sgRNAs are offset by about 5 to about 100 nucleotides, about 10 to about 50 nucleotides, about 10 to about 40 nucleotides, about 10 to about 30 nucleotides, about 10 to about 20 nucleotides or about 15 to 30 nucleotides.

In certain non-limiting embodiments, a PAM can be recognized by a CRISPR endonuclease such as a Cas protein. Non-limiting examples of Cas proteins include, but are not limited to, Casl, CaslB, Cas2, Cas3, Cas4, Cas5, Cas6, Cas7, Cas8, Cas9 (also known as Csnl or Csxl2), CaslO, Csyl , Csy2, Csy3, Cse l, Cse2, Cscl, Csc2, Csa5, Csn2, Csm2, Csm3, Csm4, Csm5, Csm6, Cmrl, Cmr3, Cmr4, Cmr5, Cmr6, Csbl, Csb2, Csb3, Csxl7, Csxl4, CsxlO, Csxl6, CsaX, Csx3, Csxl, CsxlS, Csfl, Csf2, CsO, Csf4, Cpfl, c2cl, c2c3, Cas9HiFi, homologues thereof or modified versions thereof.

In certain embodiments, the endonuclease can be the clustered, regularly interspaced short palindromic repeat (CRISPR) associated protein 9 (Cas9)

endonuclease. In certain embodiments, the Cas9 endonuclease is obtained from

Streptococcus pyogenes. In certain embodiments, the Cas9 endonuclease is obtained from Staphylococcus aureus. In certain embodiments, the endonuclease can result in the cleavage of the targeted genome sequence and allow modification of the genome at the cleavage site through nonhomologous end joining (NHEJ) or homologous

recombination. In certain embodiments, the Cas9 endonuclease can be a mutated form of Cas9, e.g., that generates a single-strand break or“nick.” For example, and not by way of limitation, the Cas9 protein can include the D10A mutation, i.e., Cas9 D10A (see Cong et al. Science.339:819-823 (2013); Gasiunas et al. PNAS 109:E2579–2586 (2012); and Jinek et al. Science.337:816-821 (2012), the contents of which are incorporated by reference herein).

In certain embodiments, the genome editing method and/or technique can be used to target one or more sequences of a fusion gene present in a cell, e.g., in a cancer cell, to promote homologous recombination to insert a nucleic acid into the genome of the cell. For example, and not by way of limitation, the genome editing technique can be used to target the region where the two genes of the fusion gene are joined together (i.e., the junction and/or chromosomal breakpoint).

In certain embodiments, the genome editing method and/or technique can be used to knockout the fusion gene, e.g., by excising out at least a portion of the fusion gene, to dirupt the fusion gene sequence. For example, and not by way of limitation, an endonuclease, e.g., a wild type Cas9 endonuclease, can be used to specifically cleave the double-stranded DNA sequence of a fusion gene, and in the absence of a homologous repair template non-homologous end joining can result in indels to disrupt the fusion gene sequence.

In certain embodiments, the genome editing method and/or technique can be used to repress the expression of the fusion gene, e.g., by using a nuclease-deficient Cas9. For example, and not by way of limitation, mutations in a catalytic domain of Cas9, e.g., H840A in the HNH domain and D10A in the RuvC domain, inactivates the cleavage activity of Cas9 but do not prevent DNA binding. In certain embodiments, Cas9 D10A H840A (referred to herein as dCas9) can be used to target the region where the two genes of the fusion gene are joined together without cleavage, and by fusing with various effector domains, dCas9 can be used to silence the fusion gene.

As normal, non-cancerous cells do not contain the fusion gene, and therefore do not contain the chromosomal breakpoint associated with the fusion gene, cells can be specifically targeted using this genome editing technique. In certain embodiments, the genome editing technique can be used to target the junction (i.e., breakpoint) of a fusion gene including, but not limited to, TRMT11-GRIK2, SLC45A2-AMACR, MTOR- TP53BP1, LRRC59-FLJ60017, TMEM135-CCDC67, KDM4B-AC011523.2, MAN2A1- FER, PTEN-NOLC1, CCNH-C5orf30, ZMPSTE24‐ZMYM4, CLTC‐ETV1, ACPP‐ SEC13, DOCK7‐OLR1 and PCMTD1‐SNTG1.

In certain embodiments, the one or more gRNAs that can be used in the disclosed methods can target the breakpoints that comprise the nucleotide sequences set forth in SEQ ID NOs: 40-56, 106 and 113 and/or the breakpoints that comprise the nucleotide sequences disclosed in Figures 4, 15, 17 and 18, e.g., SEQ ID NOs: 143-145 and 148- 154. In certain embodiments, the one or more gRNAs used in the disclosed methods can be about 80%, about 85%, about 90%, about 91%, about 92%, about 93%, about 94%, about 95%, about 96%, about 97%, about 98% or about 99% homologous and/or complementary to the chromosomal breakpoints disclosed herein.

In certain embodiments, the gRNAs can be designed to target (e.g., be

complementary to) the sequences flanking the chromosomal breakpoint region (see, for example, Figures 13, 15, 23and 31) to guide an endonuclease, e.g., Cas9 D10A , to the chromosomal breakpoint region or a region surrounding the breakpoint. Non-limiting examples of the sequences of the gRNAs that can be used in the disclosed methods are detailed in Figure 13B, Figure 15A and Figure 23A (e.g., SEQ ID NOs: 107, 112, 146, 147, 272 and 274). In certain embodiments, the one or more gRNAs used in the disclosed methods can be about 80%, about 85%, about 90%, about 91%, about 92%, about 93%, about 94%, about 95%, about 96%, about 97%, about 98% or about 99% homologous to the gRNAs disclosed herein. In certain embodiments, the disclosed gRNAs can include about 1, about 2, about 3, about 4 or about 5 nucleotide substitutions and/or mutations.

In certain embodiments, the one or more gRNAs can target intron 13 of

TMEM135 and intron 9 of CCDC67, e.g., one gRNA can target intron 13 of TMEM135 and the second gRNA can target intron 9 of CCDC67, which flank the breakpoint of the TMEM135-CCDC67 fusion gene. In certain embodiments, one or more gRNAs used to target TMEM135-CCDC67 fusion gene can have a nucleotide sequence that comprises one or more of the nucleotide sequences set forth in Figure 13B or 13C. In certain embodiments, the one or more gRNAs for targeting TMEM135-CCDC67 can be about 80%, about 85%, about 90%, about 91%, about 92%, about 93%, about 94%, about 95%, about 96%, about 97%, about 98% or about 99% homologous to the gRNAs disclosed in Figure 13.

In certain embodiments, the one or more gRNAs can target intron 13 and/or exon 14 (or the splicing acceptor site of intron 13 and exon 14) of MAN2A1 and intron 14 of FER, which flank the breakpoint of the MAN2A1-FER fusion gene. In certain embodiments, one or more gRNAs used to target MAN2A1-FER fusion gene can have a nucleotide sequence that comprises one or more of the nucleotide sequences set forth in Figure 15A. In certain embodiments, the one or more gRNAs for targeting MAN2A1- FER can be about 80%, about 85%, about 90%, about 91%, about 92%, about 93%, about 94%, about 95%, about 96%, about 97%, about 98% or about 99% homologous to the gRNAs disclosed in Figure 15.

In certain embodiments, the one or more gRNAs can target intron 11 of PTEN and intron 1 of NOLC1, which flank the breakpoint of the PTEN-NOLC1fusion gene. In certain embodiments, one or more gRNAs used to target PTEN-NOLC1fusion gene can have a nucleotide sequence that comprises one or more of the nucleotide sequences set forth in Figure 17A or Figure 23A. In certain embodiments, the one or more gRNAs can target intron 11 of PTEN and intron 1 of NOLC1, which flank the breakpoint of the PTEN-NOLC1 fusion gene. In certain embodiments, the one or more gRNAs for targeting PTEN-NOLC1 can be about 80%, about 85%, about 90%, about 91%, about 92%, about 93%, about 94%, about 95%, about 96%, about 97%, about 98% or about 99% homologous to the gRNAs disclosed in Figures 17 and 23. In certain embodiments, the fusion gene is CCNH-C5orf30.

In certain embodiments, the fusion gene is TMEM135-CCDC67.

In certain embodiments, the fusion gene is MAN2A1-FER.

In certain embodiments, the fusion gene can be PTEN-NOLC1.

In certain embodiments, the fusion gene is not CCNH-C5orf30 or TMEM135- CCDC67.

In certain embodiments, the disclosed genome editing technique can be used to promote homologous recombination with a sequence of a fusion gene, e.g., at a chromosomal breakpoint (junction) of a fusion gene, in one or more cells of a subject to allow the insertion of a nucleic acid sequence that when expressed results in the death, e.g., apoptosis, of the one or more cells. For example, and not by way of limitation, the nucleic acid sequence (also referred to herein as a donor nucleic acid) can encode the Herpes Simplex Virus 1 (HSV-1) thymidine kinase, Exotoxin A from Pseudomonas aeruginosa, Diphtheria toxin from Corynebacterium diphtheri, Ricin or abrin from Ricinus communi (castor oil plant), Cytosine deaminase from bacteria or yeast, Carboxyl esterase or Varicella Zoster virus (VZV) thymidine kinase. Additional non-limiting examples of nucleic acids and/or genes that can be inserted into the genome of a cell carrying a fusion gene to induce cell death are disclosed in Rajab et al. (2013) (J. of Genetics Syndromes and Gene Therapy, 4(9):187) and Zarogoulidis et al. (2013) (J. of Genetics Syndromes and Gene Therapy, 4(9):pii: 16849). In certain non-limiting embodiments, the nucleic acid sequence, e.g., the HSV-1 thymidine kinase nucleic acid sequence, is not operably linked to a regulatory sequence promoter (e.g., a promoter) and requires integration into the genome for expression. For example, and not by way of limitation, the promoter of the head gene of the fusion gene can promote the expression of the donor nucleic acid sequence.

In certain embodiments where a nucleic acid encoding HSV-1 thymidine kinase is inserted in the genome of one or more cells of a subject, a therapeutically effective amount of the guanine derivative, ganciclovir, or its oral homolog, valganciclovir, can be administered to the subject. HSV-1 thymidine kinase can phosphorylate and convert ganciclovir and/or valganciclovir into the triphosphate forms of ganciclovir and/or valganciclovir in the one or more cells of the subject. The triphosphate form of ganciclovir and/or valganciclovir acts as competitive inhibitor of deoxyguanosine triphosphate (dGTP) and is a poor substrate of DNA elongation, and can result in the inhibition of DNA synthesis. The inhibition of DNA synthesis, in turn, can result in the reduction and/or inhibition of growth and/or survival and/or cell death of cancer cells that contain the targeted chromosomal breakpoint and the integrated HSV-1 thymidine kinase nucleic acid sequence. This genome editing method can be used to produce an anti-cancer effect in a subject that has been determined to have a fusion gene.

In certain embodiments, a genome editing technique of the present disclosure can include the introduction of an expression vector comprising a nucleic acid sequence that encodes a Cas protein or a mutant thereof, e.g., Cas9 D10A , into one or more cells of the subject, e.g., cancer cells, carrying a fusion gene. In certain embodiments, the cells are not prostate cancer cells. In certain embodiments, the vector can further comprise one or more gRNAs for targeting the Cas9 protein to a specific nucleic acid sequence within the genome. In certain embodiments, the expression vector can be a viral vector.

In certain embodiments, the one or more gRNAs can hybridize to a target sequence within a fusion gene. For example, and not by way of limitation, the one or more gRNAs can target the chromosomal breakpoint of a fusion gene and/or target the one or more sequences that flank the chromosomal breakpoint region. Non-limiting examples of sequences of fusion gene chromosomal breakpoints are disclosed herein and within the Figures (see, for example, Table 1). In certain embodiments, one gRNA can be complementary to a region within one of the genes of the fusion gene and another gRNA can be complementary to a region within the other gene of the fusion gene. For example, and not by way of limitation, one gRNA can be complementary to a region within the TMEM135 gene of the TMEM135-CCDC67 fusion gene and another gRNA can be complementary to a region within the CCDC67 gene. In certain embodiments, one gRNA can be complementary to a region within the MAN2A1 gene of the

MAN2A1-FER fusion gene and another gRNA can be complementary to a region within the FER gene. In certain embodiments, one gRNA can be complementary to a region within the PTEN gene of the PTEN-NOLC1 fusion gene and another gRNA can be complementary to a region within the NOLC1 gene. In certain embodiments, one gRNA can be complementary to a region upstream of the chromosomal breakpoint of a fusion gene and another gRNA can be complementary to a region downstream of the chromosomal breakpoint. In certain embodiments, genome sequencing can be performed to determine the regions of the fusion gene that can be targeted by the gRNAs. In certain embodiment, the regions of the genes that are targeted by the gRNAs can be introns and/or exons. In certain embodiments, the nucleic acid sequence encoding the Cas protein, e.g., Cas9, can be operably linked to a regulatory element, and when transcribed, the one or more gRNAs can direct the Cas protein to the target sequence in the genome and induce cleavage of the genomic loci by the Cas protein. In certain embodiments, the Cas9 protein cut about 3-4 nucleotides upstream of the PAM sequence present adjacent to the target sequence. In certain embodiments, the regulatory element operably linked to the nucleic acid sequence encoding the Cas protein can be a promoter, e.g., an inducible promoter such as a doxycycline inducible promoter. The term“operably linked,” when applied to DNA sequences, for example in an expression vector, indicates that the sequences are arranged so that they function cooperatively in order to achieve their intended purposes, i.e., a promoter sequence allows for initiation of transcription that proceeds through a linked coding sequence as far as the termination signal.

In certain embodiments, the Cas9 enzyme encoded by a vector of the present invention can comprise one or more mutations. The mutations may be artificially introduced mutations or gain- or loss-of-function mutations. Non-limiting examples of such mutations include mutations in a catalytic domain of the Cas9 protein, e.g., the RuvC and HNH catalytic domains, such as the D10 mutation within the RuvC catalytic domain and the H840 in the HNH catalytic domain. In certain embodiments, a mutation in one of the catalytic domains of the Cas9 protein results in the Cas9 protein functioning as a“nickase,” where the mutated Cas9 protein cuts only one strand of the target DNA, creating a single-strand break or“nick.” In certain embodiments, the use of a mutated Cas9 protein, e.g., Cas9 D10A , allows the use of two gRNAs to promote cleavage of both strands of the target DNA. Additional non-limiting examples of Cas9 mutations include VP64, KRAB and SID4X, FLAG, EGFP and RFP. In certain embodiments, the genome editing technique of the present disclosure can further include introducing into the one or more cells an additional vector comprising a nucleic acid, that when expressed results in the death, e.g., apoptosis, of the one or more cells. In certain embodiments, this vector can further comprise one or more targeting sequences that are complementary (e.g., can hybridize) to the same and/or adjacent to the genomic sequences targeted by the gRNAs to allow homologous recombination to occur and insertion of the nucleic acid sequence (i.e., donor nucleic acid sequence) into the genome. In certain embodiments, the additional vector can further comprise one or more splice tag sequences of an

exon/intron junction of a gene that makes up the fusion gene. In certain embodiments, the targeting sequences can be complementary to an intron, exon sequence and/or intron/exon splicing sequence within a gene of the fusion gene. In certain embodiments, one targeting sequence can be complementary to a region within one of the genes of the fusion gene targeted by the gRNAs and a second targeting sequence can be

complementary to a region within the other gene of the fusion gene, to allow

homologous recombination between the vector comprising the donor nucleic acid and the genome sequence cleaved by the Cas9 protein. For example, and not by way of limitation, one targeting sequence can be complementary to a region within the

TMEM135 gene of the TMEM135-CCDC67 fusion gene and another targeting sequence can be complementary to a region within the CCDC67 gene. In certain embodiments, one targeting sequence can be complementary to a region within the MAN2A1 gene of the MAN2A1-FER fusion gene and another targeting sequence can be complementary to a region within the FER gene. In certain embodiments, one targeting sequence can be complementary to a region within the PTEN gene of the PTEN-NOLC1 fusion gene and another targeting sequence can be complementary to a region within the NOLC1 gene. In certain embodiments, one targeting sequence can be complementary to a region upstream of the cleavage site generated by the Cas9 protein and another targeting sequence can be complementary to a region downstream of the chromosomal breakpoint. Non-limiting examples of the types of nucleic acid sequences that can be inserted into the genome are disclosed above. In certain embodiments, the nucleic acid that is to be inserted into the genome encodes HSV-1 thymidine kinase. Additional non-limiting examples of nucleic acids and/or genes that can be inserted into the genome of a cell carrying a fusion gene to induce cell death are set forth above.

The vectors for use in the present disclosure can be any vector known in the art. For example, and not by way of limitation, the vector can be derived from plasmids, cosmids, viral vectors and yeast artificial chromosomes. In certain embodiments, the vector can be a recombinant molecule that contains DNA sequences from several sources. In certain embodiments, the vector can include additional segments such as, but not limited to, promoters, transcription terminators, enhancers, internal ribosome entry sites, untranslated regions, polyadenylation signals, selectable markers, origins of replication and the like. In certain embodiments, the vectors can be introduced into the one or more cells by any technique known in the art such as by electroporation, transfection and transduction. In certain embodiments, the vectors can be introduced by adenovirus transduction.

.

Targeted sequences are underlined and bolded. 5.5.1 PARTICULAR NON-LIMITING EXAMPLES

In certain embodiments, a genome editing technique of the present invention comprises introducing into one or more cells, e.g., cancer cells, of a subject: (i) a vector comprising a nucleic acid sequence that encodes a Cas9 protein, or mutant thereof; (ii) a vector comprising one or more gRNAs that are complementary to one or more target sequences of a fusion gene, that when expressed induce Cas9-mediated DNA cleavage within the fusion gene; and (iii) a vector comprising a donor nucleic acid sequence, that when expressed results in cell death, and one or more targeting sequences that are complementary to one or more sequences of the fusion gene to promote homologous recombination and the insertion of the donor nucleic acid sequence into the fusion gene. In certain embodiments, the cancer cell is not a prostate cancer cell. In certain embodiments, a genome editing technique of the present invention comprises introducing into one or more cells of a subject: (i) a vector comprising a nucleic acid sequence that encodes a Cas9 protein, or mutant thereof (e.g., Cas9 D10A ), and one or more gRNAs that are complementary to one or more target sequences of a fusion gene, wherein when transcribed, the one or more gRNAs direct sequence-specific binding of one or more Cas9 proteins to the one or more target sequences of the fusion gene to promote cleavage of the fusion gene; and (ii) a vector comprising a donor nucleic acid sequence, that when expressed results in cell death, and one or more targeting sequences that are complementary to one or more sequences of the fusion gene to promote homologous recombination and the insertion of the donor nucleic acid sequence into the fusion gene. In certain embodiments, the one or more targeting sequences can include the chromosomal breakpoint of a fusion gene and/or the one or more sequences that flank the chromosomal breakpoint region or a combination thereof. For example, and not by way of limitation, the target sequence can comprise at least a part of the breakpoint sequence of the fusion gene and at least a part of a sequence of one of the genes that comprises the fusion gene.

In certain embodiments, a genome editing technique of the present invention comprises introducing into one or more cells of a subject: (i) a vector comprising a nucleic acid sequence that encodes Cas9 protein, or mutant thereof, and one or more gRNAs that are complementary to one or more target sequences of a fusion gene, wherein when transcribed, the one or more gRNAs direct sequence-specific binding of a Cas9 protein to the one or more target sequences of the fusion gene to promote cleavage of the fusion gene; and (ii) a vector comprising a donor nucleic acid sequence encoding HSV-1 thymidine kinase and one or more targeting sequences that are complementary to one or more sequences of the fusion gene to promote homologous recombination and the insertion of the donor nucleic acid sequence encoding HSV-1 thymidine kinase into the fusion gene. In certain embodiments, the genome editing technique further comprises the administration of a therapeutically effective amount of ganciclovir and/or

valganciclovir. 5.6 KITS

The present invention further provides kits for treating a subject that carries one or more of the fusion genes disclosed herein and/or for carrying out any one of the above-listed detection and therapeutic methods. In certain embodiments, the present disclosure provides kits for performing a targeted genome editing technique on one or more cancer cells within the subject that carries one or more of the fusion genes disclosed herein. In certain embodiments, the one or more cancer cells are not prostate cancer cells.

Types of kits include, but are not limited to, packaged fusion gene-specific probe and primer sets (e.g., TaqMan probe/primer sets), arrays/microarrays, antibodies, which further contain one or more probes, primers, or other reagents for detecting one or more fusion genes and/or can comprise means for performing a genome editing technique.

In certain embodiments, the kit can include means for performing the genome editing techniques disclosed herein. For example, and not by way of limitation, a kit of the present disclosure can include a container comprising one or more vectors or plasmids comprising a nucleic acid encoding a Cas protein or a mutant thereof, e.g., Cas9 D10A . In certain embodiments, the nucleic acid encoding the Cas protein can be operably linked to a regulatory element such as a promoter. In certain embodiments, the one or more vectors can further comprise one or more gRNAs specific to a fusion gene, e.g., specific to a breakpoint of a fusion gene and/or sequences flanking the breakpoint of a fusion gene.

In certain embodiments, a kit of the present invention can include, optionally in the same container as the vector comprising the nucleic acid encoding a Cas protein or in another container, one or more vectors or plasmids comprising a nucleic acid, that when expressed (in the presence of absence of a compound) results in cell death. For example, and not by way of limitation, the nucleic acid sequence can encode the Herpes Simplex Virus 1 (HSV-1) thymidine kinase, Exotoxin A from Pseudomonas aeruginosa,

Diphtheria toxin from Corynebacterium diphtheri, Ricin or abrin from Ricinus communi (castor oil plant), Cytosine deaminase from bacteria or yeast, Carboxyl esterase or Varicella Zoster virus (VZV) thymidine kinase. In certain embodiments, this vector can further comprise one or more targeting sequences that are complementary to sequences within the fusion gene to promote homologous recombination and insertion of the donor nucleic acid.

In certain embodiments, where the donor nucleic acid encodes HSV-1 thymidine kinase, the kit can further comprise ganciclovir and/or valganciclovir.

In certain non-limiting embodiments, a kit of the present disclosure can further comprise one or more nucleic acid primers or probes and/or antibody probes for use in carrying out any of the above-listed methods. Said probes may be detectably labeled, for example with a biotin, colorimetric, fluorescent or radioactive marker. A nucleic acid primer may be provided as part of a pair, for example for use in polymerase chain reaction. In certain non-limiting embodiments, a nucleic acid primer may be at least about 10 nucleotides or at least about 15 nucleotides or at least about 20 nucleotides in length and/or up to about 200 nucleotides or up to about 150 nucleotides or up to about 100 nucleotides or up to about 75 nucleotides or up to about 50 nucleotides in length. An nucleic acid probe may be an oligonucleotide probe and/or a probe suitable for FISH analysis. In specific non-limiting embodiments, the kit comprises primers and/or probes for analysis of at least two, at least three, at least four, at least five, six, seven, eight, nine, ten, eleven, twelve, thirteen, fourteen of TRMT11-GRIK2, SLC45A2-AMACR, MTOR-TP53BP1, LRRC59-FLJ60017, TMEM135-CCDC67, KDM4B-AC011523.2, MAN2A1-FER, PTEN-NOLC1, CCNH-C5orf30, ZMPSTE24‐ZMYM4, CLTC‐ETV1, ACPP‐SEC13, DOCK7‐OLR1 and PCMTD1‐SNTG1. In certain embodiments, the kit comprises primers for analysis of TMEM135-CCDC67, MAN2A1-FER, PTEN-NOLC1 and CCNH-C5orf30.

In certain non-limiting embodiments, the nucleic acid primers and/or probes may be immobilized on a solid surface, substrate or support, for example, on a nucleic acid microarray, wherein the position of each primer and/or probe bound to the solid surface or support is known and identifiable. The nucleic acid primers and/or probes can be affixed to a substrate, such as glass, plastic, paper, nylon or other type of membrane, filter, chip, bead, or any other suitable solid support. The nucleic acid primers and/or probes can be synthesized directly on the substrate, or synthesized separate from the substrate and then affixed to the substrate. The arrays can be prepared using known methods.

In non-limiting embodiments, a kit provides nucleic acid probes for FISH analysis to determine the presence of one or more fusion genes in a sample obtained from a subject. In certain embodiments, the one or more fusion genes are selected from the group consisting of: TRMT11-GRIK2, SLC45A2-AMACR, MTOR-TP53BP1, LRRC59-FLJ60017, TMEM135-CCDC67, CCNH-C5orf30, TRMT11-GRIK2, SLC45A2-AMACR, KDM4B-AC011523.2, MAN2A1-FER, PTEN-NOLC1, MTOR- TP53BP1, ZMPSTE24‐ZMYM4, CLTC‐ETV1, ACPP‐SEC13, DOCK7‐OLR1, PCMTD1‐SNTG1 and a combination thereof. In non-limiting embodiments, a kit provides nucleic acid probes for FISH analysis of one or more fusion genes. In certain embodiments, the one or more fusion genes can include TRMT11-GRIK2, SLC45A2- AMACR, MTOR-TP53BP1, LRRC59-FLJ60017, TMEM135-CCDC67, PTEN-NOLC1 and CCNH-C5orf30, and TRMT11-GRIK2, SLC45A2-AMACR, KDM4B-AC011523.2, MAN2A1-FER, MTOR-TP53BP1 or combinations thereof. In specific non-limiting embodiments, probes to detect a fusion gene may be provided such that separate probes each bind to the two components of the fusion gene or a probe may bind to a“junction” that encompasses the boundary between the spliced genes. For example, and not by way of limitation, the junction is the region where the two genes are joined together. In specific non-limiting embodiments, the kit comprises said probes for analysis of at least two, at least three, at least four or all five of ZMPSTE24‐ZMYM4, CLTC‐ETV1, ACPP‐ SEC13, DOCK7‐OLR1 or PCMTD1‐SNTG1.

In non-limiting embodiments, a kit provides nucleic acid primers for PCR analysis to determine the presence of one or more fusion genes in a sample obtained from a subject. In certain embodiments, the one or more fusion genes are selected from the group consisting of: TRMT11-GRIK2, SLC45A2-AMACR, MTOR-TP53BP1, LRRC59-FLJ60017, TMEM135-CCDC67, PTEN-NOLC1, CCNH-C5orf30, TRMT11- GRIK2, SLC45A2-AMACR, KDM4B-AC011523.2, MAN2A1-FER, MTOR-TP53BP1 and a combination thereof. In non-limiting embodiments, a kit provides nucleic acid primers for PCR analysis of one or more fusion gene selected from the group consisting of: ZMPSTE24‐ZMYM4, CLTC‐ETV1, ACPP‐SEC13, DOCK7‐OLR1or PCMTD1‐ SNTG1. In specific non-limiting embodiments, the kit comprises said primers for analysis of at least two, at least three, at least four, at least five, six, seven, eight, nine, ten, eleven, twelve, thirteen, fourteen of TRMT11-GRIK2, SLC45A2-AMACR, MTOR- TP53BP1, LRRC59-FLJ60017, TMEM135-CCDC67, KDM4B-AC011523.2, MAN2A1- FER, PTEN-NOLC1, CCNH-C5orf30, ZMPSTE24‐ZMYM4, CLTC‐ETV1, ACPP‐ SEC13, DOCK7‐OLR1 and PCMTD1‐SNTG1. The following Examples are offered to more fully illustrate the disclosure, but are not to be construed as limiting the scope thereof. 6. EXAMPLE 1: GENOME THERAPY TARGETING AT THE

CHROMOSOME BREAKPOINTS OF FUSION GENES RESULTED IN REMISSION OF XENOGRAFTED HUMAN CANCERS

6.1 INTRODUCTION

In this Example, a genome intervention approach was developed to kill cancer cells based on unique sequences resulting from genome rearrangement. The chromosome breakpoints from MAN2A1-FER and TMEM135-CCDC67 fusion genes were exploited as therapeutic targets. The MAN2A1-FER and TMEM135-CCDC67 fusion genes have been previously determined to be present in prostate cancer (see WO 2016/011428, the contents of which are hereby incorporated by reference in its entirety). Additionally, the MAN2A1-FER fusion gene has been shown to be present in glioblastoma multiforme, non-small cell lung cancer, ovarian cancer, esophagus adenocarcinoma and liver cancer, in percentages ranging from 2-25.9%. MAN2A1-FER fusion gene has been shown to be present in 16.8% of non-small cell lung cancer, 15.7% of liver cancer, 7.1% GBM, 25.9% of esophagus adenocarcinoma, 5.2% of prostate cancer and 1.7% of ovarian cancer.

Clustered regularly interspaced short palindromic repeats (CRISPR)/CRISPR- associated (Cas) was originally discovered as one of the immunity defense mechanisms against foreign pathogens in prokaryotic cells 2 . Cas9, a critical protein for type II of CRISPR/Cas system, was found to contain DNA cleavage activity. The nuclease activity of Cas9 was guided by a 20-base complementary sequence from CRISPR RNA and trans-activating CRISPR RNA to the targeted DNA 3 . Since trans-activating CRISPR RNA and CRISPR RNA can be made into a chimeric RNA containing the full function of both RNA species, guide RNA (gRNA) was coined for the artificial fusion RNA 4 . The D10A mutation in the catalytic domain of Cas9 converts it to a nickase that produces a single nucleotide break at the target DNA 4 . Nicking genomic DNA can be used to precisely introduce sequences into a specific genomic locus by using the cellular homology-directed repair (HDR) pathway. Introducing two nicks in proximity (double nicking) in target DNA increases the efficiency of introducing sequences 50- to 1,500- fold 5 over natural homologous-recombination rates, with an off-target rate as low as 1/10,000. Such specificity makes somatic genomic targeting a viable approach in treating human diseases, especially neoplasms carrying fusion genes that do not exist in normal cells. Herpes Simplex Virus 1 thymidine kinase (HSV1-tk) phosphorylates thymidine and forms thymidine monophosphate, a building block for DNA synthesis. However, the substrate specificity of HSV1-tk is different from that of its mammalian counterpart in that it also phosphorylates the synthetic nucleoside homolog ganciclovir (prodrug) 7 , which is not recognized by mammalian thymidine kinase. This phosphorylation results in accumulation of ganciclovir monophosphate in mammalian cells that express HSV1-tk after treatment with ganciclovir. Ganciclovir monophosphate is converted to its triphosphate form by two other kinases 8 . Ganciclovir triphosphate blocks DNA synthesis through elongation termination. Mammalian cells negative for HSV1-tk, in contrast, are immune from this effect, owing to their inability to phosphorylate ganciclovir.

In this Example, we show that by using Cas9 D10A mediated genome editing, we have successfully inserted HSV1-tk into the chromosomal breakpoints of fusion gene, MAN2A1-FER. Treatment of tumors harboring this chromosome breakpoint with ganciclovir led to cell death in cell culture and remission of xenografted prostate and liver cancers in Severe Combined Immunodeficiency (SCID) mice.

6.2 MATERIALS AND METHODS

Materials and vector construction. PC3 (prostate cancer), Du145 (prostate cancer) and the hepatocellular carcinoma cell lines, HUH7 and HEP3B cells, were purchased from American Type Cell Culture (Manassas, VA). PC3 cells were cultured with F12K medium supplemented with 10% fetal bovine serum (InVitrogen, Carlsbad, CA). Du145 cells were cultured with modified Eagle medium supplemented with 10% fetal bovine serum (Invitrogen). HEP3B cells were cultured with modified Eagle medium supplemented with 10% fetal bovine serum (InVitrogen). HUH7 cells were cultured with Dulbecco’s modified eagle medium supplemental with 10% fetal bovine serum. The genomes of these cell lines were tested for a short tandem repeat (STR) DNA profile on eight different loci (CSF1PO, D13S317, D16S539, D5S818, D7S820, THO1, TPOX, and vWA) of the genomes by PCR using the following sets of primers CSF1PO: AACCTGAGTCTGCCAAGGACTAGC (SEQ ID

NO:78)/TTCCACACACCACTGG CCATCTTC (SEQ ID NO:79);

D13S317: ACAGAAGTCTGGGATGTGGA (SEQ ID NO:80)/

GCCCAAAAAGACAGACAGAA (SEQ ID NO:81),

D16S539: GATCCCAAGCTCTTCCTCTT (SEQ ID

NO:82)/ACGTTTGTGTGTGCATCTGT (SEQ ID NO:83); D5S818: GGGTGATTTTCCTCTTTGGT (SEQ ID

NO:84)/TGATTCCAATCATAGCCACA (SEQ ID NO:85);

D7S820: TGTCATAGTTTAGAACGAACTAACG (SEQ ID NO:86)/

CTGAGGTATCAAAAACTC AGAGG (SEQ ID NO:87);

TH01: GTGGGCTGAAAAGCTCCCGATTAT (SEQ ID NO:88)/

ATTCAAAGGGTATCTGGGCTCTGG (SEQ ID NO:89);

TPOX: ACTGGCACAGAACAGGCACTTAGG (SEQ ID NO:90)/

GGAGGAACTGGGAACCACACAGGT (SEQ ID NO:91);

vWA: CCCTAGTGGATGATAAGAATAATCAGTATG (SEQ ID NO:92)/

GGACAGATGATAAATACATAGGATGGATGG (SEQ ID NO:93).

These cell lines were authenticated because the STR profiles of the cell lines perfectly matched those published by ATCC. Rabbit polyclonal anti-Cas9 antibodies were purchased from Clontech Inc., CA. Rabbit anti-HSV-1 TK polyclonal antibodies were purchased from Sigma Inc., OH.

Construction of vector. To construct the gRNA expression vector, sequences flanking the breakpoint region of TMEM135-CCDC67 were analyzed and gRNAs were designed using DNA 2.0 tool: https://www.dna20.com/eCommerce/cas9/input. Both gRNA- and gRNA+ were ligated into All-in-One NICKASENINJA® vector that also contains Cas9 D10A . The insert was then released by restriction with XbaI, and ligated into similarly restricted VQAd5 shuttle vector to create VQAd5- Cas9 D10A -gRNA TMEM135int13 - gRNA CCDC67int9 . The recombinant shuttle vector was then recombined with pAD5 virus to generate pAD5-Cas9 D10A -gRNA TMEM135int13 -gRNA CCDC67int9 using a method previously described (14).

To construct donor DNA recombinant virus, PCR was performed on pEGFP-N1 using the following primers:

GTACTCACGTAAGCTTTCGCCACCATGGTGAGCAAGG (SEQ ID NO:94); and GACTCAGATGGGCGCCCTTGTACAGCTCGTCCATGCC (SEQ ID NO:95). The PCR product was restricted with KasI and HindIII, and ligated into similarly restricted pSELECT-zeo-HSV1tk vector to create pEGFP-HSV1-tk.

PCR was performed on the genome DNA from sample where TMEM135- CCDC67 fusion was discovered to obtain intron 13 sequence of TMEM135 using the following primers:

GACTCAGATGGCGGCCGCCTGTATTCTTTGTTTTACAGATTTGCTGTCAGGGG TTAGATAGCTTGCCAG (SEQ ID NO:96)/ GTACTCACGTAAGCTTGAGCTAACATTACCAATGAGGC (SEQ ID NO:97). The PCR products were then restricted with NotI and HindIII, and ligated into similarly restricted pEGFPtk vector to create pTMEM135 int13 -EGFP-tk. Subsequently, PCR was performed on the genome DNA from the sample where TMEM135-CCDC67 fusion was discovered to obtain intron 9 sequence of CCDC67 using the following primers:

GACTCAGATGGCTAGCAGTTCACTGAGTGTGCCATGC (SEQ ID NO:98) / GTACTCACGTGAATTCCTATTCTGCCTGCTTGCATACCTTTTGTTTTGGTTGCA GTATAGTGGGCTGAG (SEQ ID NO:99). The PCR was then restricted with NheI and EcoRI, and ligated into the similarly restricted pTMEM135 int13 -EGFP-tk vector to create pTMEM135 int13 -EGFP-tk-CCDC67 int9 . The vector was then restricted with EcoR1 and NotI and ligated into the similarly restricted pAdlox to create pAdlox- pTMEM135 int13 - EGFP-tk-CCDC67 int9 . The recombinant shuttle vector was then recombined with adenovirus to create pAd-TMEM135 int13 -EGFP-tk-CCDC67 int9 . For the construction of pCMV-TMEM135-CCDC67bp vector, PCR was performed on genome DNA from a prostate cancer sample that are positive for TMEM135-CCDC67 fusion using the following primers: GACTCAGATGAAGCTTAAGAGCATGGGCTTTGGAGTC (SEQ ID NO:100)/ GTACTCACGTTCTAGACTGGAATCTAGGACTCTTGGC (SEQ ID NO:101). The PCR product was then sequenced to confirm the presence of TMEM135- CCDC67 breakpoint. The PCR product was digested with HindIII and XbaI, and ligated into similarly digested pCMVscript vector. The construct was subsequently transfected into PC3 and DU145 cells using lipofectamine 3000. Cells stably expressing

TMEM135-CCDC67 breakpoint transcripts were selected by incubation of the transfected cells in medium containing G418 (200 μg/ml).

The construction of pAD5-Cas9 D10A -gRNAMAN2A1 int13 -gRNAFER int14 followed the similar procedure of constructing the gRNA for targeting the TMEM135-CCDC67 fusion gene as described above. For construction of pAdlox-MAN2A1 int13 -EGFP-tk- FER int14 , extended long PCR was performed on 1 μg genome DNA from HUH7 cells using the following primers:

GACTCAGATGGCGGCCGCGAACATCAGAACTGGGAGAGG (SEQ ID

NO:102)/GTACTCACGTAAGCTTCAGGAGAATCACTTGAACCCG (SEQ ID NO:103). The PCR product was then digested with HindIII and Not1, and ligated into similarly digested pEGFP-tk vector to create pMAN2A1 int13 -EGFP-tk. A synthetic sequence corresponding to splicing acceptor site of MAN2A1 intron 13/exon 14 (TAATGTTGGTTTTACCAAAAATATAAATGGTTTGCCTCTCAGTAGATAACAT TTATCTTTAATAAATTCCCTTCCCTATCTTTTAAAGATCTCTTTTCGAGCACAT AT (SEQ ID NO:104)/

TAATATGTGCTCGAAAAGAGATCTTTAAAAGATAGGGAAGGGAATTTATTAA AGATAAATGTTATCTACTGAGAGGCAAACCATTTATATTTTTGGTAAAACCA ACAT(SEQ ID NO:105)) was ligated to ASE1 restricted pMAN2A1 int13 -EGFP-tk. Separately, a PCR was performed on HUH7 genome DNA using primers

GACTCAGATGGAATTCAAGGTGGAACACAGAAGGAGG(SEQ ID

NO:121)/GTACTCACGTGAATTCGATTACTTTAAATAACTCACTTGGCTTCTTG CAGAGGTAGAGCTGAGAGAAG (SEQ ID NO:122) to generate a 1984 bp sequence corresponding to intron 14 of FER including 31 bp splice donor site sequence corresponding to FER exon 15/intron 15. The PCR was then restricted with EcoR1, and ligated into similarly restricted pMAN2A1int13-EGFP-tk-FERint14 to create pMAN2A1 int13 -EGFP-tk-FER int14 . The vector was then restricted with NotI, and bluntended with T4 DNA polymerase. The product was then restricted with XmaI. Separately, pAdlox vector was restricted with HindIII, and blunt-ended with T4 DNA polymerase. The product was restricted with XmaI, and ligated with the digestion product of pMAN2A1 int13 -EGFP-tk-FER int14 . The recombinant shuttle vector was then recombined with adenovirus to create pAd-MAN2A1 int13 -EGFP-tk-FER int14 .

In vitro Cas9 target cleavage assays. gRNA DNA sequence plus scaffold DNA sequence for + or– DNA strand were amplified from the all-in-one vector with the following primers:

GGCCAGTGAATTGTAATACGACTCACTATAGGGAGGCGGTAGCATTAAGGG CCCCCTAAGTTTTAGAGCTAGAAATAGCAAG (SEQ ID NO:108) for gRNA+ template of MAN2A1-FER, and

GGCCAGTGAATTGTAATACGACTCACTATAGGGAGGCGGATAGCTAGAAGG TGGATCACGTTTTAGAGCTAGAAATAGCAAG (SEQ ID NO:109) for gRNA- template of MAN2A1-FER. The PCR products were in vitro transcribed using In Vitro Transcription kit from Ambion, CA, to obtain gRNA+ and gRNA- products. Cleavage assays were performed at 25 o C for 10 min and then 37 o C for 1 hour under the following condition: 1x Cas9 nuclease reaction buffer, 30 nM gRNA 3 nM DNA template and 30 nM Cas9 Nuclease, S. pyogenes. The cleaved DNA was visualized in 1% agarose gel electrophoresis. Fluorescence activated cell sorting (FACS) analysis of apoptotic cells. The assays were previously described (8,9). Briefly, the cells treated with pAD5-Cas9 D10A - gRNAMAN2A1 int13 -gRNAFER int14 /pAD-MAN2A1 int13 -EGFP-tk-FER int14 and various concentrations of ganciclovir were trypsinized and washed twice with cold PBS. The cells were then resuspended in 100 µl of annexin binding buffer (Invitrogen), and incubated with 5 µl of phycoerythrin (PE)-conjugated annexin V and 1 µl of 100 µg/ml propidium iodide for 15 min in dark at room temperature. The binding assays were terminated by addition of 400 µl of cold annexin binding buffer. FACS analysis was performed using a BD-LSR-II flow cytometer (BD Science, San Jose, CA). The fluorescence stained cells were analyzed at the fluorescence emission at 533 nm (FL2). The negative control, cells with neither PE nor PI in the incubation medium, was used to set the background for the acquisition. UV treated cells were used as a positive control for apoptosis. For each acquisition, 10,000 to 20,000 cells were analyzed and sorted based on the fluorescence color of the cells. For HUH7 and HEP3B FACS analysis, similar procedures were performed except these cells were treated with 1 μM scr-7 along with viral infections to improve genome editing efficiency (10).

Tumor Growth and Spontaneous Metastasis. The xenografting procedure was described previously (11,12). For HUH7 and HEP3B xenografted tumor treatment, a similar procedure was applied as previously described in Example 7 of WO

2016/011428, except that the treatment was started 2 weeks after the tumor xenografting due to rapid growth of the cancers. The breakdown of the treated groups is the following: 5 mice xenografted with HUH7 cells were treated with pAD5-Cas9 D10A - gRNAMAN2A1 int13 -gRNAFER int14 /pAD-MAN2A1 int13 -EGFP-tk-FER int14 and ganciclovir; 5 mice xenografted with HUH7 cells were treated with pAD5-Cas9 D10A - gRNATMEM135 int13 -gRNACCDC67 int9 /pAD-TMEM135int13-EGFP-tk-CCDC67 int9 and ganciclovir (control); 5 mice xenografted with HUH7 cells were treated with pAD5- Cas9 D10A -gRNAMAN2A1 int13 -gRNAFER int14 /pAD-MAN2A1 int13 -EGFP-tk-FER int14 and PBS (control); 5 mice xenografted with HEP3B cells were treated with pAD5- Cas9 D10A gRNAMAN2A1 int13 -gRNAFER int14 /pAD-MAN2A1 int13 -EGFP-tk-FER int14 and ganciclovir (control). All animals were treated scr-7 (10 mg/kg) along with viruses. All animal procedures were approved by the University of Pittsburgh Institutional Animal Care and Use Committee.

Immunohistochemistry. Immunohistochemistry was performed as described previously 6 with antibodies specific for HSV-1 TK (1:100 dilution) or for Cas9 (1:100 dilution). The antibody was omitted in negative controls. The sections were then incubated with horseradish peroxidase conjugated anti-rabbit IgG for 30 minutes at room temperature (ABC kit from Vector Labs, Inc). Slides were then exposed to a 3,3’- diaminobenzidine solution to visualize immunostaining. Counterstaining was performed by incubating the slides in 1% Hematoxylin solution for 2 minutes at room temperature. The slides were then rinsed briefly in distilled water to remove excessive staining. The procedure of TUNEL assays is similar to that previously described (13).

6.3 RESULTS

One recurring fusion gene discovered in prostate cancer is located between the genes encoding transmembrane protein 135 (TMEM135) and coiled-coil domain containing 67 (CCDC67), i.e., TMEM135-CCD67 6,16 . The fusion gene is created by a 6- Mb deletion in the region of chromosome 11q14.2-21. The deletion joins intron 13 of TMEM135 with intron 9 of CCDC67 in chromosome 11 (Figure 13A) and creates a unique sequence breakpoint not present in normal tissues, thus providing a unique target in cancer cells for therapeutic intervention.

To target this joining sequence, two sgRNAs were designed, each complementary to one of the regions flanking the chromosomal breakpoint on opposite strands (Figure 13B). These guide RNAs (gRNAs) and a Cas9 D10A sequence were ligated into the VQAd5-CMV shuttle vector and recombined into pAD5 adenovirus to create pAD5- Cas9 D10A -gRNA TMEM135int13 -gRNA CCDC67int9 . To provide a potential lethal gene for targeted cancer cells, cDNA of HSV-1 tk was ligated with EGFP cDNA in frame, thus yielding the chimeric gene EGFP-tk. The chimeric cDNA was promoterless but contained a full open reading frame and ribosome-binding site for independent translation initiation. To provide homologous sequences to engage the HDR pathway, the construct was then ligated with 584 bp of the intron 13 sequence of TMEM135 at the 5’ end and 561 bp of the intron 9 sequence of CCDC67 at the 3’ end. These sequences were subsequently ligated into the PAdlox shuttle vector and recombined into adenovirus, thus yielding pAD-TMEM135int13-EGFP-tk-CCDC67int9 (Figure 13C). Integration of TMEM135int13–EGFP-tk–CCDC67int9, expression of EGFP-tk and apoptosis were detected in PC3 or DU145 cells that contained the TMEM135–CCDC67 breakpoint and were treated with the recombinant viruses (Figure 31). Genetic targeting of the

TMEM135-CCD67 in prostate cancer is disclosed in WO 2016/011428, the contents of which are hereby disclosed in their entirety (see Figures 30, 31 and 32 of WO

2016/011428, and accompanying text). Additional cancers were analyzed for fusion gene expression. Screening of human hepatocellular carcinoma cell line HUH7 showed that it expresses one of the fusion genes, MAN2A1-FER (Figure 15A and B). Both MAN2A1-FER mRNA and protein were detected in this cell line (Figure 15A and B). A genome breakpoint was identified between intron 13 of MAN2A1 and intron 14 of FER in HUH7 cells. The chimeric protein retains intact tyrosine kinase domain from FER but loses the SH2 domain that regulates the kinase activity. To evaluate the applicability of genome therapy targeting at cancer cells with a native fusion gene breakpoint, we designed a pair of gRNAs specific for intron 13 of MAN2A1 and intron 14 of FER (Figure 15A). The gRNAs and Cas9 D10A was packaged into adenovirus to create pAD5- Cas9 D10A gRNAMAN2A1 int13 -gRNAFER int14 . This recombinant virus was co-infected with a“donor” recombinant adenovirus containing the sequences flanking the nick-sites (pAD-MAN2A1 int13 -EGFP-tk-FER int14 ). This“donor” virus also contains splicing sequences corresponding to acceptor of intron 14 of MAN2A1 and donor of intron 15 of FER, respectively, so that EGPF-tk is interrupted into the mRNA of MAN2A1-FER. The results showed that up to 27% of HUH7 cells infected with these viruses expressed EGFP-tk (Figure 15D and E; Figure 16A), while similar infection of HEP3B cells, which are negative for MAN2A1-FER fusion, with these viruses induced minimal fluorescent protein expression. When HUH7 cells, which are negative for TMEM135-CCDC67 fusion, were infected with adenoviruses specific for TMEM135-CCDC67 breakpoint, there is little expression of EGFP-tk (Figure 15D and E; Figure 16A). These results confirm the specificity of this genome targeting technique.

When HUH7 cells were infected with pAD5-Cas9 D10A -gRNAMAN2A1 int13 - gRNAFER int14 /pADMAN2A1 int13 -EGFP-tk-FER int14 (Ad-MF) and treated with various concentrations of ganciclovir, up to 27% cells died at 10 μg/ml of ganciclovir, while HEP3B cells infected with the same viruses had minimal cell death even at high concentrations of ganciclovir (Table 3). When HUH7 cells were infected with Ad-TC and treated with ganciclovir, there is no appreciable increase of cell death, clearly indicating that cell death induced by ganciclovir is MAN2A1-FER breakpoint dependent compared to the PC3 BP clone and DU145 BP clone that contain the TMEM135- CCDC67 breakpoint (Table 3; Figure 16B). As shown in Figure 14, clones of transformed PC3 cells were selected to quantify for the copy number of TMEM135- CCDC67 breakpoint relative to that of β-actin in the genome. The PC3 BP clone was estimated to contain one copy of TMEM135-CCDC67 breakpoint per genome and was selected based on its ratio (~1:4, PC3 cells are hyperploid for the chromosome region containing β-actin) to β-actin (Figure 14) 15 . Similar selection was also applied to DU145 clone (DU145 BP) that contains the TMEM135-CCDC67 breakpoint. PC3 BP and DU145 BP clones treated with ganciclovir exhibited significant cell death as observed by tunel staining (Figure 16B).

To examine the effectiveness of genome therapy targeting at MAN2A1-FER in vivo, SCID mice were xenografted with HUH7 and HEP3B cells, and treated with recombinant viruses and ganciclovir 2 weeks after the xenografting (~400 mm 3 in average). The mice xenografted with HUH7 cells and treated with Ad-MF and ganciclovir experienced up to 29% reduction of tumor size from the peak, and had no notable metastasis or mortality in the treatment period. In contrast, the mice xenografted with HUH7 cells treated with ganciclovir and Ad-TC, the adenoviruses specific for the TMEM135-CCDC67 breakpoint not carried by the HUH7 cells, experienced 73 fold increase of tumor size. Four of 5 of these mice had metastasis in lung and liver. All 5 mice died in 40 days after xenografting. Similar rates of death, metastases and increase of tumor volume also occurred in mice treated with AD-MF and PBS. Treatment of mice xenografted with HEP3B, a hepatocellular cancer cell line negative for MAN2A1-FER fusion, with Ad-MF and ganciclovir was similarly ineffective. These results indicate that therapy targeting at cancer genome is highly specific and effective.

In addition, our approach appears highly specific, with average functional off- target rates being less than 1% for HEP3B cells and HUH7 cells (EGFP-tk+

cells/Cas9 D10A -RFP+ cells treated with adeno-MF, Table 4). These off-target rates were largely confirmed by quantitative sequencing methods: the off-target rates ranged from <0.1% to 2.5% in 100 million reads, including samples from in vitro tissue culture experiments, xenografted cancers and liver samples from mice that were treated with the recombinant viruses (Table 18-21). On-target integration rates in vitro, on the basis of sequencing, ranged from 15.9% to 25.5%, whereas rates for xenografted tumors ranged from 21.1% to 33.5%. The higher integration rates in xenografted tumors probably reflect repeated application of the recombinant adenoviruses.

tk

6.4 DISCUSSION

The impact of fusion genes on the function of genes that are involved could be dramatic due to creation of a new protein or generation of a large truncation of a protein domain. For example, without being limited to a particular theory, it appears that the elimination of SH2 domain of FER in the MAN2A1-FER fusion gene may lead to constitutive activation of FER tyrosine kinase.

Our gRNA target designs would produce two nicks at different strands 37 bp apart for MAN2A1-FER breakpoint. It is unlikely that these nicks would generate a complete break of double stranded DNA. As a result, these DNA damages would likely be repaired by homologous recombination process rather than by non-homologous end joining. Our results are consistent with several other studies that CRISPR/Cas mediated homologous recombination rates can reach between 20-30% 7 .

To our knowledge, this is the first report to show that such system can be applied to specifically target a cancer genome and to have a recombination rate sufficient to achieve remission of xenografted cancers. The precision specificity and integration rate of EGFP-tk in genome therapy might make it possible to apply this approach to clinical settings. Future developments that enhance the integration rate of the targeting cassette into the genome target site may be helpful in enhancing the efficiency of genome therapy. Notably the donor sequences of the adenoviral GFP-tk genome are outside the gRNA target sites, and thus the viruses do not contain target sequences recognizable by gRNA-Cas9 activity. The recombination rate between the two viruses is probably low. In over 300,000 integrated EGFP-tk sequencing reads (Tables 20-22), we did not find any reads indicating recombination between EGPFP-tk and Cas9.

The current therapeutic approach to human cancer heavily relies on interception of signaling pathways that drive cancer growth. However, such an approach invariably leads to drug tolerance and resistance to drug treatment as the cancer genome adjusts its gene expression patterns and, through new mutations that bypass the signaling blockade, develops new pathways to support growth. The subsequent application of second-tier chemotherapy may affect both cancer and normal tissues and thus may generally produce poor therapeutic outcomes. The genome approach may have substantial advantages over chemotherapy because it is specific for the cancer genome sequence, and it kills cancer cells regardless of whether the mutations are cancer drivers. It is possible that additional new mutations and fusion genes will be generated under the pressure of cancer therapy. In principle, additional vectors may be designed to target these genomic lesions and might even increase the integration rate, owing to multiple integrations per cell (Table 22). It remains to be determined whether such an adaptive strategy might be feasible in the clinic. For cancers comprising multiple populations of cancer cells with several different fusion-gene targets, these targets can be simultaneously targeted through the genome targeting scheme. Furthermore, our approach is not limited to using HSV-tk in the therapeutic cassette but instead can use a wide spectrum of gene devices, such as immunogens from viruses or toxins from plants or bacteria. The well-documented ‘bystander’ effect of several prodrugs on tumor cell killing may enhance the therapeutic effects of this genome targeting strategy 17 . When necessary, genome targeting can be combined with other cancer therapeutic treatments, such as tumor immunotherapy or signaling-molecule targeting, to achieve better therapeutic results.

Table18. On- and off-target sequences (SEQ ID NOs: 159-182, in order)

Table 19. On- and off-target sequencing primers for Illumina HiSEQ2500 (SEQ ID NOs: 182-223, in order)

6.5 REFERENCES

1. R. L. Siegel, K. D. Miller, and A. Jemal, CA: a cancer journal for clinicians 66 (1), 7 (2016).

2. F. J. Mojica, C. Diez‐Villasenor, J. Garcia‐Martinez et al., Journal of molecular evolution 60 (2), 174 (2005).

3. M. Jinek, K. Chylinski, I. Fonfara et al., Science (New York, N.Y 337 (6096), 816 (2012).

4. K. M. Esvelt, A. L. Smidler, F. Catteruccia et al., eLife, e03401 (2014). 5. F. A. Ran, P. D. Hsu, C. Y. Lin et al., Cell 154 (6), 1380 (2013).

6. Y. P. Yu, S. Liu, Z. Huo et al., PloS one 10 (8), e0135982 (2015).

7. C. Yu, Y. Liu, T. Ma et al., Cell stem cell 16 (2), 142 (2015); P. D. Hsu, D. A. Scott, J. A. Weinstein et al., Nature biotechnology 31 (9), 827; L. Cong, F. A. Ran, D. Cox et al., Science (New York, N.Y 339 (6121), 819.13 K. F. Kozarsky and J. M. Wilson, Current opinion in genetics & development 3 (3), 499 (1993).

8. H. Wang, K. Luo, L. Z. Tan et al., The Journal of biological chemistry 287 (20), 16890 (2012); Z. H. Zhu, Y. P. Yu, Z. L. Zheng et al., The American journal of pathology 177 (3), 1176 (2010); K. L. Luo, J. H. Luo, and Y. P. Yu, Cancer science 101 (3), 707 (2010); Y. C. Han, Y. P. Yu, J. Nelson et al., Cancer research 70 (11), 4375 (2010); Z. H. Zhu, Y. P. Yu, Y. K. Shi et al., Oncogene 28 (1), 41 (2009).

9. Y. K. Shi, Y. P. Yu, G. C. Tseng et al., Cancer gene therapy 17 (10), 694 (2010); Y. P. Yu, G. Yu, G. Tseng et al., Cancer research 67 (17), 8043 (2007); B. Ren, Y. P. Yu, G. C. Tseng et al., Journal of the National Cancer Institute 99 (11), 868 (2007); G. Yu, G. C. Tseng, Y. P. Yu et al., American Journal of Pathology 168 (2), 597 (2006).

10. T. Maruyama, S. K. Dougan, M. C. Truttmann et al., Nature

biotechnology 33 (5), 538 (2015).

11. Y. C. Han, Z. L. Zheng, Z. H. Zuo et al., The Journal of pathology 230 (2), 184 (2013); B. Ren, G. Yu, G. C. Tseng et al., Oncogene 25 (7), 1090 (2006).

12. L. Jing, L. Liu, Y. P. Yu et al., The American journal of pathology 164 (5), 1799 (2004).

13. A. J. Demetris, E. C. Seaberg, A. Wennerberg et al., The American journal of pathology 149 (2), 439 (1996).

14. Anderson, R. D., Haskell, R. E., Xia, H., Roessler, B. J., & Davidson, B. L. (2000) A simple method for the rapid generation of recombinant adenovirus vectors. Gene therapy 7: 1034‐1038.

15. Y. Ohnuki, M. M. Marnell, M. S. Babcock et al., Cancer research 40 (3), 524 (1980); J. Bernardino, C. A. Bourgeois, M. Muleris et al., Cancer genetics and cytogenetics 96 (2), 123 (1997).

16. Yu, Y.P. et al. Novel fusion transcripts associate with progressive prostate cancer. Am. J. Pathol.184, 2840–2849 (2014); Luo, J.H. et al. Discovery and

classification of fusion transcripts in prostate cancer and normal prostate tissue. Am. J. Pathol.185, 1834–1845 (2015).

17. Hanahan, D. & Weinberg, R.A. Hallmarks of cancer: the next generation. Cell 144, 646–674 (2011). 7. EXAMPLE 2: NUCLEAR PTEN-NOLC1 FUSION IS ONCOGENIC IN HUMAN CANCERS.

7.1 INTRODUCTION

In this Example, a genome intervention approach was developed to kill cancer cells based on unique sequences resulting from genome rearrangement. The chromosome breakpoints from the Pten-NOLC1 fusion gene were exploited as therapeutic targets. The Pten-NOLC1 fusion gene has been previously determined to be present in prostate cancer (see WO 2016/011428, the contents of which are hereby incorporated by reference in its entirety).

Phosphatase and tensin homolog (Pten) (1, 2), a phosphatase for PIP3 (3, 4), maintains the homeostatic Akt/PI3K signaling and is a crucial regulator of cell survival and growth (5, 6). Poly-ubiquitination of Pten leads to inactivation of Pten in the cytosol, while mono-ubiquitination of Pten translocates the protein into the nucleus (7, 8). The nuclear Pten regulates DNA repair activity of Rad51 and maintains the chromosome stability (9). Deletion of Pten may be associated with genome rearrangement (10). Pten deletion or mutation has been reported in a variety of human malignancies and is considered one of the most important driver events for human cancer development (11). However, whether a sole unchecked Akt/PI3K signaling or nuclear function of Pten is sufficient for cancer development, remains unclear.

Here, a novel Pten related fusion protein is characterized, identified through a high coverage (600-1500x) transcriptome sequencing analyses on 87 prostate samples including 20 organ donor prostate samples from healthy individuals, 3 benign prostate tissues adjacent to cancer, and 64 prostate cancers. Through Fusion-Catcher screening and multiple filtering, 96 cancer-specific fusion transcripts were identified (Table 6). Six of these fusion genes were validated through Sanger sequencing and fluorescence in situ hybridization (FISH, Figures 17-19). One of these fusion genes involves Pten and NOLC1. Pten, a tumor suppressor gene (1), is fused with NOLC1 (12, 13), a nucleolar and coiled-body phosphoprotein for nucleolar organogenesis (13), in frame to produce a chimera protein of Pten-NOLC1.

Deletion or mutation of PTEN is frequent in a variety of human cancers, and is one of the mechanisms underlying human cancer development. Here, we report a fusion protein between Pten and nucleolar organogenesis protein NOLC1 in human malignancies.

Pten-NOLC1 fusion is highly recurrent in 8 types of human malignancies (66- 85%). Gene fusion between Pten-NOLC1 leads to loss of C2 domain of Pten protein, and translocation of the fusion gene product to the nucleus. Targeted interruption of Pten- NOLC1 genome breakpoint of DU145 and MCF7 cells impeded tumor cell growths, retarded S phase entry, and delayed cell migration and invasiveness, and was prone to the UV- induced apoptosis. Knockout of Pten-NOLC1 blocks xenografted DU145 tumor growth in SCID mice, while forced expression of Pten-NOLC1 through hydrodynamic injection into the mice led to the development of hepatocellular carcinoma. Up- regulation of C-Met/HGFR signaling was detected in the Pten-NOLC1 induced tumor. These results indicate that Pten-NOLC1 fusion produces a gain of function, and converts a tumor suppressor gene into an oncogene.

7.2 MATERIALS AND METHODS Tissue samples:

Total 815 tissue specimens used in the study consisted of 268 prostate cancers, 10 matched blood samples, 20 donor prostates, 102 non-small cell lung cancers, 61 ovarian cancers, 60 colon cancers, 70 liver cancers, 156 glioblastoma, 60 breast cancers, and 34 esophageal adenocarcinomas, and wereobtained from University of Pittsburgh Tissue Bank in compliance with institutional regulatory guidelines (Table 7, and 12).

Procedures of microdissection of PCa samples and DNA extraction were previously described (21-23). The protocols of tissue procurement and procedure were approved by Institution Board of Review of University of Pittsburgh. All cell lines were purchased from American Type Cell Culture, Inc (ATCC). The procedures of cell cultures followed the manuals from the manufacturer.

Cancer tissues were also obtained from other institutes: 16 non-small cell lung cancer samples obtained from University of Kansas; 28 samples of non-small cell lung cancer from

University of Iowa; 3 samples of glioblastoma multiforme from Northwestern University; 50 samples of prostate cancer from Stanford University; and 163 samples from University of Wisconsin Madison; All protocols were approved by the Institution Review Board. The cell lines used in the study were purchased from American Type Cell Culture (ATCC), and were cultured and maintained following the recommendations of manufacturer.

Construction of vector: To construct cDNA for Pten-NOLC1, a PCR was performed on the cDNA template of Pten using primers

AGGGGCATCAGCTACCCTTAAGTCCAGAGCCATTTC (SEQ ID NO:

127)/GGCATTGGCATCCTGCTGTGT CTTAAAATTTGGAGAAAAGTA (SEQ ID NO: 128) to obtain cDNA corresponding to 5’ end of Pten, under the following condition: 94°C for 2 min, then 94°C for 30 seconds, 61°C for 30 second, 72°C for 30 second for 35 cycles. Separately, a PCR was performed on the cDNA template of NOLC1 using primers

TACTTTTCTCCAAATTTTAAGACACAGCAGGATGCCAATGCC (SEQ ID NO: 129)/ ACCGAAGATGGCCTCTCTAGACTCGCTGTCAAACTT (SEQ ID NO: 130) to obtain 3’ end of NOLC1 under the same condition. The PCR products of the 2 reactions were pooled. A PCR was performed using

AGGGGCATCAGCTACCCTTAAGTCCAGAGCCATTTC (SEQ ID NO: 131)/ ACCGAAGATGGCCTCTCTAGACTCGCTGTCAAACTT (SEQ ID NO: 132) in the same condition to obtain full length Pten-NOLC1 cDNA. The PCR product was then restricted with AflII and XbaI, and ligated into similarly digested pCDNA4-FLAG vector to create pPten-NOLC1-FLAG vector.

To construct Pten-EGFP vector, a PCR was performed on the cDNA template of Pten using primers: CTTAAAATTTGGAGATCTAGATCGGTTGGCTTTGTC (SEQ ID NO: 133)/

ACCGAAGATGGCCTCTCTAGAGACTTTTGTAATTTGTGTATGCTGATC (SEQ ID NO: 134). The PCR product was restricted with NdeI and Xba1, and ligated into similarly restricted pEGFP vector to create pPten-EGFP. To construct pNOLC1-cherry, a PCR was performed on the cDNA template of NOLC1 using primers:

TCAGCTACCCTTAAGCGGTAGTGACGCGTATTGC (SEQ ID NO:

135)/ACCGAAGATGGCCTCTCTAGACTCGCTGTCAAACTT (SEQ ID NO: 136). The PCR product was restricted with AflII and XbaI, and ligated into similarly restricted pCNDA4-Cherry vector.

To construct pGST-Pten vector, a PCR was performed using primers

GTGGGATCCACATGACAGCCATCATCAAAGAG (SEQ ID NO: 137)/

CATACACAAATTACAAAAGTCTGAGGATCCCCAGGAATTCCCGGGTCGACTC

(SEQ ID NO: 138) under the following condition: 94°C for 2 min, then 94°C for 30 seconds, 60°C for 30 second, 72°C for 30 second for 35 cycles. The PCR product was restricted with BamH1, and ligated to similarly restricted pGEX5T vector to create pGST-Pten.

To construct pGST-Pten-NOLC1 vector, a PCR was performed on Pten-NOLC1 cDNA template using primers GTGGGATCCACATGACAGCCATCATCAAAGAG (SEQ ID NO:

139)/CGAGTCGACCCGGGAATTCCTGGGGATCCTCACTCGCTGTCAAACTTAA TAG (SEQ ID NO: 140) was restricted with BamH1, and ligated into similarly restricted pGEX5T vector to create pGST-Pten-NOLC1.

To construct pPT3-EF1α-Pten-NOLC1, a PCR was performed on pcDNA4-Pten- NOLC1-Cherry as a template using primers

5’CTCCGGACTCTAGCGTCGACACTTAAGTCCAGAGCC (SEQ ID NO: 224) /5’ ATGGTGATGGTGATGGCGGCCGCTTAACTAGATCCGG (SEQ ID NO: 225) to obtain the full length of Pten- NOLC1-cherry containing Sal1 and Not1 restriction sites. The PCR product was then restricted with Sal1 and Not1, and ligated into similarly restricted pENTR 1A dual vector (Invitrogen) to create pENTR-attL1-Pten-NOLC1- attL2. Using Gateway® LR Clonase™ II enzyme mix (Invitrogen), pENTR-attL1-Pten- NOLC1-attL2 was then recombined with destination vector pPT3-EF1α to generate Pten- NOLC1-cherry expression vector.

Genome and transcriptome sequencing library preparation and sequencing: The prostate samples were fresh-frozen and stored in -80 o C. They were obtained from University of Pittsburgh Tissue Bank. The protocol was approved by University of Pittsburgh Institutional Review Board. For transcriptome sequencing, total RNA was extracted from samples of 20 organ donor prostates from individuals free of urological disease, 3 benign prostate tissues adjacent to cancer and 64 prostate samples, using Trizol, and treated with DNAse1. Ribosomal RNA was then removed from the samples using RIBO-Zero™ Magnetic kit (Epicentre, Madison, WI). The RNA was reverse- transcribed to cDNA and amplified using TruSeq™ RNA Sample Prep Kit v2 from Illumina, Inc (San Diego, CA). The library preparation process such as adenylation, ligation and amplification was performed following the manual provided by the manufacturer. The quality of transcriptome libraries was then analyzed with qPCR using Illumina sequencing primers and quantified with Agilent 2000 Bioanalyzer. The procedure of 200 cycle paired-end sequencing in Illumina HiSeq2500 followed the manufacturer’s manual.

Detection of genome breakpoint of Pten-NOLC1 genome rearrangement: To detect the breakpoint of genome between Pten and NOLC1, multiple primers were designed. The forward primers were designed annealing to the region of exon 11 and intron 11 of Pten, and the reverse primers to the region of intron 1 and exon 2 of NOLC1 (Table 16). Multiple nested- PCRs were performed with various primer combinations using AccuPrime™ Pfx DNA Polymerase (Invitrogen) with 35 heat cycles of 95°C for 15 seconds, 65°C for 30 seconds, and 72°C for 10 minutes. The intron primer pair, 5’ATTCACCACACTCGTTTCTTTCTC (SEQ ID NO: 226)

/5’CCTGCCTGCCAATCTATATTGATC (SEQ ID NO: 227) was proven to produce a PCR product. The direct sequencing of the purified PCR product confirmed the genome intron breakpoint sequence of Pten-NOLC1.

Knockout Pten-NOLC1 with CRISP/Cas9 genome editing. Vector

pSPCAS9N(BB)-2A-GFP was obtained from Addgene, inc (Addgene# PX461). Target sequences for gRNA was analyzed with software CRISPR gRNA design tool- DNA2.0TM, and selected, as CGGTTATACCGCTTTGGGATcaaa (Pten) (SEQ ID NO: 228) / caccGAGATGGGGTTTCACCATGT (NOLC1) (SEQ ID NO: 229) flanking the breaking juncture of Pten and NOLC1. Synthetic oligonucleotides corresponding to gRNA(Pten) and gRNA(NOLC1) were constructed into Bbs1 site of modified pSPCAS9N(BB)-2A-GFP and pX330s-2 vector, respectively. Both vectors were then restricted with Bsa1, and gRNA(NOLC1) released from pX330s-2 vector with Bsa1 cut was subsequently inserted into Bsa1 site of gRNA(Pten)-pSPCAS9n (BB)-2A-GFP. T7 ligase was used in the cloning. For homologous directed recombination of an insert gene at the breakpoint, a donor vector was constructed: mCherry vector (Clontech inc, CA) was used as backbone to construct a cDNA of Zeocin to the upstream of mCherry coding region. The promoter sequence of mCherry vector was removed; a homologous arm sequence identical to the segment of 952 bp sequence of Pten intron 11 upstream to gRNA(Pten) sequence plus a 55 bp of splice acceptor sequence was inserted to the upstream of ribosomal RNA binding kozak sequence for zeocin. The second homologous arm sequence of 845 bp segment of NOLC1 intron 1 downstream from gRNA(NOLC1) was inserted into region downstream of SV40 poly-A for mcherry coding sequence. The donor vector was co-transfected with gRNA-Cas9D10A vector into DU145 and MCF7 cell lines using lipofectamine 3000 (Invitrogen). Integration of insert gene was recognized by mCherry expression and zeocin resistance. Loss of Pten-NOLC1 expression was confirmed by Taqman qRT-PCR with the primers for Pten-NOLC1 fusion detection and immunoblotting for dKO1 and dKO2, mKO1 and mKO2 clones with antibodies against N-terminus of Pten or Cterminus of NOLC1.

Fusion transcript detection: To identify fusion transcript events, the

Fusioncatcher (v0.97) algorithm (24) was applied to the RNA sequencing samples.

Embedded in fusioncatcher, BOWTIE and BLAT were used to align sequences to the reference genome. The preliminary list of candidate fusion transcripts are filtered in Fusioncatcher based on the existing biological knowledge of the literature including: (1) If the genes are known to be the other's paralog in Ensembl; (2) If one of the fusion transcripts are the partner's pseudogene; (3) If one of the fusion transcripts are micro/transfer/small-nuclear RNA; (4) If the fusion transcript is known to be a false positive event (e.g., Conjoin gene database (25)); (5) If it has been found in healthy samples (Illumina Body Map 2.0 [http://www.ebi.ac.uk/arrayexpress/experiments/E- MTAB-513/]); (6) If the head and tail genes are overlapping with each other on the same strand. Fusion genes were visualized with CIRCOS software (26). TCGA SNP 6.0 data analysis: The CNV segmentation data were downloaded from TCGA level 3 data (http://cancergenome.nih.gov/). CNVs with less than 10 markers are filtered out. Only the segments overlapping with the between exon 11 of Pten and exon 2 of NOLC1 region (89728532-89728532/hg19) were selected. CNVs with segment_mean value smaller than -0.23 are defined as deletions. For each sample, spanning deletion between Pten and NOLC1 is defined as the length of deletion segments over the whole length of the region between Pten and NOLC1 equal to or greater than 0.8 (illustrated in Figure 23). Seventeen types of cancers were analyzed for spanning deletion between Pten and NOLC1: bladder cancer (BLCA), breast cancer (BRCA), colon cancer (COAD), diffuse large B-cell lymphoma (DLBC), esophageal carcinoma (ESCA), glioblastoma multiforme (GBM), head and neck squamous cell carcinoma (HNSC), liver cancer (LIHC), lung adenocarcinoma (LUAD), lung squamous cell carcinoma (LUSC), ovarian cancer (OV), pancreatic adenocarcinoma (PAAD), prostate cancer (PRAD), rectal adenocarcinoma (READ), sarcoma (SARC), thyroid cancer (THCA) and uterine endometrium carcinoma (UCEC).

Fluorescence In-situ Hybridization: Similar procedure was previously described (12-14). Briefly, tissue slides (4 microns) were placed in 0.2 N HCl for 20 minutes then Pretreatment Solution (32-801200, Vysis), 80 degrees for 30 minutes. Tissues were then digested in protease solution at 37 o C for 36 minutes, and air dried. The FISH probe was prepared by combining 7 ml of SpectrumOrange-labeled Bacterial artificial chromosome (BAC) sequence containing Pten 5’end (RP11-124B18,

InVitrogen, Inc, Grand Island, NY) /50% formamide with 1 ml of BAC sequence (CTD- 3082D22) containing 3’end of NOLC1 labeled with SpectrumGreen. The probe was denatured for 5 min at 75 o C. Sections of formalin-fixed tissues were denatured in 70% formamide for 3 min, and dehydrated in 70%, 85%, and 100% ethanol for 2 min each at room temperature. The denatured probe was placed on the slide, cover-slipped, sealed with rubber cement, placed in a humidified chamber and hybridized overnight at 37°C. Coverslips were removed and the slides were washed in 2XSSC/0.3% Igepal (Sigma) at 72°C for 2 min. Slides were air-dried in the dark. The slides were counterstained with DAPI. Analysis was performed using a Olympus BX61 with CytoVision equipped with Chroma Technology 83000 filter set with single band exitors for SpectrumOrange, SpectrumGreen and DAPI (uv 360 nm). Only individual and well delineated cells with two hybridization signals were scored. Overlapping cells were excluded from the analysis. Fifty to 100 cells per sample were scored to obtain an average of signals. The cutoff for gain of MCM8 is an average of at least 2.5 copies per genome Samples with more than 10% cells with merged Pten and NOLC1 signals were considered positive for Pten-NOLC1 fusion.

RNA extraction, cDNA synthesis and Taqman RT-PCR: microdissection was performed on slides of FFPE samples to obtain at least 50% cancer cells. Total RNA was extracted from epithelial cells with the Trizol method (InVitrogen, CA). The extraction procedure was performed according to manufacturer’s recommendation. Random hexamer was used in the first strand cDNA synthesis with 1 μg of total RNA and Superscript II TM (InVitrogen, Inc, CA). This was followed by Taqman PCR (94°C for 2 min, then 94°C for 30 seconds, 61°C for 30 second, 72°C for 30 second for 50 cycles) in Eppendorf Realplex™ cycler using primers GAGCGTGCAGATAATGACAAGG (SEQ ID NO: 141)/GCCAGAAGCTATAGATGTCTAAGAG (SEQ ID NO: 142)and Taqman probe: 5’-/56-FAM/CAG GAT GCC/ZEN/AAT GCC TCT TCC C/3IABkFQ/-3’. Ct threshold of Pten-NOLC1 detection: 385cycles for formalin-fixed paraffin-embedded tissues, 35 cycles for frozen tissues. No template negative control and Pten-NOLC1 cDNA templates were used as negative and positive controls in each batch, respectively.

Immunoblot Analysis and Immunoprecipitation. Pten-NOLC1 expression was examined in H522, H358, PC3 DU145 and A-172 cells. First, cells were washed with PBS and lysed by RIPA buffer (50 mM Tris-HCl at pH 7.4, 1% Nonidet P-40, 0.25% sodium deoxycholate, 150 mM NaCl, 1 mM EDTA, 1 mM phenylmethylsulfonyl fluoride, Aprotinin at 1 μg/mL, leupeptin at 1 μg/mL, pepstatin at 1 μg/mL, and 1 mM Na 3 VO 4 ). The lysates were sonicated and centrifuged at 12,000g at 4°C for 30 minutes to remove the insoluble materials. The proteins were separated by sodium dodecyl sulfate- polyacrylamide gel electrophoresis (SDS-PAGE) in 8.5% polyacrylamide gels, and bands were blotted onto a polyvinylidene difluoride (PVDF) membrane. The membrane was blocked with 5% powdered skim milk in Tris–Tween 20 buffer (0.1 M Tris-HCl and 0.1% Tween-20, pH 7.4) for 1 hour at room temperature, followed by a 2-hour incubation with primary anti-Pten antibodies (1:1000 dilution, Santa Cruz), anti-NOLC1 antibodies (1:1000 dilution; Santa Cruz, CA), anti-FLAG (Santa Cruz), anti-β-actin antibodies (1:500 dilution; Santa Cruz) or antibodies for C-MET, GAB1 C-RAF, Akt , MAPK, p-MAPK, Stat3, pStat3 (Cell Signaling). The membrane was then washed three times with Tris-Tween 20 buffer and incubated with a horseradish peroxidase-conjugated secondary antibody specific for rabbit (anti-β-actin, 1:1000 dilution), mouse (anti-Pten, 1:1000 dilution), or goat (anti-NOLC1, 1:1000 dilution) for 1 hour at room temperature. The protein expression was detected with the ECL system (Amersham Life Science) according to the manufacturer’s protocols. Similar immunoblotting was also performed on protein extracts from prostate cancer samples PCa638T, PCa207T, PCa624T, PCa099T, PCa090T, and organ-donor prostates from individual free of urological disease DO12 and DO17.To immunoprecipitate, the cell lysates were incubated with the A/G magnetic bead (Millipore) for 20 minutes to remove non-specific binding. The supernatants were incubated with 4-6 μg primary antibody at 4°C overnight, and the A/G magnetic beads were added to the immune-complex and continue to rock the reaction at 4°C for 2 hours. The magnetic beads were collected on magnetic stand and washed with 1x PBS containing proteinase inhibitor for 3 times. The SDS-PAGE loading buffer was directly used to re-suspend and elude the proteins from the beads.

Phosphatidylinositol 3,4,5-triphosphate (PI(3,4,5)P 3 ) phosphatase assay of Pten and Pten-NOLC1. To quantify the phosphatase activity on PI(3,4,5)P 3 of the purified GST-Pten, Pten-NOLC1-FLAG or GST-Pten-NOLC1. Pten-NOLC1-FLAG was immunopurified with FLAG antibody and GST-Pten-NOLC1 was purified with GST column. The phosphatase of purified Pten-NOLC1 were subjected to a competitive ELISA assay: 2 to 10 pmol of purified a protein was incubated with 8 µM PI(3,4,5)P 3 for 37°C for 1 hour. The reaction was then stopped by heating to 95°C for 3 min. The reaction was then transferred to“Detection Pate” (Echelon, Inc, UT), and incubated at 37°C for 60 min. This was followed by washing the“Detection plate” with PBS-tween 203 times. The plate was added with 100 µl secondary detector provided by Echelon, Inc, and incubated at room temperature for 30 min, followed by washing with PBS- tween 203 times. The color was then developed by adding 100 µl TMB solution provided by Echelon for 15 min. The amount of PI(3,4)P 2 was quantified through reading of absorbance at 450nm in a spectrophotometer and fitting with a standard curve of known amount of PI(3,4)P 2 . The purified Pten protein (Echelon) in the reaction was used as a positive control and GST only protein or Flag only or IgG only were used as negative controls.

Cell growth and cell cycles analysis. DU145 or its Pten-NOLC1 knockout counterparts were used in the colony formation analyses, and 1000 cells of each clone were plated on each well in triplicates. Individual single cell was allowed to grow to form colony for 7-10 days. The colonies were then fixed with ice cold methanol for 10 minutes and stained with 0.025% Crystal violet for 15 minutes. The numbers of colonies were counted and imaged. In cell cycle analysis, FITC BrdU flow kit (BD Biosciences) was used. DU145, or DU145 KO1 or KO2 were synchronized by removing FBS in culture for 48 hours, followed by feeding back 10% FBS and BrdU for 4 hours. The cells were then harvested for analysis with FITC-BrdU antibody and propridium iodide nuclei staining (BD bioscience). The cells in different cell cycle phases were analyzed by flow cytometry (BD Facscalibur). Similar analyses were also performed for MCF7 cells versus mKO1 or mKO2, PC3-PNOL(T+) versus PC3-PNOL (T-), NIH3T3-PNOL(T+) versus NIH3T3-PNOL (T-).

UV-induced cell deaths. The cultured cells of Du145, dKO1, dKO2, MCF7 or mKO1 at 60-70% confluence were irradiated with UV ranging from 50mj to 200 mj. Seventeen hours later, these cells were harvested for apoptosis analysis. Alexa Fluor 488 -annexin V apoptosis assay kit (BD Biosciences) was used, and apoptotic cells were quantified in flow cytometry (BD Facscalibur). The same apoptosis analysis was also applied to PC3-PNOL(T+), PC3-PNOL (T-), NIH3T3-PNOL(T+) and NIH3T3-PNOL (T-) clones.

Pten-NOLC1 knockout cell death analysis. Cancer cell lines carrying breakpoint of Pten-NOLC1 were used. Cells were treated with recombinant adenoviruses containing Cas9 D10A with gRNAs for Pten and mCherry-knockout cassette, or viruses containing Cas9 D10A with gRNAs for Pten donor cassette or viruses containing Cas9 D10A only as a control. These viruses were applied to the cell cultures when they reached 70-80% confluence for 18-24 hours. The culture medium was then changed. The cells were incubated for 3 days for Pten-NOLC1 disruption to occur. The cells were then harvested for cell death analysis. PE -annexin V apoptosis assay kit (BD Biosciences) was used: The cells were re-suspended in 100 μl of annexin binding buffer (Invitrogen), and incubated with 5 μl of PE-conjugated annexin V and 5 μl of propidium iodide for 20 min in dark at room temperature. The binding assays were terminated by addition of 400 μl of annexin binding buffer. FACS analysis was performed using a BD Facscalibur (BD Sciences, San Jose, CA). Ten thousands cells were acquired and sorted. WinMDI 2.9 software (freeware from Joseph Trotter) was used to analyze the data.

Cell motility and invasion assays. Estimated 5x10 4 cells were plated in each matrix gel chamber (24-well), and the control well without matrigel membrane. Twenty- two hours later, the cells on top of the membrane were wiped away and the cells migrated through the chamber membrane or dividers of the controls were stained with H&E and the numbers of cells were counted. The cell numbers invaded through the membrane were normalized with the cells migrated through the divider in control.

Triplicates of each clone were included and the comparisons were made between clones of DU145 and DU145KO1 or DU145KO2; MCF7 or MCF7KO1 or KO2; PC3- PNOL(T+) and PC3-PNOL (T-); NIH3T3-PNOL(T+) and NIH3T3- PNOL (T-).

Sleeping beauty/transposon mediated hydrodynamic transfection of Pten- NOLC1. The pPt3-EF1α-Pten-NOLC1-Cherry was transfected into PC3 cells and the expression of Pten- NOLC1-Cherry protein was confirmed by the expression of mCherry fluorescent tag. pT3-EF1α- Pten-NOLC1-Cherry (20 μg) and SB (1μg) were then pooled in 2 ml saline. The hydrodynamic injections of the pooled plasmids were performed for transfection of Pten-NOLC1-mCherry in the liver (Chen et al., 2017a; Liu et al., 1999). The loxP-Pten mice aged at 4 to 6 weeks were used for peritoneal injection of 1010 of AAV8-Cre (Penn Vector Core, University of Pennsylvania) before the transfection.

Chromatin immunoprecipitation (ChIP). The Pten-NOLC1 knockout clones, dKO1, dKO2, mKO1, mKO2 and their parental cells, Du145 and MCF7, were used for ChIP analysis. MagnaChIP A/G kit (Millipore, USA) was used and the manufacturer’s protocol was followed. Briefly, 2-3 x 105cells at 80% confluence in culture were incubated in cold 4% formaldehyde at room temperature for 10 minutes to cross-link proteins and DNA, followed by quenching with glycine for 5 minutes at room

temperature, washing 3 times with ice cold 1xPBS, and spinning down the cell pellets. The cells were then lysed with lysis buffer containing protease inhibitors on ice for 15 minutes and centrifuged at 800g in 4°C for 5 minutes. The pellet was further lysed with nuclear lysis buffer. The nuclear lysates (protein/DNA) were then sheared to the DNA fragment with size of 100– 800 bps by sonication. A Focused ultrasonicator M220 (COVARIS) was used with setting at power 7.5, 200 burst/cycle and factor 10%. The fragmented and crosslinked chromatin was immunoprecipitated with antibodies against the C-terminus of NOLC1 or the N-terminus of Pten. The A/G magnetic beads were used to collect the immune complex including DNA fragments. The DNA were then eluded and purified for library preparation (Illumina, CA). Similar assays were also performed on lysates from PC3-PNOL-Flag, RWPE1-PNOL-Flag and their control PC3-Flag and RWPE1-Flag; and PC3 by using antibodies specific for FLAG tag.

ChIP sequencing: The manufacturer recommended procedure was followed (Illumina): Quantity and size of fragment DNAs were analyzed in a Bio-analyzer (Agilent); the DNA was first blunt-ended with the end repairing reagents, followed by 3’ end adenylation and ligation with indexing adaptor; the modified DNA was purified with Ampure XP magnetic beads, and resolved in 2% agarose with SYBR Gold gel. The DNA sizes of 250 to 300 bps were excised, purified with MinElute Gel Extraction Kit, and enriched with 17 heat cycles of PCR with TruSeqTM reagents (Illumina). Each library was quantified again in Agilent Bioanalyzer and normalized to 2 nM for each sample for loading. The process of sequencing in Illumina Highseq 2500 followed the manufacturer’s standard protocol. Two lanes of 70 G sequencing capacity were used for sequencing 15 ChIP samples. Identification of peaks of ChIP-enriched reads. The Chip- Seq data were aligned to Human Genome reference hg19 by Burrows-Wheeler Aligner (BWA)(Li and Durbin, 2009). Peaks were called from each individual sample by tool Model-based Analysis of ChIP-Seq (MACS)(Zhang et al., 2008). Significant peaks are defined as the DNA regions where reads are enriched compared to local background alignment. The p-values were adjusted by Benjamini-Hochberg procedure because of multiple hypothesis testing, and FDR was set to be 0.05 to define significance(Benjamini and Hochberg 1995). Significant peaks were further compared between pairwise samples to detect differential peaks (supplementary table 1). Reads aligned to a given peak regions were extracted by SAMtools for further analysis(Li et al., 2009).

All programming for plotting and statistical analysis was implemented in R package. Both the global Manhattan plot and local plot of reads around the upstream of transcription start site (TSS) of specific genes (supplemental figure 4) were generated. The global Manhattan plot was generated using the R package“qqman” (biorXiv DOI: 10.1101/005165). Subtraction results were obtained by subtracting overlapped peaks (extending the peaks by a range of +/- 2kb while matching) in knock-out samples from wild type samples. Peaks from the subtraction results (specially“DU145- (DKOR1+DKOR2)” +“MCF7- (MKO1+MKO2)”) were annotated to genes and analyzed through Ingenuity Pathway Analysis (IPA®, QIAGEN Redwood City, www.qiagen.com/ingenuity) and the top enriched pathways were reported. As differentially enriched reads in wildtype Du145 were reached and annotated, the reads in the promoter regions of several identified genes, MET, EGFR, AXL, VGEFA, RAF1 and GAB1 were further analyzed. All the reads of each gene, such as MET, identified in 8 wildtype samples and 7 knockout or other negative controls were plotted along the promoter candidate region, -4kb to +500 bps of tss site. The reads from different wild type samples were frequently mapped to the closely genome location. Two or more reads mapped to the close location within 100 bp distance were combined as 2 to 3 or more reads in the same location in the plot. The read sequences in the densely distributed region for wildtype group samples were used to design Tagman PCR primers and probes as listed in Table 17. The enriched DNA elements were validated through Taqman PCR using the ChIP samples and resolved on 3% agarose gel.

7.3 RESULTS

Pten-NOLC1 fusion is the result of chromosome 10 rearrangement.

To investigate the mechanism underlying Pten-NOLC1 fusion transcription formation, fluorescence in situ hybridization (FISH) analysis was performed on the prostate cancer samples where Pten-NOLC1 fusion transcript was detected, using two probes corresponding to the 5’ end of Pten genome (spectrum red) and to the 3’ end of NOLC1 (spectrum green). The results showed that the signals of Pten and NOLC1 were overlapped to form a single hybridization signal (yellowish) in the cancer cells. An independent NOLC1 signal (green) was also clearly identified, but the wild type Pten signal (red) was absent in this prostate cancer case. This is in contrast to the distinct two pair of separate signals for Pten (red) and NOLC1 (green) visualized in normal organ donor prostate tissue. The co-existence of Pten-NOLC1 genome recombination and hemizygous Pten deletion in the cancer genome suggests a complete functional loss of Pten alleles in this cancer.

To identify the location of the chromosomal breakpoint for Pten-NOLC1, a series of nested-long-extended PCR was performed on the genomic DNA of this prostate cancer sample, using primers corresponding to the genome sequences adjacent to the exons flanking the breakpoint juncture.

The PCR product was purified and subjected to Sanger sequencing. The sequencing results of these PCR products showed (Figure 17C) indicated that the genome breakpoint of Pten-NOLC1 was located at the sequences between intron 11 of Pten and intron 1 of NOLC1, with an 8 bp (TAGCTGGG) overlapped sequence shared by both Pten and NOLC1 introns. In the subsequent Pten-NOLC1 breakpoint analysis, we found that the genomes of cancer cell lines LNCaP, DU145, VCaP, HEP3B, MCF7 and PC3 contain the identical breakpoint sequence, even though they are from different types of malignancies with diverse biological features. These results suggest a common recombination mechanism that underlies Pten-NOLC1 fusion formation in human cancer genomes.

Pten-NOLC1 is highly recurrent in human malignancies.

To investigate whether Pten-NOLC1 fusion is frequent in human cancers, Taqman RT-PCR analyses were performed on 26 cancer cell lines of 6 different of human cancers using primers and probe shown in Figure 20A, and Pten-NOLC1 fusion transcript were found present in most tumor cell lines tested, including 4 prostate cancer cell lines tested (PC3, DU145, LNCaP and VCaP), lung cancer cell lines (H358, H1299, H522 and H23), breast cancer cell lines (MDAMB231,VACC3133, MDA-MB330, MCF7), liver cancer cell lines (HUH7, HEP3B, SNU449, SNU475, SNU375, SNU182 and HEPG2), and glioblastoma cell lines (A-172, LN229, T98G, U138 and U118), and colon cancer cell lines (HCT8 and HCT15). Pten-NOLC1 fusion was not detected in 20 normal organ donor prostate samples and 10 blood samples from prostate cancer patients. To investigate whether Pten-NOLC1 is present in primary human cancer samples, Taqman RT-PCR analyses were performed on 1030 cancer samples from 8 different types of human malignancies. The results showed that Pten-NOLC1 is extensively present in these cancers: 70.1% (337 of 481) PCa, 83% (50 of 60) breast cancer, 75% (45 of 60) colon cancer, 83% (127 of 153) GBM, 82.9% (58 of 70) liver cancer, 75.2% (109 of 145) NSCLC, 70.5% (43 of 61) ovarian cancer and 85% (29 of 34) esophageal adenocarcinoma (Figures 20B, 18 and 19 and Table 7-12).

To investigate whether this widely present transcript of Pten-NOLC1 was translated to a protein in cancer cells or tissues, the western blots on tumor samples with antibodies specific for Pten were performed. Pten-NOLC1 fusion has a projected molecular weight of ~120 Kd, and the size is significantly bigger than Pten (48.3 Kd). While in the blotting of protein extracts from cell lines H522, H358, PC3, DU145 and A- 172, a 110 kDa band was readily detected in cancer line samples by antibodies against Pten (Figure 20C, line 8-12). Pten-NOLC1 protein was also detected in primary cancer tissue samples that were positive for Pten-NOLC1 mRNA, while was negative in healthy organ donor prostate samples where Pten-NOLC1 fusion was absent (Figure 20C, line 1- 7). Taken together, Pten-NOLC1 was a somatic genome rearrangement product and was recurrently detected as a fusion transcript and chimeric protein in human cancers of different origins.

Pten-NOLC1 only is located in nucleus and contains no phosphatase function.

Cytoplasmic PIP3 is the specific target of Pten phosphatase, while NOLC1 is exclusively a nucleolar protein. Thus, it is of interest to identify the subcellular location of Pten-NOLC1 fusion protein since it has a significant implication on the function of the fusion protein. A cDNA of Pten-NOLC1 was created and constructed into a mammalian expression vector to create pCNDA4-Pten-NOLC1-Flag and transfected into NIH3T3 cell lines for analyzing the location of Pten-NOLC1 with immunostaining. As shown in Figure 22, using an antibody specific for FLAG tag to analyze the NIH3T3 cells transfected with pCDNA4-Pten-NOLC1-FLAG, signal of Pten-NOLC1 was exclusively localized in the nucleus, similar to the cell nucleus staining with NOLC1 antibody, while Pten antibody showed a diffuse distribution of Pten in NIH3T3 cells covering both cytoplasm and nucleus. Similar results were also observed in PC3 cells transfected with pCNDA4-Pten-NOLC1-FLAG (data not shown). In addition, pCNDA4 expressing proteins with fluorescent tag were created to trace the chimeric Pten-NOLC1-EGFP, truncated Pten del343-403 -EGFP and truncated NOLC1 del1-40 -mCherry, expression in PC3 cells. When the vector was separately transfected into PC3 cells, respectively, both Pten- NOLC1-EGFP and NOLC1-mCherry proteins were exclusively localized in the nucleus, while Pten-EGFP is mostly in the cytoplasm, rarely (but one cell) in the nucleus in the mitotic stage. Further, when nuclear or cytoplasmic fractions of Du145 lysates were blotted with Pten antibodies, Pten band (48 kDa) was found exclusively in cytoplasmic fraction, and while Pten-NOLC protein was detected in the nuclear fraction (Figure 22B) as evidenced by the detection of the same protein band with antibodies against Pten and FLAG, respectively, during Western Blot analysis.

Since the phosphatase domain of Pten is intact in Pten-NOLC1 fusion protein while the lipid-binding C2 domain of Pten is truncated, it is unclear whether Pten- NOLC1 has phospholipid phosphatase activity. An ELISA assay was performed

(Echelon Pten Activity Elisa kit) to test the capability of purified recombinant GST-Pten or GST-Pten-NOLC1 to convert PI(3,4,5)P 3 to PI(4,5)P 2 . The results showed that GST- Pten had 28 pmol of PIP 2 produced (per ng of GST-Pten). In contrast, GST-Pten-NOLC1 failed to produce measurable PIP 2 (Figure 22C). Similar results were found for cellular produced Pten or Pten-NOLC1: The antibodies specific for Pten or NOLC1 or FLAG were used in the immunoprecipitation to purify the Pten-NOLC1 or Pten related proteins in Pten-NOLC1 expressing PC3 clones, but none of these immuno-purified proteins had shown any phosphatase activities. The failure of endogenous Pten in PC3 cells to show PIP3 phosphatase activity probably reflects mutated enzyme in the cancer cell line (14). In conclusion, Pten-NOLC1 resides exclusively in the nucleus, but possesses no wildtype of Pten lipid phosphatase function.

Pten-NOLC1 promotes tumor growth and increases tumor survival.

Following, testing of whether the expression of Pten-NOLC1 has impact on the tumor growth was performed. Most cancer cell lines of different organ origin have been tested to contain weak to moderate expression of Pten-NOLC1 (Figure 20A). Specific knockdown of Pten-NOLC1 fusion by siRNA may be used to investigate the impact of Pten-NOLC1 on cell growth. However, such approach may not be feasible due to share sequence between Pten-NOLC1 and Pten, or between Pten-NOLC1 and NOLC1.

First, Pten-NOLC1 genome knockout in DU145 cells and MCF7 cells were prepared using CRISPR-Cas9 D10A approach. As shown in Figure 23A, the genome breakpoint of Pten-NOLC1 was targeted by two gRNAs flanking the breakpoint. This was followed by gRNA directed Cas9 D10A nickase to nick at the target sites (gRNA- Cas9 D10A vector) and insertion of a promoterless but ribosome binding site-containing zeocin-mCherry cDNA (donor vector) into the breakpoint through homologous recombination. When both these vectors were transfected into DU145 (prostate cancer cells), MCF7 (breast cancer cells) and other cancer cell lines, the successful interruption of Pten-NOLC1 fusion was indicated by zeocin resistant and display of mCherry fluorescence. Clones of Pten-NOLC1 knockout were obtained through zeocin resistance selection. As shown in Figure 23B, clones of DU145 and MCF7 were obtained to show knockout of Pten-NOLC1 expression.

Subsequently, these clones were analyzed for cell cycle characteristics. As shown in Figure 23C, the cell population in S phase decreased from 23.3% to 13% (p<0.05) and 8% (p<0.05) in Pten-NOLC1 knockout clones DU145KO1 and

DU145KO2, respectively, in comparison with their parent DU145 cells, and from 23% to 15% (p<0.05) in MCF7KO1 cells, while the G0/G1 phase of these Pten-NOLC1 knockout cells moderately increased. In contrast, forced expression of Pten-NOLC1 in NIH3T3 and PC3 cells increased the S-phase entry for these cells (7% to 26% for NIH3T3, p<0.05; and 9% to 32% for PC3, p<0.05). Knockout of Pten-NOLC1 produced 8-13 fold decrease (p<0.05) in colony formation for Du145 cells and 2.4 fold drop (p<0.05) for MCF7 cells (Figure 23D). The finding was reversed if Pten-NOLC1 was forced to express in NIH3T3 (2.83 fold increase, p<0.05) and PC3 (2.8 fold increase, p<0.05) cells. Separately, we performed a UV-induced cell death analysis of these DU145 and MCF7 KO cells, as shown in an example of FACS analysis of Alexa Fluor 488 annexin V binding assays, 36% of the parental DU145 cells treated with 175 mj UV died versus 56% of its knockout counterparts (DU145-KO1;p<0.05), suggesting that the presence of Pten-NOLC1 increases the resistance of cancer cells to UV radiation. Indeed, the removal of Pten-NOLC1 resulted in the cells beoming more sensitive and decreased the ED50 of DU145 cells from 174 mj to 135 mj and ED50 of MCF7 from 175 to 127 mj (Figure 23E). To investigate the impact of Pten-NOLC1 on cancer invasiveness, matrigel traverse analysis was performed. As shown in Figure 23F, removal of Pten-NOLC1 in DU145 tumor cells reduced the invasion index by 3.8-4 fold. To test the aggressiveness of these cells in vivo, 5x10 6 of these cells were subcutaneously grafted in the left flank region of SCID mice. The DU145 tumors grew rapidly and started displaying a visible bump on the flank region in the second week after the grafting and to the end of observation, the bump became almost 35 times of its original size on the second week, while the grafts of two Pten-NOLC1 KO clones of DU145 only became visible at 4th week after the xenografting (Figure 23G). Most of the mice with DU145 tumor hadmetastatic lesions and ascites, while only one animal from the knockout groups showed such signs (Figure 23H). All animals xenografted with DU145 cells devoid of Pten-NOLC1 survived the 6-week period of xenografting, while 42% (3/7) SCID mice xenografted with DU145 cell tumor died during the same period. MCF7 cells failed to migrate in the matrigel, and its xenografts did not grow in SCID mice. Taken together, these results indicate that Pten-NOLC1 fusion contributes significantly to the tumor growth and invasion.

Overexpression of c-met, GAB1 and EGFR are dependent on Pten-NOLC1. NOLC1 is a cofactor of RNA polymerase I12, and has been implicated as a co- transcription factor (15, 16). To test whether fusion between NOLC1 and Pten may alter its transcription spectrum, DNA fragments of DU145 and MCF7 and their knockout counterparts from chromatin immunoprecipitations using NOLC1 antibodies were sequenced (ChIPseq). As shown in Figure 28A, 6179 DNA fragment peaks were detected in DU145 cells. The peak events were reduced to 2868 in 2 Pten-NOLC1 knockout clones. Similar results were also identified in MCF7 cells: 4347 peaks in total in MCF7, and 1196 peaks in MKO1 and MKO2. The presence of Pten domain in NOLC1 produced 2.2 to 3.6-fold more genome binding regions, significantly broadening NOLC1 DNA binding activity. To verify whether these DNA binding activities are the direct results of Pten-NOLC1 fusion, Pten-NOLC1-FLAG was forced to express in PC3 and RWPE1 cells, large numbers of DNA peaks were identified in FLAG antibodies- ChIPseq. Many peaks overlapped with those found in DU145 and MCF7 cells. Pathway analysis of these Pten-NOLC1 associated DNA fragments showed that genes from‘HGF signaling’ and‘mechanism of cancer’ pathways are among the most impacted signals (Table 15). The enrichment includes DNA fragments from the promoter/enhancer regions of MET, EGFR, RAF1, AXL, GAB1 and VEGFA (Figure 29 and Figure 28B). To investigate the impact of removal of Pten-NOLC1 from DU145 cells, gene expression microarray assays were performed performed on DU145 cells and its knockout counterparts. The results showed more than 500 genes and transcripts were down-regulated in both Du145KO1 and Du145KO2 clones in comparison with parent DU145 cells. Among these genes, c-MET was downregulated 2.8-6 fold, EGFR 1.4-1.41 fold, GAB1 1.8-1.9 fold, AXL 1.8-2.3 fold, VEGFA 1.5-1.6 fold (Figure 24A). Similar down regulations of these gene expressions were also identified using quantitative RT- PCR assay method (Figure 24B). Interestingly, these findings were similarly identified in MCF7 cells with knockout of Pten-NOLC1 (Figure 24B). Dramiatic downregulation of MET, EGFR, RAF1, and GAB1 protein expression in DU145KO1, DU145KO2 and MCF7KO1were identified (Figure 24C and Figure 28C). Both RAF1 and STAT3 showed very little phosphorylation. These results suggest that activation of these signaling pathways are largely dependent on the presence of Pten-NOLC1. When combined with ChIPseq/microarray and immunoblot analyses, we found that the expressions of many molecules in the ECM, EGF and HGF signaling pathways are dependent on Pten-NOLC1 (Figure 28D). Indeed, disruption of Pten-NOLC1 expression in 9 different cancer cell lines including PC3, DU145, SNU479, SNU449, HEP3B, T98G, MCF7, MB231 and H1299 cells by knocking in zeocin-cherry cassette produced large numbers of cell deaths (Figure 30 and Table 18), in contrast to Pten-NOLC1 negative NIH3T3 cells where the impact is minimal. These results suggest that many cancer cells may addict to Pten-NOLC1 for survival.

Pten-NOLC1 fusion induced Hepatoma in mouse.

Conversion of Pten to Pten-NOLC1 represents a loss of wild-type Pten and creation of an oncogene in one event. The hypothesis was that this single event may be sufficient to generate cancer in mammals. To test this hypothesis, Ptenwas somatically knocked out from the liver of C57Bl loxPten+/+ mice through intra-peritoneal injection of AAV8-cre. This was followed by hydrodynamic tail vein injection of pT3-Pten- NOLC1-mCherry and pSB so that about 1-5% hepatocytes were transfected with Pten- NOLC1-mCherry gene (Figure 25A). Within 18 weeks, six of seven mice developed hepatocellular carcinoma, showing distinctive tumor nodules in the liver (Figure 25B). One mouse developed cancer metastasis in the peritoneal cavity. Two animals have ascites accumulation. In contrast, none of the control mice with somatic Pten knockout and pT3/pSB injection developed sign of tumor in the same period. These tumor cells have large nucleoli and contain significant fatty accumulation in the cytoplasm, probably due to Pten gene knockout. The tumor cells also displayed high frequency of Ki-67 staining, showing 4.6 fold more frequent than that of non- cancerous cells (Figure 25D). Immunoblot analyses showed Pten-NOLC1-mCherry protein expression in all the cancer samples, which was detected by antibodies specific for Pten or NOLC1, but such protein was absence in pT3/pSB transfected and AAV8-cre treated mouse liver tissues (Figure 25E). A dramatic up-regulation of c-MET and GAB1 was detected in all the tumor tissues, including the metastatic cancer sample. Thus, these results indicate that a single event of Pten-NOLC1 fusion creation is sufficient to drive the development of spontaneous liver cancer and the oncogenic activity of Pten-NOLC1 is significantly contributed by over-activation of c- MET signaling pathway.

7.4 DISCUSSION

This examples shows that there is a widespread presence of Pten-NOLC1 fusion in human cancers, which suggests a fundamental role of Pten-NOLC1 fusion in cancer development. The positive rate of Pten-NOLC1 in cancer metastasis is even higher, reaching 90% in some types of cancers (Figure 21). Despite the high frequency of Pten- NOLC1 fusion, to our knowledge, this is the first report to describe such fusion in human cancers. Screening of transcriptome sequencing data sets of 2586 cancer samples covering 17 different cancer types from TCGA, we failed to detect the presence of Pten- NOLC1 fusion transcript. Intrigued by the discrepancies, we investigated the possible differences between the two data sets by comparing the normalized number read alignments for the head or tail exon of Pten or NOLC1 gene. The analysis showed that significantly smaller proportion of mapped reads are aligned to the first exon of either Pten or NOLC1 gene from the TCGA prostate cancer data set in comparison with our data (Figure 26). Since detection of Pten-NOLC1 requires large number of exon 1 mapped reads for NOLC1, 3’ end bias and relatively low sequencing coverage may impede the detection of Pten-NOLC1 fusion transcript.

Creation of Pten-NOLC1 fusion is the result of a large of deletion (14.2 MB) of chromosome 10 sequence spanning between exon 11 of Pten and exon 1 of NOLC1. Such deletion is readily detectable by copy number analysis. Indeed, among 12 Pten- NOLC1 fusion-positive prostate cancer samples that were also analyzed by Affymetrix SNP6.0 array, all showed the similar 14.2 MB spanning deletion between exon 11 of Pten and exon 1 of NOLC1 (Figure 27). In contrary, 10 prostate cancer samples that are negative for Pten-NOLC1 fusion are found negative for such spanning deletion. Thus, spanning deletion between exon 11 of Pten and exon 1 of NOLC1 may serve as a surrogate indicator for the presence of Pten-NOLC1 fusion. We subsequently applied this analysis to SNP6.0 data sets from TCGA database. Large numbers of cancer samples were found to contain such spanning deletion, ranging from 2% for thyroid cancer data set to 79% from glioblastoma multiforme data set (Figure 27A-C). The wide presence of Pten-NOLC1 spanning deletion in the cancer genomes suggests the presence of Pten- NOLC1 fusion gene in these cancer samples.

Pten deletion and mutations occur in many human cancers. These mutations result in loss of lipid phosphatase activity and produce genome stability. Here, we identified a Pten derived fusion molecule functionally promotes cancer cell growth through the MET signaling system. Elevated MET is known to facilitate the MET-FAS interactions on the membrane to increase survivals (17). Pten-NOLC1 may increase the survival of cancer cells by elevated the expression level of c-MET such that it abrogates the impact of FAS cell death signaling. This may explain the impact of Pten-NOLC1 induced cell death resistance to UV irradiation. Interestingly, many cancer samples with hemizygous Pten deletion are also positive for Pten-NOLC1 (Figure 27C). As a result, these cancers are devoid of functional Pten protein, since Pten-NOLC1 is negative for phospholipid phosphatase activity, and is translocated to the nucleus. These cancers may also have over-activated PI3K/Akt signaling due to the lack of deactivation of

PIP3(1,4,5). The analysis showed that a single event of creation of Pten-NOLC1 is sufficient to produce spontaneous liver cancer in animals in a short period of time. Pten-NOLC1 fusion may have significant clinical implication. Clinical trials using drugs targeting at Pten signaling pathway has been initiated (18, 19). Recently, a genome therapy strategy targeting at the chromosomal breakpoint of fusion genes through CRISPR-cas9 system has been developed (20). Using this approach, partial remissions on animals xenografted with cancers positive for fusion gene breakpoints have been achieved. Different from other fusion genes, the breakpoints of Pten-NOLC1 from primary cancer samples or cancer cell lines of different organ origins appear identical, probably due to the utilization of the same anchoring sequence in the chromosome recombination process. This may give significance ease in designing a targeting strategy toward the chromosomal breakpoint of this fusion gene for all the cancers that contain Pten-NOLC1. Thus, the discovery of Pten-NOLC1 fusion gene may lay down an important foundation for future cancer treatment.

Table 6 fusionListFilter

Attorney Ref. No.072396.0685 Table 8

Attorney Ref. No.072396.0685

, Active 36374557.2 94

Attorney Ref. No.072396.0685

regmen ve

Active 36374557.2 95

Attorney Ref. No.072396.0685

Attorney Ref. No.072396.0685 /

,

Active 36374557.2 97

Attorney Ref. No.072396.0685

Attorney Ref. No.072396.0685

Attorney Ref. No.072396.0685 Evidence for recurrent Patient malignancy No (0), Yes (1) or Survival unknown (Blank) Days 1 276 N/A 8783 0 45 0 489 0 2954 1 265 1 1014 0 19 N/A 6737 1 394 1 856 0 176 1 1776 N/A 6326 0 227 1 381

Attorney Ref. No.072396.0685 Evidence for recurrent Patient malignancy No (0), Yes (1) or Survival unknown (Blank) Days

1 638 1 2435 0 174 st 0 3376

0 40 y 0 3019

0 61 0 3190 N/A 4960 1 1695 0 3835 N/A 1500 1 678 N/A N/A N/A N/A N/A 1473 N/A N/A N/A N/A N/A 1317

Attorney Ref. No.072396.0685 -NOLC1 Sex Age at Op. Primary Cause of Death Evidence for recurrent Patient tatus malignancy No (0), Yes (1) or Survival unknown (Blank) Days P M 57 N/A N/A 1365 P M 54 N/A N/A

P M 51 N/A N/A 792

102

Attorney Ref. No.072396.0685

path_n_reg path_m_re os_months g

0 0 107.9333333 0 0 72.53333333 0 0 14.13333333 0 0 148.2 2 0 7.033333333 0 0 75.83333333 0 0 55.93333333 0 0 7.7 0 0 138.7333333 0 128.3666667 0 0 17.46666667 0 0 19.73333333 1 20 1 0 74.2 0 0 8.033333333 0 9.633333333 1 0 19.8 0 0 21.9 2 0 7.333333333 0 20.23333333 1 0 100 0 0 72 1 99.83333333 1 0 96.36666667

Attorney Ref. No.072396.0685 0 0 67.13333333 0 0 0.7 1 0 8.133333333 1 13.9 0 0 2.233333333 0 1 6.766666667

10.43333333 0 0 101.6666667 0 0 139.3666667 1 0 4.833333333 1 0 4.833333333 0 0 62.46666667 1 0 18.26666667 2 0 23 2 0 5.9 0 37.23333333 0 0 67.7 0 0 14.23333333 1 92.4 1 0 107.1 0 0 119.2333333 1 0 5.5 0 17.83333333 0 0 94.76666667 0 0 107.9666667 0 0 80.1

43.8 0 0 3.2 0 0 64.13333333 1 0 7 0 0 50.3

Attorney Ref. No.072396.0685 ale 65 3A 3 2 19.16666667 male 68 1B 2 0 0 100.6666667 ale 76 4 1 0 30.86666667 ale 75 1B 2A 0 54.36666667 ale 68 1A 1A 0 0 61.6 male 75 1A 1 0 0 62.8 ale 66 3A 1 2 32.06666667 ale 84 1B 2 1 0 41.7 male 55 1B 4 1 0 60.83333333 male 66 1B 2 1 0 54.16666667 ale 84 1B 2 0 21.96666667 ale 56 2 1 0 15.36666667

105

Attorney Ref. No.072396.0685

Attorney Ref. No.072396.0685

OS_tumor_ linstage castatusdesc size_registry

tumor

Evidence of this

3a tumor 40

No evidence of this

1b tumor 25

No evidence of this

3a tumor 15

No evidence of this

1b tumor 35

No evidence of this

2b tumor 18

No evidence of this

1a tumor 18

No evidence of this

1b tumor 9

No evidence of this

3b tumor 19

No evidence of this

2b tumor 33

No evidence of this

2b tumor 32

Evidence of this

1a tumor 15

No evidence of this

2b tumor 24

No evidence of this

1a tumor 28

Evidence of this

3a tumor 40

Attorney Ref. No.072396.0685

male/femal m_stag OS_tumor_ e e t_stage n_stage e clinstage castatusdesc size_registry

No evidence of this

F T2 N0 MX 1b tumor 25

Evidence of this

F T4 N0 MX 3b tumor 19

No evidence of this

M T1 N0 MX 1a tumor 10

Evidence of this

M T4 N0 MX 3b tumor 35

108

Attorney Ref. No.072396.0685

Attorney Ref. No.072396.0685

Attorney Ref. No.072396.0685

linical Clinical ic

M Stage Patholog

Group T Pathologic N 2B 3 1A 2B 3 1A 2A 1C 2 1 1B 1A 1 1C 1A 2A 2 2A 2A 3 1M 4 2 1A 1 2 1A 2A 0 0

Attorney Ref. No.072396.0685

Clinical Clinical

M Stage Pathologic

Group T Pathologic N 0 0 IS 0 99 0 2A 2 2A 0 2B 1B 2A 0 1 1C 1M 0 1 1B 0 0 1 1C 0 0 2A 2 0 0 1 1C 0 0 0 1C 0 0 1 1C 0

Attorney Ref. No.072396.0685 l

C linical N Clinical Clinica ogic

M Stage Pathol

Group T Pathologic N 0 0 2A 2 0 0 0 1 3 1A 0 0 2A 2 0 0 0 1 2 0 0 0 1 1C 0 0 0 1 2 0 0 0 1 1C 0 0 0 1 1C 0 0 0 1 1C 0

Attorney Ref. No.072396.0685

Attorney Ref. No.072396.0685 Months

from Dx

1st Recurrence-Desc To 1st Survival in Vital

Recurre Months Status- Desc nce

nt became disease-free 0 12 Dead treatment and has not

recurrence.

nt became disease-free 000 103 Alive treatment and has not

recurrence.

nt became disease-free 0 73 Alive treatment and has not

recurrence.

nt became disease-free 0 95 Alive treatment and has not

recurrence.

nt became disease-free 0 83 Alive treatment and has not

recurrence.

nt became disease-free 0 61 Alive treatment and has not

recurrence.

nt became disease-free 0 62 Alive treatment and has not

recurrence.

nt became disease-free 0 65 Alive treatment and has not

recurrence.

nt became disease-free 0 68 Alive treatment and has not

recurrence.

nt became disease-free 0 65 Alive treatment and has not

recurrence.

nt became disease-free 0 68 Alive treatment and has not

recurrence.

Attorney Ref. No.072396.0685 Months

from Dx

Desc To 1st Survival in Vital e 1st Recurrence- Status- Recurre Months Desc nce

nt recurrence of an 0 73 Alive ive tumor in multiple

(recurrences that can

ded to more than one

ory 51–59

nt became disease-free 0 63 Dead treatment and has not

recurrence.

nt became disease-free 0 66 Alive treatment and has not

recurrence.

nt became disease-free 0 62 Alive treatment and has not

recurrence.

nt recurrence of an 68 70 Alive ive tumor in bone

This includes bones

than the primary site.

nt became disease-free 0 80 Alive treatment and has not

recurrence.

nt became disease-free 0 75 Alive treatment and has not

recurrence.

nt became disease-free 0 79 Alive treatment and has not

recurrence.

nt became disease-free 0 72 Alive treatment and has not

recurrence.

nt became disease-free 0 89 Alive treatment and has not

recurrence.

Attorney Ref. No.072396.0685

l T Clinical N Clinical Clinical

M Stage Pathologic

Group T 1 0 3A 3

1 0 2B 1C 2 0 3A 2

1 0 2B 3

0 0 2A 2

0 0 1 1C 0 0 1 1C 1 X 99 3

3 0 3C 3

0 0 2B 1C 1 0 2A 1C 1 0 3B 2

1 0 2B 1A 0 0 1 2

Attorney Ref. No.072396.0685 a l T Clinical N Clinical Clinical

M Stage Pathologic

Group T 0 0 1 1B 0 0 1 1C 0 0 1 2

0 0 1 2

0 0 1 2

0 0 1 1C 0 0 1 1B 0 0 2A 1C 0 0 2B 2

0 0 2B X

0 0 1 1C 0 0 2A 1A 0 0 1 1B 0 0 1 1B 0 0 1 1C 0 0 1 1B

Attorney Ref. No.072396.0685

Attorney Ref. No.072396.0685 currence-Desc Months Survival Vital Status- from Dx In Months Desc to 1st

Recurrenc

e

–59 rrence of an 018 028 Dead or in multiple

ences that can

more than one

–59

me disease-free 000 060 Alive ent and has not

ence.

me disease-free 000 060 Alive ent and has not

ence.

me disease-free 000 068 Dead ent and has not

ence.

me disease-free 000 063 Alive ent and has not

ence.

me disease-free 000 061 Alive ent and has not

ence.

me disease-free 000 063 Alive ent and has not

ence.

me disease-free 000 061 Alive ent and has not

ence.

me disease-free 000 060 Alive ent and has not

ence.

me disease-free 000 063 Alive ent and has not

Attorney Ref. No.072396.0685 ecurrence-Desc Months Survival Vital Status- from Dx In Months Desc to 1st

Recurrenc

e

ence.

me disease-free 000 058 Alive ent and has not

ence.

me disease-free 000 062 Alive ent and has not

ence.

me disease-free 000 061 Alive ent and has not

ence.

me disease-free 000 063 Alive ent and has not

ence.

me disease-free 000 062 Alive ent and has not

ence.

me disease-free 000 063 Alive ent and has not

ence.

me disease-free 000 065 Alive ent and has not

ence.

me disease-free 000 064 Alive ent and has not

ence.

me disease-free 000 060 Alive ent and has not

ence.

me disease-free

ent and has not

ence.

me disease-free 000 058 Alive ent and has not

Attorney Ref. No.072396.0685 atholog Pathologic ER Assay PR Assay HER2: Type 1st Recurrence-Desc Months Survival Vital Status- c M Stage Summary from Dx In Months Desc

Group Result to 1st

Recurrenc

e

had a recurrence.

1 Positive/E Positive/Elevat 988 Patient became disease-free 000 060 Alive

levated ed after treatment and has not

had a recurrence.

122

Attorney Ref. No.072396.0685

Path

pM Stg Type 1st Surv Vital Grp Recurrence (Months) Status 0 1A Patient became 78 Dead disease-free after

treatment and has

not had a

recurrence.

0 3B Patient became 80 Alive disease-free after

treatment and has

not had a

recurrence.

0 3B Patient became 80 Alive disease-free after

treatment and has

not had a

recurrence.

0 1 Patient became 81 Alive disease-free after

treatment and has

not had a

recurrence.

0 1 Patient became 81 Alive disease-free after

treatment and has

not had a

recurrence.

0 2A Patient became 85 Alive disease-free after

treatment and has

not had a

recurrence.

Attorney Ref. No.072396.0685 Path

Stg Type 1st Surv Vital Grp Recurrence (Months) Status 1C Patient became 85 Alive disease-free after

treatment and has

not had a

recurrence.

1C Patient became 85 Alive disease-free after

treatment and has

not had a

recurrence.

1C Patient became 85 Alive disease-free after

treatment and has

not had a

recurrence.

2C Patient became 76 Alive disease-free after

treatment and has

not had a

recurrence.

2C Patient became 76 Alive disease-free after

treatment and has

not had a

recurrence.

99 Patient became 80 Alive disease-free after

treatment and has

not had a

recurrence.

99 Patient became 80 Alive disease-free after

treatment and has

not had a

recurrence.

Attorney Ref. No.072396.0685 Path

Stg Type 1st Surv Vital Grp Recurrence (Months) Status 99 Patient became 64 Alive disease-free after

treatment and has

not had a

recurrence.

99 Patient became 64 Alive disease-free after

treatment and has

not had a

recurrence.

1C Patient became 87 Alive disease-free after

treatment and has

not had a

recurrence.

1C Patient became 87 Alive disease-free after

treatment and has

not had a

recurrence.

1C Patient became 87 Alive disease-free after

treatment and has

not had a

recurrence.

1A Patient became 60 Alive disease-free after

treatment and has

not had a

recurrence.

1A Patient became 60 Alive disease-free after

treatment and has

not had a

recurrence.

Attorney Ref. No.072396.0685 Path

M Stg Type 1st Surv Vital Grp Recurrence (Months) Status 1C Distant systemic 84 Alive recurrence of an

invasive tumor only.

This includes

lymphoma,

leukemia, bone

marrow

1C Distant systemic 84 Alive recurrence of an

invasive tumor only.

This includes

lymphoma,

leukemia, bone

marrow

1C Distant systemic 84 Alive recurrence of an

invasive tumor only.

This includes

lymphoma,

leukemia, bone

marrow

1C Patient became 83 Alive disease-free after

treatment and has

not had a

recurrence.

1C Patient became 83 Alive disease-free after

treatment and has

not had a

recurrence.

99 Patient became 86 Alive disease-free after

treatment and has

not had a

Attorney Ref. No.072396.0685 Path

M Stg Type 1st Surv Vital Grp Recurrence (Months) Status recurrence.

99 Patient became 86 Alive disease-free after

treatment and has

not had a

recurrence.

99 Patient became 84 Alive disease-free after

treatment and has

not had a

recurrence.

99 Patient became 84 Alive disease-free after

treatment and has

not had a

recurrence.

1C Patient became 58 Alive disease-free after

treatment and has

not had a

recurrence.

1C Patient became 58 Alive disease-free after

treatment and has

not had a

recurrence.

99 Patient became 82 Alive disease-free after

treatment and has

not had a

recurrence.

99 Patient became 82 Alive disease-free after

treatment and has

not had a

Attorney Ref. No.072396.0685 Path

Stg Type 1st Surv Vital Grp Recurrence (Months) Status recurrence.

1C Patient became 80 Alive disease-free after

treatment and has

not had a

recurrence.

1C Patient became 80 Alive disease-free after

treatment and has

not had a

recurrence.

1A Patient became 66 Alive disease-free after

treatment and has

not had a

recurrence.

1A Patient became 66 Alive disease-free after

treatment and has

not had a

recurrence.

99 Patient became 73 Alive disease-free after

treatment and has

not had a

recurrence.

99 Patient became 73 Alive disease-free after

treatment and has

not had a

recurrence.

1C Patient became 70 Alive disease-free after

treatment and has

not had a

Attorney Ref. No.072396.0685 Path

M Stg Type 1st Surv Vital Grp Recurrence (Months) Status recurrence.

1C Patient became 70 Alive disease-free after

treatment and has

not had a

recurrence.

1C Patient became 70 Alive disease-free after

treatment and has

not had a

recurrence.

3B Patient became 64 Alive disease-free after

treatment and has

not had a

recurrence.

99 Patient became 63 Alive disease-free after

treatment and has

not had a

recurrence.

99 Patient became 63 Alive disease-free after

treatment and has

not had a

recurrence.

3B Patient became 62 Alive disease-free after

treatment and has

not had a

recurrence.

3B Patient became 62 Alive disease-free after

treatment and has

not had a

Attorney Ref. No.072396.0685 Path

Stg Type 1st Surv Vital Grp Recurrence (Months) Status recurrence.

99 Patient became 75 Dead disease-free after

treatment and has

not had a

recurrence.

99 Patient became 75 Dead disease-free after

treatment and has

not had a

recurrence.

1A Patient became 93 Alive disease-free after

treatment and has

not had a

recurrence.

1A Patient became 93 Alive disease-free after

treatment and has

not had a

recurrence.

1A Patient became 90 Alive disease-free after

treatment and has

not had a

recurrence.

1A Patient became 90 Alive disease-free after

treatment and has

not had a

recurrence.

1A Patient became 91 Alive disease-free after

treatment and has

not had a

Attorney Ref. No.072396.0685 Path

M Stg Type 1st Surv Vital Grp Recurrence (Months) Status recurrence.

2A Patient became 84 Alive disease-free after

treatment and has

not had a

recurrence.

0 1A Patient 89 became

disease- free after treatment and has not had a

recurrence.

1A Patient became 89 Alive disease-free after

treatment and has

not had a

recurrence.

1C Patient became 84 Alive disease-free after

treatment and has

not had a

recurrence.

1C Patient became 84 Alive disease-free after

treatment and has

not had a

recurrence.

1C Patient became 84 Alive disease-free after

treatment and has

not had a

recurrence.

1C Patient became 85 Alive

Attorney Ref. No.072396.0685 Path

M Stg Type 1st Surv Vital Grp Recurrence (Months) Status disease-free after

treatment and has

not had a

recurrence.

1C Patient became 85 Alive disease-free after

treatment and has

not had a

recurrence.

3C Since diagnosis, 7 Dead patient has never

been disease-free.

This includes cases

with distant

metastasis at

3C Since diagnosis, 7 Dead patient has never

been disease-free.

This includes cases

with distant

metastasis at

3C Since diagnosis, 7 Dead patient has never

been disease-free.

This includes cases

with distant

metastasis at

99 Since diagnosis, 24 Dead patient has never

been disease-free.

This includes cases

with distant

metastasis at

99 Since diagnosis, 24 Dead patient has never

Attorney Ref. No.072396.0685 Path

pM Stg Type 1st Surv Vital Grp Recurrence (Months) Status been disease-free.

This includes cases

with distant

metastasis at

3C Distant recurrence 39 Dead of an invasive tumor

in lymph node only.

Refer to the staging

scheme for a descri

3C Distant recurrence 39 Dead of an invasive tumor

in lymph node only.

Refer to the staging

scheme for a descri

3C Distant recurrence 39 Dead of an invasive tumor

in lymph node only.

Refer to the staging

scheme for a descri

3C Since diagnosis, 7 Dead patient has never

been disease-free.

This includes cases

with distant

metastasis at 4 Recurrence of an 75 Dead invasive tumor in

regional lymph

nodes only.

4 Recurrence of an 75 Dead invasive tumor in

regional lymph

Attorney Ref. No.072396.0685 Path

M Stg Type 1st Surv Vital Grp Recurrence (Months) Status nodes only.

4 Recurrence of an 75 Dead invasive tumor in

regional lymph

nodes only.

4 Recurrence of an 75 Dead invasive tumor in

regional lymph

nodes only.

4 Recurrence of an 75 Dead invasive tumor in

regional lymph

nodes only.

4 Recurrence of an 75 Dead invasive tumor in

regional lymph

nodes only.

4 Since diagnosis, 80 Alive patient has never

been disease-free.

This includes cases

with distant

metastasis at

4 Since diagnosis, 80 Alive patient has never

been disease-free.

This includes cases

with distant

metastasis at

4 Since diagnosis, 80 Alive patient has never

been disease-free.

This includes cases

with distant

metastasis at

Attorney Ref. No.072396.0685 Path

M Stg Type 1st Surv Vital Grp Recurrence (Months) Status 4 Since diagnosis, 80 Alive patient has never

been disease-free.

This includes cases

with distant

metastasis at

4 Since diagnosis, 80 Alive patient has never

been disease-free.

This includes cases

with distant

metastasis at

4 Since diagnosis, 80 Alive patient has never

been disease-free.

This includes cases

with distant

metastasis at

4 Since diagnosis, 80 Alive patient has never

been disease-free.

This includes cases

with distant

metastasis at

4 Since diagnosis, 80 Alive patient has never

been disease-free.

This includes cases

with distant

metastasis at

4 Since diagnosis, 80 Alive patient has never

been disease-free.

This includes cases

with distant

Attorney Ref. No.072396.0685 Path

M Stg Type 1st Surv Vital Grp Recurrence (Months) Status metastasis at

4 Since diagnosis, 80 Alive patient has never

been disease-free.

This includes cases

with distant

metastasis at

4 Since diagnosis, 9 Dead patient has never

been disease-free.

This includes cases

with distant

metastasis at

4 Since diagnosis, 9 Dead patient has never

been disease-free.

This includes cases

with distant

metastasis at

4 Since diagnosis, 9 Dead patient has never

been disease-free.

This includes cases

with distant

metastasis at

4 Since diagnosis, 9 Dead patient has never

been disease-free.

This includes cases

with distant

metastasis at

4 Since diagnosis, 9 Dead patient has never

been disease-free.

This includes cases

Attorney Ref. No.072396.0685 Path

Stg Type 1st Surv Vital Grp Recurrence (Months) Status with distant

metastasis at

4 Since diagnosis, 9 Dead patient has never

been disease-free.

This includes cases

with distant

metastasis at

3C Patient became 74 Alive disease-free after

treatment and has

not had a

recurrence.

3C Patient became 74 Alive disease-free after

treatment and has

not had a

recurrence.

4 Since diagnosis, 58 Dead patient has never

been disease-free.

This includes cases

with distant

metastasis at

4 Since diagnosis, 58 Dead patient has never

been disease-free.

This includes cases

with distant

metastasis at

4 Since diagnosis, 58 Dead patient has never

been disease-free.

This includes cases

with distant

Attorney Ref. No.072396.0685 Path

M Stg Type 1st Surv Vital Grp Recurrence (Months) Status metastasis at

4 Since diagnosis, 58 Dead patient has never

been disease-free.

This includes cases

with distant

metastasis at

4 Since diagnosis, 58 Dead patient has never

been disease-free.

This includes cases

with distant

metastasis at

4 Since diagnosis, 58 Dead patient has never

been disease-free.

This includes cases

with distant

metastasis at

3C Distant recurrence, 57 Dead to a site not listed in

46-62 or there is

insufficient

information

available to

3C Distant recurrence, 57 Dead to a site not listed in

46-62 or there is

insufficient

information

available to

3C Distant recurrence, 57 Dead to a site not listed in

46-62 or there is

insufficient

Attorney Ref. No.072396.0685 Path

M Stg Type 1st Surv Vital Grp Recurrence (Months) Status information

available to

3C Patient became 78 Alive disease-free after

treatment and has

not had a

recurrence.

3C Patient became 78 Alive disease-free after

treatment and has

not had a

recurrence.

4 Distant recurrence 34 Dead of an invasive tumor

in multiple sites

(recurrences that

can be coded to

more tha

4 Distant recurrence 34 Dead of an invasive tumor

in multiple sites

(recurrences that

can be coded to

more tha

4 Since diagnosis, 3 Dead patient has never

been disease-free.

This includes cases

with distant

metastasis at

4 Since diagnosis, 3 Dead patient has never

been disease-free.

This includes cases

with distant

Attorney Ref. No.072396.0685 Path

Stg Type 1st Surv Vital Grp Recurrence (Months) Status metastasis at

4 Since diagnosis, 3 Dead patient has never

been disease-free.

This includes cases

with distant

metastasis at

4 Since diagnosis, 3 Dead patient has never

been disease-free.

This includes cases

with distant

metastasis at

4 Since diagnosis, 3 Dead patient has never

been disease-free.

This includes cases

with distant

metastasis at

3C Distant recurrence 38 Dead of an invasive tumor

in lymph node only.

Refer to the staging

scheme for a descri

3C Distant recurrence 38 Dead of an invasive tumor

in lymph node only.

Refer to the staging

scheme for a descri

3C Patient became 76 Alive disease-free after

treatment and has

not had a

recurrence.

3C Patient became 76 Alive

Attorney Ref. No.072396.0685 Path

Stg Type 1st Surv Vital Grp Recurrence (Months) Status disease-free after

treatment and has

not had a

recurrence.

3C Distant recurrence 9 Dead of an invasive tumor

in multiple sites

(recurrences that

can be coded to

more tha

3C Distant recurrence 9 Dead of an invasive tumor

in multiple sites

(recurrences that

can be coded to

more tha

3C Distant recurrence 9 Dead of an invasive tumor

in multiple sites

(recurrences that

can be coded to

more tha

3C Recurrence of an 70 Alive invasive tumor in

regional lymph

nodes only.

3C Recurrence of an 70 Alive invasive tumor in

regional lymph

nodes only.

3C Recurrence of an 70 Alive invasive tumor in

regional lymph

nodes only.

3C Recurrence of an 58 Dead

Attorney Ref. No.072396.0685 Path

M Stg Type 1st Surv Vital Grp Recurrence (Months) Status invasive tumor in

adjacent tissue or

organ(s) only.

3C Recurrence of an 58 Dead invasive tumor in

adjacent tissue or

organ(s) only.

3C Recurrence of an 58 Dead invasive tumor in

adjacent tissue or

organ(s) only.

3C Since diagnosis, 16 Dead patient has never

been disease-free.

This includes cases

with distant

metastasis at

3C Since diagnosis, 16 Dead patient has never

been disease-free.

This includes cases

with distant

metastasis at

4 Regional 30 Dead recurrence, and

there is insufficient

information

available to code to

21–27.

4 Regional 30 Dead recurrence, and

there is insufficient

information

available to code to

21–27.

Attorney Ref. No.072396.0685 Path

M Stg Type 1st Surv Vital Grp Recurrence (Months) Status 4 Regional 30 Dead recurrence, and

there is insufficient

information

available to code to

21–27.

4 Regional 30 Dead recurrence, and

there is insufficient

information

available to code to

21–27.

4 Regional 30 Dead recurrence, and

there is insufficient

information

available to code to

21–27.

4 Since diagnosis, 10 Dead patient has never

been disease-free.

This includes cases

with distant

metastasis at

4 Since diagnosis, 10 Dead patient has never

been disease-free.

This includes cases

with distant

metastasis at

4 Since diagnosis, 10 Dead patient has never

been disease-free.

This includes cases

with distant

Attorney Ref. No.072396.0685 Path

Stg Type 1st Surv Vital Grp Recurrence (Months) Status metastasis at

4 Since diagnosis, 2 Dead patient has never

been disease-free.

This includes cases

with distant

metastasis at

4 Since diagnosis, 2 Dead patient has never

been disease-free.

This includes cases

with distant

metastasis at

3C Recurrence of an 81 Dead invasive tumor in

regional lymph

nodes only.

3C Recurrence of an 81 Dead invasive tumor in

regional lymph

nodes only.

3C Recurrence of an 81 Dead invasive tumor in

regional lymph

nodes only.

3C Recurrence of an 81 Dead invasive tumor in

regional lymph

nodes only.

3C Recurrence of an 81 Dead invasive tumor in

regional lymph

nodes only.

3C Recurrence of an 81 Dead invasive tumor in

Attorney Ref. No.072396.0685 Path

Stg Type 1st Surv Vital Grp Recurrence (Months) Status regional lymph

nodes only.

3C Recurrence of an 81 Dead invasive tumor in

regional lymph

nodes only.

3C Recurrence of an 81 Dead invasive tumor in

regional lymph

nodes only.

3C Since diagnosis, 30 Dead patient has never

been disease-free.

This includes cases

with distant

metastasis at

3C Since diagnosis, 30 Dead patient has never

been disease-free.

This includes cases

with distant

metastasis at

3C Since diagnosis, 27 Dead patient has never

been disease-free.

This includes cases

with distant

metastasis at

3C Since diagnosis, 27 Dead patient has never

been disease-free.

This includes cases

with distant

metastasis at

3C Since diagnosis, 27 Dead

Attorney Ref. No.072396.0685 Path

M Stg Type 1st Surv Vital Grp Recurrence (Months) Status patient has never

been disease-free.

This includes cases

with distant

metastasis at

3C Since diagnosis, 27 Dead patient has never

been disease-free.

This includes cases

with distant

metastasis at

3C Since diagnosis, 27 Dead patient has never

been disease-free.

This includes cases

with distant

metastasis at

4 Since diagnosis, 78 Dead patient has never

been disease-free.

This includes cases

with distant

metastasis at

4 Since diagnosis, 78 Dead patient has never

been disease-free.

This includes cases

with distant

metastasis at

4 Since diagnosis, 78 Dead patient has never

been disease-free.

This includes cases

with distant

metastasis at

Attorney Ref. No.072396.0685 Path

M Stg Type 1st Surv Vital Grp Recurrence (Months) Status 4 Since diagnosis, 78 Dead patient has never

been disease-free.

This includes cases

with distant

metastasis at

4 Since diagnosis, 78 Dead patient has never

been disease-free.

This includes cases

with distant

metastasis at

4 Since diagnosis, 78 Dead patient has never

been disease-free.

This includes cases

with distant

metastasis at

4 Since diagnosis, 78 Dead patient has never

been disease-free.

This includes cases

with distant

metastasis at

4 Since diagnosis, 78 Dead patient has never

been disease-free.

This includes cases

with distant

metastasis at

4 Since diagnosis, 25 Dead patient has never

been disease-free.

This includes cases

with distant

Attorney Ref. No.072396.0685 Path

pM Stg Type 1st Surv Vital Grp Recurrence (Months) Status metastasis at

4 Since diagnosis, 25 Dead patient has never

been disease-free.

This includes cases

with distant

metastasis at

3C Recurrence of an 11 Dead invasive tumor in

adjacent tissue or

organ(s) only.

3C Recurrence of an 11 Dead invasive tumor in

adjacent tissue or

organ(s) only.

3C Recurrence of an 11 Dead invasive tumor in

adjacent tissue or

organ(s) only.

4 Since diagnosis, 32 Dead patient has never

been disease-free.

This includes cases

with distant

metastasis at

4 Since diagnosis, 32 Dead patient has never

been disease-free.

This includes cases

with distant

metastasis at

4 Since diagnosis, 32 Dead patient has never

been disease-free.

This includes cases

Attorney Ref. No.072396.0685 Path

M Stg Type 1st Surv Vital Grp Recurrence (Months) Status with distant

metastasis at

3C Distant recurrence 85 Dead of an invasive tumor

in multiple sites

(recurrences that

can be coded to

more tha

3C Distant recurrence 85 Dead of an invasive tumor

in multiple sites

(recurrences that

can be coded to

more tha

3C Distant recurrence 85 Dead of an invasive tumor

in multiple sites

(recurrences that

can be coded to

more tha

3C Distant recurrence 85 Dead of an invasive tumor

in multiple sites

(recurrences that

can be coded to

more tha

3C Distant recurrence 85 Dead of an invasive tumor

in multiple sites

(recurrences that

can be coded to

more tha

3C Distant recurrence 85 Dead of an invasive tumor

in multiple sites

Attorney Ref. No.072396.0685 en- Grade/ Reg Regl Path

LC1 Age at pT pN pM Stg Type 1st Surv Vital tus Diagnosis Differentiation- Nodes Nodes

Desc Exam Positive Grp Recurrence (Months) Status

(recurrences that

can be coded to

more tha

150

Table 15 (cont’d)

Table 16

Primer sequence for nested PCR to identify Pten-NOLC1 breakpoint (SEQ ID

NOs: 230-250 in order

Table 17

Primer and Probe sequence for Taqman quantitative PCR (SEQ ID NO: 251- 271, in order from left to right)

7.5 REFERENCES

1. Li J, Yen C, Liaw D, Podsypanina K, Bose S, Wang SI, Puc J, Miliaresis C, Rodgers L, McCombie R, Bigner SH, Giovanella BC, Ittmann M, Tycko B, Hibshoosh H, Wigler MH, Parsons R: PTEN, a putative protein tyrosine phosphatase gene mutated in human brain, breast, and prostate cancer. Science (New York, NY 1997, 275:1943‐7.

2. Steck PA, Pershouse MA, Jasser SA, Yung WK, Lin H, Ligon AH, Langford LA, Baumgard ML, Hattier T, Davis T, Frye C, Hu R, Swedlund B, Teng DH, Tavtigian SV: Identification of a candidate tumour suppressor gene, MMAC1, at chromosome 10q23.3 thats mutated in multiple advanced cancers. Nature genetics 1997, 15:356‐62.

3. Myers MP, Pass I, Batty IH, Van der Kaay J, Stolarov JP, Hemmings BA, Wigler MH, Downes CP, Tonks NK: The lipid phosphatase activity of PTEN is critical for its tumor supressor function. Proceedings of the National Academy of Sciences of the United States of America 1998, 95:13513‐8.

4. Maehama T, Dixon JE: The tumor suppressor, PTEN/MMAC1, dephosphorylateshe lipid second messenger, phosphatidylinositol 3,4,5‐trisphosphate. The Journal of biological chemistry 1998, 273:13375‐8.

5. Sansal I, Sellers WR: The biology and clinical relevance of the PTEN tumor suppressor pathway. J Clin Oncol 2004, 22:2954‐63.

6. McCubrey JA, Steelman LS, Abrams SL, Lee JT, Chang F, Bertrand FE, Navolanic PM, Terrian DM, Franklin RA, D'Assoro AB, Salisbury JL, Mazzarino MC, Stivala F, Libra M: Roles of the RAF/MEK/ERK and PI3K/PTEN/AKT pathways in malignant transformation and drug resistance. Adv Enzyme Regul 2006, 46:249‐79.

7. Baker SJ: PTEN enters the nuclear age. Cell 2007, 128:25‐8.

8. Trotman LC, Wang X, Alimonti A, Chen Z, Teruya‐Feldstein J, Yang H, Pavletich NP, Carver BS, Cordon‐Cardo C, Erdjument‐Bromage H, Tempst P, Chi SG, Kim HJ, Misteli T, Jiang X, Pandolfi PP: Ubiquitination regulates PTEN nuclear import andumor suppression. Cell 2007, 128:141‐56.

9. Shen WH, Balajee AS, Wang J, Wu H, Eng C, Pandolfi PP, Yin Y: Essentialrole for nuclear PTEN in maintaining chromosomal integrity. Cell 2007, 128:157‐70.

10. Han B, Mehra R, Lonigro RJ, Wang L, Suleman K, Menon A, Palanisamy N, Tomlins SA, Chinnaiyan AM, Shah RB: Fluorescence in situ hybridization study shows association of PTEN deletion with ERG rearrangement during prostate cancer progression. Mod Pathol 2009, 22:1083‐93.

11. Yin Y, Shen WH: PTEN: a new guardian of the genome. Oncogene 2008, 27:5443‐53.

12. Chen HK, Pai CY, Huang JY, Yeh NH: Human Nopp140, which interacts with RNA polymerase I: implications for rRNA gene transcription and nucleolar structural organization. Molecular and cellular biology 1999, 19:8536‐46.

13. Pai CY, Chen HK, Sheu HL, Yeh NH: Cell‐cycle‐dependent alterations of a highly phosphorylated nucleolar protein p130 are associated with nucleologenesis. Journal of cell science 1995, 108 ( Pt 5):1911‐20.

14. Vlietstra RJ, van Alewijk DC, Hermans KG, van Steenbrugge GJ, Trapman J: Frequent inactivation of PTEN in prostate cancer cell lines and xenografts. Cancer research 1998, 58:2720‐3.

15. Lee YM, Miau LH, Chang CJ, Lee SC: Transcriptional induction of the alpha‐1 acid glycoprotein (AGP) gene by synergistic interaction of two alternative activator forms of AGP/enhancer‐binding protein (C/EBP beta) and NF‐kappaB or Nopp140. Molecular and cellular biology 1996, 16:4257‐63.

16. Hwang YC, Lu TY, Huang DY, Kuo YS, Kao CF, Yeh NH, Wu HC, Lin CT: NOLC1, an enhancer of nasopharyngeal carcinoma progression, is essential for TP53 to regulate MDM2 expression. The American journal of pathology 2009, 175:342‐54. 17. Wang X, DeFrances MC, Dai Y, Pediaditakis P, Johnson C, Bell A, Michalopoulos GK, Zarnegar R: A mechanism of cell survival: sequestration of Fas by the HGF receptor Met. Mol Cell 2002, 9:411‐21.

18. Cloughesy TF, Yoshimoto K, Nghiemphu P, Brown K, Dang J, Zhu S, Hsueh T, Chen Y, Wang W, Youngkin D, Liau L, Martin N, Becker D, Bergsneider M, Lai A, Green R, Oglesby T, Koleto M, Trent J, Horvath S, Mischel PS, Mellinghoff IK, Sawyers CL: Antitumor activity of rapamycin in a Phase I trial for patients with recurrent PTEN‐deficient glioblastoma. PLoS medicine 2008, 5:e8.

19. Courtney KD, Corcoran RB, Engelman JA: The PI3K pathway as drug target in human cancer. J Clin Oncol 2010, 28:1075‐83.

20. Zhang‐Hui Chen, Yan P. Yu, Ze‐Hua Zuo, Joel B. Nelson, George K.

Michalopoulos, Satdatshan Monga, Silvia Liu, Tseng G, Luo J‐H: Targeting genomic rearrangements in tumor cells using Cas9‐mediated insertion of a suicide gene Nature biotechnology 2017, in press.

21. Y.P. Yu et al., Am J Pathol (Oct 7, 2013).

22. J.H. Luo et al., Am J Pathol 182, 2028 (Jun, 2013).

23. Y.P. Yu et al., Am J Pathol 180, 2240 (Jun, 2012).

24. H. Edgren et al., Genome Biol 12, R6.

25. T. Prakash et al., PLoS One 5, e13284 (2010).

26. C.-W.F. Wei Zeng, Stefan Muller Arisona, Huamin Qu, Computer Graphics Forum, 271 (2013). Various references are cited in this document, which are hereby incorporated by reference in their entireties herein.