Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
COMPOSITIONS AND METHODS FOR DETECTING MICROBIAL SIGNATURES ASSOCIATED WITH DIFFERENT BREAST CANCER TYPES
Document Type and Number:
WIPO Patent Application WO/2018/200813
Kind Code:
A1
Abstract:
The present invention includes compositions and methods for the detection of breast cancer. The invention further includes detection and distinguishing the different types of breast cancer (BRTN, BRTP, BRER, and BRHR). Compositions and methods are provided for detecting a metagenomic signature in a tissue sample from a subject that indicates the subject has breast cancer and/or a specific type of breast cancer.

Inventors:
ROBERTSON ERLE S (US)
ALWINE JAMES C (US)
Application Number:
PCT/US2018/029572
Publication Date:
November 01, 2018
Filing Date:
April 26, 2018
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
UNIV PENNSYLVANIA (US)
International Classes:
C12M1/00; C12N15/09; C12Q1/02
Domestic Patent References:
WO2016172179A22016-10-27
Foreign References:
US20070134652A12007-06-14
Attorney, Agent or Firm:
DOYLE, Kathryn et al. (US)
Download PDF:
Claims:
CLAIMS

What is claimed:

1. A method of detecting breast cancer in a tumor tissue sample from a subject, the method comprising:

hybridizing a detectably-labeled nucleic acid from the tumor tissue sample to a PathoChip array to generate a first hybridization pattern;

hybridizing a detectably-labeled nucleic acid from a reference sample to a PathoChip array to generate a second hybridization pattern, wherein the reference sample is from an otherwise identical non-tumor tissue from a subject;

comparing the first and second hybridization patterns, wherein when the first hybridization pattern is substantially a microbial hybridization signature and the second hybridization pattern is substantially not a microbial hybridization signature,

breast cancer is detected in the tumor tissue sample.

2. The method of claim 1, wherein the microbial hybridization signature is generated by hybridization of the detectably-labeled nucleic acid from the tumor tissue sample to at least three nucleic acid probes on the PathoChip,

wherein the probes are from microbes selected from the group consisting of: Adenoviridae, Anelloviridae, Arenaviridae, Bunyaviridae, Coronaviridae, Filoviridae, Flaviviridae, Herpesviridae, Iridoviridae, Papillomaviridae,

Paramyxoviridae, Parvoviridae, Picornaviridae, Poxviridae, Reoviridae,

Retroviridae and Rhabdoviridae, Actinomyces, Bartonella, Brevundimonas, Coxiella, Mobiluncus, Mycobacterium, Rickettsia and Sphingomonas.

3. A method of detecting breast cancer in a tumor tissue sample from a subject, the method comprising:

hybridizing a detectably-labeled nucleic acid from the tumor tissue sample to a first microarray comprising at least three nucleic acid probes from microbes selected from the group consisting of Adenoviridae, Anelloviridae, Arenaviridae, Bunyaviridae, Coronaviridae, Filoviridae, Flaviviridae, Herpesviridae,

Iridoviridae, Papillomaviridae, Paramyxoviridae, Parvoviridae, Picornaviridae, Poxviridae, Reoviridae, Retroviridae and Rhabdoviridae, Actinomyces,

Bartonella, Brevundimonas, Coxiella, Mobiluncus, Mycobacterium, Rickettsia and Sphingomonas to generate a first hybridization pattern;

hybridizing a detectably-labeled nucleic acid from a reference sample to a second microarray comprising at least three nucleic acid probes from microbes selected from the group consisting of Adenoviridae, Anelloviridae, Arenaviridae, Bunyaviridae, Coronaviridae, Filoviridae, Flaviviridae, Herpesviridae,

Iridoviridae, Papillomaviridae, Paramyxoviridae, Parvoviridae, Picornaviridae, Poxviridae, Reoviridae, Retroviridae and Rhabdoviridae, Actinomyces,

Bartonella, Brevundimonas, Coxiella, Mobiluncus, Mycobacterium, Rickettsia and Sphingomonas to generate a second hybridization pattern, wherein the reference sample is from an otherwise identical non-tumor tissue from a subject; comparing the first and second hybridization patterns, wherein when the first hybridization pattern is substantially a microbial hybridization signature and the second hybridization pattern is substantially not a microbial hybridization signature, breast cancer is detected in the tumor tissue sample.

4. A method of detecting endocrine receptor positive breast cancer (BRER) in a tumor tissue sample from a subject, the method comprising:

hybridizing a detectably-labeled nucleic acid from the tumor tissue sample to a PathoChip array to generate a first hybridization pattern;

hybridizing a detectably-labeled nucleic acid from a reference sample to a PathoChip array to generate a second hybridization pattern, wherein the reference sample is from an otherwise identical non-tumor tissue from a subject;

comparing the first and second hybridization patterns, wherein when the first hybridization pattern is generated by hybridization of the detectably-labeled nucleic acid from the tumor tissue sample to at least three nucleic acid probes on the PathoChip, wherein the probes are from microbes selected from the group consisting of: Arcanobacterium, Bifidobacterium, Cardiobacterium, Citrobacter, Escherichia, Filobasidiella, Mucor, Trichophyton, Brugia and Paragonimus, and the second hybridization pattern is substantially not generated by hybridization of the detectably-labeled nucleic acid from the tumor tissue sample to at least three nucleic acid probes on the PathoChip, wherein the probes are from microbes selected from the group consisting of: Arcanobacterium, Bifidobacterium, Cardiobacterium, Citrobacter, Escherichia, Filobasidiella, Mucor, Trichophyton, Brugia and Paragonimus,

BRER is detected in the tumor tissue sample.

A method of distinguishing BRER from human epidermal growth factor receptor 2 (HER2) positive breast cancer (BRHR), triple positive breast cancer (BRTP), and triple negative breast cancer (BRTN) in a tumor tissue sample from a subject, the method comprising:

hybridizing a detectably-labeled nucleic acid from the tumor tissue sample to a PathoChip array to generate a first hybridization pattern;

hybridizing a detectably-labeled nucleic acid from a reference sample to a PathoChip array to generate a second hybridization pattern, wherein the reference sample is from an otherwise identical non-tumor tissue from a subject;

comparing the first and second hybridization patterns, wherein when the first hybridization pattern is generated by hybridization of the detectably-labeled nucleic acid from the tumor tissue sample to at least three nucleic acid probes on the PathoChip, wherein the probes are from microbes selected from the group consisting of: Arcanobacterium, Bifidobacterium, Cardiobacterium, Citrobacter, Escherichia, Filobasidiella, Mucor, Trichophyton, Brugia and Paragonimus, and the second hybridization pattern is substantially not generated by hybridization of the detectably-labeled nucleic acid from the tumor tissue sample to at least three nucleic acid probes on the PathoChip, wherein the probes are from microbes selected from the group consisting of: Arcanobacterium, Bifidobacterium, Cardiobacterium, Citrobacter, Escherichia, Filobasidiella, Mucor, Trichophyton, Brugia and Paragonimus,

BRER is distinguished from BRHR, BRTP, and BRTN in the tumor tissue sample.

A method of detecting BRHR in a tumor tissue sample from a subject, the method comprising:

hybridizing a detectably-labeled nucleic acid from the tumor tissue sample to a PathoChip array to generate a first hybridization pattern; hybridizing a detectably-labeled nucleic acid from a reference sample to a PathoChip array to generate a second hybridization pattern, wherein the reference sample is from an otherwise identical non-tumor tissue from a subject;

comparing the first and second hybridization patterns, wherein when the first hybridization pattern is generated by hybridization of the detectably-labeled nucleic acid from the tumor tissue sample to at least three nucleic acid probes on the PathoChip, wherein the probes are from microbes selected from the group consisting of: Novaviridae, Streptococcus, Epidermophyton, Fonsecaea, Pseudallescheria, and Balamuthia, and the second hybridization pattern is substantially not generated by hybridization of the detectably-labeled nucleic acid from the tumor tissue sample to at least three nucleic acid probes on the

PathoChip, wherein the probes are from microbes selected from the group consisting of: Novaviridae, Streptococcus, Epidermophyton, Fonsecaea, Pseudallescheria, and Balamuthia,

BRHR is detected in the tumor tissue sample.

7. A method of distinguishing BRHR from BRER, BRTP and BRTN in a tumor tissue sample from a subject, the method comprising:

hybridizing a detectably-labeled nucleic acid from the tumor tissue sample to a PathoChip array to generate a first hybridization pattern;

hybridizing a detectably-labeled nucleic acid from a reference sample to a PathoChip array to generate a second hybridization pattern, wherein the reference sample is from an otherwise identical non-tumor tissue from a subject;

comparing the first and second hybridization patterns, wherein when the first hybridization pattern is generated by hybridization of the detectably-labeled nucleic acid from the tumor tissue sample to at least three nucleic acid probes on the PathoChip, wherein the probes are from microbes selected from the group consisting of: Novaviridae, Streptococcus, Epidermophyton, Fonsecaea, Pseudallescheria, and Balamuthia, and the second hybridization pattern is substantially not generated by hybridization of the detectably-labeled nucleic acid from the tumor tissue sample to at least three nucleic acid probes on the

PathoChip, wherein the probes are from microbes selected from the group consisting of: Novaviridae, Streptococcus, Epidermophyton, Fonsecaea,

Pseudallescheria, and Balamuthia,

BRHR is distinguished from BRER, BRTP, and BRTN in the tumor tissue sample.

A method of detecting BRTP in a tumor tissue sample from a subject, the method comprising:

hybridizing a detectably-labeled nucleic acid from the tumor tissue sample to a PathoChip array to generate a first hybridization pattern;

hybridizing a detectably-labeled nucleic acid from a reference sample to a PathoChip array to generate a second hybridization pattern, wherein the reference sample is from an otherwise identical non-tumor tissue from a subject;

comparing the first and second hybridization patterns, wherein when the first hybridization pattern is generated by hybridization of the detectably-labeled nucleic acid from the tumor tissue sample to at least three nucleic acid probes on the PathoChip, wherein the probes are from microbes selected from the group consisting of: Birnaviridae, Hepeviridae, Bordetella, Campylobacter, Chlamydia, Chlamydophila, Legionella, Pasteurella, Penicillium, Ancylostoma,

Angiostrongylus, Echinococcus, Sarcocystis, Trichomonas, and Trichostrongylus, and the second hybridization pattern is substantially not generated by hybridization of the detectably-labeled nucleic acid from the tumor tissue sample to at least three nucleic acid probes on the PathoChip, wherein the probes are from microbes selected from the group consisting of: Birnaviridae, Hepeviridae, Bordetella, Campylobacter, Chlamydia, Chlamydophila, Legionella, Pasteurella, Penicillium, Ancylostoma, Angiostrongylus, Echinococcus, Sarcocystis, Trichomonas, and Trichostrongylus,

BRTP is detected in the tumor tissue sample.

A method of distinguishing BRTP from BRHR, BRER, and BRTN in a tumor tissue sample from a subject, the method comprising:

hybridizing a detectably-labeled nucleic acid from the tumor tissue sample to a PathoChip array to generate a first hybridization pattern; hybridizing a detectably-labeled nucleic acid from a reference sample to a PathoChip array to generate a second hybridization pattern, wherein the reference sample is from an otherwise identical non-tumor tissue from a subject;

comparing the first and second hybridization patterns, wherein when the first hybridization pattern is generated by hybridization of the detectably-labeled nucleic acid from the tumor tissue sample to at least three nucleic acid probes on the PathoChip, wherein the probes are from microbes selected from the group consisting of: Birnaviridae, Hepeviridae, Bordetella, Campylobacter, Chlamydia, Chlamydophila, Legionella, Pasteurella, Penicillium, Ancylostoma,

Angiostrongylus, Echinococcus, Sarcocystis, Trichomonas, and Trichostrongylus, and the second hybridization pattern is substantially not generated by hybridization of the detectably-labeled nucleic acid from the tumor tissue sample to at least three nucleic acid probes on the PathoChip, wherein the probes are from microbes selected from the group consisting of: Birnaviridae, Hepeviridae, Bordetella, Campylobacter, Chlamydia, Chlamydophila, Legionella, Pasteurella, Penicillium, Ancylostoma, Angiostrongylus, Echinococcus, Sarcocystis, Trichomonas, and Trichostrongylus,

BRTP is distinguished from BRHR, BRER, and BRTN in the tumor tissue sample.

A method of distinguishing BRTN from BRHR, BRER, and BRTP in a tumor tissue sample from a subject, the method comprising:

hybridizing a detectably-labeled nucleic acid from the tumor tissue sample to a PathoChip array to generate a first hybridization pattern;

hybridizing a detectably-labeled nucleic acid from a reference sample to a PathoChip array to generate a second hybridization pattern, wherein the reference sample is from an otherwise identical non-tumor tissue from a subject;

comparing the first and second hybridization patterns, wherein when the first hybridization pattern is generated by hybridization of the detectably-labeled nucleic acid from the tumor tissue sample to at least three nucleic acid probes on the PathoChip, wherein the probes are from microbes selected from the group consisting of: Aerococcus, Arcobacter, Geobacillus, Orientia and Rothia, Alternaria, Malassezia, Piedraia, Rhizomucor, Centrocestus, Contracaecum, Leishmania, Necator, Onchocerca, Toxocara, Trichinella, and Trichuris, and the second hybridization pattern is substantially not generated by hybridization of the detectably -labeled nucleic acid from the tumor tissue sample to at least three nucleic acid probes on the PathoChip, wherein the probes are from microbes selected from the group consisting of: Aerococcus, Arcobacter, Geobacillus, Orientia and Rothia, Alternaria, Malassezia, Piedraia, Rhizomucor, Centrocestus, Contracaecum, Leishmania, Necator, Onchocerca, Toxocara, Trichinella, and Trichuris,

BRTN is distinguished from BRHR, BRER, and BRTP in the tumor tissue sample.

1 1. The method of any one of claims 1-10, wherein the tumor tissue sample is

selected from the group consisting of a biopsy, formalin-fixed, paraffin-embedded (FFPE) sample, or non-solid tumor.

12. The method of any one of claims 1-10, wherein the subject is human.

13. The method of any one of claims 1- 10, wherein the detectably-labeled nucleic acid is labeled with a fluorophore, radioactive phosphate, biotin, or enzyme.

14. The method of claim 13, wherein the fluorophore is Cy3 or Cy5.

15. The method of any one of claims 1-10, further comprising wherein when breast cancer is detected in the tumor tissue sample from a subject, the subject is provided with a treatment for breast cancer.

16. The method of claim 15, wherein the treatment comprises surgery, chemotherapy, or radiotherapy.

17. A kit comprising a microarray comprising at least three nucleic acid probes

selected from the group of microbes consisting of Adenoviridae, Anelloviridae, Arenaviridae, Bunyaviridae, Coronaviridae, Filoviridae, Flaviviridae,

Herpesviridae, Iridoviridae, Papillomaviridae, Paramyxoviridae, Parvoviridae, Picornaviridae, Poxviridae, Reoviridae, Retroviridae and Rhabdoviridae, Actinomyces, Bartonella, Brevundimonas, Coxiella, Mobiluncus, Mycobacterium, Rickettsia and Sphingomonas, and instructional material for use thereof.

18. A kit comprising a microarray comprising at least three nucleic acid probes

selected from the group of microbes consisting of Arcanobacterium,

Bifidobacterium, Cardiobacterium, Citrobacter, Escherichia, Filobasidiella, Mucor, Trichophyton, Brugia and Paragonimus, and instructional material for use thereof.

19. A kit comprising a microarray comprising at least three nucleic acid probes

selected from the group of microbes consisting of Novaviridae, Streptococcus, Epidermophyton, Fonsecaea, Pseudallescheria, and Balamuthia, and instructional material for use thereof.

20. A kit comprising a microarray comprising at least three nucleic acid probes

selected from the group of microbes consisting of Birnaviridae, Hepeviridae, Bordetella, Campylobacter, Chlamydia, Chlamydophila, Legionella, Pasteurella, Penicillium, Ancylostoma, Angiostrongylus, Echinococcus, Sarcocystis,

Trichomonas, and Trichostrongylus, and instructional material for use thereof.

21. A kit comprising a microarray comprising at least three nucleic acid probes

selected from the group of microbes consisting of Aerococcus, Arcobacter, Geobacillus, Orientia and Rothia, Alternaria, Malassezia, Piedraia, Rhizomucor, Centrocestus, Contracaecum, Leishmania, Necator, Onchocerca, Toxocara, Trichinella, and Trichuris, and instructional material for use thereof.

22. The kit of any one of claims 17-21, wherein the nucleic acid probes are selected from between about 10 to about 30 microbes and comprise about 3 to about 5 probes per microbe.

23. The kit of any one of claims 17-21, wherein the microarray is a biochip, glass slide, bead, or paper.

24. A method of detecting and treating breast cancer in a subject, the method comprising:

hybridizing a detectably-labeled nucleic acid from a tumor tissue sample from the subject to a PathoChip array to generate a first hybridization pattern; hybridizing a detectably-labeled nucleic acid from a reference sample to a PathoChip array to generate a second hybridization pattern, wherein the reference sample is from an otherwise identical non-tumor tissue from a subject;

comparing the first and second hybridization patterns, wherein when the first hybridization pattern is substantially a microbial hybridization signature and the second hybridization pattern is substantially not a microbial hybridization signature, breast cancer is detected in the subject, and

administering a treatment to the subject.

25. The method of claim 24, wherein the microbial hybridization signature is

generated by hybridization of the detectably-labeled nucleic acid from the tumor tissue sample to at least three nucleic acid probes on the PathoChip,

wherein the probes are from microbes selected from the group consisting of: Adenoviridae, Anelloviridae, Arenaviridae, Bunyaviridae, Coronaviridae, Filoviridae, Flaviviridae, Herpesviridae, Iridoviridae, Papillomaviridae,

Paramyxoviridae, Parvoviridae, Picornaviridae, Poxviridae, Reoviridae,

Retroviridae and Rhabdoviridae, Actinomyces, Bartonella, Brevundimonas, Coxiella, Mobiluncus, Mycobacterium, Rickettsia and Sphingomonas.

26. A method of detecting and treating breast cancer in a subject, the method

comprising:

hybridizing a detectably-labeled nucleic acid from a tumor tissue sample from the subject to a first microarray comprising at least three nucleic acid probes from microbes selected from the group consisting of Adenoviridae, Anelloviridae, Arenaviridae, Bunyaviridae, Coronaviridae, Filoviridae, Flaviviridae,

Herpesviridae, Iridoviridae, Papillomaviridae, Paramyxoviridae, Parvoviridae, Picornaviridae, Poxviridae, Reoviridae, Retroviridae and Rhabdoviridae, Actinomyces, Bartonella, Brevundimonas, Coxiella, Mobiluncus, Mycobacterium, Rickettsia and Sphingomonas to generate a first hybridization pattern; hybridizing a detectably-labeled nucleic acid from a reference sample to a second microarray comprising at least three nucleic acid probes from microbes selected from the group consisting of Adenoviridae, Anelloviridae, Arenaviridae, Bunyaviridae, Coronaviridae, Filoviridae, Flaviviridae, Herpesviridae,

Iridoviridae, Papillomaviridae, Paramyxoviridae, Parvoviridae, Picornaviridae, Poxviridae, Reoviridae, Retroviridae and Rhabdoviridae, Actinomyces,

Bartonella, Brevundimonas, Coxiella, Mobiluncus, Mycobacterium, Rickettsia and Sphingomonas to generate a second hybridization pattern, wherein the reference sample is from an otherwise identical non-tumor tissue from a subject; comparing the first and second hybridization patterns, wherein when the first hybridization pattern is substantially a microbial hybridization signature and the second hybridization pattern is substantially not a microbial hybridization signature, breast cancer is detected in the subject, and

administering a treatment to the subject.

A method of detecting and treating BRER in a subject, the method comprising: hybridizing a detectably-labeled nucleic acid from a tumor tissue sample from the subject to a PathoChip array to generate a first hybridization pattern; hybridizing a detectably-labeled nucleic acid from a reference sample to a PathoChip array to generate a second hybridization pattern, wherein the reference sample is from an otherwise identical non-tumor tissue from a subject;

comparing the first and second hybridization patterns, wherein when the first hybridization pattern is generated by hybridization of the detectably-labeled nucleic acid from the tumor tissue sample to at least three nucleic acid probes on the PathoChip, wherein the probes are from microbes selected from the group consisting of: Arcanobacterium, Bifidobacterium, Cardiobacterium, Citrobacter, Escherichia, Filobasidiella, Mucor, Trichophyton, Brugia and Paragonimus, and the second hybridization pattern is substantially not generated by hybridization of the detectably-labeled nucleic acid from the tumor tissue sample to at least three nucleic acid probes on the PathoChip, wherein the probes are from microbes selected from the group consisting of: Arcanobacterium, Bifidobacterium, Cardiobacterium, Citrobacter, Escherichia, Filobasidiella, Mucor, Trichophyton, Brugia and Paragonimus, BRER is detected in the subject, and administering a treatment to the subject.

28. A method of detecting and treating BRHR in a subject, the method comprising:

hybridizing a detectably-labeled nucleic acid from a tumor tissue sample from the subject to a PathoChip array to generate a first hybridization pattern; hybridizing a detectably-labeled nucleic acid from a reference sample to a PathoChip array to generate a second hybridization pattern, wherein the reference sample is from an otherwise identical non-tumor tissue from a subject;

comparing the first and second hybridization patterns, wherein when the first hybridization pattern is generated by hybridization of the detectably-labeled nucleic acid from the tumor tissue sample to at least three nucleic acid probes on the PathoChip, wherein the probes are from microbes selected from the group consisting of: Novaviridae, Streptococcus, Epidermophyton, Fonsecaea, Pseudallescheria, and Balamuthia, and the second hybridization pattern is substantially not generated by hybridization of the detectably-labeled nucleic acid from the tumor tissue sample to at least three nucleic acid probes on the

PathoChip, wherein the probes are from microbes selected from the group consisting of: Novaviridae, Streptococcus, Epidermophyton, Fonsecaea, Pseudallescheria, and Balamuthia, BRHR is detected in the subject, and

administering a treatment to the subject.

29. A method of detecting and treating BRTP in a subject, the method comprising:

hybridizing a detectably-labeled nucleic acid from a tumor tissue sample from a subject to a PathoChip array to generate a first hybridization pattern; hybridizing a detectably-labeled nucleic acid from a reference sample to a PathoChip array to generate a second hybridization pattern, wherein the reference sample is from an otherwise identical non-tumor tissue from a subject;

comparing the first and second hybridization patterns, wherein when the first hybridization pattern is generated by hybridization of the detectably-labeled nucleic acid from the tumor tissue sample to at least three nucleic acid probes on the PathoChip, wherein the probes are from microbes selected from the group consisting of: Birnaviridae, Hepeviridae, Bordetella, Campylobacter, Chlamydia, Chlamydophila, Legionella, Pasteurella, Penicillium, Ancylostoma, Angiostrongylus, Echinococcus, Sarcocystis, Trichomonas, and Trichostrongylus, and the second hybridization pattern is substantially not generated by

hybridization of the detectably-labeled nucleic acid from the tumor tissue sample to at least three nucleic acid probes on the PathoChip, wherein the probes are from microbes selected from the group consisting of: Birnaviridae, Hepeviridae, Bordetella, Campylobacter, Chlamydia, Chlamydophila, Legionella, Pasteurella, Penicillium, Ancylostoma, Angiostrongylus, Echinococcus, Sarcocystis,

Trichomonas, and Trichostrongylus, BRTP is detected in the subject, and

administering a treatment to the subject.

The method of any one of claims 24-29, wherein the treatment comprises surgery, chemotherapy, or radiotherapy.

Description:
TITLE OF THE INVENTION

Compositions and Methods for Detecting Microbial Signatures Associated with Different

Breast Cancer Types

CROSS-REFERENCE TO RELATED APPLICATION The present application is entitled to priority under 35 U.S.C. § 119(e) to U.S. Provisional Patent Application No. 62/490,375, filed April 26, 2017, which is hereby incorporated by reference in its entirety herein.

BACKGROUND OF THE INVENTION

Breast cancer, the second leading cause of cancer death in women, is responsible for the death of 1 in 36 women. Based on the hormone receptor status in the cancerous breast cells, there are 4 major groups of breast cancers: endocrine receptor (estrogen or progesterone receptor) positive (BRER), human epidermal growth factor receptor 2 (HER2) positive (BRHR), triple positive (estrogen, progesterone and HER2 receptor positive) (BRTP) and triple negative (absence of estrogen, progesterone and HER2 receptors) (BRTN). These four types have specific prognoses and responses to therapy. Specifically, the hormone receptor positive breast cancers (BRER, BRTP) respond to endocrine therapy and show better prognosis, while the hormone receptor negative types (BRHR, BRTN) are more aggressive, non-responsive to endocrine therapy and have poor prognosis. BRTN cancer is seen in 15-20% of breast cancer patients, is the most aggressive of all the breast cancers, is unresponsive to treatment, highly angiogenic, proliferative and has the lowest survival rate.

Among the risk factors to develop cancer in general, infectious agents are known to be the third highest after tobacco usage and obesity, contributing 15-20% of cancer incidence. Age and genetic pre-disposition are also known cancer risk factors however, the majority of cancers have unknown etiology. Recent studies of microbiome dysbiosis in human health suggest specific changes in the microbiome in a number of disease states, including cancer. Further, studies have suggested the association of a particular microbiome with specific cancers. Thus, a distinct microbiome may contribute to the cause or development of cancer. Conversely, the tumor micro -environment may provide a specialized niche in which these viruses and microorganisms may persist. In either case, cancer-type specific microbiome signatures may provide biomarkers for early diagnosis, prognosis and treatment strategies. Distinct microbiome signatures associated with triple negative breast cancer have been identified. However, it was not known whether the microbiome signatures associated with BRTN are shared by other breast cancer types, or whether different breast cancer types have unique signatures. The PathoChip is a pan-pathogen array containing oligonucleotide probes for the detection of all known, sequenced viruses, as well as known human bacterial, parasitic and fungal pathogens (Baldwin et al. MBio. 2014; 5 : e01714-14). Additionally, PathoChip contains viral family specific conserved probes that allow for detection of uncharacterized members of the viral families. The PathoChip screen includes a whole genome and transcriptome amplification step that allows detection of very low copy number of both DNA and RNA viruses and micro-organisms from cancer tissues.

A need exists for compositions and methods for detection and treatment of breast cancer, and importantly, distinguishing between different types of breast cancer. The present invention satisfies this need.

SUMMARY OF THE INVENTION

As described herein, the present invention relates to compositions and methods for detecting, treating, and distinguishing between different types of breast cancer.

One aspect of the invention includes a method of detecting breast cancer in a tumor tissue sample from a subject. The method comprises hybridizing a detectably- labeled nucleic acid from the tumor tissue sample to a PathoChip array to generate a first hybridization pattern and hybridizing a detectably-labeled nucleic acid from a reference sample to a PathoChip array to generate a second hybridization pattern. The reference sample is from an otherwise identical non-tumor tissue from a subject. The first and second hybridization patterns are compared. When the first hybridization pattern is substantially a microbial hybridization signature and the second hybridization pattern is substantially not a microbial hybridization signature, breast cancer is detected in the tumor tissue sample.

Another aspect of the invention includes a method of detecting breast cancer in a tumor tissue sample from a subject. The method comprises hybridizing a detectably- labeled nucleic acid from the tumor tissue sample to a first microarray comprising at least three nucleic acid probes from microbes selected from the group consisting of

Adenoviridae, Anelloviridae, Arenaviridae, Bunyaviridae, Coronaviridae, Filoviridae, Flaviviridae, Herpesviridae, Iridoviridae, Papillomaviridae, Paramyxoviridae,

Parvoviridae, Picornaviridae, Poxviridae, Reoviridae, Retroviridae and Rhabdoviridae, Actinomyces, Bartonella, Brevundimonas, Coxiella, Mobiluncus, Mycobacterium, Rickettsia and Sphingomonas to generate a first hybridization pattern. A detectably- labeled nucleic acid from a reference sample is hybridized to a second microarray comprising at least three nucleic acid probes from microbes selected from the group consisting of Adenoviridae, Anelloviridae, Arenaviridae, Bunyaviridae, Coronaviridae, Filoviridae, Flaviviridae, Herpesviridae, Iridoviridae, Papillomaviridae, Paramyxoviridae, Parvoviridae, Picornaviridae, Poxviridae, Reoviridae, Retroviridae and Rhabdoviridae, Actinomyces, Bartonella, Brevundimonas, Coxiella, Mobiluncus, Mycobacterium,

Rickettsia and Sphingomonas to generate a second hybridization pattern. The reference sample is from an otherwise identical non-tumor tissue from a subject. The first and second hybridization patterns are compared. When the first hybridization pattern is substantially a microbial hybridization signature and the second hybridization pattern is substantially not a microbial hybridization signature, breast cancer is detected in the tumor tissue sample.

Yet another aspect of the invention includes a method of detecting endocrine receptor positive breast cancer (BRER) in a tumor tissue sample from a subject. The method comprises hybridizing a detectably-labeled nucleic acid from the tumor tissue sample to a PathoChip array to generate a first hybridization pattern and hybridizing a detectably-labeled nucleic acid from a reference sample to a PathoChip array to generate a second hybridization pattern. The reference sample is from an otherwise identical non- tumor tissue from a subject. The the first and second hybridization patterns are compared. When the first hybridization pattern is generated by hybridization of the detectably- labeled nucleic acid from the tumor tissue sample to at least three nucleic acid probes on the PathoChip, wherein the probes are from microbes selected from the group consisting of: Arcanobacterium, Bifidobacterium, Cardiobacterium, Citrobacter, Escherichia, Filobasidiella, Mucor, Trichophyton, Brugia and Paragonimus, and the second hybridization pattern is substantially not generated by hybridization of the detectably- labeled nucleic acid from the tumor tissue sample to at least three nucleic acid probes on the PathoChip, wherein the probes are from microbes selected from the group consisting of: Arcanobacterium, Bifidobacterium, Cardiobacterium, Citrobacter, Escherichia, Filobasidiella, Mucor, Trichophyton, Brugia and Paragonimus, then BRER is detected in the tumor tissue sample.

Still another aspect of the invention includes a method of distinguishing BRER from human epidermal growth factor receptor 2 (HER2) positive breast cancer (BRHR), triple positive breast cancer (BRTP), and triple negative breast cancer (BRTN) in a tumor tissue sample from a subject. The method comprises hybridizing a detectably-labeled nucleic acid from the tumor tissue sample to a PathoChip array to generate a first hybridization pattern and hybridizing a detectably-labeled nucleic acid from a reference sample to a PathoChip array to generate a second hybridization pattern. The reference sample is from an otherwise identical non-tumor tissue from a subject. The first and second hybridization patterns are compared. When the first hybridization pattern is generated by hybridization of the detectably-labeled nucleic acid from the tumor tissue sample to at least three nucleic acid probes on the PathoChip, wherein the probes are from microbes selected from the group consisting of: Arcanobacterium, Bifidobacterium, Cardiobacterium, Citrobacter, Escherichia, Filobasidiella, Mucor, Trichophyton, Brugia and Paragonimus, and the second hybridization pattern is substantially not generated by hybridization of the detectably-labeled nucleic acid from the tumor tissue sample to at least three nucleic acid probes on the PathoChip, wherein the probes are from microbes selected from the group consisting of: Arcanobacterium, Bifidobacterium,

Cardiobacterium, Citrobacter, Escherichia, Filobasidiella, Mucor, Trichophyton, Brugia and Paragonimus, then BRER is distinguished from BRHR, BRTP, and BRTN in the tumor tissue sample.

In one aspect, the invention includes a method of detecting BRHR in a tumor tissue sample from a subject. The method comprises hybridizing a detectably-labeled nucleic acid from the tumor tissue sample to a PathoChip array to generate a first hybridization pattern and hybridizing a detectably-labeled nucleic acid from a reference sample to a PathoChip array to generate a second hybridization pattern The reference sample is from an otherwise identical non-tumor tissue from a subject. The first and second hybridization patterns are compared. When the first hybridization pattern is generated by hybridization of the detectably-labeled nucleic acid from the tumor tissue sample to at least three nucleic acid probes on the PathoChip, wherein the probes are from microbes selected from the group consisting of: Novaviridae, Streptococcus,

Epidermophyton, Fonsecaea, Pseudallescheria, and Balamuthia, and the second hybridization pattern is substantially not generated by hybridization of the detectably- labeled nucleic acid from the tumor tissue sample to at least three nucleic acid probes on the PathoChip, wherein the probes are from microbes selected from the group consisting of: Novaviridae, Streptococcus, Epidermophyton, Fonsecaea, Pseudallescheria, and Balamuthia, then BRHR is detected in the tumor tissue sample.

In another aspect, the invention includes a method of distinguishing BRHR from BRER, BRTP and BRTN in a tumor tissue sample from a subject. The method comprises hybridizing a detectably-labeled nucleic acid from the tumor tissue sample to a PathoChip array to generate a first hybridization pattern and hybridizing a detectably-labeled nucleic acid from a reference sample to a PathoChip array to generate a second hybridization pattern. The reference sample is from an otherwise identical non-tumor tissue from a subject. The first and second hybridization patterns are compared. When the first hybridization pattern is generated by hybridization of the detectably-labeled nucleic acid from the tumor tissue sample to at least three nucleic acid probes on the PathoChip, wherein the probes are from microbes selected from the group consisting of: Novaviridae,

Streptococcus, Epidermophyton, Fonsecaea, Pseudallescheria, and Balamuthia, and the second hybridization pattern is substantially not generated by hybridization of the detectably-labeled nucleic acid from the tumor tissue sample to at least three nucleic acid probes on the PathoChip, wherein the probes are from microbes selected from the group consisting of: Novaviridae, Streptococcus, Epidermophyton, Fonsecaea, Pseudallescheria, and Balamuthia, then BRHR is distinguished from BRER, BRTP, and BRTN in the tumor tissue sample.

In yet another aspect, the invention includes a method of detecting BRTP in a tumor tissue sample from a subject. The method comprises hybridizing a detectably- labeled nucleic acid from the tumor tissue sample to a PathoChip array to generate a first hybridization pattern and hybridizing a detectably-labeled nucleic acid from a reference sample to a PathoChip array to generate a second hybridization pattern. The reference sample is from an otherwise identical non-tumor tissue from a subject. The first and second hybridization patterns are compared. When the first hybridization pattern is generated by hybridization of the detectably-labeled nucleic acid from the tumor tissue sample to at least three nucleic acid probes on the PathoChip, wherein the probes are from microbes selected from the group consisting of: Birnaviridae, Hepeviridae, Bordetella, Campylobacter, Chlamydia, Chlamydophila, Legionella, Pasteurella, Penicillium, Ancylostoma, Angiostrongylus, Echinococcus, Sarcocystis, Trichomonas, and

Trichostrongylus, and the second hybridization pattern is substantially not generated by hybridization of the detectably-labeled nucleic acid from the tumor tissue sample to at least three nucleic acid probes on the PathoChip, wherein the probes are from microbes selected from the group consisting of: Birnaviridae, Hepeviridae, Bordetella,

Campylobacter, Chlamydia, Chlamydophila, Legionella, Pasteurella, Penicillium, Ancylostoma, Angiostrongylus, Echinococcus, Sarcocystis, Trichomonas, and

Trichostrongylus, then BRTP is detected in the tumor tissue sample.

In still another aspect, the invention includes a method of distinguishing BRTP from BRHR, BRER, and BRTN in a tumor tissue sample from a subject. The method comprises hybridizing a detectably-labeled nucleic acid from the tumor tissue sample to a PathoChip array to generate a first hybridization pattern and hybridizing a detectably- labeled nucleic acid from a reference sample to a PathoChip array to generate a second hybridization pattern. The reference sample is from an otherwise identical non-tumor tissue from a subject. The first and second hybridization patterns are compared. When the first hybridization pattern is generated by hybridization of the detectably-labeled nucleic acid from the tumor tissue sample to at least three nucleic acid probes on the PathoChip, wherein the probes are from microbes selected from the group consisting of: Birnaviridae, Hepeviridae, Bordetella, Campylobacter, Chlamydia, Chlamydophila, Legionella, Pasteurella, Penicillium, Ancylostoma, Angiostrongylus, Echinococcus, Sarcocystis,

Trichomonas, and Trichostrongylus, and the second hybridization pattern is substantially not generated by hybridization of the detectably-labeled nucleic acid from the tumor tissue sample to at least three nucleic acid probes on the PathoChip, wherein the probes are from microbes selected from the group consisting of: Birnaviridae, Hepeviridae, Bordetella, Campylobacter, Chlamydia, Chlamydophila, Legionella, Pasteurella,

Penicillium, Ancylostoma, Angiostrongylus, Echinococcus, Sarcocystis, Trichomonas, and Trichostrongylus, then BRTP is distinguished from BRHR, BRER, and BRTN in the tumor tissue sample.

Another aspect of the invention includes a method of distinguishing BRTN from BRHR, BRER, and BRTP in a tumor tissue sample from a subject. The method comprises hybridizing a detectably-labeled nucleic acid from the tumor tissue sample to a PathoChip array to generate a first hybridization pattern and hybridizing a detectably-labeled nucleic acid from a reference sample to a PathoChip array to generate a second hybridization pattern. The reference sample is from an otherwise identical non-tumor tissue from a subject. The first and second hybridization patterns are compared. When the first hybridization pattern is generated by hybridization of the detectably-labeled nucleic acid from the tumor tissue sample to at least three nucleic acid probes on the PathoChip, wherein the probes are from microbes selected from the group consisting of: Aerococcus,

Arcobacter, Geobacillus, Orientia and Rothia, Alternaria, Malassezia, Piedraia,

Rhizomucor, Centrocestus, Contracaecum, Leishmania, Necator, Onchocerca, Toxocara, Trichinella, and Trichuris, and the second hybridization pattern is substantially not generated by hybridization of the detectably-labeled nucleic acid from the tumor tissue sample to at least three nucleic acid probes on the PathoChip, wherein the probes are from microbes selected from the group consisting of: Aerococcus, Arcobacter, Geobacillus, Orientia and Rothia, Alternaria, Malassezia, Piedraia, Rhizomucor, Centrocestus, Contracaecum, Leishmania, Necator, Onchocerca, Toxocara, Trichinella, and Trichuris, then BRTN is distinguished from BRHR, BRER, and BRTP in the tumor tissue sample.

Yet another aspect of the invention includes a kit comprising a microarray comprising at least three nucleic acid probes selected from the group of microbes consisting of Adenoviridae, Anelloviridae, Arenaviridae, Bunyaviridae, Coronaviridae, Filoviridae, Flaviviridae, Herpesviridae, Iridoviridae, Papillomaviridae, Paramyxoviridae, Parvoviridae, Picornaviridae, Poxviridae, Reoviridae, Retroviridae and Rhabdoviridae, Actinomyces, Bartonella, Brevundimonas, Coxiella, Mobiluncus, Mycobacterium,

Rickettsia and Sphingomonas, and instructional material for use thereof.

Still another aspect of the invention includes a kit comprising a microarray comprising at least three nucleic acid probes selected from the group of microbes consisting of Arcanobacterium, Bifidobacterium, Cardiobacterium, Citrobacter,

Escherichia, Filobasidiella, Mucor, Trichophyton, Brugia and Paragonimus, and instructional material for use thereof.

In another aspect, the invention includes a kit comprising a microarray comprising at least three nucleic acid probes selected from the group of microbes consisting of Novaviridae, Streptococcus, Epidermophyton, Fonsecaea, Pseudallescheria, and

Balamuthia, and instructional material for use thereof.

In yet another aspect, the invention includes a kit comprising a microarray comprising at least three nucleic acid probes selected from the group of microbes consisting of Birnaviridae, Hepeviridae, Bordetella, Campylobacter, Chlamydia, Chlamydophila, Legionella, Pasteurella, Penicillium, Ancylostoma, Angiostrongylus, Echinococcus, Sarcocystis, Trichomonas, and Trichostrongylus, and instructional material for use thereof.

In yet another aspect, the invention includes a kit comprising a microarray comprising at least three nucleic acid probes selected from the group of microbes consisting of Aerococcus, Arcobacter, Geobacillus, Orientia and Rothia, Alternaria, Malassezia, Piedraia, Rhizomucor, Centrocestus, Contracaecum, Leishmania, Necator, Onchocerca, Toxocara, Trichinella, and Trichuris, and instructional material for use thereof.

Another aspect of the invention includes a method of detecting and treating breast cancer in a subject. The method comprises hybridizing a detectably-labeled nucleic acid from a tumor tissue sample from the subject to a PathoChip array to generate a first hybridization pattern and hybridizing a detectably-labeled nucleic acid from a reference sample to a PathoChip array to generate a second hybridization pattern. The reference sample is from an otherwise identical non-tumor tissue from a subject. The first and second hybridization patterns are compared. When the first hybridization pattern is substantially a microbial hybridization signature and the second hybridization pattern is substantially not a microbial hybridization signature, then breast cancer is detected in the subject, and a treatment is administered to the subject.

Yet another aspect of the invention includes a method of detecting and treating breast cancer in a subject. The method comprises hybridizing a detectably-labeled nucleic acid from a tumor tissue sample from the subject to a first microarray comprising at least three nucleic acid probes from microbes selected from the group consisting of

Adenoviridae, Anelloviridae, Arenaviridae, Bunyaviridae, Coronaviridae, Filoviridae, Flaviviridae, Herpesviridae, Iridoviridae, Papillomaviridae, Paramyxoviridae,

Parvoviridae, Picornaviridae, Poxviridae, Reoviridae, Retroviridae and Rhabdoviridae, Actinomyces, Bartonella, Brevundimonas, Coxiella, Mobiluncus, Mycobacterium, Rickettsia and Sphingomonas to generate a first hybridization pattern. A detectably- labeled nucleic acid from a reference sample is hybridized to a second microarray comprising at least three nucleic acid probes from microbes selected from the group consisting of Adenoviridae, Anelloviridae, Arenaviridae, Bunyaviridae, Coronaviridae, Filoviridae, Flaviviridae, Herpesviridae, Iridoviridae, Papillomaviridae, Paramyxoviridae, Parvoviridae, Picornaviridae, Poxviridae, Reoviridae, Retroviridae and Rhabdoviridae, Actinomyces, Bartonella, Brevundimonas, Coxiella, Mobiluncus, Mycobacterium, Rickettsia and Sphingomonas to generate a second hybridization pattern. The reference sample is from an otherwise identical non-tumor tissue from a subject. The first and second hybridization patterns are compared. When the first hybridization pattern is substantially a microbial hybridization signature and the second hybridization pattern is substantially not a microbial hybridization signature, breast cancer is detected in the subject, and a treatment is administered to the subject.

Still another aspect of the invention includes a method of detecting and treating BRER in a subject. The method comprises hybridizing a detectably-labeled nucleic acid from a tumor tissue sample from the subject to a PathoChip array to generate a first hybridization pattern and hybridizing a detectably-labeled nucleic acid from a reference sample to a PathoChip array to generate a second hybridization pattern. The reference sample is from an otherwise identical non-tumor tissue from a subject. The first and second hybridization patterns are compared. When the first hybridization pattern is generated by hybridization of the detectably-labeled nucleic acid from the tumor tissue sample to at least three nucleic acid probes on the PathoChip, wherein the probes are from microbes selected from the group consisting of: Arcanobacterium, Bifidobacterium, Cardiobacterium, Citrobacter, Escherichia, Filobasidiella, Mucor, Trichophyton, Brugia and Paragonimus, and the second hybridization pattern is substantially not generated by hybridization of the detectably-labeled nucleic acid from the tumor tissue sample to at least three nucleic acid probes on the PathoChip, wherein the probes are from microbes selected from the group consisting of: Arcanobacterium, Bifidobacterium,

Cardiobacterium, Citrobacter, Escherichia, Filobasidiella, Mucor, Trichophyton, Brugia and Paragonimus, then BRER is detected in the subject. A treatment is then administered to the subject.

In another aspect, the invention includes a method of detecting and treating BRHR in a subject. The method comprises hybridizing a detectably-labeled nucleic acid from a tumor tissue sample from the subject to a PathoChip array to generate a first hybridization pattern and hybridizing a detectably-labeled nucleic acid from a reference sample to a PathoChip array to generate a second hybridization pattern. The reference sample is from an otherwise identical non-tumor tissue from a subject. The first and second hybridization patterns are compared. When the first hybridization pattern is generated by hybridization of the detectably-labeled nucleic acid from the tumor tissue sample to at least three nucleic acid probes on the PathoChip, wherein the probes are from microbes selected from the group consisting of: Novaviridae, Streptococcus, Epidermophyton, Fonsecaea, Pseudallescheria, and Balamuthia, and the second hybridization pattern is substantially not generated by hybridization of the detectably-labeled nucleic acid from the tumor tissue sample to at least three nucleic acid probes on the PathoChip, wherein the probes are from microbes selected from the group consisting of: Novaviridae, Streptococcus, Epidermophyton, Fonsecaea, Pseudallescheria, and Balamuthia, then BRHR is detected in the subject. A treatment is then administered to the subject.

In yet another aspect, the invention includes a method of detecting and treating BRTP in a subject. The method comprises hybridizing a detectably-labeled nucleic acid from a tumor tissue sample from a subject to a PathoChip array to generate a first hybridization pattern and hybridizing a detectably-labeled nucleic acid from a reference sample to a PathoChip array to generate a second hybridization pattern. The reference sample is from an otherwise identical non-tumor tissue from a subject. The first and second hybridization patterns are compared. When the first hybridization pattern is generated by hybridization of the detectably-labeled nucleic acid from the tumor tissue sample to at least three nucleic acid probes on the PathoChip, wherein the probes are from microbes selected from the group consisting of: Birnaviridae, Hepeviridae, Bordetella, Campylobacter, Chlamydia, Chlamydophila, Legionella, Pasteurella, Penicillium, Ancylostoma, Angiostrongylus, Echinococcus, Sarcocystis, Trichomonas, and

Trichostrongylus, and the second hybridization pattern is substantially not generated by hybridization of the detectably-labeled nucleic acid from the tumor tissue sample to at least three nucleic acid probes on the PathoChip, wherein the probes are from microbes selected from the group consisting of: Birnaviridae, Hepeviridae, Bordetella,

Campylobacter, Chlamydia, Chlamydophila, Legionella, Pasteurella, Penicillium,

Ancylostoma, Angiostrongylus, Echinococcus, Sarcocystis, Trichomonas, and

Trichostrongylus, then BRTP is detected in the subject. A treatment is then administered to the subject.

In various embodiments of the above aspects or any other aspect of the invention delineated herein, the microbial hybridization signature is generated by hybridization of the detectably-labeled nucleic acid from the tumor tissue sample to at least three nucleic acid probes on the PathoChip. The probes are from microbes selected from the group consisting of: Adenoviridae, Anelloviridae, Arenaviridae, Bunyaviridae, Coronaviridae, Filoviridae, Flaviviridae, Herpesviridae, Iridoviridae, Papillomaviridae, Paramyxoviridae, Parvoviridae, Picornaviridae, Poxviridae, Reoviridae, Retroviridae and Rhabdoviridae, Actinomyces, Bartonella, Brevundimonas, Coxiella, Mobiluncus, Mycobacterium, Rickettsia and Sphingomonas.

In one embodiment, the tumor tissue sample is selected from the group consisting of a biopsy, formalin-fixed, paraffin-embedded (FFPE) sample, or non-solid tumor.

In one embodiment, the detectably-labeled nucleic acid is labeled with a fluorophore, radioactive phosphate, biotin, or enzyme. In one embodiment, the fluorophore is Cy3 or Cy5.

In one embodiment, the subject is human. In one embodiment, when breast cancer is detected in the tumor tissue sample from a subject, the subject is provided with a treatment for breast cancer. In one embodiment, the treatment comprises surgery, chemotherapy, or radiotherapy.

In one embodiment, the nucleic acid probes are selected from between about 10 to about 30 microbes and comprise about 3 to about 5 probes per microbe.

In one embodiment, the microarray is a biochip, glass slide, bead, or paper.

BRIEF DESCRIPTION OF THE DRAWINGS

The following detailed description of specific embodiments of the invention will be better understood when read in conjunction with the appended drawings. For the purpose of illustrating the invention, there are shown in the drawings exemplary embodiments. It should be understood, however, that the invention is not limited to the precise arrangements and instrumentalities of the embodiments shown in the drawings.

FIGs. 1A-1E are a series of plots and images illustrating viral signatures associated with different breast cancer types. FIG. 1 A is a Venn diagram showing the common and unique viral signatures in the 4 types of breast cancers. FIG. IB is a heat map of common viral signatures in the 4 breast cancer types. FIG. 1C shows relative hybridization signals of viral probes detected in breast cancer types. For example, hybridization signals for Polyomaviridae probes were 4%, 6% and 3% of the total hybridization signals detected in BRER, BRTP and BRHR respectively. FIG. ID shows the prevalence of viral signatures in 4 breast cancer types. Since the hybridization signals for Polyomaviridae, Hepadnaviridae and Parapoxviridae were lower than the cut-off [log fold change in hybridization signal > 1] in one or more breast cancer types they are depicted as negative in this figure. FIG. IE shows a heat map of hybridization signals for viral signatures that are significantly higher in the cancers when compared to the control.

FIGs. 2A-2E are a series of plots and images illustrating bacterial signatures associated with different breast cancer types. FIG. 2A shows bacterial phyla associated with breast cancer types. FIG. 2B is a Venn diagram showing the common and unique bacterial signatures in the 4 types of breast cancers. FIG. 2C is a heat map of common viral signatures in the 4 breast cancer types. FIG. 2D shows hybridization signals of bacterial probes detected in breast cancer types. FIG. 2E shows the prevalence of bacterial signatures in 4 breast cancer types.

FIGs 3A-3F are a series of graphs illustrating fungal and parasitic signatures associated with different breast cancer types. FIG. 3A shows relative hybridization signals of fungal probes detected in breast cancer types. For example, hybridization signals for Ajellomyces were 7%, 8% and 14% of the total hybridization signals detected in BRER, BRTP and BRHR respectively, and that of Rhizomucor is 19% of the hybridization signals detected in BRTN. FIG. 3B shows prevalence of viral signatures in

4 breast cancer types. FIG. 3C is a Venn diagram showing the common and unique fungal signatures in the 4 types of breast cancers. FIG. 3D shows relative hybridization signals of parasitic probes detected in breast cancer types. For example, hybridization signals for Plasmodium were 10%, 6% and 21% of the total hybridization signals detected in BRER, BRTP and BRHR respectively, and that of ' Mansonella is 7% and 12% of the

hybridization signals detected in BRTP and BRTN respectively. FIG. 3E shows prevalence of parasitic signatures in 4 breast cancer types. FIG. 3F is a Venn diagram showing the common and unique parasitic signatures in the 4 types of breast cancers.

FIGs. 4A-4E are a series of plots illustrating hierarchical clustering of different breast cancer types based on their microbial signature detection pattern. FIG. 4A shows clustering of BRER. FIG. 4B shows clustering of BRTP. FIG. 4C shows clustering of BRHR. FIG. 4D shows clustering of BRTN. FIG. 4E shows the comparison of the microbiome signatures from all four breast cancer types together in the clustering analysis.

FIGs. 5A-5B are a series of plots and images illustrating PCR validation of microbial signatures in the 4 types of breast cancers and non-matched control, using the primers from FIG. 13. The left panels show the cropped gel pictures of EtBr stained amplicons run on agarose gel, where M is DNA ladder of Rsal digested φΧ/174, NTC is non-template control. The sequenced amplicons were subjected to nucleotide blast program in NCBI, and the results are shown in the right panels. In the Polyomavirus PCR gel picture, the reverse and the forward arrow heads signify Simian virus 40 and Merkel cell polyomavirus amplicons respectively, the electropherogram of the sequences of which are marked with the same arrow heads in FIGs. 6A-6B.

FIGs. 6A-6B are a series of images showing parts of the electropherograms of the sequenced amplicons validating the PathoChip screen results by PCR and Sanger sequencing.

FIG. 7 is a table displaying unique and common microbial signatures for 4 breast cancer types.

FIGs. 8A-8G are a series of tables showing the average hybridization signals of the probes of microorganisms detected in the cancers versus the controls, with respective adjusted p-values with multiple corrections.. FIGs. 8A-8B show average hybridization signals for BRER. FIGs. 8C-8D show average hybridization signals for BRTP. FIGs. 8E- 8F show average hybridization signals for BRHR. FIG. 8G shows average hybridization signals for BRTN.

FIG. 9 is a table illustrating BRER cluster 1 vs 2 from FIG. 4E.

FIG. 10 is a set of tables displaying BRER Cluster 1ER vs. Ungrouped 1ER and BRER Cluster 2ER vs ungrouped 1ER from FIG. 4E.

FIG. 11 is a set of tables displaying BRHR ClusterlHR VS Ungrouped

(1HR+2HR) and BRHR Cluster 2HR VS ungrouped (1HR+2HR) from FIG. 4C.

FIG. 12 is a set of tables displaying BRTN Cluster 1TN VS ungrouped 1TN and BRTN Cluster 2TN VS ungrouped 1TN from FIG. 4D.

FIG. 13 is a table displaying the primers used for PCR validation of PathoChip screen (SEQ ID NOs: 1-16).

FIG. 14 is a table displaying BLAST results of the sequenced PCR products for the validation of PathoChip screen.

DETAILED DESCRIPTION

Definitions

Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which the invention pertains. Although any methods and materials similar or equivalent to those described herein can be used in the practice for testing of the present invention, exemplary materials and methods are described herein. In describing and claiming the present invention, the following terminology will be used.

It is also to be understood that the terminology used herein is for the purpose of describing particular embodiments only, and is not intended to be limiting.

The articles "a", "an", and "the" are used herein to refer to one or to more than one (i. e. , to at least one) of the grammatical object of the article. By way of example, "an element" means one element or more than one element.

"About" as used herein when referring to a measurable value such as an amount, a temporal duration, and the like, is meant to encompass variations of ±20% or ±10%, more preferably ±5%, even more preferably ±1%, and still more preferably ±0.1% from the specified value, as such variations are appropriate to perform the disclosed methods.

A "biomarker" or "marker" as used herein generally refers to a nucleic acid molecule, clinical indicator, protein, or other analyte that is associated with a disease. In certain embodiments, a nucleic acid biomarker is indicative of the presence in a sample of a pathogenic organism, including but not limited to, viruses, viroids, bacteria, fungi, helminths, and protozoa. In various embodiments, a marker is differentially present in a biological sample obtained from a subject having or at risk of developing a disease (e.g., an infectious disease) relative to a reference. A marker is differentially present if the mean or median level of the biomarker present in the sample is statistically different from the level present in a reference. A reference level may be, for example, the level present in an environmental sample obtained from a clean or uncontaminated source. A reference level may be, for example, the level present in a sample obtained from a healthy control subject or the level obtained from the subject at an earlier timepoint, i. e. , prior to treatment. Common tests for statistical significance include, among others, t-test,

ANOVA, Kruskal-Wallis, Wilcoxon, Mann-Whitney and odds ratio. Biomarkers, alone or in combination, provide measures of relative likelihood that a subject belongs to a phenotypic status of interest. The differential presence of a marker of the invention in a subject sample can be useful in characterizing the subject as having or at risk of developing a disease (e.g., an infectious disease), for determining the prognosis of the subject, for evaluating therapeutic efficacy, or for selecting a treatment regimen.

By "agent" is meant any nucleic acid molecule, small molecule chemical compound, antibody, or polypeptide, or fragments thereof. By "alteration" or "change" is meant an increase or decrease. An alteration may be by as little as 1%, 2%, 3%, 4%, 5%, 10%, 20%, 30%, or by 40%, 50%, 60%, or even by as much as 70%, 75%, 80%, 90%, or 100%.

By "biologic sample" is meant any tissue, cell, fluid, or other material derived from an organism.

By "capture reagent" is meant a reagent that specifically binds a nucleic acid molecule or polypeptide to select or isolate the nucleic acid molecule or polypeptide.

As used herein, the terms "determining", "assessing", "assaying", "measuring" and "detecting" refer to both quantitative and qualitative determinations, and as such, the term "determining" is used interchangeably herein with "assaying," "measuring," and the like. Where a quantitative determination is intended, the phrase "determining an amount" of an analyte and the like is used. Where a qualitative and/or quantitative determination is intended, the phrase "determining a level" of an analyte or "detecting" an analyte is used.

By "detectable moiety" is meant a composition that when linked to a molecule of interest renders the latter detectable, via spectroscopic, photochemical, biochemical, immunochemical, or chemical means. For example, useful labels include radioactive isotopes, magnetic beads, metallic beads, colloidal particles, fluorescent dyes, electron- dense reagents, enzymes (for example, as commonly used in an ELISA), biotin, digoxigenin, or haptens.

A "disease" is a state of health of an animal wherein the animal cannot maintain homeostasis, and wherein if the disease is not ameliorated then the animal's health continues to deteriorate. In contrast, a "disorder" in an animal is a state of health in which the animal is able to maintain homeostasis, but in which the animal's state of health is less favorable than it would be in the absence of the disorder. Left untreated, a disorder does not necessarily cause a further decrease in the animal's state of health.

"Effective amount" or "therapeutically effective amount" are used

interchangeably herein, and refer to an amount of a compound, formulation, material, or composition, as described herein effective to achieve a particular biological result or provides a therapeutic or prophylactic benefit. Such results may include, but are not limited to, anti -tumor activity as determined by any means suitable in the art.

"Encoding" refers to the inherent property of specific sequences of nucleotides in a polynucleotide, such as a gene, a cDNA, or an mRNA, to serve as templates for synthesis of other polymers and macromolecules in biological processes having either a defined sequence of nucleotides (i.e. , rRNA, tRNA and mRNA) or a defined sequence of amino acids and the biological properties resulting therefrom. Thus, a gene encodes a protein if transcription and translation of mRNA corresponding to that gene produces the protein in a cell or other biological system. Both the coding strand, the nucleotide sequence of which is identical to the mRNA sequence and is usually provided in sequence listings, and the non-coding strand, used as the template for transcription of a gene or cDNA, can be referred to as encoding the protein or other product of that gene or cDNA.

By "fragment" is meant a portion of a nucleic acid molecule. This portion contains, preferably, at least 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, or 90% of the entire length of the reference nucleic acid molecule or polypeptide. A fragment may contain 5, 10, 15, 20, 30, 40, 50, 60, 70, 80, 90, or 100 nucleotides.

"Homologous" as used herein, refers to the subunit sequence identity between two polymeric molecules, e.g. , between two nucleic acid molecules, such as, two DNA molecules or two RNA molecules, or between two polypeptide molecules. When a subunit position in both of the two molecules is occupied by the same monomelic subunit; e.g. , if a position in each of two DNA molecules is occupied by adenine, then they are homologous at that position. The homology between two sequences is a direct function of the number of matching or homologous positions; e.g. , if half (e.g. , five positions in a polymer ten subunits in length) of the positions in two sequences are homologous, the two sequences are 50% homologous; if 90% of the positions (e.g. , 9 of

10), are matched or homologous, the two sequences are 90% homologous.

"Hybridization" means hydrogen bonding, which may be Watson-Crick,

Hoogsteen or reversed Hoogsteen hydrogen bonding, between complementary nucleobases. For example, adenine and thymine are complementary nucleotides that pair through the formation of hydrogen bonds.

"Identity" as used herein refers to the subunit sequence identity between two polymeric molecules particularly between two amino acid molecules, such as, between two polypeptide molecules. When two amino acid sequences have the same residues at the same positions; e.g. , if a position in each of two polypeptide molecules is occupied by an Arginine, then they are identical at that position. The identity or extent to which two amino acid sequences have the same residues at the same positions in an alignment is often expressed as a percentage. The identity between two amino acid sequences is a direct function of the number of matching or identical positions; e.g. , if half (e.g. , five positions in a polymer ten amino acids in length) of the positions in two sequences are identical, the two sequences are 50% identical; if 90% of the positions (e.g., 9 of 10), are matched or identical, the two amino acids sequences are 90% identical.

As used herein, an "instructional material" includes a publication, a recording, a diagram, or any other medium of expression which can be used to communicate the usefulness of the compositions and methods of the invention. The instructional material of the kit of the invention may, for example, be affixed to a container which contains the nucleic acid, peptide, and/or composition of the invention or be shipped together with a container which contains the nucleic acid, peptide, and/or composition. Alternatively, the instructional material may be shipped separately from the container with the intention that the instructional material and the compound be used cooperatively by the recipient.

The terms "isolated," "purified," or "biologically pure" refer to material that is free to varying degrees from components which normally accompany it as found in its native state. "Isolate" denotes a degree of separation from original source or surroundings. "Purify" denotes a degree of separation that is higher than isolation. A "purified" or

"biologically pure" protein is sufficiently free of other materials such that any impurities do not materially affect the biological properties of the protein or cause other adverse consequences. That is, a nucleic acid or peptide of this invention is purified if it is substantially free of cellular material, viral material, or culture medium when produced by recombinant DNA techniques, or chemical precursors or other chemicals when chemically synthesized. Purity and homogeneity are typically determined using analytical chemistry techniques, for example, polyacrylamide gel electrophoresis or high performance liquid chromatography. The term "purified" can denote that a nucleic acid or protein gives rise to essentially one band in an electrophoretic gel. For a protein that can be subjected to modifications, for example, phosphorylation or glycosylation, different modifications may give rise to different isolated proteins, which can be separately purified.

By "marker profile" is meant a characterization of the signal, level, expression or expression level of two or more markers (e.g., polynucleotides).

By the term "microbe" is meant any and all organisms classed within the commonly used term "microbiology," including but not limited to, bacteria, viruses, fungi and parasites. By the term "microarray" is meant a collection of nucleic acid probes immobilized on a substrate. As used herein, the term "nucleic acid" refers to deoxyribonucleotides, ribonucleotides, or modified nucleotides, and polymers thereof in single- or double- stranded form. The term encompasses nucleic acids containing known nucleotide analogs or modified backbone residues or linkages, which are synthetic, naturally occurring, and non- naturally occurring. Nucleic acid molecules useful in the methods of the invention include any nucleic acid molecule that specifically binds a target nucleic acid (e.g., a nucleic acid biomarker). Such nucleic acid molecules need not be 100% identical with an endogenous nucleic acid sequence, but will typically exhibit substantial identity.

Polynucleotides having "substantial identity" to an endogenous sequence are typically capable of hybridizing with at least one strand of a double-stranded nucleic acid molecule. By "hybridize" is meant pair to form a double-stranded molecule between complementary polynucleotide sequences (e.g., a gene described herein), or portions thereof, under various conditions of stringency. (See, e.g., Wahl, G. M. and S. L. Berger (1987) Methods Enzymol. 152:399; Kimmel, A. R. (1987) Methods Enzymol. 152:507).

By the term "modulating," as used herein, is meant mediating a detectable increase or decrease in the level of a response in a subject compared with the level of a response in the subject in the absence of a treatment or compound, and/or compared with the level of a response in an otherwise identical but untreated subject. The term encompasses perturbing and/or affecting a native signal or response thereby mediating a beneficial therapeutic response in a subject, preferably, a human.

In the context of the present invention, the following abbreviations for the commonly occurring nucleic acid bases are used. "A" refers to adenosine, "C" refers to cytosine, "G" refers to guanosine, "T" refers to thymidine, and "U" refers to uridine.

"Parenteral" administration of an immunogenic composition includes, e.g. , subcutaneous (s.c), intravenous (i.v.), intramuscular (i.m.), or intrasternal injection, or infusion techniques.

As used herein, the terms "peptide," "polypeptide," and "protein" are used interchangeably, and refer to a compound comprised of amino acid residues covalently linked by peptide bonds. A protein or peptide must contain at least two amino acids, and no limitation is placed on the maximum number of amino acids that can comprise a protein's or peptide's sequence. Polypeptides include any peptide or protein comprising two or more amino acids joined to each other by peptide bonds. As used herein, the term refers to both short chains, which also commonly are referred to in the art as peptides, oligopeptides and oligomers, for example, and to longer chains, which generally are referred to in the art as proteins, of which there are many types. "Polypeptides" include, for example, biologically active fragments, substantially homologous polypeptides, oligopeptides, homodimers, heterodimers, variants of polypeptides, modified

polypeptides, derivatives, analogs, fusion proteins, among others. The polypeptides include natural peptides, recombinant peptides, synthetic peptides, or a combination thereof.

By "reference" is meant a standard of comparison. As is apparent to one skilled in the art, an appropriate reference is where an element is changed in order to determine the effect of the element. In one embodiment, the level of a target nucleic acid molecule present in a sample may be compared to the level of the target nucleic acid molecule present in a clean or uncontaminated sample. For example, the level of a target nucleic acid molecule present in a sample may be compared to the level of the target nucleic acid molecule present in a corresponding healthy cell or tissue or in a diseased cell or tissue

(e.g., a cell or tissue derived from a subject having a disease, disorder, or condition).

As used herein, the term "sample" includes a biologic sample such as any tissue, cell, fluid, or other material derived from an organism.

By "specifically binds" is meant a compound (e.g., nucleic acid probe or primer) that recognizes and binds a molecule (e.g. , a nucleic acid biomarker), but which does not substantially recognize and bind other molecules in a sample, for example, a biological sample.

By "substantially identical" is meant a polypeptide or nucleic acid molecule exhibiting at least 50% identity to a reference amino acid sequence (for example, any one of the amino acid sequences described herein) or nucleic acid sequence (for example, any one of the nucleic acid sequences described herein). Preferably, such a sequence is at least 60%, more preferably 80% or 85%, and more preferably 90%, 95%, 96%, 97%, 98%, or even 99% or more identical at the amino acid level or nucleic acid to the sequence used for comparison.

Sequence identity is typically measured using sequence analysis software (for example, Sequence Analysis Software Package of the Genetics Computer Group, University of Wisconsin Biotechnology Center, 1710 University Avenue, Madison, Wis. 53705, BLAST, BESTFIT, GAP, or PILEUP/PRETTYBOX programs). Such software matches identical or similar sequences by assigning degrees of homology to various substitutions, deletions, and/or other modifications. Conservative substitutions typically include substitutions within the following groups: glycine, alanine; valine, isoleucine, leucine; aspartic acid, glutamic acid, asparagine, glutamine; serine, threonine; lysine, arginine; and phenylalanine, tyrosine. In an exemplary approach to determining the degree of identity, a BLAST program may be used, with a probability score between e "3 and e "100 indicating a closely related sequence.

By the term "substantially microbial hybridization signature" is a relative term and means a hybridization signature that indicates the presence of more microbes in a tumor sample than in a reference sample. By the term "substantially not a microbial hybridization signature" is a relative term and means a hybridization signature that indicates the presence of less microbes in a reference sample than in a tumor sample.

By "subject" is meant a mammal, including, but not limited to, a human or non- human mammal, such as a bovine, equine, canine, ovine, feline, mouse, or monkey. The term "subject" may refer to an animal, which is the object of treatment, observation, or experiment (e.g., a patient).

By "target nucleic acid molecule" is meant a polynucleotide to be analyzed. Such polynucleotide may be a sense or antisense strand of the target sequence. The term "target nucleic acid molecule" also refers to amplicons of the original target sequence. In various embodiments, the target nucleic acid molecule is one or more nucleic acid biomarkers.

A "target site" or "target sequence" refers to a genomic nucleic acid sequence that defines a portion of a nucleic acid to which a binding molecule may specifically bind under conditions sufficient for binding to occur.

The term "therapeutic" as used herein means a treatment and/or prophylaxis. A therapeutic effect is obtained by suppression, remission, or eradication of a disease state.

As used herein, the terms "treat," treating," "treatment," and the like refer to reducing or ameliorating a disorder and/or symptoms associated therewith. It will be appreciated that, although not precluded, treating a disorder or condition does not require that the disorder, condition or symptoms associated therewith be completely eliminated.

By the term "tumor tissue sample" is meant any sample from a tumor in a subject including any solid and non-solid tumor in the subject.

Ranges: throughout this disclosure, various aspects of the invention can be presented in a range format. It should be understood that the description in range format is merely for convenience and brevity and should not be construed as an inflexible limitation on the scope of the invention. Accordingly, the description of a range should be considered to have specifically disclosed all the possible subranges as well as individual numerical values within that range. For example, description of a range such as from 1 to 6 should be considered to have specifically disclosed subranges such as from

1 to 3, from 1 to 4, from 1 to 5, from 2 to 4, from 2 to 6, from 3 to 6 etc., as well as individual numbers within that range, for example, 1, 2, 2.7, 3, 4, 5, 5.3, and 6. This applies regardless of the breadth of the range. Description

The present invention features compositions and methods for the detection or diagnosis of breast cancer in a subject. The invention also features methods and kits for detecting specific types of breast cancer including endocrine receptor (estrogen or progesterone receptor) positive (BRER), human epidermal growth factor receptor 2 (HER2) positive (BRHR), triple positive (estrogen, progesterone and HER2 receptor positive) (BRTP) and triple negative (absence of estrogen, progesterone and HER2 receptors) (BRTN), and for distinguishing the different types of breast cancer from one another. Metagenomic signatures from a number of viral, bacterial, fungal, and parasitic microbes were identified herein that indicate that a subject has breast cancer, and determines the specific type of breast cancer (BRER, BRHR, BRTP, or BRTN).

The raicrobiome potentially has a role in the pathogenesis of many different diseases including cancer. Breast cancer is the second leading cause of cancer death in women, thus the diversity of the microbiome in four different types of breast cancer was investigated herein: endocrine receptor (ER) positive, triple positive, Her2 positive and triple negative breast cancers. Using whole genome and transcriptome amplification and a pan-pathogen microarray (PathoChip), unique and common viral, bacterial, fungal and parasitic signatures were detected for each of the breast cancer types. Validation was provided by PCR. and Sanger sequencing. Hierarchical cluster analysis of the breast cancer samples, based on their microbiome signatures, showed distinct signature patterns tor the triple negative and triple positive samples, while the ER positive and Her2 positive samples shared similar microbiome signatures. These signatures, unique or common to the different breast cancer types, provide a new line of investigation to gain insight into prognosis, treatment strategies and clinical outcome, as well as better understanding of the role of the microbiome in the development and progression of breast cancer.

Methods

The present invention includes methods of detecting breast cancer in a tumor tissue sample from a subject. In one aspect, the method comprises hybridizing a detectably-labeled nucleic acid from the tumor tissue sample to a PathoChip array to generate a first hybridization pattern, then hybridizing a detectably-labeled nucleic acid from a reference sample to a PathoChip array to generate a second hybridization pattern. The reference sample is from an otherwise identical non-tumor tissue from a subject. The first and second hybridization patterns are compared. When the first hybridization pattern is substantially a microbial hybridization signature and the second hybridization pattern is substantially not a microbial hybridization signature, breast cancer is detected in the tumor tissue sample.

In another aspect of the invention the method comprises wherein the microbial hybridization signature is generated by hybridization of the detectably-labeled nucleic acid from the tumor tissue sample to at least three nucleic acid probes on the PathoChip, wherein the probes are from microbes selected from the group consisting of:

Adenoviridae, Anelloviridae, Arenaviridae, Bunyaviridae, Coronaviridae, Filoviridae, Flaviviridae, Herpesviridae, Iridoviridae, Papillomaviridae, Paramyxoviridae,

Parvoviridae, Picornaviridae, Poxviridae, Reoviridae, Retroviridae and Rhabdoviridae, Actinomyces, Bartonella, Brevundimonas, Coxiella, Mobiluncus, Mycobacterium, Rickettsia and Sphingomonas.

Another aspect of the invention includes a method of detecting breast cancer in a tumor tissue sample from a subject comprising hybridizing a detectably-labeled nucleic acid from the tumor tissue sample to a first microarray. The first microarray comprises at least three nucleic acid probes from microbes selected from the group consisting of Adenoviridae, Anelloviridae, Arenaviridae, Bunyaviridae, Coronaviridae, Filoviridae, Flaviviridae, Herpesviridae, Iridoviridae, Papillomaviridae, Paramyxoviridae,

Parvoviridae, Picornaviridae, Poxviridae, Reoviridae, Retroviridae and Rhabdoviridae,

Actinomyces, Bartonella, Brevundimonas, Coxiella, Mobiluncus, Mycobacterium, Rickettsia and Sphingomonas. A first hybridization pattern is generated. Then, hybridizing a detectably-labeled nucleic acid from a reference sample to a second microarray. The second microarray comprises at least three nucleic acid probes from microbes selected from the group consisting of Adenoviridae, Anelloviridae,

Arenaviridae, Bunyaviridae, Coronaviridae, Filoviridae, Flaviviridae, Herpesviridae, Iridoviridae, Papillomaviridae, Paramyxoviridae, Parvoviridae, Picornaviridae,

Poxviridae, Reoviridae, Retroviridae and Rhabdoviridae, Actinomyces, Bartonella,

Brevundimonas, Coxiella, Mobiluncus, Mycobacterium, Rickettsia and Sphingomonas. A second hybridization pattern is generated. The reference sample is from an otherwise identical non-tumor tissue from a subject. The first and second hybridization patterns are compared. When the first hybridization pattern is substantially a microbial hybridization signature and the second hybridization pattern is substantially not a microbial hybridization signature, breast cancer is detected in the tumor tissue sample.

In another aspect, the invention includes a method of detecting endocrine receptor positive breast cancer (BRER) in a tumor tissue sample from a subject. The method comprises hybridizing a detectably-labeled nucleic acid from the tumor tissue sample to a PathoChip array to generate a first hybridization pattern and hybridizing a detectably- labeled nucleic acid from a reference sample to a PathoChip array to generate a second hybridization pattern. The reference sample is from an otherwise identical non-tumor tissue from a subject. The first and second hybridization patterns are compared. When the first hybridization pattern is generated by hybridization of the detectably-labeled nucleic acid from the tumor tissue sample to at least three nucleic acid probes on the PathoChip, wherein the probes are from microbes selected from the group consisting of:

Arcanobacterium, Bifidobacterium, Cardiobacterium, Citrobacter, Escherichia,

Filobasidiella, Mucor, Trichophyton, Brugia and Paragonimus, and the second hybridization pattern is substantially not generated by hybridization of the detectably- labeled nucleic acid from the tumor tissue sample to at least three nucleic acid probes on the PathoChip, wherein the probes are from microbes selected from the group consisting of: Arcanobacterium, Bifidobacterium, Cardiobacterium, Citrobacter, Escherichia, Filobasidiella, Mucor, Trichophyton, Brugia and Paragonimus, then BRER is detected in the tumor tissue sample.

In yet another aspect, the invention includes a method of distinguishing BRER from human epidermal growth factor receptor 2 (HER2) positive breast cancer (BRHR), triple positive breast cancer (BRTP), and triple negative breast cancer (BRTN) in a tumor tissue sample from a subject. The method comprises hybridizing a detectably-labeled nucleic acid from the tumor tissue sample to a PathoChip array to generate a first hybridization pattern and hybridizing a detectably-labeled nucleic acid from a reference sample to a PathoChip array to generate a second hybridization pattern. The reference sample is from an otherwise identical non-tumor tissue from a subject. The first and second hybridization patterns are compared. When the first hybridization pattern is generated by hybridization of the detectably-labeled nucleic acid from the tumor tissue sample to at least three nucleic acid probes on the PathoChip, wherein the probes are from microbes selected from the group consisting of: Arcanobacterium, Bifidobacterium, Cardiobacterium, Citrobacter, Escherichia, Filobasidiella, Mucor, Trichophyton, Brugia and Paragonimus, and the second hybridization pattern is substantially not generated by hybridization of the detectably-labeled nucleic acid from the tumor tissue sample to at least three nucleic acid probes on the PathoChip, wherein the probes are from microbes selected from the group consisting of: Arcanobacterium, Bifidobacterium,

Cardiobacterium, Citrobacter, Escherichia, Filobasidiella, Mucor, Trichophyton, Brugia and Paragonimus, then BRER is distinguished from BRHR, BRTP, and BRTN in the tumor tissue sample.

Another aspect of the invention includes a method of detecting human epidermal growth factor receptor 2 (HER2) positive breast cancer (BRHR) in a tumor tissue sample from a subject. The method comprises hybridizing a detectably-labeled nucleic acid from the tumor tissue sample to a PathoChip array to generate a first hybridization pattern and hybridizing a detectably-labeled nucleic acid from a reference sample to a PathoChip array to generate a second hybridization pattern, wherein the reference sample is from an otherwise identical non-tumor tissue from a subject. The first and second hybridization patterns are compared. When the first hybridization pattern is generated by hybridization of the detectably-labeled nucleic acid from the tumor tissue sample to at least three nucleic acid probes on the PathoChip, wherein the probes are from microbes selected from the group consisting of: Novaviridae, Streptococcus, Epidermophyton, Fonsecaea, Pseudallescheria, and Balamuthia, and the second hybridization pattern is substantially not generated by hybridization of the detectably-labeled nucleic acid from the tumor tissue sample to at least three nucleic acid probes on the PathoChip, wherein the probes are from microbes selected from the group consisting of: Novaviridae, Streptococcus, Epidermophyton, Fonsecaea, Pseudallescheria, and Balamuthia, BRHR is detected in the tumor tissue sample. Yet another aspect of the invention includes a method of distinguishing BRHR from BRER, BRTP and BRTN in a tumor tissue sample from a subject. The method comprises hybridizing a detectably-labeled nucleic acid from the tumor tissue sample to a PathoChip array to generate a first hybridization pattern and hybridizing a detectably- labeled nucleic acid from a reference sample to a PathoChip array to generate a second hybridization pattern, wherein the reference sample is from an otherwise identical non- tumor tissue from a subject. The first and second hybridization patterns are compared. When the first hybridization pattern is generated by hybridization of the detectably- labeled nucleic acid from the tumor tissue sample to at least three nucleic acid probes on the PathoChip, wherein the probes are from microbes selected from the group consisting of: Novaviridae, Streptococcus, Epidermophyton, Fonsecaea, Pseudallescheria, and Balamuthia, and the second hybridization pattern is substantially not generated by hybridization of the detectably-labeled nucleic acid from the tumor tissue sample to at least three nucleic acid probes on the PathoChip, wherein the probes are from microbes selected from the group consisting of: Novaviridae, Streptococcus, Epidermophyton,

Fonsecaea, Pseudallescheria, and Balamuthia, then BRHR is distinguished from BRER, BRTP, and BRTN in the tumor tissue sample.

Still another aspect of the invention includes a method of detecting BRTP in a tumor tissue sample from a subject. The method comprises hybridizing a detectably- labeled nucleic acid from the tumor tissue sample to a PathoChip array to generate a first hybridization pattern and hybridizing a detectably-labeled nucleic acid from a reference sample to a PathoChip array to generate a second hybridization pattern, wherein the reference sample is from an otherwise identical non-tumor tissue from a subject. The first and second hybridization patterns are compared. When the first hybridization pattern is generated by hybridization of the detectably-labeled nucleic acid from the tumor tissue sample to at least three nucleic acid probes on the PathoChip, wherein the probes are from microbes selected from the group consisting of: Birnaviridae, Hepeviridae, Bordetella, Campylobacter, Chlamydia, Chlamydophila, Legionella, Pasteurella, Penicillium, Ancylostoma, Angiostrongylus, Echinococcus, Sarcocystis, Trichomonas, and

Trichostrongylus, and the second hybridization pattern is substantially not generated by hybridization of the detectably-labeled nucleic acid from the tumor tissue sample to at least three nucleic acid probes on the PathoChip, wherein the probes are from microbes selected from the group consisting of: Birnaviridae, Hepeviridae, Bordetella, Campylobacter, Chlamydia, Chlamydophila, Legionella, Pasteurella, Penicillium, Ancylostoma, Angiostrongylus, Echinococcus, Sarcocystis, Trichomonas, and

Trichostrongylus, then BRTP is detected in the tumor tissue sample.

Another aspect of the invention includes a method of distinguishing BRTP from BRHR, BRER, and BRTN in a tumor tissue sample from a subject. The method comprises hybridizing a detectably-labeled nucleic acid from the tumor tissue sample to a PathoChip array to generate a first hybridization pattern and hybridizing a detectably- labeled nucleic acid from a reference sample to a PathoChip array to generate a second hybridization pattern. The reference sample is from an otherwise identical non-tumor tissue from a subject. The first and second hybridization patterns are compared. When the first hybridization pattern is generated by hybridization of the detectably-labeled nucleic acid from the tumor tissue sample to at least three nucleic acid probes on the PathoChip, wherein the probes are from microbes selected from the group consisting of: Birnaviridae, Hepeviridae, Bordetella, Campylobacter, Chlamydia, Chlamydophila, Legionella, Pasteurella, Penicillium, Ancylostoma, Angiostrongylus, Echinococcus, Sarcocystis,

Trichomonas, and Trichostrongylus, and the second hybridization pattern is substantially not generated by hybridization of the detectably-labeled nucleic acid from the tumor tissue sample to at least three nucleic acid probes on the PathoChip, wherein the probes are from microbes selected from the group consisting of: Birnaviridae, Hepeviridae, Bordetella, Campylobacter, Chlamydia, Chlamydophila, Legionella, Pasteurella,

Penicillium, Ancylostoma, Angiostrongylus, Echinococcus, Sarcocystis, Trichomonas, and Trichostrongylus, then BRTP is distinguished from BRHR, BRER, and BRTN in the tumor tissue sample.

In another aspect, the invention includes a method of distinguishing BRTN from BRHR, BRER, and BRTP in a tumor tissue sample from a subject. The method comprises hybridizing a detectably-labeled nucleic acid from the tumor tissue sample to a PathoChip array to generate a first hybridization pattern and hybridizing a detectably-labeled nucleic acid from a reference sample to a PathoChip array to generate a second hybridization pattern. The reference sample is from an otherwise identical non-tumor tissue from a subject. The first and second hybridization patterns are compared. When the first hybridization pattern is generated by hybridization of the detectably-labeled nucleic acid from the tumor tissue sample to at least three nucleic acid probes on the PathoChip, wherein the probes are from microbes selected from the group consisting of: Aerococcus, Arcobacter, Geobacillus, Orientia and Rothia, Alternaria, Malassezia, Piedraia,

Rhizomucor, Centrocestus, Contracaecum, Leishmania, Necator, Onchocerca, Toxocara, Trichinella, and Trichuris, and the second hybridization pattern is substantially not generated by hybridization of the detectably-labeled nucleic acid from the tumor tissue sample to at least three nucleic acid probes on the PathoChip, wherein the probes are from microbes selected from the group consisting of: Aerococcus, Arcobacter, Geobacillus, Orientia and Rothia, Alternaria, Malassezia, Piedraia, Rhizomucor, Centrocestus, Contracaecum, Leishmania, Necator, Onchocerca, Toxocara, Trichinella, and Trichuris, then BRTN is distinguished from BRHR, BRER, and BRTP in the tumor tissue sample.

In the methods disclosed herein, the tumor tissue sample can be a biopsy, formalin-fixed, paraffin-embedded (FFPE) sample, or non-solid tumor. The detectably- labeled nucleic acid can be labeled with a fluorophore, radioactive phosphate, biotin, or enzyme and the fluorophore can be Cy3 or Cy5.

The methods can also include providing the subject with a treatment for breast cancer, when breast cancer is detected in the tumor tissue sample from the subject. This includes all types of breast cancer including but not limited to BRTN, BRTP, BRER, and BRHR. Examples of treatments include, but are not limited to, surgery, chemotherapy, or radiotherapy. The subject can be any human or non-human mammal, such as a bovine, equine, canine, ovine, feline, mouse, or monkey. In one embodiment, the subject is a human.

Kits

The invention provides kits for the detection of biomarkers, which are indicative of the presence of one or more biological sequences or agents associated with breast cancer. The kits may be used for detecting the presence of multiple biological agents associated with breast cancer. The kits may be used for the diagnosis or detection of different types of breast cancer. In some embodiments, the kit comprises a panel or collection of probes to nucleic acid biomarkers (e.g., PathoChip) delineated herein as specific for detection of breast cancer. In some embodiments, the kit comprises a panel or collection of probes to nucleic acid biomarkers (e.g., PathoChip) delineated herein as specific for detection of a specific type of breast cancer (BRTN, BRTP, BRER, and BRHR). In additional or alternative embodiments, the kit comprises an antibody specific for a pathogenic organism associated with breast cancer. Such antibodies may be used for ELISA detection or for extraction of a pathogenic organism associated with breast cancer (e.g., a biotin labeled antibody in conjunction with Streptavidin bound magnetic beads).

In some embodiments, the kit comprises one or more sterile containers which contain the panel of probes, nucleic acid biomarkers, or a microarray. Such containers can be boxes, ampoules, bottles, vials, tubes, bags, pouches, blister-packs, or other suitable container forms known in the art. Such containers can be made of plastic, glass, laminated paper, metal foil, or other materials suitable for holding medicaments.

The instructions will generally include information about the use of the composition for the detection or diagnosis of breast cancer or a specific type of breast cancer (i.e. BRTN, BRTP, BRER, and BRHR). In other embodiments, the instructions include at least one of the following: description of the therapeutic agent; dosage schedule and administration for treatment or prevention of breast cancer or symptoms thereof; precautions; warnings; indications; counter-indications; overdosage information; adverse reactions; animal pharmacology; clinical studies; and/or references. The instructions may be printed directly on the container (when present), or as a label applied to the container, or as a separate sheet, pamphlet, card, or folder supplied in or with the container.

One aspect of the invention includes a kit comprising a microarray comprising at least three nucleic acid probes selected from the group of microbes consisting of Adenoviridae, Anelloviridae, Arenaviridae, Bunyaviridae, Coronaviridae, Filoviridae, Flaviviridae, Herpesviridae, Iridoviridae, Papillomaviridae, Paramyxoviridae,

Parvoviridae, Picornaviridae, Poxviridae, Reoviridae, Retroviridae and Rhabdoviridae, Actinomyces, Bartonella, Brevundimonas, Coxiella, Mobiluncus, Mycobacterium, Rickettsia and Sphingomonas, and instructional material for use thereof.

Another aspect of the invention includes a kit comprising a microarray comprising at least three nucleic acid probes selected from the group of microbes consisting of

Arcanobacterium, Bifidobacterium, Cardiobacterium, Citrobacter, Escherichia,

Filobasidiella, Mucor, Trichophyton, Brugia and Paragonimus, and instructional material for use thereof.

Yet another aspect of the invention includes a kit comprising a microarray comprising at least three nucleic acid probes selected from the group of microbes consisting of Novaviridae, Streptococcus, Epidermophyton, Fonsecaea, Pseudallescheria, and Balamuthia, and instructional material for use thereof. Still another aspect includes a kit comprising a microarray comprising at least three nucleic acid probes selected from the group of microbes consisting of Birnaviridae, Hepeviridae, Bordetella, Campylobacter, Chlamydia, Chlamydophila, Legionella, Pasteurella, Penicillium, Ancylostoma, Angiostrongylus, Echinococcus, Sarcocystis, Trichomonas, and Trichostrongylus, and instructional material for use thereof.

Another aspect of the invention includes a kit comprising a microarray comprising at least three nucleic acid probes selected from the group of microbes consisting of Aerococcus, Arcobacter, Geobacillus, Orientia and Rothia, Alternaria, Malassezia, Piedraia, Rhizomucor, Centrocestus, Contracaecum, Leishmania, Necator, Onchocerca, Toxocara, Trichinella, and Trichuris, and instructional material for use thereof.

The kit can include probes from about 10-30 organisms with about 3-5 probes per organism. In certain embodiments, the microarray in the kit comprises is a biochip, glass slide, bead, or paper.

Target Nucleic Acid Molecules

Methods and compositions of the invention are useful for the identification of a target nucleic acid molecule in a biological sample to be analyzed. Target sequences are amplified from any biological sample that comprises a target nucleic acid molecule. Such samples may comprise fungi, spores, viruses, or cells (e.g., prokaryotes, eukaryotes, including human). Such samples may comprise viral, bacterial, fungal, and parasitic nucleic acid molecules. In specific embodiments, compositions and methods of the invention detect one or more nucleic acid sequences from one or more pathogenic organisms, including viruses, viroids, bacteria, fungi, helminths, and/or protozoa.

In one embodiment, a sample is a biological sample, such as a tissue or tumor sample. The level of one or more polynucleotide biomarkers (e.g., to detect or identify viruses, viroids, bacteria, fungi, helminths, and/or protozoa) is measured in the biological sample. In one embodiment, the biological sample is a tissue sample that includes a tumor cell, for example, from a biopsy or formalin-fixed, paraffin-embedded (FFPE) sample. Exemplary test samples also include body fluids (e.g. blood, serum, plasma, amniotic fluid, sputum, urine, cerebrospinal fluid, lymph, tear fluid, feces, or gastric fluid), feces, tissue extracts, and culture media (e.g., a liquid in which a cell, such as a pathogen cell, has been grown). If desired, the sample is purified prior to detection using any standard method typically used for isolating a nucleic acid molecule from a biological sample. In one embodiment, a target nucleic acid of a pathogen is amplified by primer

oligonucleotides to detect the presence of the nucleic acid sequence of an infectious agent in the sample. Such nucleic acid sequences may derive from pathogens including fungi, bacteria, viruses and yeast.

Target nucleic acid molecules include double-stranded and single- stranded nucleic acid molecules (e.g., DNA, RNA, and other nucleobase polymers known in the art capable of hybridizing with a nucleic acid molecule described herein). RNA molecules suitable for detection with a detectable oligonucleotide probe or detectable

primer/template oligonucleotide of the invention include, but are not limited to, double- stranded and single-stranded RNA molecules that comprise a target sequence (e.g., messenger RNA, viral RNA, ribosomal RNA, transfer RNA, microRNA and microRNA precursors, and siRNAs or other RNAs described herein or known in the art). DNA molecules suitable for detection with a detectable oligonucleotide probe or

primer/template oligonucleotide of the invention include, but are not limited to, double stranded DNA (e.g., genomic DNA, plasmid DNA, mitochondrial DNA, viral DNA, and synthetic double stranded DNA). Single-stranded DNA target nucleic acid molecules include, for example, viral DNA, cDNA, and synthetic single- stranded DNA, or other types of DNA known in the art. In general, a target sequence for detection is between about 30 and about 300 nucleotides in length (e.g., 10, 15, 20, 25, 30, 35, 40, 45, 50, 60, 70, 80, 90, 100, 110, 120, 130, 140, 150, 160, 170, 180, 190, 200, 210, 220, 230, 240,

250, 260, 270, 280, 290, 300 nucleotides). In a specific embodiment the target sequence is about 60 nucleotides in length. A target sequence for detection may also have at least about 70, 80, 90, 95, 96, 97, 98, 99, or even 100% identity to a probe sequence. Probe sequences may be longer or shorter than the target sequence. For example, a 60- nucleotide probe may hybridize to at least about 44 nucleotides of a target sequence.

In particular embodiments, a biomarker is a biomolecule (e.g., nucleic acid molecule) that is differentially present in a biological sample. For example, a biomarker is taken from a subject of one phenotypic status (e.g., having breast cancer) as compared with another phenotypic status (e.g., not having breast cancer). A biomarker is differentially present between different phenotypic statuses if the mean or median expression level of the biomarker in the different groups is calculated to be statistically significant. Common tests for statistical significance include, among others, t-test, ANOVA, Kruskal-Wallis, Wilcoxon, Mann-Whitney and odds ratio. Biomarkers, alone or in combination, provide measures of relative risk that a subject belongs to one phenotypic status or another. Therefore, they are useful as markers for characterizing a disease (e.g., having breast cancer).

Sample Preparation

The invention provides a means for analyzing multiple types of nucleic acids present in a sample, including DNA and RNA. In various embodiments, sample preparation involves extracting a mixture of nucleic acid molecules (e.g., DNA and RNA). In other embodiments, sample preparation involves extracting a mixture of nucleic acids from multiple organisms, cell types, infectious agents, or any combination thereof. In one embodiment, sample preparation involves the workflow below.

A. Fragment genomic DNA

B. Convert total RNA to first strand cDNA by random-primed reverse transcriptase

C. Label genomic DNA with biotin or fluorescent dye by chemical or enzymatic incorporation

D. Label cDNA with biotin or fluorescent dye by chemical or enzymatic incorporation

E. Label a mixture of genomic DNA and cDNA in the same chemical or enzymatic reaction

F. Mix C + D and co-hybridize to microarray of probes

G. Hybridize E to microarray of probes

H. Amplify targeted genomic DNA

1. Use whole-genome amplification (GE GenomiPhi, Sigma WGA, NuGEN Ovation DNA) to non- specifically amplify genomic DNA

2. Use amplified products as input for C or E.

I. Amplify targeted total RNA

1. Use whole-transcriptome amplification (Sigma WTA, Ambion in vitro transcription, NuGEN Ovation RNA) to non-specifically amplify total RNA

2. Use amplified products as input.

The samples are hybridized to the microarray (e.g., PathoChip), and the microarrays are washed at various stringencies. Microarrays are scanned for detection of fluorescence. Background correction and inter-array normalization algorithms are applied. Detection thresholds are applied. The results are analyzed for statistical significance.

Nucleic Acid Amplification

Target nucleic acid sequences are optionally amplified before being detected. The term "amplified" defines the process of making multiple copies of the nucleic acid from a single or lower copy number of nucleic acid sequence molecule. The amplification of nucleic acid sequences is carried out in vitro by biochemical processes known to those of skill in the art. Prior to or concurrent with identification, the viral sample may be amplified by a variety of mechanisms, some of which may employ PCR. For example, primers for PCR may be designed to amplify regions of the sequence. For RNA viruses a first reverse transcriptase step may be used to generate double stranded DNA from the single stranded RNA. See, for example, PCR Technology: Principles and Applications for DNA Amplification (Ed. H A. Erlich, Freeman Press, NY, N.Y., 1992); PCR

Protocols: A Guide to Methods and Applications (Eds. Innis, et al., Academic Press, San Diego, Calif, 1990); Mattila et al., Nucleic Acids Res. 19, 4967 (1991); Eckert et al,

PCR Methods and Applications 1, 17 (1991); PCR (Eds. McPherson et al., IRL Press, Oxford); and US Patent Nos 4,683,202, 4,683, 195, 4,800, 159 4,965, 188, and 5,333,675. The sample may be amplified on the array. See, for example, US Patent No 6,300,070 and US Ser No 09/513,300.

Other suitable amplification methods include the ligase chain reaction (LCR) (for example, Wu and Wallace, Genomics 4, 560 ( 1989), Landegren et al., Science 241, 1077 (1988) and Barringer et al. Gene 89: 117 (1990)), transcription amplification (Kwoh et al, Proc. Natl. Acad. Sci. USA 86, 1 173 (1989) and WO88/10315), self-sustained sequence replication (Guatelli et al., Proc. Nat. Acad. Sci. USA, 87, 1874 (1990) and

WO90/06995), selective amplification of target polynucleotide sequences (US Patent No

6,410,276), consensus sequence primed PCR (CP-PCR) (US Patent No 4,437,975), arbitrarily primed PCR (AP-PCR) (US Patent Nos 5,413,909, 5,861,245) and nucleic acid based sequence amplification (NABSA) (see, US Patent Nos 5,409,818, 5,554,517, and 6,063,603). Other amplification methods that may be used are described in, US Patent Nos 5,242,794, 5,494,810, 4,988,617 and in US Ser No 09/854,317.

Additional methods of sample preparation and techniques for reducing the complexity of a nucleic acid sample are described in Dong et al., Genome Research 1 1, 1418 (2001), in US Patent Nos 6,361,947, 6,391,592 and US Ser Nos 09/916, 135, 09/920,491 (US Patent Application Publication 20030096235), 09/910,292 (US Patent Application Publication 20030082543), and 10/013,598.

Detection of Biomarkers

The biomarkers of this invention can be detected by any suitable method. The methods described herein can be used individually or in combination for a more accurate detection of the biomarkers. Methods for conducting polynucleotide hybridization assays have been developed in the art. Hybridization assay procedures and conditions will vary depending on the application and are selected in accordance with the general binding methods known including those referred to in: Sambrook and Russell, Molecular

Cloning: A Laboratory Manual (3 rd Ed. Cold Spring Harbor, N.Y, 2001); Berger and Kimmel Methods in Enzymology, Vol. 152, Guide to Molecular Cloning Techniques (Academic Press, Inc., San Diego, Calif, 1987); Young and Davism, P.N.A.S, 80: 1 194 (1983). Methods and apparatus for carrying out repeated and controlled hybridization reactions have been described in US Patent Nos 5,871,928, 5,874,219, 6,045,996 and

6,386,749, 6,391,623. A data analysis algorithm (E-predict) for interpreting the hybridization results from an array is publicly available (see Urisman, 2005, Genome Biol 6:R78).

In one embodiment, the hybridized nucleic acids are detected by detecting one or more labels attached to, or incorporated within, the sample nucleic acids. The labels may be attached or incorporated by any of a number of means well known to those of skill in the art. In one embodiment, the label is simultaneously incorporated during the amplification step in the preparation of the sample nucleic acids. Thus, for example, PCR with labeled primers or labeled nucleotides will provide a labeled amplification product. In another embodiment, transcription amplification, as described above, using a labeled nucleotide (e.g. fluorescein-labeled UTP and/or CTP) incorporates a label into the transcribed nucleic acids. In another embodiment PCR amplification products are fragmented and labeled by terminal deoxytransferase and labeled dNTPs. Alternatively, a label may be added directly to the original nucleic acid sample (e.g., mRNA, polyA mRNA, cDNA, etc.) or to the amplification product after the amplification is completed.

Means of attaching labels to nucleic acids are well known to those of skill in the art and include, for example, nick translation or end-labeling (e.g. with a labeled RNA) by kinasing the nucleic acid and subsequent attachment (ligation) of a nucleic acid linker joining the sample nucleic acid to a label (e.g., a fluorophore). In another embodiment label is added to the end of fragments using terminal deoxytransferase.

Detectable labels suitable for use in the present invention include any composition detectable by spectroscopic, photochemical, biochemical, immunochemical, electrical, optical or chemical means. Useful labels in the present invention include, but are not limited to: biotin for staining with labeled streptavidin conjugate; anti-biotin antibodies, magnetic beads (e.g., Dynabeads™.); fluorescent dyes (e.g., Cy3, Cy5, fluorescein, texas

3 125 35 4 red, rhodamine, green fluorescent protein, and the like); radiolabels (e.g., H, I, S, C, or 32 P); phosphorescent labels; enzymes (e.g., horse radish peroxidase, alkaline phosphatase and others commonly used in an ELISA); and colorimetric labels such as colloidal gold or colored glass or plastic (e.g., polystyrene, polypropylene, latex, etc.) beads. Patents teaching the use of such labels include US Patent Nos 3,817,837, 3,850,752, 3,939,350, 3,996,345, 4,277,437, 4,275, 149 and 4,366,241.

Means of detecting such labels are well known to those of skill in the art. Thus, for example, radiolabels may be detected using photographic film or scintillation counters; fluorescent markers may be detected using a photodetector to detect emitted light. Enzymatic labels are typically detected by providing the enzyme with a substrate and detecting the reaction product produced by the action of the enzyme on the substrate, and calorimetric labels are detected by simply visualizing the colored label.

Methods and apparatus for signal detection and processing of intensity data are disclosed in, for example, US Patent Nos 5, 143,854, 5,547,839, 5,578,832, 5,631,734, 5,800,992, 5,834,758; 5,856,092, 5,902,723, 5,936,324, 5,981,956, 6,025,601, 6,090,555, 6, 141,096, 6, 185,030, 6,201,639; 6,218,803; and 6,225,625, in US Ser Nos 10/389, 194, 60/493,495 and in PCT Application PCT/US99/06097 (published as W099/47964).

Detection by Microarray

In certain aspects of the invention, a sample is analyzed by means of a microarray. The nucleic acid molecules of the invention are useful as hybridizable array elements in a microarray. Microarrays generally comprise solid substrates and have a generally planar surface, to which a capture reagent (also called an adsorbent or affinity reagent) is attached. Frequently, the surface of a biochip comprises a plurality of addressable locations, each of which has the capture reagent bound there. The array elements are organized in an ordered fashion such that each element is present at a specified location on the substrate. Useful substrate materials include membranes, composed of paper, nylon or other materials, filters, chips, glass slides, and other solid supports. The ordered arrangement of the array elements allows hybridization patterns and intensities to be interpreted as expression levels of particular genes or proteins. Methods for making nucleic acid microarrays are known to the skilled artisan and are described, for example, in U.S. Pat. No. 5,837,832, Lockhart, et al. (Nat. Biotech. 14: 1675-1680, 1996), and Schena, et al. (Proc. Natl. Acad. Sci. 93 : 10614- 10619, 1996), herein incorporated by reference. US Patent Nos 5,800,992 and 6,040, 138 describe methods for making arrays of nucleic acid probes that can be used to detect the presence of a nucleic acid containing a specific nucleotide sequence. Methods of forming high- density arrays of nucleic acids, peptides and other polymer sequences with a minimal number of synthetic steps are known. The nucleic acid array can be synthesized on a solid substrate by a variety of methods, including, but not limited to, light-directed chemical coupling, and mechanically directed coupling. For additional descriptions and methods relating to resequencing arrays see US Patent Application Ser Nos 10/658,879,

60/417, 190, 09/381,480, 60/409,396, and US Patent Nos 5,861,242, 6,027,880,

5,837,832, 6,723,503.

By "hybridize" is meant pair to form a double-stranded molecule between complementary polynucleotide sequences (e.g., a gene described herein), or portions thereof, under various conditions of stringency. (See, e.g., Wahl, G. M. and S. L. Berger (1987) Methods Enzymol. 152:399; Kimmel, A. R. (1987) Methods Enzymol. 152:507). For example, stringent salt concentration will ordinarily be less than about 750 mM NaCl and 75 mM trisodium citrate, preferably less than about 500 mM NaCl and 50 mM trisodium citrate, and more preferably less than about 250 mM NaCl and 25 mM trisodium citrate. Low stringency hybridization can be obtained in the absence of organic solvent, e.g. , formamide, while high stringency hybridization can be obtained in the presence of at least about 35% formamide, and more preferably at least about 50% formamide. Stringent temperature conditions will ordinarily include temperatures of at least about 30° C, more preferably of at least about 37° C, and most preferably of at least about 42° C. Varying additional parameters, such as hybridization time, the

concentration of detergent, e.g. , sodium dodecyl sulfate (SDS), and the inclusion or exclusion of carrier DNA, are well known to those skilled in the art. Various levels of stringency are accomplished by combining these various conditions as needed. In a preferred: embodiment, hybridization will occur at 30° C in 750 mM NaCl, 75 mM trisodium citrate, and 1% SDS. In a more preferred embodiment, hybridization will occur at 37° C in 500 mM NaCl, 50 mM trisodium citrate, 1% SDS, 35% formamide, and 100 Mg ml denatured salmon sperm DNA (ssDNA). In a most preferred embodiment, hybridization will occur at 42° C in 250 mM NaCl, 25 mM trisodium citrate, 1% SDS, 50% formamide, and 200 μg/ml ssDNA. Useful variations on these conditions will be readily apparent to those skilled in the art.

For most applications, washing steps that follow hybridization will also vary in stringency. Wash stringency conditions can be defined by salt concentration and by temperature. As above, wash stringency can be increased by decreasing salt concentration or by increasing temperature. For example, stringent salt concentration for the wash steps will preferably be less than about 30 mM NaCl and 3 mM trisodium citrate, and most preferably less than about 15 mM NaCl and 1.5 mM trisodium citrate. Stringent temperature conditions for the wash steps will ordinarily include a temperature of at least about 25° C, more preferably of at least about 42° C, and even more preferably of at least about 68° C In a preferred embodiment, wash steps will occur at 25° C in 30 mM NaCl, 3 mM trisodium citrate, and 0.1% SDS. In a more preferred embodiment, wash steps will occur at 42° C in 15 mM NaCl, 1.5 mM trisodium citrate, and 0.1% SDS. In a more preferred embodiment, wash steps will occur at 68° C in 15 mM NaCl, 1.5 mM trisodium citrate, and 0.1% SDS. Additional variations on these conditions will be readily apparent to those skilled in the art. Hybridization techniques are well known to those skilled in the art and are described, for example, in Benton and Davis (Science 196: 180, 1977);

Grunstein and Hogness (Proc. Natl. Acad. Sci., USA 72:3961, 1975); Ausubel et al. (Current Protocols in Molecular Biology, Wiley Interscience, New York, 2001); Berger and Kimmel (Guide to Molecular Cloning Techniques, 1987, Academic Press, New York); and Sambrook et al., Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Laboratory Press, New York.

One embodiment of the invention includes a microarray. In one embodiment, the microarray comprises at least three nucleic acid probes selected from the group of microbes consisting of Adenoviridae, Anelloviridae, Arenaviridae, Bunyaviridae, Coronaviridae, Filoviridae, Flaviviridae, Herpesviridae, Iridoviridae, Papillomaviridae, Paramyxoviridae, Parvoviridae, Picornaviridae, Poxviridae, Reoviridae, Retroviridae and Rhabdoviridae, Actinomyces, Bartonella, Brevundimonas, Coxiella, Mobiluncus, Mycobacterium, Rickettsia and Sphingomonas. In one embodiment, the microarray comprises at least three nucleic acid probes selected from the group of microbes consisting of Arcanobacterium, Bifidobacterium, Cardiobacterium, Citrobacter,

Escherichia, Filobasidiella, Mucor, Trichophyton, Brugia and Paragonimus. In one embodiment, the microarray comprises at least three nucleic acid probes selected from the group of microbes consisting of Novaviridae, Streptococcus, Epidermophyton,

Fonsecaea, Pseudallescheria, and Balamuthia. In one embodiment, the microarray comprises at least three nucleic acid probes selected from the group of microbes consisting of Birnaviridae, Hepeviridae, Bordetella, Campylobacter, Chlamydia,

Chlamydophila, Legionella, Pasteurella, Penicillium, Ancylostoma, Angiostrongylus, Echinococcus, Sarcocystis, Trichomonas, and Trichostrongylus. In one embodiment, the microarray comprising at least three nucleic acid probes selected from the group of microbes consisting of Aerococcus, Arcobacter, Geobacillus, Orientia and Rothia, Alternaria, Malassezia, Piedraia, Rhizomucor, Centrocestus, Contracaecum, Leishmania,

Necator, Onchocerca, Toxocara, Trichinella, and Trichuris, The microarray can be a biochip, or on a glass slide, bead, or paper. The nucleic acid probes can be selected from between about 10 to about 30 microbes and comprise about 3 to about 5 probes per microbe.

Detection by Nucleic Acid Biochip

In aspects of the invention, a sample is analyzed by means of a nucleic acid biochip (also known as a nucleic acid microarray). To produce a nucleic acid biochip, oligonucleotides may be synthesized or bound to the surface of a substrate using a chemical coupling procedure and an ink jet application apparatus, as described in PCT application W095/251 1 16 (Baldeschweiler et al). Alternatively, a gridded array may be used to arrange and link cDNA fragments or oligonucleotides to the surface of a substrate using a vacuum system, thermal, UV, mechanical or chemical bonding procedure.

Exemplary nucleic acid molecules useful in the invention include polynucleotides that specifically bind nucleic acid biomarkers to one or more pathogenic organisms, and fragments thereof.

A nucleic acid molecule (e.g. RNA or DNA) derived from a biological sample may be used to produce a hybridization probe as described herein. The biological samples are generally derived from a patient, e.g. , as a bodily fluid (such as blood, blood serum, plasma, saliva, urine, ascites, cyst fluid, and the like); a homogenized tissue sample (e.g. , a tissue sample obtained by biopsy); or a cell or population of cells isolated from a patient sample. For some applications, cultured cells or other tissue preparations may be used. The mR A is isolated according to standard methods, and cDNA is produced and used as a template to make complementary RNA suitable for hybridization. Such methods are well known in the art. The RNA is amplified in the presence of fluorescent nucleotides, and the labeled probes are then incubated with the microarray to allow the probe sequence to hybridize to complementary oligonucleotides bound to the biochip.

Incubation conditions are adjusted such that hybridization occurs with precise complementary matches or with various degrees of less complementarity depending on the degree of stringency employed. For example, stringent salt concentration will ordinarily be less than about 750 mM NaCl and 75 mM trisodium citrate, less than about 500 mM NaCl and 50 mM trisodium citrate, or less than about 250 mM NaCl and 25 mM trisodium citrate. Low stringency hybridization can be obtained in the absence of organic solvent, e.g. , formamide, while high stringency hybridization can be obtained in the presence of at least about 35% formamide, and most preferably at least about 50% formamide. Stringent temperature conditions will ordinarily include temperatures of at least about 30°C, of at least about 37°C, or of at least about 42°C. Varying additional parameters, such as hybridization time, the concentration of detergent, e.g. , sodium dodecyl sulfate (SDS), and the inclusion or exclusion of carrier DNA, are well known to those skilled in the art. Various levels of stringency are accomplished by combining these various conditions as needed. In a preferred embodiment, hybridization will occur at 30°C in 750 mM NaCl, 75 mM trisodium citrate, and 1% SDS. In embodiments, hybridization will occur at 37°C in 500 mM NaCl, 50 mM trisodium citrate, 1% SDS, 35% formamide, and 100 μg/ml denatured salmon sperm DNA (ssDNA). In other embodiments, hybridization will occur at 42°C in 250 mM NaCl, 25 mM trisodium citrate, 1% SDS, 50% formamide, and 200 μg/ml ssDNA. Useful variations on these conditions will be readily apparent to those skilled in the art.

The removal of nonhybridized probes may be accomplished, for example, by washing. The washing steps that follow hybridization can also vary in stringency. Wash stringency conditions can be defined by salt concentration and by temperature. As above, wash stringency can be increased by decreasing salt concentration or by increasing temperature. For example, stringent salt concentration for the wash steps will preferably be less than about 30 mM NaCl and 3 mM trisodium citrate, and most preferably less than about 15 mM NaCl and 1.5 mM trisodium citrate. Stringent temperature conditions for the wash steps will ordinarily include a temperature of at least about 25 °C, of at least about 42°C, or of at least about 68°C. In embodiments, wash steps will occur at 25°C in 30 mM NaCl, 3 mM trisodium citrate, and 0.1% SDS. In a more preferred embodiment, wash steps will occur at 42 C in 15 mM NaCl, 1.5 mM trisodium citrate, and 0.1% SDS. In other embodiments, wash steps will occur at 68 C in 15 mM NaCl, 1.5 mM trisodium citrate, and 0.1% SDS. Additional variations on these conditions will be readily apparent to those skilled in the art.

Detection systems for measuring the absence, presence, and amount of hybridization for all of the distinct nucleic acid sequences are well known in the art. For example, simultaneous detection is described in Heller et al., Proc. Natl. Acad. Sci. 94:2150-2155, 1997. In certain embodiments, a scanner is used to determine the levels and patterns of fluorescence.

Diagnostic assays

The present invention provides a number of diagnostic assays that are useful for the identification or characterization of a disease or disorder (e.g., breast cancer), or a propensity to develop such a condition. In one embodiment, breast cancer is

characterized by quantifying the level of one or more biomarkers from one or more organisms, including viruses, viroids, bacteria, fungi, helminths, and protozoa. While the examples provided herein describe specific methods of detecting levels of these markers, the skilled artisan appreciates that the invention is not limited to such methods. Marker levels are quantifiable by any standard method, such methods include, but are not limited to real-time PCR, Southern blot, PCR, and/or mass spectroscopy.

The level of any two or more of the markers described herein defines the marker profile of a disease, disorder, or condition. The level of marker is compared to a reference. In one embodiment, the reference is the level of marker present in a control sample obtained from a patient that does not have breast cancer. In another embodiment, the reference is a healthy tissue or cell (i.e., that is negative for breast cancer). In another embodiment, the reference is a baseline level of marker present in a biologic sample derived from a patient prior to, during, or after treatment for ovarian cancer. In yet another embodiment, the reference is a standardized curve. The level of any one or more of the markers described herein (e.g., a combination of viral, bacterial, fungal, helminth, and/or protozoan biomarkers) is used, alone or in combination with other standard methods, to characterize the disease, disorder, or condition (e.g., breast cancer).

In certain embodiments, one or more organisms described herein may be isolated or extracted from a sample using a capture reagent (e.g., an antibody) and/or detected using ELISA. In a particular embodiment, reagents for capturing the pathogenic organism include Streptavidin bound magnetic beads and biotin labeled probes. Such techniques can be further used to obtain nucleic acids pathogenic organism detection using nucleic acid based probes or for direct sequencing (e.g., MiSeq; Illumina).

The practice of the present invention employs, unless otherwise indicated, conventional techniques of molecular biology (including recombinant techniques), microbiology, cell biology, biochemistry and immunology, which are well within the purview of the skilled artisan. Such techniques are explained fully in the literature, such as, "Molecular Cloning: A Laboratory Manual", fourth edition (Sambrook, 2012);

"Oligonucleotide Synthesis" (Gait, 1984); "Culture of Animal Cells" (Freshney, 2010); "Methods in Enzymology" "Handbook of Experimental Immunology" (Weir, 1997); "Gene Transfer Vectors for Mammalian Cells" (Miller and Calos, 1987); "Short Protocols in Molecular Biology" (Ausubel, 2002); "Polymerase Chain Reaction: Principles,

Applications and Troubleshooting", (Babar, 201 1); "Current Protocols in Immunology" (Coligan, 2002). These techniques are applicable to the production of the polynucleotides and polypeptides of the invention, and, as such, may be considered in making and practicing the invention. Particularly useful techniques for particular embodiments will be discussed herein.

The following examples are put forth so as to provide those of ordinary skill in the art with a complete disclosure and description of how to make and use the assay, screening, and therapeutic methods of the invention, and are not intended to limit the scope of what the inventors regard as their invention.

EXPERIMENTAL EXAMPLES

The invention is further described in detail by reference to the following experimental examples. These examples are provided for purposes of illustration only, and are not intended to be limiting unless otherwise specified. Thus, the invention should in no way be construed as being limited to the following examples, but rather, should be construed to encompass any and all variations which become evident as a result of the teaching provided herein.

Without further description, it is believed that one of ordinary skill in the art can, using the preceding description and the following illustrative examples, make and utilize the compounds of the present invention and practice the claimed methods. The following working examples therefore, specifically point out the exemplary embodiments of the present invention, and are not to be construed as limiting in any way the remainder of the disclosure.

The materials and methods employed in these experiments are now described. Study samples: In the present study, 50 endocrine receptor (estrogen or progesterone receptor) positive (BRER), 34 human epidermal growth factor receptor 2 (HER2) positive (BRHR), 24 triple positive (estrogen, progesterone and HER2 receptor positive) (BRTP) and 40 triple negative (absence of estrogen, progesterone and HER2 receptors) breast cancer tissues were included along with 20 breast control samples from healthy individuals. These tissues were obtained as de-identified archived samples.

Tumors needing macro-dissection were received in the form of ΙΟμπι sections on glass slides with marked guiding H&E slides, while tumors that did not require macro- dissection were received as ΙΟμπι paraffin rolls. The 20 non-matched control tissues were derived from breast reduction surgeries and obtained as ΙΟμπι paraffin rolls. A resident pathologist reviewed case histories, confirmed the tumor types and demarcated the cancer cells. All the samples were de-identified FFPE (formalin fixed paraffin embedded) samples of breast tumors or controls, and were received from the Abramson Cancer Center Tumor Tissue and Biosample Core.

PathoChip design, sample preparation and microarray processing: The PathoChip Array design has been previously described in detail (Baldwin et al. MBio. 2014; 5 : e01714-14). Including all the genomes of all known viruses as well as known human bacterial, parasitic and fungal pathogens the probes were generated in silico. The PathoChip comprises 60,000 probe sets manufactured as Sure Print glass slide microarrays

(Agilent Technologies Inc.), containing 8 replicate arrays per slide. Each probe is a 60-nt DNA oligomer that targets multiple genomic regions of the micro-organisms. PathoChip screening was done using both DNA and RNA extracted from formalin-fixed paraffin- embedded (FFPE) tumor tissues as described previously (Banerjee et al. Sci Rep. 2015; 5: 15162; Baldwin et al. MBio. 2014; 5: e01714-14). The quality of the extracted nucleic acids was determined by agarose gel electrophoresis and the A260/280 ratio. The extracted RNA and DNA samples were subjected to whole transcriptome amplification (WTA) as previously described (Banerjee et al. Sci Rep. 2015; 5: 15162). The WTA products were analyzed by agarose gel electrophoresis. Human reference RNA and DNA were also extracted from the human B cell line, BJAB and were used for WTA as previously described (Banerjee et al. Sci Rep. 2015; 5: 15162). The WTA products were purified, (PCR purification kit, Qiagen, Germantown, MD, USA); the WTA products from the cancers were labelled with Cy3 and those from the human reference DNA were labelled with Cy5 (SureTag labeling kit, Agilent Technologies, Santa Clara, CA). The labelled DNAs were purified and hybridized to the PathoChip as described previously (Banerjee et al. Sci Rep. 2015; 5: 15162). Post-hybridization, the slides were washed, scanned and visualized using an Agilent SureScan G4900DA array scanner (Banerjee et al Sci Rep. 2015; 5: 15162).

Microarray Data Extraction and Statistical analysis: Agilent Feature Extraction software (Banerjee et al. Sci Rep. 2015; 5: 15162; Baldwin et al. MBio. 2014; 5: e01714-14) was used to extract the raw data from the microarray images. The R program was used for normalization and data analyses (R Core Team. R Foundation for Statistical Computing, Vienna, Austria. (2015)). Scale factor was calculated using the signals of green (Cy3) and red (Cy5) channels for human probes. Scale factors are the sum of green/sum of red signal ratios of human probes. Then scale factors were used to obtain normalized signals for all other probes. For all probes except human probes, normalized signal was log2 transformed of green signals / scale factors modified red signals (log2 g - log2 scale factor * r). On the normalized signals, one-sided t-test was applied to select probes significantly present in cancer samples by comparing cancer samples versus controls. The significance cut-off was log2 fold change of signal > 1 and adjust p value < 0.01, control prevalence < 25%, case prevalence > 40%. Prevalence was calculated based on the detection of the signatures in the cancer and the control samples as percentage.

The cancer samples were also subjected to hierarchical clustering, based on the detection of microbial signatures in the samples. A hierarchical clustering technique (Euclidean distance, complete linkage, normalized hybridization signals not scaled) was used to cluster the samples. Then clusters were validated by CH index (Calinski and Harabasz index) which is implemented in an R package as NbClust (Charrad et al.

Journal of ' Statistical Software 2014; 61, 1-36). CH index is a cluster index that maximizes inter-cluster distances and minimizes intra-cluster distances. Possible cluster solutions were calculated that would maximize the index values to achieve the best clustering of the data. Statistical significance between different groups was determined using the two-sided t-test.

PCR validation of PathoChip results: PCR primers from the conserved and/or specific regions of the micro-organisms detected by PathoChip screen were used. The PCR amplification reaction mixtures for each reaction contained 200-400 ng of WTA product and 20 pm each of forward and reverse primers (FIG. 13), 300μΜ of dNTPs and 2.5U of LongAmp Taq DNA polymerase (NEB). DNA was denatured at 94°C for 3 min, followed by 30 cycles of 94°C for 30 s, different annealing temperatures for different sets of primers for 30-45 s, and 65°C for 30 s. The PCR conditions for each of the primer sets are displayed in FIG. 13.

The results of the experiments are now described.

Example 1 : Microbial signatures associated with different breast cancer types

Unique and common microbial signatures associated with different breast cancer types have been listed in FIG. 7 and are represented in FIGs. 1A, 2B, 3C and 3F. To establish the microbial signatures in the cancers, the average hybridization signal for each probe in the cancer samples was compared to the controls. Those probes that detected significant hybridization signals in the cancer samples (p-value<0.05, log fold change in hybridization signal >logl), were considered. The average hybridization signals of the significant probes for each microbial genera and viral families are shown in FIGs. 1A-1E, FIGs. 2A-2E and FIGs 3A-3F. FIGs. 8A-8G show the average hybridization signals of the probes of microorganisms detected in the cancers versus the controls, with respective adjusted p-values with multiple corrections. Additionally, the percent prevalence of the significant microbial signatures in the cancer samples were calculated. These data indicated how prevalent a significant virus or microorganism signature was in the cancer samples regardless of the hybridization intensity.

Example 2: Viral signatures associated with different breast cancer types Significant hybridization (described above), at levels above the controls, was detected for 28 viral families among the four breast cancer types (FIGs. 1A and ID). Of these, 17 viral families were detected with significantly higher hybridization signals in greater than 50% of the samples representing all 4 breast cancer types, as compared to the controls (FIGs. IB and ID). They included signatures of Adenoviridae, Anelloviridae,

Arenaviridae, Bunyaviridae, Coronaviridae, Filoviridae, Flaviviridae, Herpesviridae, Iridoviridae, Papillomaviridae, Paramyxoviridae, Parvoviridae, Picornaviridae,

Poxviridae, Reoviridae, Retroviridae and Rhabdoviridae (FIG. IB). Importantly, in examining the percent hybridization signal (FIG. 1C) and percent prevalence (FIG. ID) a number of viral families were significantly detected only in a subset of breast cancer types. Specifically, the signatures for Birnaviridae and Hepeviridae were only detected in BRTP; and Nodaviridea only in BRHR (FIGs. 1C and ID). Further examination of the percent prevalence (FIG. ID), showed that BRTN samples had low or no prevalence of Arteriviridae, Astroviridae, Birnaviridae, Caliciviridae, Circoviridae, Hepadnaviridae, Nodaviridae, Orthomyxoviridae, Polyomaviridae and Togaviridae; BRHR samples show low or no prevalence of Birnaviridae, Hepadnaviridae and Hepeviridae; BRTP samples show low or no prevalence of Caliciviridae and Nodaviridae; and BRER samples show low or no prevalence of Arteriviridae, Birnaviridae, Hepeviridae and Nodaviridae.

Hybridization signal intensity offered an additional way to compare the data. Here, marked differences were noted for specific viral families between the different breast cancer types. For example, probes for polyomaviridae were detected with the highest hybridization signal in the BRHRs, followed by BRERs and BRTPs (FIG. IE). Polyomaviridae were detected in the BRTNs compared to the controls; however, at a lower hybridization signal [log fold change in hybridization signal = 0.4 to 1] (FIG. IE) which is below the cut-off to consider the signal positive, thus polyomaviridae were not shown to be present in the BRTNs in FIG. 1C or in FIG. ID. Similarly, probes of Hepadnaviridae were significantly detected with low hybridization signal in the BRTNs (FIG. IE), while detected with higher hybridization signal intensity [log fold change in hybridization signal >1] in the BRERs and BRTPs (FIG. IE).

Signatures of Herpesviridae, Adenoviridae, Poxviridae were detected in >90% of the BRER samples screened (FIG. ID), while the highest hybridization signal was detected for Anelloviridae and Flaviviridae (FIG. IB). Signatures of Astroviridae, Herpesviridae, Reoviridae were detected in all of the BRTP samples tested (FIG. ID), with the highest hybridization signal detected for Polyomaviridae signatures (FIG. 1C). For BRHPv samples, signatures of Reoviridae and Flaviviridae were detected in >90% of the samples screened (FIG. ID), with signatures of Togaviridae showing the highest hybridization signal (FIG. 1C). Among the BRTN samples, we detected signatures of Reoviridae in 90% of the samples screened (FIG. ID), with signatures of Picornaviridae and Anelloviridae with the highest hybridization signal (FIG. IB).

Probes of Poxviridae family were detected significantly in >80% of all the breast cancer types analyzed. Interestingly, probes of Parapoxviridae were detected

significantly with high hybridization signal intensity in BRER cancers versus the controls (FIG. IE). Probes of Parapoxviridae were also detected significantly in the other 3 types of breast cancers compared to the controls, but showed much lower hybridization signal intensity for those probes (log fold change in hybridization signal -0.5) (FIG. IE).

The data show that the cancer samples as a whole had a robust viral signature. However, there were significant and defining differences between the four types with BRTN having the least complex viral signature.

Example 3 : Bacterial signatures associated with different breast cancer types

FIGs. 2A-2E shows the analysis of bacterial signatures in the 4 breast cancer types. Significant hybridization, above the levels of the controls, was detected for 56 bacterial genera, the majority (50-60%) were proteobacteria, the major group of gram negative bacteria. These genera partitioned into bacterial signatures unique to each cancer types, as well as signatures that were common to multiple breast cancer types (FIG. 7, FIGs. 2B-2D). Significant hybridization signals common to all 4 breast cancer types were detected for Actinomyces, Bartonella, Brevundimonas, Coxiella, Mobiluncus, Mycobacterium, Rickettsia and Sphingomonas (FIGs. 2B-2C).

FIG. 2E shows the marked diversity in bacterial signatures between the breast cancer types. Distinct bacterial signatures uniquely associated with each type of breast cancer analyzed were identified. In this regard BRTN had the least complex bacterial signature, while BRER was the most complex. Signals for Arcanobacterium,

Bifidobacterium, Cardiobacterium, Citrobacter, Escherichia were significantly detected in the BRER samples compared to the controls, while those of Bordetella, Campylobacter, Chlamydia, Chlamydophila, Legionella and Pasteurella were significantly associated with the BRTPs. Signals for Streptococcus were detected significantly in the BRHRs, whereas, Aerococcus, Arcobacter, Geobacillus, Orientia and Rothia were found associated with the BRTNs.

Hybridization signal intensity again provided an additional view of the complexity of the bacterial microbiome and its diversity among the different breast cancers (FIGs. 2C-2D). Signals for Brevundimonas were detected with higher average hybridization signals in the endocrine receptor positive BRER and BRTP compared to the endocrine receptor negative BRHR and BRTN (FIGs. 2C-2D). Hybridization signals of Mobiluncus and Mycobacterium were predominantly detected in the endocrine receptor negative samples.

Bacterial signatures of Actinomyces were detected in all 4 cancer types, however their hybridization signal intensity was markedly lower in the BRTN samples (FIG. 2C).

Similarly, Bartonella was significantly detected in all cancer types, but its hybridization signal intensity was markedly lower in the BRER samples compared to the others (FIG.

2C). The bacterial probes detected with the highest hybridization signals were those for Acinetobacter in BRER and BRHR samples, Brevundimonas in BRTP samples and

Caulobacter in BRTN samples (FIG. 2D).

As in the case of the viruses, the data showed that the cancer samples had a robust bacterial signature with significant and defining differences between the four breast cancer types.

Example 4: Fungal signatures associated with different breast cancer types

Significant hybridization, above the levels of the controls, was detected for 21 different genera of fungi among the 4 types of breast cancer (FIGs. 3A-3B).

Interestingly, none of these families were detected in all four cancer types (FIGs. 3B-3C). In fact the fungi signatures for each type of breast cancer were relatively unique; only 7 fungal families (Aspergillus, Candida, Coccidioides, Cunninghamella, Geotrichum, Pleistophora and Rhodotorula) were detected in more than one type of breast cancer. The receptor positive cancer samples showed much more complex fungal diversity than the BRTN samples (FIGs. 3A-3B). FIG. 7 and FIG. 3C show the unique fungal signatures associated with different breast cancer types. Fungal signatures of Filobasidiella, Mucor, and Trichophyton were found to be significantly associated with BRER samples, Penicillium with BRTP samples, Epidermophyton, Fonsecaea, Pseudallescheria with BRHR samples and Alternaria, Malassezia, Piedraia, Rhizomucor with BRTN samples. Example 5 : Parasitic signatures associated with different breast cancer types

Significant hybridization, above the levels of the controls, was detected for 29 different genera of parasites among the 4 types of breast cancer (FIGs. 3D-3E). As in the case of the fungi, no single genus of parasite was significantly detected in all four breast cancer types (FIGs. 3E-3F). Each cancer showed a relatively distinct parasitic signature pattern, with BRHR showing the least diverse signatures. FIG. 7 and FIG. 3F show the unique and common parasitic signatures among the different breast cancer types.

Analysis of hybridization signal intensity in FIG. 3D showed that Plasmodium was detected with the highest hybridization signal in the BRHR samples and also detected in the BRER samples and BRTP samples but not in BRTN samples. In BRTN the highest hybridization signal intensity was detected for the probes of Mansonella followed by Centrocestus, whereas Strongyloides was detected in almost all of the BRTN samples. Naegleria was detected with the highest hybridization signal intensity in BRTP while Sarcocystis and Babesia were detected in 92% of BRTP samples (FIG. 3E). Among the

BRER samples, Brugia showed the highest hybridization signal intensity (FIG. 3D), while Thelazia showed the highest prevalence (FIG. 3E). Signatures of Brugia and Paragonimus were only detected in BRER samples. Ancylostoma, Angiostrongylus, Echinococcus, Sarcocystis, Trichomonas, Trichostrongylus were found uniquely associated with BRTP samples. Balamuthia signatures were associated significantly with BRHR samples, and that of Centrocestus, Contracaecum, Leishmania, Necator, Onchocerca, Toxocara, Trichinella, Trichuris were detected significantly only with BRTN samples.

Example 6: Hierarchical clustering of the breast cancer samples based on the detection of microbial signatures

Using the hierarchical clustering analysis based on the detection of microbial signatures associated with the 4 breast cancer types it was determined whether the breast cancer types fell into any unique and identifiable clusters. While this analysis identified distinct clusters in each of the breast cancer types based on the detection of their microbial signature patterns (FIGs. 4A-4D), it also defined the distinct microbial signature pattern found in BRTNs and BRTPs whereas, BRER and BRHR shared similar microbial signatures (FIG. 4E). Individually, the different BC types fell into distinct microbial signature clusters. BRER samples fell into 2 distinct clusters 1ER and 2ER, along with 2 ungrouped samples (ungrouped 1ER) (FIG. 4A). Samples grouped in Cluster 1ER and 2ER differed significantly based on the higher detection of mostly bacterial and viral and certain fungal and parasitic signatures in the samples of Cluster 2ER (FIG. 10). The ungrouped BRER samples (ungrouped 1ER) were significantly different from clusters 1ER and 2ER (FIG. 10).

The majority of the BRTP samples had similar microbial detections and grouped together into 1 major cluster (cluster 1TP), while few samples remained ungrouped (FIG. 4B).

The BRHR samples formed 2 major clusters (cluster 1HR and cluster 2HR) (FIG. 4C), and they differed from each other in having higher detection of certain bacterial and viral signatures in cluster 2HR compared to samples in cluster 1HR (FIG. 11). Bacterial signatures of Kingella, Brevundimonas, Eikenella, Bartonella, Acinetobacter,

Nodaviridae, Actinomyces, Aeromonas, Mobiluncus, Fusobacterium, Alcaligenes,

Brucella and Staphylococcus; viral signatures of Orthomyxoviridae, Parvoviridae, Papillomaviridae, Nodaviridae and Astroviridae and fungal signatures of Aspergillus showed significant higher detection in cluster 2HR. The 3 BRHRs that could not be grouped (ungrouped 1HR and 2HR) showed higher detection of certain microbial signatures listed in FIG. 11 compared to the clustered BRHR samples; in particular, included the parasitic signature of Entamoeba and bacterial signatures of Listeria and Corynebacterium .

The BRTN samples formed two distinct clusters (cluster ITN and 2TN) with 2 samples that did not cluster into distinct group (ungrouped ITN) (FIG. 4D). Cluster ITN differed from Cluster 2TN in having higher detection of bacterial probes of Caulobacter,

Brevundimonas, Peptoniphilus, Rothia, Geobacillus, Aerococcus, Mobiluncus,

Actinomyces, Bartonella, fungal probes of Malassezia, Piedraia, Rhodotorula,

Rhizomucor and parasitic signatures of Leishmania, Toxocara, Contracaecum,

Centrocestus, Trichuris, Strongyloides (FIG. 12). Whereas, samples in Cluster 2TN had significant higher hybridization signal intensity for viral signatures of Poxviridae,

Paramyxoviridae, Reoviridae, Parvoviridae, Arenaviridae, bacterial signatures of Sphingomonas, Brucella, Orientia, Stenotrophomonas, fungal signatures of Pleistophora and parasitic signatures of Trichinella. The ungrouped samples differed from the grouped samples in having significantly higher detection of certain viral probes of Anelloviridae, Retroviridae, Poxviridae and Arenaviridae compared to Cluster 1TN and Cluster 2TN samples (FIG. 12).

FIG. 4E shows the comparison of the microbiome signatures from all four breast cancer types together in the clustering analysis. The data show that the different breast cancers grouped into 4 major clusters plus a few ungrouped BRER (2 samples), BRHR (3 samples) and BRTN (2 samples) samples (ungrouped 1, 2 and 3 respectively). Most of the BRTNs were very distinct in their microbial signature pattern association, and they clustered together (cluster 3). Similarly all the BRTPs screened clustered together to form a distinct cluster 4. Conversely, most of the BRER samples shared a similar microbial signature pattern with all of the BRHR samples forming the distinct cluster 1, while the remaining 11 BRER samples formed cluster 2. The BRERs in cluster 2 differed from those in Cluster 1 in having significant higher hybridization signals for certain bacterial signatures like Brevundimonas, Sphingomonas, Erysipelothrix, Mycoplasma, Brucella, Prevotella, Arcanobacterium, Staphylococcus, Rickettsia, Propionibacterium,

Lactobacillus, Shigella, viral signatures of Polyomaviridae, Circoviridae, Herpesviridae, Papillomaviridae, Retroviridae, Orthomyxoviridae, Flaviviridae, Iridoviridae, Poxviridae, Reoviridae, fungal signatures of Trichophyton, Mucor, Rhodotorula, Geotrichum, Pleistophora and parasitic signatures of Paragonimus, Macracanthorhynchus,

Hartmannella (FIG. 9).

Thus, specific microbial signature patterns associated with different breast cancer types were identified herein.

Example 7: Validation of PathoChip screen results by PCR

Several viruses and microorganisms detected in the BC samples were selected for verification by non-quantitative PCR and sequencing, these included several viral families and individual viruses (Herpesvirus, Polyoma, Papilloma, Parapox and MMTV), as well as a prevalent bacterium (Brevundimonas) and fungus (Pleistophora). The primers used were either previously published (FIG. 13) or were designed based on sequences from the conserved and specific regions of the micro-organisms. For detection of parasites pan parasite diagnostic PCR primers were used enabling exhaustive detection of non-human eukaryotic species-specific small subunit rDNA in human clinical samples. For the validation experiments the same amplification products used for the screening were used. The amplification products of all the samples for each type of cancer were pooled together and 200-400ng of the products were used for PCR. The same protocol was used for the controls. The PCR amplification showed the expected amplicons for the PathoChip-detected viruses, as well as the selected bacterium, fungus and parasite (FIGs. 5A-5B). Sequencing of the PCR products verified the detection of the appropriate virus or other microorganism (FIG. 14).

Example 8: Discussion

The human microbiome is comprised of mutualistic, pathogenic, transient and residential viruses and microorganisms. Many recent studies have suggested that the body's microbiome dramatically affects health, where perturbation of the microbiome leads to altered physiology and pathology, including cancer. However, the reverse may also be true, that different human diseases create disease microenvironments amenable to the persistence of a differential microbiome, with or without a direct effect of the establishment or progression of the disease. Such differential microbiomes could be specific to each such disease, thus a biomarker. Using the metagenomic array technology PathoChip, distinct microbiome signatures were previously established in triple negative breast cancers (BRTNs) (Banerjee et al. Sci Rep. 2015; 5 : 15162). In the present study the microbiome signatures in 4 major breast cancer types (BRTN, BRTP, BRER, BRHR) were simultaneously determined in order to resolve whether the microbiome associated with the BRTNs was a specific feature of BRTNs, or a generic feature shared with other types of breast cancers. The data showed that the various breast cancers have a robust and varied microbiome with aspects that are unique to each type as well as shared components. The data demonstrate that breast cancer microbiome signatures provide type-specific biomarkers.

Examining viral signatures showed that the majority of the viral families detected were associated with all 4 breast cancer types. However, several important viruses were differentially detected; for example, among known oncogenic viruses the signatures of Polyomaviridae were detected with high significance (signal intensity) in the BRER and BRHR samples and with low signal intensity in the other breast cancer types. Signatures of Hepadnaviridae were similarly detected in BRER and BRTPs with high signal intensity, but with very low signal intensity in the other two cancer types. It is intriguing that signatures for Parapoxviridae family were found in all the breast cancers with BRERs showing the highest level of detection.

There were a number of bacterial families shared by all four breast cancer types.

For example, all four breast cancer types had dominant signatures for Proteobacteria followed by Firmicutes. In particular, the signature of the proteobacteria Brevundimonas genus was detected with high hybridization signal and prevalence in all four breast cancer types. Additionally, the Mobiluncus family was detected in all four types. Actinomyces signatures were also detected in all four breast cancers, especially in BRHRs where it was detected with very high signal intensity. All of the bacterial families detected in all four breast cancers could serve as a biomarker for breast cancer in general. However, each type of breast cancer held signatures for unique bacterial genera, which provides the ability to detect specific breast cancer types.

Among the fungal signatures detected were yeasts like Candida, Geotrichum,

Rhodotorula, Trichosporon as well as fungi causing Mucormycosis, Aspergillosis (cutaneous infections) and dermatophytes like Epidermophyton and Trichophyton. Also,

Fonsecaea infection was detected.

Possibly the most intriguing and unexpected result of the PathoChip screening is the detection of parasite signatures in the different breast cancer types. These signatures were quite unique to the different breast cancer types with no single parasite being prevalently found in all four. Many parasite signatures were distinctly detected in only one type of breast cancer. It should be kept in mind that the sensitive detection approach used herein allows detection of low abundance organisms, as well as unknown members of parasite families.

Findings presented herein demonstrated that the microbiome of breast cancers is diverse, extensive and has unique aspects that differentiate the four different breast cancers tested. It is possible that organisms in the breast cancer microbiomes could contribute to the origin, potentiation or modulation of oncogenesis. However, it is equally possible that the tumor microenvironment provides favorable conditions for a specific microbiome to persist more readily than in the normal tissue microenvironment. These data demonstrate for the first time that the BRTN and BRTP microbiomes are distinct and significantly different from the microbiome which is largely shared by BRER and BRHR. The unique characteristics of the breast cancer microbiomes potentially provide biomarkers for specific diagnosis and treatment of these cancers. Other Embodiments

The recitation of a listing of elements in any definition of a variable herein includes definitions of that variable as any single element or combination (or subcombination) of listed elements. The recitation of an embodiment herein includes that embodiment as any single embodiment or in combination with any other embodiments or portions thereof.

The disclosures of each and every patent, patent application, and publication cited herein are hereby incorporated herein by reference in their entirety. While this invention has been disclosed with reference to specific embodiments, it is apparent that other embodiments and variations of this invention may be devised by others skilled in the art without departing from the true spirit and scope of the invention. The appended claims are intended to be construed to include all such embodiments and equivalent variations.