Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
METHODS FOR CLASSIFYING AND TREATING GASTROINTESTINAL TUMORS BASED ON MYCOBIOME ANALYSIS
Document Type and Number:
WIPO Patent Application WO/2024/035638
Kind Code:
A1
Abstract:
The present disclosure provides methods for classifying and treating gastrointestinal tumors in a patient based on tumor-associated mycobiota. Also disclosed herein are methods for determining the prognosis of patients suffering from GI cancers comprising detecting the presence of tumor associated Candida nucleic acids in a biological sample obtained from the GI cancer patients.

Inventors:
ILIEV ILIYAN (US)
DOHLMAN ANDERS (US)
SHEN XILING (US)
Application Number:
PCT/US2023/029634
Publication Date:
February 15, 2024
Filing Date:
August 07, 2023
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
UNIV CORNELL (US)
UNIV DUKE (US)
International Classes:
C12Q1/6809; A61P31/10; A61P35/00; C12Q1/6895
Foreign References:
US20180071329A12018-03-15
US20160354416A12016-12-08
Other References:
COKER OLABISI OLUWABUKOLA, NAKATSU GEICHO, DAI RUDIN ZHENWEI, WU WILLIAM KA KEI, WONG SUNNY HEI, NG SIEW CHIEN, CHAN FRANCIS KA LE: "Enteric fungal microbiota dysbiosis and ecological alterations in colorectal cancer", GUT MICROBIOTA, BRITISH MEDICAL ASSOCIATION , LONDON, UK, vol. 68, no. 4, 1 April 2019 (2019-04-01), UK , pages 654 - 662, XP093142825, ISSN: 0017-5749, DOI: 10.1136/gutjnl-2018-317178
ZHONG MENGYA, XIONG YUBO, ZHAO JIABAO, GAO ZHI, MA JINGSONG, WU ZHENGXIN, SONG YONGXI, HONG XUEHUI: "Candida albicans disorder is associated with gastric carcinogenesis", THERANOSTICS, IVYSPRING INTERNATIONAL PUBLISHER, AU, vol. 11, no. 10, 1 January 2021 (2021-01-01), AU , pages 4945 - 4956, XP093142827, ISSN: 1838-7640, DOI: 10.7150/thno.55209
Attorney, Agent or Firm:
EWING, James F. et al. (US)
Download PDF:
Claims:
CLAIMS

1. A method for classifying the my cobiome of gastrointestinal (GI) tumors in a patient comprising

(a) receiving a whole genome sequencing (WGS) dataset corresponding to a GI tumor sample obtained from the patient, wherein the WGS dataset comprises GI tumor-associated fungal sequence reads corresponding to a plurality of fungal species;

(b) mapping the tumor-associated fungal sequence reads in the WGS dataset to a plurality of fungal reference genome libraries using at least one sequence alignment tool to identify the plurality of fungal species present in the GI tumor sample, wherein the plurality of fungal species comprises at least one Candida species and at least one Saccharomyces species;

(c) normalizing the tumor-associated fungal sequence reads corresponding to the plurality of fungal species to obtain expected reads per kilobase of genome per million primary reads (eRPKM) values;

(d) calculating relative abundance (%) values for each species in the plurality of fungal species in the WGS dataset by scaling the eRPKM values and computing a metric for Candida-Xo- Saccharomyces abundance for the GI tumor sample; and

(e) classifying the GI tumor sample as / /tG-dominant (Ca-type) when the metric for Candida-Xo- Saccharomyces abundance is above a first predetermined threshold or Saccharomyces- omm&nX (&-type) when the metric for Candida-Xo- Saccharomyces abundance is below a second predetermined threshold.

2. The method of claim 1, further comprising applying a prevalence-based decontamination model to remove fungal sequence reads corresponding to fungal species or genera that are not associated with GI tumors.

3. The method of claim 2, further comprising applying quality control models to eliminate false-positive fungal sequence reads.

4. The method of any one of claims 1-3, wherein the A -type tumor sample comprises one or more of S. cerevisiae. S. eubayanus, Cyberlindnera jadinii, Pichia membranifaciens, C. parapsilosis and C. glabrata.

5. The method of any one of claims 1-3, wherein the Ca-type tumor sample comprises one or more of C. albicans, C. dubliniensis, C. tropicalis, and C. guilliermondii.

6. The method of any one of claims 1-3 or 5, further comprising determining that the patient has a poor prognosis when the GI tumor sample is Ca-type.

7. The method of any one of claims 1-3 or 5-6, further comprising determining that the patient is at risk for metastatic disease when the GI tumor sample is Ca-type.

8. The method of any one of claims 1-7, wherein the GI tumor sample is a head-neck tumor, an esophageal tumor, a stomach tumor, a colon tumor, or a rectal tumor.

9. A method for treating a gastrointestinal (GI) cancer in a patient in need thereof comprising: administering to the patient an effective amount of an antifungal agent that targets at least one Candida species, wherein the patient comprises Ca /t/a-dominant (Ca-type) tumors, wherein the Ca-type tumors have an abundance ratio of Candida species relative to Saccharomyces species that is higher than a predetermined threshold, and comprise one or more of C albicans, C. dubliniensis, C. tropicalis, and C guilliermondii, and wherein the GI cancer is head-neck cancer, esophageal cancer, stomach cancer, colon cancer, or rectal cancer.

10. The method of claim 9, further comprising administering an effective amount of an anti-cancer therapy, wherein the anti-cancer therapy is selected from among chemotherapy, radiation therapy, immunotherapy, monoclonal antibodies, anti-cancer nucleic acids or proteins, anti-cancer viruses or microorganisms, and any combinations thereof.

11. A method for prolonging survival of a patient diagnosed with or suffering from gastrointestinal (GI) cancer comprising administering to the patient an effective amount of an antifungal agent that targets at least one Candida species, wherein the patient comprises ( ////tZ/tG-dominant (( //-type) tumors, wherein the ( //-type tumors have an abundance ratio of Candida species relative to Saccharomyces species that is higher than a predetermined threshold, and comprise one or more of C. albicans, C. dubliniensis, C. tropicalis, and C. guilliermondii, and wherein the GI cancer is head-neck cancer, esophageal cancer, stomach cancer, colon cancer, or rectal cancer.

12. The method of any one of claims 9-11, wherein the antifungal agent that targets at least one Candida species comprises one or more of echinocandins, caspofungin, micafungin, anidulafungin, fluconazole, amphotericin B, traconazole, voriconazole, Posaconazole, isavuconazole, nystatin, miconazole, clotrimazole, and itraconazole.

13. The method of any one of claims 9-12, wherein the antifungal agent that targets at least one Candida species is administered orally, intravenously, intraperitoneally, subcutaneously, intramuscularly, rectally, or intratumorally.

14. The method of any one of claims 9-13, wherein the ( //-type tumors are detected by detecting tumor-associated Candida nucleic acids in a biological sample obtained from the patient.

15. The method of claim 14, wherein the tumor-associated Candida nucleic acids are detected via whole genome sequencing, shotgun sequencing, targeted sequencing, RNA sequencing, or any combination thereof.

16. The method of claim 14 or 15, wherein the biological sample comprises whole blood, plasma, serum, saliva or tumor cells.

17. The method of any one of claims 1-16, wherein the GI cancer or tumor is stage 1, stage 2, stage 3, stage 4 or metastatic.

18. The method of any one of claims 1-17, wherein the (//-type tumors exhibit elevated expression levels and/or activity of one or more of IL22, IL24, CARDIO, CD44, ILIA, IL1B, IL6, IL8, CXCL1, CXCL2, IL17C, BMP15, PFN3, CCL27, PIP, and SAGE1 relative to a GI tissue sample from a healthy control subject or a predetermined threshold.

19. The method of any one of claims 1-18, wherein the (//-type tumors exhibit reduced expression levels and/or activity of one or more of TP53, CDKN2A, fibronectin (FN1), PTK2B, CDKN2C, NET1, ALAD, FTL, IL17D, CST5, ELN, and TREM2 relative to a GI tissue sample from a healthy control subject or a predetermined threshold.

20. The method of any one of claims 1-19, wherein the patient is human.

Description:
METHODS FOR CLASSIFYING AND TREATING GASTROINTESTINAL

TUMORS BASED ON MYCOBIOME ANALYSIS

CROSS-REFERENCE TO RELATED APPLICATIONS

[0001] This application claims the benefit of and priority to U.S. Provisional Application No. 63/396,078, filed August 8, 2022, the contents of which are incorporated by reference in their entirety for any and all purposes.

STATEMENT OF FEDERALLY FUNDED RESEARCH

[0002] This invention was made with government support under grant numbers R01DK113136, R01DK121977, R01AI163007, R35GM122465, DK119795, and NIH- U01CA214300 awarded by National Institutes of Health. The government has certain rights in the invention.

TECHNICAL FIELD

[0003] The present technology relates to methods for classifying and treating gastrointestinal tumors in a patient based on tumor-associated mycobiota. Also disclosed herein are methods for determining the prognosis of patients suffering from GI cancers comprising detecting the presence of tumor associated Candida nucleic acids in a biological sample obtained from the GI cancer patients.

BACKGROUND

[0004] The following description of the background of the present technology is provided simply as an aid in understanding the present technology and is not admitted to describe or constitute prior art to the present technology.

[0005] Cancer is among the leading causes of death worldwide. Host-bacterial immune interactions profoundly influence tumorigenesis, cancer progression, and response to therapies. Nevertheless, the role of fungi (mycobiota) in these processes remains largely unexplored, missing a potential avenue for developing novel diagnostic and preventative strategies. Mycobiota and bacteria co-colonize the mammalian gastrointestinal (GI) tract, skin epithelium, respiratory tract, and reproductive organs, forming a complex ecosystem of microbe-microbe and host-microbe interactions with significant implications for human health. Despite comprising around just 0.1% of the microbial DNA present in the gut, fungal infections are responsible for more than 1.5 million global deaths per year and species from this kingdom have a disproportionate influence on the overall microbiome and host immunity.

[0006] A growing body of evidence links the human microbiome to cancer and cancer outcomes, including viruses, bacteria, and fungi (Helmink et al., 2019; Vogtmann and Goedert, 2016). In the lower GI tract, genotoxic Escherichia coli (Arthur et al., 2012; Pleguezuelos-Manzano et al., 2020), Bacteroides fragilis (Sears et al., 2014), Streptococcus bovis/gallolyticus (Abdulamir et al., 2011), and Fusobacterium nucleatum (Castellarin et al., 2012; Kostic et al., 2013) have each been implicated in the pathogenesis of colorectal cancer. Common among these cancer-associated bacteria is their ability to modulate host immunity and provoke chronic inflammation, features which are proposed to contribute to their tumorigenic capacity (Francescone et al., 2014; Jain et al., 2021; Kostic et al., 2013). However, few conclusive associations have so far linked the fungal microbiome and inflammation to cancer.

[0007] Accordingly, there is an urgent need for therapeutic and prognostic methods that target species within the fungal microbiome that contribute to the pathogenesis of GI cancers and predict disease outcomes.

SUMMARY OF THE PRESENT TECHNOLOGY

[0008] In one aspect, the present disclosure provides, a method for classifying the mycobiome of gastrointestinal (GI) tumors in a patient comprising (a) receiving a whole genome sequencing (WGS) dataset corresponding to a GI tumor sample obtained from the patient, wherein the WGS dataset comprises GI tumor-associated fungal sequence reads corresponding to a plurality of fungal species; (b) mapping the tumor-associated fungal sequence reads in the WGS dataset to a plurality of fungal reference genome libraries using at least one sequence alignment tool to identify the plurality of fungal species present in the GI tumor sample, wherein the plurality of fungal species comprises at least one Candida species and at least one Saccharomyces species; (c) normalizing the tumor-associated fungal sequence reads corresponding to the plurality of fungal species to obtain expected reads per kilobase of genome per million primary reads (eRPKM) values; (d) calculating relative abundance (%) values for each species in the plurality of fungal species in the WGS dataset by scaling the eRPKM values and computing a metric for Candida -to- Saccharomyces abundance for the GI tumor sample; and (e) classifying the GI tumor sample as Candida- dominant (Ca-type) when the metric for Candida-Xo- Saccharomyces abundance is above a first predetermined threshold or Saccharomyces-Aomm^ry (&-type) when the metric for Candida-Xo- Saccharomyces abundance is below a second predetermined threshold.

[0009] In some embodiments, the method further comprises applying a prevalence-based decontamination model to remove fungal sequence reads corresponding to fungal species or genera that are not associated with GI tumors.

[0010] In certain embodiments, the method further comprises applying quality control models to eliminate false-positive fungal sequence reads.

[0011] In any of the preceding embodiments of the methods disclosed herein, the Aa-type tumor sample comprises one or more of S. cerevisiae, S. eubayanus, Cyberlindnera jadinii, Pichia membranifaciens, C. parapsilosis and C. glabrata.

[0012] Additionally or alternatively, in some embodiments the Ca-type tumor sample comprises one or more of C. albicans, C. dubliniensis, C. tropicalis, and C. guilliermondii .

[0013] In some embodiments, the method further comprises determining that the patient has a poor prognosis when the GI tumor sample is Ca-type.

[0014] Additionally or alternatively, in some embodiments, the method further comprises determining that the patient is at risk for metastatic disease when the GI tumor sample is Ca- type.

[0015] In any of the preceding embodiments of the methods disclosed herein, the GI tumor sample is a head-neck tumor, an esophageal tumor, a stomach tumor, a colon tumor, or a rectal tumor.

[0016] In one aspect, the present disclosure provides a method for treating a gastrointestinal (GI) cancer in a patient in need thereof comprising: administering to the patient an effective amount of an antifungal agent that targets at least one Candida species, wherein the patient comprises / /tG-dominant (Ca-type) tumors, wherein the Ca-type tumors have an abundance ratio of Candida species relative to Saccharomyces species that is higher than a predetermined threshold, and comprise one or more of C. albicans, C. dubliniensis, C. tropicalis, and C. guilliermondii, and wherein the GI cancer is head-neck cancer, esophageal cancer, stomach cancer, colon cancer, or rectal cancer.

[0017] In some embodiments, the method further comprises administering an effective amount of an anti-cancer therapy, wherein the anti-cancer therapy is selected from among chemotherapy, radiation therapy, immunotherapy, monoclonal antibodies, anti-cancer nucleic acids or proteins, anti-cancer viruses or microorganisms, and any combinations thereof.

[0018] In another aspect, the present disclosure provides a method for prolonging survival of a patient diagnosed with or suffering from gastrointestinal (GI) cancer comprising administering to the patient an effective amount of an antifungal agent that targets at least one Candida species, wherein the patient comprises / /tG-dominant (Ca-type) tumors, wherein the Ca-type tumors have an abundance ratio of Candida species relative to Saccharomyces species that is higher than a predetermined threshold, and comprise one or more of C. albicans, C. dubliniensis, C. tropicalis, and C. guilliermondii, and wherein the GI cancer is head-neck cancer, esophageal cancer, stomach cancer, colon cancer, or rectal cancer.

[0019] In some embodiments, the antifungal agent that targets at least one Candida species comprises one or more of echinocandins, caspofungin, micafungin, anidulafungin, fluconazole, amphotericin B, traconazole, voriconazole, Posaconazole, isavuconazole, nystatin, miconazole, clotrimazole, and itraconazole.

[0020] In certain embodiments, the antifungal agent that targets at least one Candida species is administered orally, intravenously, intraperitoneally, subcutaneously, intramuscularly, rectally, or intratum orally.

[0021] Additionally, or alternatively, in some embodiments, the Ca-type tumors are detected by detecting tumor-associated Candida nucleic acids in a biological sample obtained from the patient. In certain embodiments, the tumor-associated Candida nucleic acids are detected via whole genome sequencing, shotgun sequencing, targeted sequencing, RNA sequencing, or any combination thereof. The biological sample may comprise whole blood, plasma, serum, saliva or tumor cells. [0022] In any and all embodiments of the methods disclosed herein, the GI cancer or tumor is stage 1, stage 2, stage 3, stage 4 or metastatic.

[0023] In any and all embodiments of the methods disclosed herein, the Ca-type tumors exhibit elevated expression levels and/or activity of one or more of IL22, IL24, CARDIO, CD44, ILIA, IL1B, IL6, IL8, CXCL1, CXCL2, IL17C, BMP15, PFN3, CCL27, PIP, and SAGE1 relative to a GI tissue sample from a healthy control subject or a predetermined threshold.

[0024] In any and all embodiments of the methods disclosed herein, the Ca-type tumors exhibit reduced expression levels and/or activity of one or more of TP53, CDKN2A, fibronectin (FN1), PTK2B, CDKN2C, NET1, ALAD, FTL, IL17D, CST5, ELN, and TREM2 relative to a GI tissue sample from a healthy control subject or a predetermined threshold.

[0025] In another aspect, the present disclosure provides a method for predicting the prognosis of a patient diagnosed with or suffering from gastrointestinal (GI) cancer comprising detecting the presence of tumor-associated Candida nucleic acids in a biological sample obtained from the patient, and determining that the patient has poor prognosis and/or is at risk for metastatic disease when the abundance/level of tumor-associated Candida nucleic acids in the biological sample is higher relative to a predetermined threshold or a sample obtained from a healthy control subject. In certain embodiments, the tumor- associated Candida nucleic acids are detected via whole genome sequencing, shotgun sequencing, targeted sequencing, RNA sequencing, or any combination thereof. The biological sample may comprise whole blood, plasma, serum, saliva or tumor cells.

[0026] In any and all embodiments of the methods disclosed herein, the patient is human.

BRIEF DESCRIPTION OF THE DRAWINGS

[0027] FIGs. 1A-1F: Fungal DNA is present in multiple cancer types not explained by contamination. FIG. 1A shows the geometric mean of reads per million (RPM) of fungal DNA detected in tumor and tumor-associated tissue samples from head & neck (HNSC), lung (LUSC), rectum (READ), colon (COAD), stomach (STAD), breast (BRCA), esophageal (ESC A) and brain (LGG) cancers. FIG. IB shows mean RPM of bacteria and fungi in the brain, upper gastrointestinal (GI) tract and lower GI tract, where both bacteria and fungi were more abundant in the lower GI tract (COAD, READ) than the upper GI tract (HNSC, ESC A, STAD), and were more abundant in both GI groups compared to brain (LGG). FIGs. 1C-1F show genome alignments in samples from head & neck (HNSC), lung (LUSC), rectum (READ), colon (COAD), stomach (STAD), breast (BRCA), esophageal (ESCA) and brain (LGG) cancers to C. albicans (FIG. 1C), to S. cerevisiae (FIG. ID), to B. dermatidis (FIG. IE), and to globosa (FIG. IF). Statistical significance between groups is calculated using a two-sided Wilcoxon rank-sum statistic. The direction of the inequality symbol indicates which sample group is larger, while the number of symbols indicates the degree of significance (l p < 0.05, 2: p < 0.01, 3 p < 0.001).

[0028] FIGs. 2A-2C: Primary tumor samples harbor disease-specific my cobiomes. FIG. 2A shows a principal coordinate analysis (PCoA) of normalized species abundances from head-neck (HNSC), esophageal (ESCA), stomach (STAD), colon (COAD), rectal (READ), lung (LUSC), breast (BRCA), and brain (LGG) that reveals clustering by tumor type, after filtering out potential contaminants and false positive signals. FIG. 2B is a hierarchically clustered heatmap showing the difference in relative fungal species abundances (RPM) between tissues from each TCGA cancer type, after filtering out potential contaminants and false positive signals. Species are included if they were classified as tissue-resident in any of GI, lung, or breast samples, even if they were classified as contaminants in others. Heatmap values are z-scored by species abundance to highlight tissue-specific differences. FIG. 2C is a series of boxplots showing distribution of relative abundances (RA) from the 10 most abundant species detected in each TCGA cancer type after removing potential contaminants, provided they were detected in at least 5 samples (left to right first row: HNSC, ESCA, STAD, COAD; left to right second row: READ, LUSC, and BRCA).

[0029] FIGs. 3A-3G show a trans-kingdom analysis that reveals Candida- and N/cc/ia/'o/ q'cc.s-associated GI cancer coabundance groups. FIG. 3A is a hierarchically clustered heatmap showing co-abundance among fungal species using correlation statistic SparCC to reveal clusters of species Candida- and N/cc/ia/'o/ q'cc.s-associated species across GI tumor samples. Purple boxes indicate clusters forming Candida- and Saccharomyces- associated tumor co-abundance groups. FIGs. 3B-3D are hierarchically clustered heatmaps, showing gene expression patterns in head-neck (HNSC; FIG. 3B), stomach (STAD; FIG. 3C), and colon (COAD; FIG. 3D) cancers. Heatmaps are clustered by row, while column clustering is determined by FIG. 3A. Gray columns indicate species not detected in certain cancer types. FIGs. 3E-3G show SparCC correlations between Candida and Saccharomyces and bacterial genera found in matched tumor samples from TCMA (Dohlman et al., 2020) in HNSC (FIG. 3E), STAD (FIG. 3F), and COAD (FIG. 3G).

[0030] FIGs. 4A-4E show that Candida is associated with late-stage and metastatic GI cancers. FIG. 4A shows kernel density estimations (KDE) of Candida-to-Saccharomyces ratios in head-neck (HNSC), stomach (STAD), and colon (COAD) cancers, suggesting that GI tumor samples can be classified into Candida- and Saccharo yce s-associated cancers. FIG. 4B shows volcano plots showing genes differentially expressed Candida-negative (blue) and Candida-high (red) tumor samples head-neck, stomach, and colon cancers. FIG. 4C is boxplots depicting the ratio of Candida-to-Saccharomyces in early-stage (I-III) and late-stage (IV) for head-neck, stomach, and colon cancers reveals enrichment of Candida in late-stage colon cancers. FIG. 4D is a KDE analysis of Candida-to-Saccharomyces ratios in metastatic (orange) and non-metastatic (blue) tumor samples finds that Ca-type colon tumors are significantly more likely to be metastatic. FIG. 4E is violin-plots showing the distribution of Bray-Curtis distances between the fungal species compositions of patient- matched tumor and blood sample (blue) and unmatched tumor and blood samples (orange). Matched samples were significantly more likely have similar composition than unmatched samples.

(A) FIGs. 5A-5F show that live, transcriptionally active Candida species are associated with GI tumors. FIG. 5 A is a spatial distribution of Ascomycota abundance along the colorectal tract. Ascomycota were most abundant in the ascending colon. Significance was calculated between adjacent tumor sites. FIG. 5B is a targeted analysis showing spatial distribution of C. albicans abundance by reads per kilobase of genome, per million (RPKM). FIG. 5C is a comparison of Candida abundance detected in TCGA WGS data (left) and in matched tissues by independent ITS sequencing (right). FIG. 5D is a MALDI-TOF analysis showing that live C. albicans and C. tropicalis are present in the mucosa of adenocarcinomas from ascending colon. Isolated fungal colonies from each individual subject were identified by MALDI-TOF, and viable fungal colony forming units (cfu) per mL of sample were determined. No \iveM..globosa or S. cerevisiae were isolated from these tissues. FIG. 5E shows the abundance of RNA transcripts aligning to Candida in brain (gray) and sites across the lower GI tract (blue) from solid tissues in the HCMI cohort. FIG. 5F is a series of graphs showing the correlation between fungal species abundances (logio-eRPKM) determined by PathSeq analysis of TCGA WGS and RNA-seq data in GI samples (top) and brain samples (bottom) indicates that Candida spp. are transcriptionally active in GI tissues, while other species are not.

[0031] FIG. 6A-6D shows that Candida species are present in GI cancers and high abundance is associated with early-stage stomach cancer. FIG. 6A is two targeted analyses (see Methods) measuring the abundance (RPKM) of C. albicans and C. tropicalis across TCGA cancer types, with gastrointestinal tumors harboing higher rates of C. albicans and C. tropicalis sequences than brain and breast tissues. FIG. 6B is a targeted analysis showing S. cerevisiae is more abundant in GI tumor samples than brain and breast tissues. FIG. 6C is a series of graphs showing the abundance of C. albicans and C. tropicalis are elevated in stage 1 stomach cancer tumors and stage 4 colon cancer tumors. Statistical significance was calculated between stage 1 tumors and each subsequent stage. FIG. 6D is a series of graphs showing the abundance of S. cerevisiae is elevated in stage 1 stomach cancer tumors and stage 4 colon cancer tumors. Statistical significance was calculated between stage 1 tumors and each subsequent stage. Statistical significance between groups is calculated using a two- sided Wilcoxon rank-sum statistic. The direction of the inequality symbol indicates which sample group is larger, while the number of symbols indicates the degree of significance (1 : p < 0.05, 2: /? < 0.01, 3: /? < 0.001).

[0032] FIGs. 7A-7G show that cancer-associated fungal mycobiota and clinical outcomes highlight predictive value of Candida. FIG. 7A is three heat-trees depicting differential abundance of genera between tumor (blue) and matched adjacent normal tissue (yellow) in head-neck (HNSC), stomach (STAD), and colon (COAD) cancers. Across cancer types, Candida is enriched in tumors compared to uninvolved tissue. FIG. 7B is a volcano plot showing differential abundance of genera between tumor (blue) and matched adjacent normal tissue (yellow) in stomach cancer. Candida is enriched in stomach tumors compared to matched adjacent normal tissue. FIG. 7C is three bar graphs showing that genera were identified as predictive features, used for distinguishing head-neck, stomach, and colon tumors from other tumor types. Feature importances were calculated from random forest (RF) classifiers using the Gini coefficient. Site specific contaminants (#) were set to 0 prior to running the analysis and therefore some are predictive due to their absence in samples. FIG. 7D is three targeted analyses of Candida spp. showing that C. albicans and C. tropicalis increases in abundance from the proximal to distal stomach, while S. cerevisiae abundance is relatively stable across these sites. FIG. 7E is nine survival analyses comparing outcomes for stomach cancer patients with high rates of tumor-associated C. albicans, C. tropicalis, and S. cerevisiae, compared to patients whose blood was negative for these species. These findings suggest that C. tropicalis and other Candida spp. may be prognostic of survival in stomach cancer. FIG. 7F is a survival comparison showing that across GI cancer types, patients with high levels of tumor-associated Candida experience decreased survival compared to Q/ /t/a-negative patients. FIG. 7G is three gene-set enrichment analyses (GSEA) which reveal that genes related to cytosolic DNA sensing, Toll-like receptor, and Nod-like receptor signaling are up-regulated in stomach cancers with higher rates of Candida. Statistical significance between groups is calculated using a two-sided Wilcoxon rank-sum statistic. The direction of the inequality symbol indicates which sample group is larger, while the number of symbols indicates the degree of significance (1 : p < 0.05, 2: /? < 0.01, 3: /? < 0.001).

[0033] FIGs. 8A-8E show that fungal DNA is present in multiple cancer types not explained by contamination. FIG. 8A is a boxplot showing the distribution of reads per million (RPM) of fungal DNA detected in tumor and tumor-associated tissue samples from head & neck (HNSC), lung (LUSC), rectum (READ), colon (COAD), stomach (STAD), breast (BRCA), esophageal (ESCA) and brain (LGG) cancers, ordered by upper quartiles. FIG. 8B is a graph of a multi -well aliquot plate A19H which shows significant signs of contamination; samples shown in orange were discarded. FIG. 8C shows that the distribution of sequencing reads aligning to the AT. restricta genome displays similar depth across sequencing projects including brain, but reads are distributed randomly, a signature of biological contamination. FIG. 8D shows that the distribution of sequencing reads aligning to the A. bisporus genome displays uneven depth, but reads are horizontally distributed in a predictable manner, a signature of false-positive assignments. FIG. 8E is a re-analysis of human-subtracted sequencing data using TaxaTarget (Commichaux et al., 2021) which validates the presence of Candida spp, S. cerevisiae, and Malassezia spp. across TCGA samples. R- and /^-values indicate the result of a spearman rank correlation test.

[0034] FIG. 9 is a series of graphs showing that primary tumor samples harbor diseasespecific my cobiomes, including the prevalence of the top 10 fungal species detected in tissue from each cancer type, after removing potential contaminants.

[0035] FIGs. 10A-10F is a trans-kingdom analysis that shows Candida- and &cc/2arom ce5-associated GI cancer co-abundance groups. FIGs. 10A-10C are three SparCC correlations between Candida and Saccharomyces and the most abundant bacterial species found in matched tumor samples from TCMA (Dohlman et al., 2020) in head-neck (HNSC; FIG. 10A), stomach (STAD; FIG. 10B), and colon (COAD; FIG. 10C) cancers. FIGs. 10D-10F are three scatterplots with fitted regression lines showing correlations between Candida, Saccharomyces, and Lactobacillus in (HNSC; FIG. 10D), stomach (STAD; FIG. 10E), and colon (COAD; FIG. 10F) cancers. R- and /^-values indicate the result of a spearman rank correlation test.

[0036] FIGs. 11A-11D show that Candida is associated with late-stage and metastatic GI cancers. FIG. HA is three kernel density estimates (KDE) of Candida-Xo-Saccharomyces ratios in esophageal (ESCA), rectum (READ), lung (LUSC), and brain tumors. FIG. 11B is three volcano plot showing genes differentially expressed Saccharomyces-nQ aXiNQ (blue) and Saccharomyces- \<A\ (red) tumor samples head-neck, stomach, and colon cancers. FIG. 11C is three boxplots depicting the ratio of Candida-Xo-Saccharomyces in early-stage (I-III) and late-stage (IV) ESCA, READ, and LUSC. FIG. HD is a series of KDE analyses showing the distribution of Candida-Xo- Saccharomyces ratios in metastatic (orange) and non-metastatic (blue) tumor samples across cancer types HNSC, ESCA, STAD, READ, and LUSC.

[0037] FIG. 12 is a spatial distribution of Ascomycota and Basidiomycota abundance along the colorectal tract showing alive and transcriptionally active Candida species are associated with GI tumors.

[0038] FIG. 13A-13B shows that cancer-associated fungal mycobiota and clinical outcomes highlight predictive value of Candida. FIG. 13A is a volcano plot showing differential abundance of genera between tumor (blue) and matched adjacent normal tissue (yellow) in lung cancer. Blastomyces was enriched in lung tumors compared to matched adjacent normal tissue. FIG. 13B is a correlation analysis across GI cancer types comparing survival between patients with high rates of Saccharomyces in their tumors compared to patients whose tumors are negative for this taxon.

[0039] FIG. 14A shows the overall experimental design in a mouse model of colorectal cancer (AOM-DSS model). FIG. 14B shows that C. Albicans 3SO2 isolate from colorectal adenocarcinoma mucosa but not C. lusitaniae isolate promotes augmented tumor progression in the AOM-DSS model.

DETAILED DESCRIPTION

[0040] It is to be appreciated that certain aspects, modes, embodiments, variations and features of the present methods are described below in various levels of detail in order to provide a substantial understanding of the present technology. The present disclosure is not to be limited in terms of the particular embodiments described in this application, which are intended as single illustrations of individual aspects of the disclosure. All the various embodiments of the present disclosure will not be described herein. Many modifications and variations of the disclosure can be made without departing from its spirit and scope, as will be apparent to those skilled in the art. Functionally equivalent methods and apparatuses within the scope of the disclosure, in addition to those enumerated herein, will be apparent to those skilled in the art from the foregoing descriptions. Such modifications and variations are intended to fall within the scope of the appended claims. The present disclosure is to be limited only by the terms of the appended claims, along with the full scope of equivalents to which such claims are entitled.

[0041] It is to be understood that the present disclosure is not limited to particular uses, methods, reagents, compounds, compositions or biological systems, which can, of course, vary. It is also to be understood that the terminology used herein is for the purpose of describing particular embodiments only, and is not intended to be limiting.

[0042] In practicing the present methods, many conventional techniques in molecular biology, protein biochemistry, cell biology, immunology, microbiology and recombinant DNA are used. See, e.g., Sambrook and Russell eds. (2001) Molecular Cloning: A Laboratory Manual, 3rd edition; the series Ausubel et al. eds. (2007) Current Protocols in Molecular Biology, the series Methods in Enzymology (Academic Press, Inc., N. Y.); MacPherson et al. (1991) PCR 1: A Practical Approach (IRL Press at Oxford University Press); MacPherson et al. (1995) PCR 2: A Practical Approach,' Harlow and Lane eds. (1999) Antibodies, A Laboratory Manual,' Freshney (2005) Culture of Animal Cells: A Manual of Basic Technique, 5th edition; Gait ed. (1984) Oligonucleotide Synthesis,' U.S. Patent No. 4,683,195; Hames and Higgins eds. (1984) Nucleic Acid Hybridization,' Anderson (1999) Nucleic Acid Hybridization,' Hames and Higgins eds. (1984) Transcription and Translation; Immobilized Cells and Enzymes (IRL Press (1986)); Perbal (1984) A Practical Guide to Molecular Cloning; Miller and Calos eds. (1987) Gene Transfer Vectors for Mammalian Cells (Cold Spring Harbor Laboratory); Makrides ed. (2003) Gene Transfer and Expression in Mammalian Cells; Mayer and Walker eds. (1987) Immunochemical Methods in Cell and Molecular Biology (Academic Press, London); and Herzenberg et al. eds (1996) Weir ’s Handbook of Experimental Immunology. Methods to detect and measure levels of polypeptide gene expression products (i.e., gene translation level) are well-known in the art and include the use of polypeptide detection methods such as antibody detection and quantification techniques. (See also, Strachan & Read, Human Molecular Genetics, Second Edition. (John Wiley and Sons, Inc., NY, 1999)).

[0043] As demonstrated herein, GI tumors can be classified as either Q/ /r/a-dominant (Ca-type) or Saccharomyces-Aomm^ y (Nz-type) according to their Candida-iQ- Saccharomyces abundance ratio, and these tumor types had different implications with respect to the pathogenesis and prognosis of GI cancers, and decision-making regarding treatment selection. As described in the Examples herein, GI cancer patients with high levels of Candida at the tumor site had significantly decreased survival rates compared to patients who were z/zt/ztCz-negative (FIG. 7F), whereas Saccharomyces presence at the tumor site was not associated with survival (FIG. 13B). These observations were surprising because strong antibody responses against both Saccharomyces cerevisiae, and Candida sp. have been previously correlated with poor prognosis in other cancer patients. See US20110003335 (demonstrating that renal cell carcinoma (RCC) patients with elevated serum levels of IgG antibodies against S. cerevisiae have an unfavorable clinical course and associated with poor survival in patients with metastatic RCC). Definitions

[0044] Unless defined otherwise, all technical and scientific terms used herein generally have the same meaning as commonly understood by one of ordinary skill in the art to which this technology belongs. As used in this specification and the appended claims, the singular forms “a”, “an” and “the” include plural referents unless the content clearly dictates otherwise. For example, reference to “a cell” includes a combination of two or more cells, and the like. Generally, the nomenclature used herein and the laboratory procedures in cell culture, molecular genetics, organic chemistry, analytical chemistry and nucleic acid chemistry and hybridization described below are those well-known and commonly employed in the art.

[0045] As used herein, the term “about” in reference to a number is generally taken to include numbers that fall within a range of 1%, 5%, or 10% in either direction (greater than or less than) of the number unless otherwise stated or otherwise evident from the context (except where such number would be less than 0% or exceed 100% of a possible value).

[0046] The term “adapter” refers to a short, chemically synthesized, nucleic acid sequence which can be used to ligate to the end of a nucleic acid sequence in order to facilitate attachment to another molecule. The adapter can be single- stranded or double-stranded. An adapter can incorporate a short (typically less than 50 base pairs) sequence useful for PCR amplification or sequencing.

[0047] As used herein, the “administration” of an agent or drug to a subject includes any route of introducing or delivering to a subject a compound to perform its intended function. Administration can be carried out by any suitable route, including but not limited to, orally, intranasally, parenterally (intravenously, intramuscularly, intraperitoneally, or subcutaneously), rectally, intrathecally, or topically. Administration includes selfadministration and the administration by another.

[0044] As used herein, the terms “amplify” or “amplification” with respect to nucleic acid sequences, refer to methods that increase the representation of a population of nucleic acid sequences in a sample. Nucleic acid amplification methods are well known to the skilled artisan and include ligase chain reaction (LCR), ligase detection reaction (LDR), ligation followed by Q-replicase amplification, PCR, primer extension, strand displacement amplification (SDA), hyperbranched strand displacement amplification, multiple displacement amplification (MDA), nucleic acid strand-based amplification (NASBA), two- step multiplexed amplifications, rolling circle amplification (RCA), recombinase- polymerase amplification (RPA)(TwistDx, Cambridge, UK), transcription mediated amplification, signal mediated amplification of RNA technology, loop-mediated isothermal amplification of DNA, helicase-dependent amplification, single primer isothermal amplification, and self- sustained sequence replication (3 SR), including multiplex versions or combinations thereof. Copies of a particular nucleic acid sequence generated in vitro in an amplification reaction are called “amplicons” or “amplification products.”

[0045] The terms “complementary” or “complementarity” as used herein with reference to polynucleotides (z.e., a sequence of nucleotides such as an oligonucleotide or a target nucleic acid) refer to the base-pairing rules. The complement of a nucleic acid sequence as used herein refers to an oligonucleotide which, when aligned with the nucleic acid sequence such that the 5' end of one sequence is paired with the 3’ end of the other, is in “antiparallel association.” For example, the sequence “5'-A-G-T-3”’ is complementary to the sequence “3’-T-C-A-5.” Certain bases not commonly found in naturally-occurring nucleic acids may be included in the nucleic acids described herein. These include, for example, inosine, 7- deazaguanine, Locked Nucleic Acids (LNA), and Peptide Nucleic Acids (PNA). Complementarity need not be perfect; stable duplexes may contain mismatched base pairs, degenerative, or unmatched bases. Those skilled in the art of nucleic acid technology can determine duplex stability empirically considering a number of variables including, for example, the length of the oligonucleotide, base composition and sequence of the oligonucleotide, ionic strength and incidence of mismatched base pairs. A complement sequence can also be an RNA sequence complementary to the DNA sequence or its complement sequence, and can also be a cDNA.

[0046] As used herein, a "control" is an alternative sample used in an experiment for comparison purpose. A control can be "positive" or "negative." For example, where the purpose of the experiment is to determine a correlation of the efficacy of a therapeutic agent for the treatment for a particular type of disease, a positive control (a compound or composition known to exhibit the desired therapeutic effect) and a negative control (a subject or a sample that does not receive the therapy or receives a placebo) are typically employed.

[0047] “Detecting” as used herein refers to determining the presence of a polynucleotide or polypeptide of interest in a sample. Detection does not require the method to provide 100% sensitivity. Analysis of nucleic acid markers can be performed using techniques known in the art including, but not limited to, sequence analysis, and electrophoretic analysis. Non-limiting examples of sequence analysis include Maxam-Gilbert sequencing, Sanger sequencing, capillary array DNA sequencing, thermal cycle sequencing (Sears et al.. Biotechniques, 13:626-633 (1992)), solid-phase sequencing (Zimmerman et al., Methods Mol. Cell Biol, 3:39-42 (1992)), sequencing with mass spectrometry such as matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI-TOF/MS; Fu et al., Nat. Biotechnol, 16:381-384 (1998)), and sequencing by hybridization. Chee et al., Science, 274:610-614 (1996); Drmanac et al., Science, 260:1649-1652 (1993); Drmanac et al., Nat. Biotechnol, 16:54-58 (1998). Non-limiting examples of electrophoretic analysis include slab gel electrophoresis such as agarose or polyacrylamide gel electrophoresis, capillary electrophoresis, and denaturing gradient gel electrophoresis. Additionally, next generation sequencing methods can be performed using commercially available kits and instruments from companies such as the Life Technologies/Ion Torrent PGM or Proton, the Illumina HiSEQ or MiSEQ, and the Roche/454 next generation sequencing system.

[0048] As used herein, the term “effective amount” refers to a quantity sufficient to achieve a desired therapeutic and/or prophylactic effect, e.g., an amount which results in the prevention of, or a decrease in a disease or condition described herein or one or more signs or symptoms associated with a disease or condition described herein. In the context of therapeutic or prophylactic applications, the amount of a composition administered to the subject will vary depending on the composition, the degree, type, and severity of the disease and on the characteristics of the individual, such as general health, age, sex, body weight and tolerance to drugs. The skilled artisan will be able to determine appropriate dosages depending on these and other factors. The compositions can also be administered in combination with one or more additional therapeutic compounds. In the methods described herein, the therapeutic compositions may be administered to a subject having one or more signs or symptoms of a disease or condition described herein. As used herein, a "therapeutically effective amount" of a composition refers to composition levels in which the physiological effects of a disease or condition are ameliorated or eliminated. A therapeutically effective amount can be given in one or more administrations.

[0049] “Homology” or “identity” or “similarity” refers to sequence similarity between two peptides or between two nucleic acid molecules. Homology can be determined by comparing a position in each sequence which may be aligned for purposes of comparison. When a position in the compared sequence is occupied by the same base or amino acid, then the molecules are homologous at that position. A degree of homology between sequences is a function of the number of matching or homologous positions shared by the sequences. A polynucleotide or polynucleotide region (or a polypeptide or polypeptide region) has a certain percentage (for example, at least 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 98% or 99%) of “sequence identity” to another sequence means that, when aligned, that percentage of bases (or amino acids) are the same in comparing the two sequences. This alignment and the percent homology or sequence identity can be determined using software programs known in the art. In some embodiments, default parameters are used for alignment. One alignment program is BLAST, using default parameters. In particular, programs are BLASTN and BLASTP, using the following default parameters: Genetic code=standard; filter=none; strand=both; cutoff=60; expect=10; Matrix=BLOSUM62; Descriptions=50 sequences; sort by =HIGH SCORE; Databases=non-redundant, GenBank+EMBL+DDBJ+PDB+GenBank CDS translations+SwissProtein+SPupdate+PIR. Details of these programs can be found at the National Center for Biotechnology Information. Biologically equivalent polynucleotides are those having the specified percent homology and encoding a polypeptide having the same or similar biological activity. Two sequences are deemed “unrelated” or “non-homologous” if they share less than 40% identity, or less than 25% identity, with each other.

[0050] The term “hybridize” as used herein refers to a process where two substantially complementary nucleic acid strands (at least about 65% complementary over a stretch of at least 14 to 25 nucleotides, at least about 75%, or at least about 90% complementary) anneal to each other under appropriately stringent conditions to form a duplex or heteroduplex through formation of hydrogen bonds between complementary base pairs. Hybridizations are typically and preferably conducted with probe-length nucleic acid molecules, preferably 15- 100 nucleotides in length, more preferably 18-50 nucleotides in length. Nucleic acid hybridization techniques are well known in the art. See, e.g., Sambrook, et al., 1989, Molecular Cloning: A Laboratory Manual, Second Edition, Cold Spring Harbor Press, Plainview, N.Y. Hybridization and the strength of hybridization (i.e., the strength of the association between the nucleic acids) is influenced by such factors as the degree of complementarity between the nucleic acids, stringency of the conditions involved, and the thermal melting point (Tm) of the formed hybrid. Those skilled in the art understand how to estimate and adjust the stringency of hybridization conditions such that sequences having at least a desired level of complementarity will stably hybridize, while those having lower complementarity will not. For examples of hybridization conditions and parameters, see, e.g., Sambrook, et al., 1989, Molecular Cloning: A Laboratory Manual, Second Edition, Cold Spring Harbor Press, Plainview, N.Y.; Ausubel, F. M. et al. 1994, Current Protocols in Molecular Biology, John Wiley & Sons, Secaucus, N.J. In some embodiments, specific hybridization occurs under stringent hybridization conditions. An oligonucleotide or polynucleotide (e.g., a probe or a primer) that is specific for a target nucleic acid will “hybridize” to the target nucleic acid under suitable conditions.

[0051] As used herein, the terms “individual”, “patient”, or “subject” are used interchangeably and refer to an individual organism, a vertebrate, a mammal, or a human. In a preferred embodiment, the individual, patient or subject is a human.

[0052] As used herein, “microbiome” refers to the collective genetic content of the communities of microbes that live in and on the human body, both sustainably and transiently, including eukaryotes, fungi, archaea, bacteria, and viruses (including bacterial viruses (i.e., phage)), wherein “genetic content” includes genomic DNA, RNA such as micro RNA and ribosomal RNA, the epigenome, plasmids, and all other types of genetic information. As used herein, the term “gut microbiome” refers to the collective genetic content of the communities of microbes present in the gastrointestinal tract (GIT).

[0053] As used herein, “microbiota” refers to the collective microbes that live in and on the human body, both sustainably and transiently, including eukaryotes, fungi, archaea, bacteria, and viruses (including bacterial viruses (i.e., phage)). “Gut microbiota” as used herein refers to the totality of the microbes present in the GIT, including eukaryotes, fungi, archaea, bacteria, and viruses (including bacterial viruses (i.e., phage)). [0054] “Next-generation sequencing or NGS” as used herein, refers to any sequencing method that determines the nucleotide sequence of either individual nucleic acid molecules (e.g., in single molecule sequencing) or clonally expanded proxies for individual nucleic acid molecules in a high throughput parallel fashion (e.g., greater than 10 3 , 10 4 , 10 5 or more molecules are sequenced simultaneously). In one embodiment, the relative abundance of the nucleic acid species in the library can be estimated by counting the relative number of occurrences of their cognate sequences in the data generated by the sequencing experiment. Next generation sequencing methods are known in the art, and are described, e.g., in Metzker, M. Nature Biotechnology Reviews 11 :31-46 (2010).

[0055] As used herein, “oligonucleotide” refers to a molecule that has a sequence of nucleic acid bases on a backbone comprised mainly of identical monomer units at defined intervals. The bases are arranged on the backbone in such a way that they can bind with a nucleic acid having a sequence of bases that are complementary to the bases of the oligonucleotide. The most common oligonucleotides have a backbone of sugar phosphate units. A distinction may be made between oligodeoxyribonucleotides that do not have a hydroxyl group at the 2' position and oligoribonucleotides that have a hydroxyl group at the 2' position. Oligonucleotides may also include derivatives, in which the hydrogen of the hydroxyl group is replaced with organic groups, e.g., an allyl group. Oligonucleotides of the method which function as primers or probes are generally at least about 10-15 nucleotides long and more preferably at least about 15 to 25 nucleotides long, although shorter or longer oligonucleotides may be used in the method. The exact size will depend on many factors, which in turn depend on the ultimate function or use of the oligonucleotide. The oligonucleotide may be generated in any manner, including, for example, chemical synthesis, DNA replication, restriction endonuclease digestion of plasmids or phage DNA, reverse transcription, PCR, or a combination thereof. The oligonucleotide may be modified e.g., by addition of a methyl group, a biotin or digoxigenin moiety, a fluorescent tag or by using radioactive nucleotides.

[0056] As used herein, the term “polynucleotide” or “nucleic acid” means any RNA or DNA, which may be unmodified or modified RNA or DNA. Polynucleotides include, without limitation, single- and double-stranded DNA, DNA that is a mixture of single- and double-stranded regions, single- and double-stranded RNA, RNA that is mixture of single- and double-stranded regions, and hybrid molecules comprising DNA and RNA that may be single-stranded or, more typically, double-stranded or a mixture of single- and doublestranded regions. In addition, polynucleotide refers to triple-stranded regions comprising RNA or DNA or both RNA and DNA. Nucleic acid molecules can be naturally occurring, recombinant, or synthetic. The term polynucleotide also includes DNAs or RNAs containing one or more modified bases and DNAs or RNAs with backbones modified for stability or for other reasons. Nucleic acid modifications include, for example, methylation, substitution of one or more of the naturally occurring nucleotides with a nucleotide analog, internucleotide modifications such as uncharged linkages (e.g., methyl phosphonates, phosphotriesters, phosphoamidates, carbamates, and the like), charged linkages (e.g., phosphorothioates, phosphorodithioates, and the like), pendent moi eties (e.g., polypeptides), intercalators (e.g., acridine, psoralen, and the like), chelators, alkylators, and modified linkages (e.g., alpha anomeric nucleic acids, and the like).

[0057] The terms “nucleotide” and “nucleotide monomer” refer to naturally occurring ribonucleotide or deoxyribonucleotide monomers, as well as non-naturally occurring derivatives and analogs thereof. Accordingly, nucleotides can include, for example, nucleotides comprising naturally occurring bases (e.g, adenosine, thymidine, guanosine, cytidine, uridine, inosine, deoxyadenosine, deoxythymidine, deoxyguanosine, or deoxy cytidine) and nucleotides comprising modified bases (e.g, 2-aminoadenosine, 2- thiothymidine, pyrrolo-pyrimidine, 3-methyl adenosine, C5-propynylcytidine, C5- propynyluridine, C5-bromouridine, C5-fluorouridine, C5-iodouridine, C5-methylcytidine, 7- deazaadenosine, 7-deazaguanosine, 8-oxoadenosine, 8-oxoguanosine, O(6)-methylguanine, 2-thiocytidine).

[0058] As used herein, a “sample” refers to a substance that is being assayed for the presence of a mutation in a nucleic acid of interest. Processing methods to release or otherwise make available a nucleic acid for detection are well known in the art and may include steps of nucleic acid manipulation. A biological sample may be a body fluid or a tissue sample. In some cases, a biological sample may consist of or comprise blood, plasma, sera, urine, feces, epidermal sample, vaginal sample, skin sample, cheek swab, sperm, amniotic fluid, cultured cells, bone marrow sample, aspirate and/or chorionic villi, cultured cells, and the like. [0059] As used herein, the term “separate” therapeutic use refers to an administration of at least two active ingredients at the same time or at substantially the same time by different routes.

[0060] As used herein, the term “sequential” therapeutic use refers to administration of at least two active ingredients at different times, the administration route being identical or different. More particularly, sequential use refers to the whole administration of one of the active ingredients before administration of the other or others commences. It is thus possible to administer one of the active ingredients over several minutes, hours, or days before administering the other active ingredient or ingredients. There is no simultaneous treatment in this case.

[0061] As used herein, the term “simultaneous” therapeutic use refers to the administration of at least two active ingredients by the same route and at the same time or at substantially the same time.

[0062] The term “stringent hybridization conditions” as used herein refers to hybridization conditions at least as stringent as the following: hybridization in 50% formamide, 5xSSC, 50 mM NaHzPCh, pH 6.8, 0.5% SDS, 0.1 mg/mL sonicated salmon sperm DNA, and 5x Denhart's solution at 42° C. overnight; washing with 2x SSC, 0.1% SDS at 45° C; and washing with 0.2x SSC, 0.1% SDS at 45° C. In another example, stringent hybridization conditions should not allow for hybridization of two nucleic acids which differ over a stretch of 20 contiguous nucleotides by more than two bases.

[0063] As used herein, the term “therapeutic agent” is intended to mean a compound that, when present in an effective amount, produces a desired therapeutic effect on a subject in need thereof.

[0064] “Treating” or “treatment” as used herein covers the treatment of a disease or disorder described herein, in a subject, such as a human, and includes: (i) inhibiting a disease or disorder, z.e., arresting its development; (ii) relieving a disease or disorder, z.e., causing regression of the disorder; (iii) slowing progression of the disorder; and/or (iv) inhibiting, relieving, or slowing progression of one or more symptoms of the disease or disorder. In some embodiments, treatment means that the symptoms associated with the disease are, e.g., alleviated, reduced, cured, or placed in a state of remission. [0065] It is also to be appreciated that the various modes of treatment of disorders as described herein are intended to mean “substantial,” which includes total but also less than total treatment, and wherein some biologically or medically relevant result is achieved. The treatment may be a continuous prolonged treatment for a chronic disease or a single, or few time administrations for the treatment of an acute condition.

NGS Platforms

[0066] In some embodiments, high throughput, massively parallel sequencing employs sequencing-by-synthesis with reversible dye terminators. In other embodiments, sequencing is performed via sequencing-by-ligation. In yet other embodiments, sequencing is single molecule sequencing. Examples of Next Generation Sequencing techniques include, but are not limited to pyrosequencing, Reversible dye-terminator sequencing, SOLiD sequencing, Ion semiconductor sequencing, Helioscope single molecule sequencing etc.

[0067] The Ion Torrent™ (Life Technologies, Carlsbad, CA) amplicon sequencing system employs a flow-based approach that detects pH changes caused by the release of hydrogen ions during incorporation of unmodified nucleotides in DNA replication. For use with this system, a sequencing library is initially produced by generating DNA fragments flanked by sequencing adapters. In some embodiments, these fragments can be clonally amplified on particles by emulsion PCR. The particles with the amplified template are then placed in a silicon semiconductor sequencing chip. During replication, the chip is flooded with one nucleotide after another, and if a nucleotide complements the DNA molecule in a particular microwell of the chip, then it will be incorporated. A proton is naturally released when a nucleotide is incorporated by the polymerase in the DNA molecule, resulting in a detectable local change of pH. The pH of the solution then changes in that well and is detected by the ion sensor. If homopolymer repeats are present in the template sequence, multiple nucleotides will be incorporated in a single cycle. This leads to a corresponding number of released hydrogens and a proportionally higher electronic signal.

[0068] The 454TM GS FLX ™ sequencing system (Roche, Germany), employs a lightbased detection methodology in a large-scale parallel pyrosequencing system. Pyrosequencing uses DNA polymerization, adding one nucleotide species at a time and detecting and quantifying the number of nucleotides added to a given location through the light emitted by the release of attached pyrophosphates. For use with the 454™ system, adapter-ligated DNA fragments are fixed to small DNA-capture beads in a water-in-oil emulsion and amplified by PCR (emulsion PCR). Each DNA-bound bead is placed into a well on a picotiter plate and sequencing reagents are delivered across the wells of the plate. The four DNA nucleotides are added sequentially in a fixed order across the picotiter plate device during a sequencing run. During the nucleotide flow, millions of copies of DNA bound to each of the beads are sequenced in parallel. When a nucleotide complementary to the template strand is added to a well, the nucleotide is incorporated onto the existing DNA strand, generating a light signal that is recorded by a CCD camera in the instrument.

[0069] Sequencing technology based on reversible dye-terminators: DNA molecules are first attached to primers on a slide and amplified so that local clonal colonies are formed. Four types of reversible terminator bases (RT -bases) are added, and non-incorporated nucleotides are washed away. Unlike pyrosequencing, the DNA can only be extended one nucleotide at a time. A camera takes images of the fluorescently labeled nucleotides, then the dye along with the terminal 3' blocker is chemically removed from the DNA, allowing the next cycle.

[0070] Helicos's single-molecule sequencing uses DNA fragments with added polyA tail adapters, which are attached to the flow cell surface. At each cycle, DNA polymerase and a single species of fluorescently labeled nucleotide are added, resulting in template-dependent extension of the surface-immobilized primer-template duplexes. The reads are performed by the Helioscope sequencer. After acquisition of images tiling the full array, chemical cleavage and release of the fluorescent label permits the subsequent cycle of extension and imaging.

[0071] Sequencing by synthesis (SBS), like the "old style" dye-termination electrophoretic sequencing, relies on incorporation of nucleotides by a DNA polymerase to determine the base sequence. A DNA library with affixed adapters is denatured into single strands and grafted to a flow cell, followed by bridge amplification to form a high-density array of spots onto a glass chip. Reversible terminator methods use reversible versions of dye-terminators, adding one nucleotide at a time, detecting fluorescence at each position by repeated removal of the blocking group to allow polymerization of another nucleotide. The signal of nucleotide incorporation can vary with fluorescently labeled nucleotides, phosphate- driven light reactions and hydrogen ion sensing having all been used. Examples of SBS platforms include Illumina GA and HiSeq 2000. The MiSeq® personal sequencing system (Illumina, Inc.) also employs sequencing by synthesis with reversible terminator chemistry.

[0072] In contrast to the sequencing by synthesis method, the sequencing by ligation method uses a DNA ligase to determine the target sequence. This sequencing method relies on enzymatic ligation of oligonucleotides that are adjacent through local complementarity on a template DNA strand. This technology employs a partition of all possible oligonucleotides of a fixed length, labeled according to the sequenced position. Oligonucleotides are annealed and ligated and the preferential ligation by DNA ligase for matching sequences results in a dinucleotide encoded color space signal at that position (through the release of a fluorescently labeled probe that corresponds to a known nucleotide at a known position along the oligo). This method is primarily used by Life Technologies’ SOLiD™ sequencers. Before sequencing, the DNA is amplified by emulsion PCR. The resulting beads, each containing only copies of the same DNA molecule, are deposited on a solid planar substrate.

[0073] SMRT™ sequencing is based on the sequencing by synthesis approach. The DNA is synthesized in zero-mode wave-guides (ZMWs)-small well-like containers with the capturing tools located at the bottom of the well. The sequencing is performed with use of unmodified polymerase (attached to the ZMW bottom) and fluorescently labeled nucleotides flowing freely in the solution. The wells are constructed in a way that only the fluorescence occurring at the bottom of the well is detected. The fluorescent label is detached from the nucleotide at its incorporation into the DNA strand, leaving an unmodified DNA strand.

Methods of the Present Technology

[0074] In one aspect, the present disclosure provides, a method for classifying the mycobiome of gastrointestinal (GI) tumors in a patient comprising (a) receiving a whole genome sequencing (WGS) dataset corresponding to a GI tumor sample obtained from the patient, wherein the WGS dataset comprises GI tumor-associated fungal sequence reads corresponding to a plurality of fungal species; (b) mapping the tumor-associated fungal sequence reads in the WGS dataset to a plurality of fungal reference genome libraries using at least one sequence alignment tool to identify the plurality of fungal species present in the GI tumor sample, wherein the plurality of fungal species comprises at least one Candida species and at least one Saccharomyces species; (c) normalizing the tumor-associated fungal sequence reads corresponding to the plurality of fungal species to obtain expected reads per kilobase of genome per million primary reads (eRPKM) values; (d) calculating relative abundance (%) values for each species in the plurality of fungal species in the WGS dataset by scaling the eRPKM values and computing a metric for Candida -to- Saccharomyces abundance for the GI tumor sample; and (e) classifying the GI tumor sample as Candida- dominant (Ca-type) when the metric for Candida-Xo- Saccharomyces abundance is above a first predetermined threshold or Saccharomyces- omm&nX (&-type) when the metric for Candida-Xo- Saccharomyces abundance is below a second predetermined threshold. In some embodiments, the metric for Candida-Xo- Saccharomyces abundance is a log2 Candida-Xo- Saccharomyces abundance ratio (log2(C/S)), optionally wherein the first predetermined threshold is 1 and/or the second predetermined threshold is -1. Examples of suitable sequence alignment tools include PathSeq pipeline (Kostic et al., 2011; Walker et al., 2018), Genome Analysis Toolkit (GATK version 4.0.3), Burrows-Wheeler Aligner, AGAMEMNON (https://github.com/ivlachos/agamemnon) and the like. Additionally or alternatively, in some embodiments, the method further comprises determining that the patient has a poor prognosis when the GI tumor sample is Ca-type.

[0075] In some embodiments, the method further comprises applying a prevalence-based decontamination model to remove fungal sequence reads corresponding to fungal species or genera that are not associated with GI tumors. In certain embodiments, the prevalence-based decontamination model is applied to identify and remove (1) fungal species and genera whose presence is associated with specific sequencing batches rather than the tumor type, and (2) samples from multi-well sequencing plates with evidence of contamination.

[0076] In certain embodiments, the method further comprises applying quality control models to eliminate false-positive fungal sequence reads. The quality control models may comprise a Vertical quality control model based on the genome coverage depth and/or a Horizontal quality control model based on the distribution of sequencing reads across the length of each genome. In certain embodiments, the quality control models permit the identification of different categories of false-positive signals such as false-positive alignments or incorrect taxonomic assignments.

[0077] In any of the preceding embodiments of the methods disclosed herein, the V/-type tumor sample comprises at least one, at least two, at least three, at least four, at least five or more fungal species selected from among S. cerevisiae, S. eubayanus, Cyberlindnera jadinii, Pichia membranifaciens, C. parapsilosis and C. glabrata.

[0078] Additionally or alternatively, in some embodiments the Ca-type tumor sample comprises at least one, at least two, at least three or more fungal species selected from among C. albicans, C. dubliniensis, C. tropicalis, and C. guilliermondii.

[0079] Additionally or alternatively, in some embodiments, the method further comprises determining that the patient is at risk for metastatic disease when the GI tumor sample is Ca- type.

[0080] In any of the preceding embodiments of the methods disclosed herein, the GI tumor sample is a head-neck tumor, an esophageal tumor, a stomach tumor, a colon tumor, or a rectal tumor.

[0081] In one aspect, the present disclosure provides a method for treating a gastrointestinal (GI) cancer in a patient in need thereof comprising: administering to the patient an effective amount of an antifungal agent that targets at least one Candida species, wherein the patient comprises / /tG-dominant (Ca-type) tumors, wherein the Ca-type tumors have an abundance ratio of Candida species relative to Saccharomyces species that is higher than a predetermined threshold, and comprise one or more of C. albicans, C. dubliniensis, C. tropicalis, and C. guilliermondii, and wherein the GI cancer is head-neck cancer, esophageal cancer, stomach cancer, colon cancer, or rectal cancer.

[0082] In some embodiments, the method further comprises administering an effective amount of an anti-cancer therapy, wherein the anti-cancer therapy is selected from among chemotherapy, radiation therapy, immunotherapy, monoclonal antibodies, anti-cancer nucleic acids or proteins, anti-cancer viruses or microorganisms, and any combinations thereof.

[0083] In another aspect, the present disclosure provides a method for prolonging survival of a patient diagnosed with or suffering from gastrointestinal (GI) cancer comprising administering to the patient an effective amount of an antifungal agent that targets at least one Candida species, wherein the patient comprises / /tG-dominant (Ca-type) tumors, wherein the Ca-type tumors have an abundance ratio of Candida species relative to Saccharomyces species that is higher than a predetermined threshold, and comprise one or more of C. albicans, C. dubliniensis, C. tropicalis, and C. guilliermondii, and wherein the GI cancer is head-neck cancer, esophageal cancer, stomach cancer, colon cancer, or rectal cancer.

[0084] In some embodiments, the antifungal agent that targets at least one Candida species comprises one or more of echinocandins, caspofungin, micafungin, anidulafungin, fluconazole, amphotericin B, traconazole, voriconazole, Posaconazole, isavuconazole, nystatin, miconazole, clotrimazole, and itraconazole.

[0085] In certain embodiments, the antifungal agent that targets at least one Candida species is administered orally, intravenously, intraperitoneally, subcutaneously, intramuscularly, rectally, or intratum orally.

[0086] Additionally, or alternatively, in some embodiments, the Ca-type tumors are detected by detecting tumor-associated Candida nucleic acids in a biological sample obtained from the patient. In certain embodiments, the tumor-associated Candida nucleic acids are detected via whole genome sequencing, shotgun sequencing, targeted sequencing, RNA sequencing, or any combination thereof. The biological sample may comprise whole blood, plasma, serum, saliva or tumor cells.

[0087] In any and all embodiments of the methods disclosed herein, the GI cancer or tumor is stage 1, stage 2, stage 3, stage 4 or metastatic.

[0088] In any and all embodiments of the methods disclosed herein, the Ca-type tumors exhibit elevated expression levels and/or activity of one or more of IL22, IL24, CARDIO, CD44, ILIA, IL1B, IL6, IL8, CXCL1, CXCL2, IL17C, BMP15, PFN3, CCL27, PIP, and SAGE1 relative to a GI tissue sample from a healthy control subject or a predetermined threshold.

[0089] In any and all embodiments of the methods disclosed herein, the Ca-type tumors exhibit reduced expression levels and/or activity of one or more of TP53, CDKN2A, fibronectin (FN1), PTK2B, CDKN2C, NET1, ALAD, FTL, IL17D, CST5, ELN, and TREM2 relative to a GI tissue sample from a healthy control subject or a predetermined threshold.

[0090] In another aspect, the present disclosure provides a method for predicting the prognosis of a patient diagnosed with or suffering from gastrointestinal (GI) cancer comprising detecting the presence of tumor-associated Candida nucleic acids in a biological sample obtained from the patient, and determining that the patient has poor prognosis and/or is at risk for metastatic disease when the abundance/level of tumor-associated Candida nucleic acids in the biological sample is higher relative to a predetermined threshold or a sample obtained from a healthy control subject. In certain embodiments, the tumor- associated Candida nucleic acids are detected via whole genome sequencing, shotgun sequencing, targeted sequencing, RNA sequencing, or any combination thereof. The biological sample may comprise whole blood, plasma, serum, saliva or tumor cells.

[0091] In any and all embodiments of the methods disclosed herein, the patient is human.

EXAMPLES

[0092] The present technology is further illustrated by the following Examples, which should not be construed as limiting in any way.

Example 1: Materials and methods

[0093] Unless otherwise specified, all manufacturer protocols were followed when using any of the discussed assays, reagents, tools, programs, kits, machines, or other experimental components.

Detection and quantification of mycobiomes in TCGA and HCMI sequencing data

[0094] Biospecimens were collected as part of the TCGA project for approximately ten years, including primary tumors, normal tissue, and blood samples from cancer patients both prospectively and retrospectively. Raw TCGA sequencing data and the analyte, sample, and patient metadata (including information on tumor stage, location, metastasis, etc.) associated with each sequencing run were obtained from the NCI Genomic Data Commons (GDC) via the GDC’s application programming interface (API). Raw WGS data are available from the GDC’s legacy archive (https://portal.gdc.cancer.gov/legacy-archive/). Overall, data from 1,759 sequencing runs was analyzed for HNSC (n = 338), ESCA (n = 143), STAD (n = 321), COAD (n = 300), READ (n = 127), BRCA (n = 230), LUSC (n = 100), and LGG (n = 200) projects from TCGA with WGS data available. From HCMI, data from 34 sequencing runs on solid tissue samples from brain (n = 13) and lower Gi sites (n = 21) was analyzed.

[0095] All WGS and RNA-seq data from TCGA and HCMI were screened for fungal content using PathSeq pipeline (Kostic et al., 2011; Walker et al., 2018), which is made available as part of the Broad Institute’s Genome Analysis Toolkit (GATK version 4.0.3) and relies on the Burrows- Wheel er Aligner (BWA-MEM) (Li and Durbin, 2009). Prior to screening for microbial alignments, PathSeq performs multiple, iterative subtractive alignments of these previously unaligned to a host genome reference (Kostic et al., 2011). The core host reference genome used was GRCh38 (hg38). This host reference is supplemented by (1) highly variable sequences from the immunohistocompatibility complex (MHC) from the Immuno-Polymorphisms Database (IDP), (2) Cloning vector sequences from NCBI UniVec, (3) mammalian consensus repetitive sequences from RepBase, (3) a curated database of human transcripts (human v25) from Gencode, and (4) human breakpoint sequences from GenBank (KY503218, KY5808060). Reference genomes for this analysis were obtained from the PathSeq resource bundle. These files were accessed via ftp from the Broad Institute (ftp.broadinstitute.org/bundle/beta/PathSeq/). PathSeq was used with default settings, except for the “minClippedReadLength” parameter, which was set to 50 for WGS and 45 for RNA-seq, since the maximum read length for the TCGA RNA-seq data is 50. All sequencing data were analyzed on a local high-performance computing (HPC) cluster with 60 compute nodes, 1,512 CPU cores, and approximately 15TB of RAM.

[0096] To isolate the endogenous fungal composition of these samples, sequencing reads from taxa at the genus and species level were normalized (1) by genome size (i.e. per kilobase of mapped fungal genome), (2) by the expected accuracy of the taxonomic assignment (i.e. weights are divided by the number of ambiguous alignments), and then (3) by the total library size (i.e. per million primary sequencing reads, regardless of alignment). These normalizations produced an “expected reads per kilobase of genome, per million primary reads” statistic (eRPKM). Kingdom- and phylum-level read counts were normalized to the library size (reads per million, RPM), as these alignments are much less prone to ambiguous assignment or significant fluctuations in genome size. Relative abundance (%) values were calculated by scaling eRPKM values, such that the sum of taxa abundances from a given taxonomic rank and sample sum to 100.

Quality control by removal of fungi associated with TCGA sequencing batches [0097] To mitigate the possibility of fungal contamination in the my cobiomes a screen was performed to identify species and genera that showed signs of technical variation, but not biological variation. A two-step prevalence-based decontamination model was designed to identify and remove (1) fungal taxa whose presence was associated with specific sequencing batches and could not be explained by biological variation, and (2) samples from multi-well sequencing plates with strong evidence of contamination.

[0098] To identify contaminant taxa, the prevalence of species and genera was determined, first across each sequencing batch (plate id), then for each tumor type (TCGA sequencing project) and compared these to their expected frequencies assuming a random distribution. Specifically, expected frequency distributions for each species were calculated by multiplying the number of total number of samples in each project or sequencing plate by the species prevalence across the entire dataset; these values were compared to the observed prevalence across projects or plates. These observed and expected frequencies were used to compute -values for a Chi-square statistic, which was adjusted for multiple comparisons using the Benjamini -Hochberg false-discovery rate correction (FDR, ^-values). Species and genera that were significantly associated with sequencing batch (q < 0.1) but not tumor type (q > 0.1) were classified as potential contaminants and removed from downstream analysis. Lastly, samples were screened to determine if there were sequencing plates with significant evidence of contamination that needed to be excluded from the analysis entirely. This analysis identified a single sequencing plate (A19H), samples from which harbored fungal reads at rates that were around five magnitudes greater than samples from different plates, independent of sample type. Overall, this analysis resulted in the identification of 35 contaminant taxa (12 genera, 23 species), and 18 of 29 contaminated sequencing runs from a single plate which were removed from downstream analysis.

Quality control by vertical and horizontal analyses of fungal genome coverage

[0099] To further address the possibility of contamination or false-positive alignments, the genomic coverage of the species which were most frequently found in the PathSeq analysis of WGS data from TCGA was characterized. Species were selected if they were detected in more than 5 sequencing runs (eRPKM > 0) in any of TCGA sequencing projects (HNSC, ESCA, STAD, COAD, READ, LLTSC, BRCA) that remained the precursory decontamination analysis of sequencing batches, and several closely related species with NCBI reference genomes available were also selected. For sequenced tumor samples from each cancer type, the human subtracted PathSeq BAM file outputs were converted back to their raw, unmapped, reads using SAMtools vl .14 (Li et al., 2009). Raw reads were aligned using the Burrows-Wheeler Aligner (BWA) (Li and Durbin, 2009) to each species’ reference genome to create a new BAM containing only reads mapped to that reference. BEDTools (Quinlan and Hall, 2010) genome CoverageBed was then used to generate coverage results with -bg flag to output statistics in bedgraph file format. Each tumor type’s bedgraphs were then pooled together and their genome coverage was assessed using deepTools2 (Ramirez et al., 2016) bamCoverage command. Genome alignments were visualized using pyGenomeTracks (Ramirez et al., 2018).

[0100] The resulting bedgraphs were used to analyze the coverage depth and horizontal read distribution for each genome. Coverage depth (Vertical quality control model) was assessed by calculating the average logio-coverage per-base per-sample. The ratio of average logio-coverage per-base per-sample was calculated between each sequencing project and brain tumor samples to estimate the fraction of reads that could be the result of contamination. To assess horizontal distribution (Horizontal quality control model) for each species and cancer type, a genome-length Boolean vector was generated which indicated whether reads had aligned to each base. The hamming distance between the vector generated for brain tissues and the vector was calculated for each cancer type to determine the basewise horizontal similarity of alignments across each genome. For the vertical quality control model, species were classified as possible contaminants if the average logio-coverage per- base per-sample coverage for each tumor type was greater than 30% that of brain tumors. For the horizontal quality control model, species were classified as possible false-positive signals if the hamming distance to brain was less than 0.02. Species which were classified as possible contaminants or false-positive signals by either model were removed from downstream analysis.

Validation with TaxaTarget

[0101] TaxaTarget (Commichaux et al., 2021) was used to validate the presence of key species using an analysis eukaryotic marker genes. The human filtered PathSeq output BAM files from TCGA were converted to their raw, unaligned forward and reverse fastq formats using samtools. They were then screened for marker genes aligning to Homo sapiens to determine the degree of contamination by human DNA, as well Candida, Sacharomyces, and Malassezia species to validate their presence in TCGA tumor samples.

Targeted analysis and quantification of Candida and Saccharomyces species of interest [0102] Several species of interest were identified that were abundant across TCGA tissue samples. To better quantify these species, a targeted analysis was performed by mapping fungal genomes to libraries putative microbial reads for each TCGA sequencing run, generated after stringent filtering of human sequences with PathSeq (Kostic et al., 2011; Walker et al., 2018). Representative genomes for C. albicans (GCA_003454735.1), C. tropicalis (GCA_000633855.1), and S. cerevisiae (GCA_000146045.2) were downloaded from GenBank and mapped to these libraries using STAR (Dobin et al., 2013) without allowing for spliced alignments (— align!ntronMax=l). Raw read counts for each species were then normalized by genome size and total library size as described above, to calculate an empirical reads per kilobase of genome, per million primary reads (RPKM).

Estimation o f intra- and inter-kingdom co-abundance groups and associated gene expression signatures

[0103] It is well accepted that microbiome data should be treated as compositional, a characteristic which typically complicates robust calculation of correlations between microbiota (Friedman and Alm, 2012; Gloor et al., 2017). To control for compositional effects, SparCC (Friedman and Alm, 2012) was used to estimate taxa that are frequently found together across each cancer type. This method relies on a bootstrapping procedure to control for spurious results common in microbiome survey data. Prior to calculating correlations, low-abundance samples were filtered out and the 20 most abundant fungal species from each cancer type were selected. The SparCC algorithm was then ran for 1000 iterations with default parameters to identify fungal co-abundance groups within head-neck (HNSC), stomach (STAD), and colon (COAD) tumor samples.

[0104] The trans-kingdom analysis was used to identify associations between fungi and bacteria and was performed by comparing the decontaminated fungal compositions generated with decontaminated bacterial compositions from matched samples in TCMA (Dohlman et al., 2020). To accurately quantify associations across kingdoms and control for the significant difference in their respective abundances, a scaling factor was applied to the fungal compositions in order to generate similar distributions for each kingdom and allow robust estimation of co-abundance between fungal and bacterial compositions. The most abundant fungal and bacterial taxa from each cancer type were selected and SparCC was applied again.

Acquisition and analysis of original TCGA tissue samples

[0105] For validation of Candida abundance, original, matched tissue and plasma samples were obtained from three CRC patients from Indivumed, an original TCGA tissue provider. Tumor tissues were minced, homogenized, and treated with 200 U/mL lyticase (Sigma) followed by bead beating, and processing using the Quick-DNA Fungal/Bacterial Kit (Zymo Research) as in (Li et al., 2022). Fungal DNA presence was validated by RT-PCR for fungal 18S and fungal ITS 1-2 regions were amplified by PCR using primers with sample barcodes and sequencing adaptors (Fungal primers: ITS1F- CTTGGTCATTTAGAGGAAGTAA (SEQ ID NO: 1), ITS2R- GCTGCGTTCTTCATCGATGC (SEQ ID NO: 2); Forward overhang: 5’ TCGTCGGCAGCGTCAGATGTGTATAAGAGACAG-[locus-specific sequence] (SEQ ID NO: 3); Reverse overhang: 5’ GTCTCGTGGGCTCGGAGATGTGTATAAGAGACAG- [locus-specific sequence] (SEQ ID NO: 4)).

[0106] ITS amplicons were generated with 35 cycles using Invitrogen AccuPrime PCR reagents (Carlsbad). Amplicons were then used in the second PCR reaction, using Illumina Nextera XT v2 (Illumina) barcoded primers to uniquely index each sample. 2x300 paired- end sequencing was then performed on the Illumina MiSeq (Illumina). DNA was amplified using the following PCR protocol: Initial denaturation at 94°C for 10 min, followed by 40 cycles of denaturation at 94°C for 30 s, annealing at 55°C for 30 s, and elongation at 72°C for 2 min, followed by an elongation step at 72°C for 30 min. All libraries were subjected to quality control using DNA 1000 Bioanalyzer (Agilent), and Qubit (Life Technologies) to validate and quantify library construction prior to preparing a Paired-End flow cell. Samples were randomly divided among flow cells to minimize sequencing bias. Clonal bridge amplification (Illumina) was performed using a cBot (Illumina). 2 x 250 bp sequencing-by- synthesis was performed on Illumina MiSeq platform (Illumina). Quantification, isolation and characterization of live fungi in primary colorectal tumor samples

[0107] Adenocarcinoma-associated tissues were collected from ascending colon surgical resections that were then weighed, minced, homogenized, diluted in sterile PBS and plated onto Sabouraud dextrose agar (SDA) and modified Dixon media (mDixon with glycerol monostearate), and inhibitory mold agar (Hardy Diagnostics), all supplemented with both penicillin/streptomycin (Sigma), inhibitory mold agar (Hardy Diagnostics) and modified Dixon broth with glycerol monostearate. SDA plates were incubated at 37°C for 48 hours. Inhibitory mold agar plates and modified Dixon media were incubated at 30°C for up to a week. Isolated fungal colonies from each individual subject were identified by matrix- assisted laser desorption/ionization-time of flight (MALDI-TOF) mass spectrometer.

Identification of Candida- and Saccharomyces-type TCGA tumor samples and associated signatures

[0108] To identify Candida- and &ccAarom ce -associated tumors, a log2 Candida-iQ- Saccharomyces abundance ratio was calculated (log2(C/S)) across all tumor samples for which either genus was detected. Tumors were classified as Ca-type or Aa-type if they had a log 2 (C/S) value above 1 or below -1, respectively, i.e. samples for which neither genus was detected at more than twice the rate of the other were excluded. To test associations between gene expression and the presence of Candida and Saccharomyces, differential gene expression analysis was performed using batch-normalized gene expression data from the PanCanAtlas publication page (https://gdc.cancer.gov/about-data/publications/pancanatlas) . For each cancer type, log2-fold changes (log2FC) in gene expression were calculated between tumors that were negative for Candida or Saccharomyces (eRPKM < IE-6) and tumors which were high in Candida or Saccharomyces (eRPKM > IE-6). All taxonomic abundance profiles were collapsed to the sample level using the geometric mean of taxon abundances across the available tumor sequencing data for each tumor sample. The significance of gene expression changes was then estimated using Student’s independent two-sample /-test. The differential gene expression values generated by this analysis were then used perform GSEA (Subramanian et al., 2005) and analyze gene expression pathways enriched in Candida- and &cc/2arom ce5-associated cancers based on gene lists obtained from MSigDB v7.1. Using pre-ranked differential gene expression values, a GSEA was ran for 1000 iterations to identify KEGG biological pathways (Kanehisa and Goto, 2000) that were enriched in Candida- and Saccharomyces-^soci^QA tumors from each cancer type. To compare rates of metastasis in Ca- and Aa-type tumors, TNM-stage classifications were used of each TCGA tumor sample to determine metastatic (Ml) or non-metastatic (MO) status. Samples for which no metastatic information was available (MX) were excluded. A contingency table was then generated for each cancer type comparing metastatic status (M0/M1) and tumor my cobiome classification (Ca-type v.s. Aa-type) and used Fisher’s exact test to determine whether Ca-type or Aa-type tumors were more likely to be metastatic.

Differential abundance analysis between tumor and adjacent normal tissue

[0109] Associations between fungal genera and sample type (tumor vs. matched adjacent normal tissue) were calculated in R, using a custom paired analysis function written for metacoder (Foster et al., 2017). For each cancer type analyzed, the 20 most abundant taxa were selected, provided they were present in at least 30 samples overall. Such filters help to remove low-abundance and low-prevalence taxa which frequently have small means and large coefficients of variation, contributing unnecessary noise for downstream differential abundance comparisons. After adding pseudocounts, the relative abundance of fungi for each sequencing run was calculated. Across all patients with matched tumor and normal tissue, the median 33 ratio of each taxon’s relative abundance values in tumor samples compared to matched adjacent normal tissue was calculated. Significance values were calculated for log2 median ratios between transformed relative abundance values, using Wilcoxon’s rank-sums test. Taxa with significant p-values (p < 0.05) were selected for downstream analysis.

Survival analysis

[0110] The survival analyses was performed using the log-rank test, as implemented by the lifelines survival analysis python package (Davidson-Pilon et al., 2020). Data on TCGA patient survival was collected from the PanCanAtlas’ clinical follow-up data (Liu et al., 2018). This analysis was performed at both the species and genus level. For the specieslevel analysis, normalized fungal abundances from the targeted analysis were used (RPKM for C. albicans, C. tropicalis, and S. cerevisiae). For each species of interest and cancer type, survival was compared between patients whose tumors did not harbor the species (“negative”; Oth percentile) with patients whose tumors were abundant in the species (“high”; top 50th percentile). The genus-level analysis was performed using fungal abundances determined by the PathSeq analysis (eRPKM for Candida and Saccharomyces) and used the same set of criteria for assigning patients as “negative” or “high” as the differential gene expression analysis. All taxonomic abundances were collapsed to the patient level using the geometric mean of taxon abundances across the available tumor sequencing data for each patient.

Random forest classification of cancer types using fungal compositions of tumor and blood samples

[OHl] To identify taxa that are predictive of cancer location, a decision-tree based ensemble machine learning method known as random forest classifiers was used (Breiman, 2001), as implemented by the python package skleam (Abraham et al., 2014). A separate classifier model was trained on the mycobacterial compositions of tumor samples from seven TCGA cancer types (HNSC, ESCA, STAD, COAD, READ, LUSC, and BRCA). For each cancer type, a one-versus-all classification strategy was implemented, which sought to identify genera capable of distinguishing a given cancer type (e.g. stomach tumors) from all others (e.g. non-stomach tumors). Prior to classification, taxa that were detected in fewer than 1% of samples were removed. Species abundances were log-normalized after the addition of a pseudocount to achieve a gaussian distribution. For each classifier, a forest of 400 estimators was used, with a maximum depth of 30 features per tree, and a minimum of 5 samples per split. Default values were used for all other hyperparameters. To bootstrap the estimation of feature importances, a repeated, stratified cross-fold cross validation strategy was used with 10 folds and 10 repeats. Feature importances were estimated by averaging Gini impurity measures for each of the 100 resulting sub-models.

Example 2: Fungal DNA is abundant in gastrointestinal tumor samples from TCGA

[0112] To explore tumor-associated my cobiomes across different cancers a metagenomic analysis was performed using whole-genome sequencing (WGS) data from multiple tumor samples across different cancers available in TCGA. Cancer types were selected based on previously reported presence or absence of my cobiota (Davar et al., 2021; Dzutsev et al., 2017; Finlay et al., 2020; Garrett, 2019; Gopalakrishnan et al., 2018; Grivennikov et al., 2010; lida et al., 2013; Matson et al., 2018; Routy et al., 2018; Sharma et al., 2017; Shiao et al., 2021; Sivan et al., 2015; Spencer et al., 2021; Tanoue et al., 2019; Vetizou et al., 2015; Viaud et al., 2013; Wolchok et al., 2013), including gastrointestinal (GI) tissues (head- neck/HNSC, n = 338; esophagus/ESCA, n = 142; stomach/STAD, n = 321; colon/COAD, n = 300; rectum/READ, n = 127), non-GI external sites (breast/BRCA, n = 229), as well as non- GI internal sites (lung/LUSC, n = 100; brain/LGG, n = 183), and PathSeq (Kostic et al., 2011; Walker et al., 2018) was used to determine their fungal composition. The mycobiomes detected in these tissues were then screened and filtered for contamination (See “Identification and removal of contaminant fungi and false-positive signals”).

[0113] This approach led to the detection of fungal sequences across multiple cancer patient’s tissue types, with higher rates of fungal DNA in tissues of the lung and specific sites of the gastrointestinal (GI) tract (FIG. 1A). Across the GI tract, fungal DNA was particularly abundant in tissues from head and neck (HNSC), colorectal (COAD and READ) and stomach (STAD) tissues, and less abundant in the esophagus (ESCA) (FIGs. 1A, 8A). By contrast, few fungal sequences were detected in brain tissue (LGG), consistent with its anatomical positioning away from barrier surfaces where fungi most frequently reside. As the brain is canonically classified as a sterile organ (fungal brain infections are usually lethal), it was determined that bacterial and fungal sequencing reads detected in sequencing data from brain tissue likely represented biological contamination and/or false-positive signals, suggesting that such tissue can be used as a presumptive “negative control” for identifying spurious signals in other datasets (Dohlman et al., 2020). Overall, samples from lower GI tissues harbored a greater density of fungi than upper GI tissues did, in a pattern consistent with bacteria (FIG. IB). As expected, fungal sequences represented a much smaller proportion of microbial sequences in tissues when compared to bacterial DNA (FIG. IB), consistent with previous reports of intestinal human samples (Coker et al., 2018; Hoarau et al., 2016;

Leonard! et al., 2020; Liguori et al., 2016; Liu et al., Nash et al., 2017; Proctor et al., 2021; Sokol et al., 2017; Zuo et al., 2020).

Example 3: Identification and removal of contaminant fungi and false-positive signals [0114] The discovery of fungal sequences in multiple tumor types lead to an examination of their origin, as contamination is a plausible source of fungal DNA. Microbial contamination is pervasive in metagenomic profiling experiments and can come from a variety of sources, including the laboratory environment or nucleic acid extraction kits (Davis et al., 2018; Eisenhofer et al., 2019). Additionally, the incorrect assignment of microbial or non-microbial sequencing reads can lead to reporting of spurious signals (Ye et al., 2019). Thus, identification and removal of non-endogenous taxa is a necessary step that must precede downstream analyses, particularly in studies of low biomass tissue sites (Glassing et al., 2016). To ensure accurate capture of the endogenous mycobiome of these samples, a rigorous, multi-step batch correction and quality control analysis were applied to identify and remove contaminant fungi and false-positive signals from the dataset, leaving only fungal species for which there was substantial evidence of their involvement in the tissue.

[0115] A prevalence-based decontamination model was applied to identify and remove (1) fungal species and genera whose presence was associated with specific sequencing batches and could not be explained by biological variation, and (2) samples from multi-well sequencing plates with strong evidence of contamination. Briefly, a Chi-squared test was used to determine if taxa were overrepresented in certain multi-well aliquot plates but not in certain tissue types (See Methods). This analysis identified 23 species and 12 genera that met these criteria, including Beauveria and Pochonia spp., which are not known to colonize humans (data not shown). Additionally, 18 samples were removed from a single sequencing plate which displayed evidence of significant fungal contamination (FIG. 8B).

[0116] While tracking the presence of taxa across sequencing batches can effectively identify contaminants, such a strategy is unable to identify contamination events that span sequencing batches, nor is it capable of identifying signals which may be the result of falsepositive alignments or incorrect taxonomic assignments. To address these possibilities, a genome-wide analysis of sequence alignments for each of the fungal species detected in each tumor type was performed (See Methods). For each sequencing project, the genome coverage depth (“Vertical QC model”) as well as the distribution of sequencing reads across the length of each genome was compared (“Horizontal QC model”). The use of orthogonal models in this case allows for the identification of different categories of false-positive signals. Species that are truly present at the time of sequencing, but not in the original biopsies are referred to as biological contaminants and are likely to have similar levels of coverage depth across tissue types, yet a random distribution of read alignments across the span of their genome. Conversely, false-positive alignments are likely to occur in conserved genes or highly mobile genes belonging to different fungal or non-fungal DNAs (Delsuc et al., 2005; Ye et al., 2019) generating similar patterns of sequence alignments across tissue types.

[0117] For example, these analyses found that reads aligning to Malassezia restricta and other Malassezia spp. genomes displayed similar coverage depth across sequencing projects but a horizontal read distribution that was generally random (FIGs. IF, 8C, data not shown), a signature consistent with biological contamination. Malassezia spp. are frequently found on the skin surface (Findley et al., 2013b; Saheb Kashaf et al., 2022) and were likely transferred to multiple samples during the handling of TCGA biospecimens. Consistently, several Malassezia species were selected by this model as a true signal in skin-adjacent breast tumors (data not shown): a finding consistent with known Malassezia colonization at skin sites (Findley et al., 2013b; Saheb Kashaf et al., 2022). Meanwhile, reads aligning to the genome of Agaricus bisporus (common mushroom or portabello) displayed a consistent horizontal distribution pattern across sequencing projects (FIG. 8D, data not shown). Thus, Malassezia restricta and Agaricus bisporus were respectively removed by the vertical and horizontal QC models (data not shown).

[0118] Overall, the decontamination and quality control analyses resulted in the removal of 97.27% of species detected in GI tumors, 99.26% of species detected in lung tumors, and 95.53% of species detected in breast tumors. Remaining were a set of commensal and pathogenic fungi, including Candida albicans, C. tropicalis, C. dubliniensis, C. glabrata, C. lusitaniae, C. guilliermondii, Cyberlindnera jadinii (FIG. 1C, data not shown) and food- associated Saccharomyces cerevisiae and Pichia membranifaciens (FIG. ID, data not shown) which were abundant in GI tumors and Blastomyces dermitidis/gilchristii (FIG. IE, data not shown) which was abundant in lung tumors and are known causative agents of blastomycosis, a disease that primarily affects the lungs (Brown et al., 2013; Brown et al., 2012). Many of the species classified as contaminants and/or false-positive signals were not known to colonize humans, including plant pathogens Alternaria alternata, Bipolaris oryzae, and Fusarium verticilloides (data not shown). Notably, Malassezia spp. were classified as probable contaminants in all tumor types except for breast tissue, suggesting that some of these reads may be biologically relevant, as epidermal tissue is often involved in such cancers (FIG. IF, data not shown) (Silverman et al., 2014). Therefore, the detection of reads from Malassezia spp. in breast tissue may have originated from both endogenous and contaminant sources, as has previously been shown for E. coli in CRC samples (Dohlman et al., 2020). Finally, the abundance of several of these species was validated in a secondary metagenomic analysis using TaxaTarget (Commichaux et al., 2021), a tool specifically designed for the detection of eukaryotic marker genes (FIG. 8E).

Example 4: TCGA tissue samples are composed of disease-specific fungi

[0119] The experimental approach generated species-level resolution data allowing the identification of specific fungi across various tumor types. Principal coordinate analysis (PCoA) and hierarchical clustering of genus abundances across TCGA cancer types revealed that head-neck, colon, and rectal tumors had highly similar fungal compositions, as did stomach and esophageal tumors, while the fungal compositions of non-GI tumors were largely distinct (FIGs. 2A-2B). Differences in the fungal communities observed across GI sites could be affected by variations in pH, oxygen availability, or bacterial biogeography across the GI tract, among a few key factors driving microbial variation. The stomach is known to be a highly acidic environment (pH 1.5 - 3.5), while esophageal tissues are subject to sudden, transient drops in pH (from 7.0 to below 4.0 during reflux) (Sifrim et al., 2004). By contrast, oropharyngeal, colon, and rectal tissues are characterized by a more consistent, neutral pH (6.0 - 7.0).

[0120] It was found that tumor-associated fungal communities were characterized by high abundance and prevalence of Saccharomycetales taxa, including Candida and Saccharomyces, consistent with previous gut mycobiome studies relying on metagenomics (Nash et al. , 2017), culturomics, and ITS-amplicon sequencing (Hoarau et al. , 2016; Leonard! et al., 2020; Li et al., 2022; Liguori et al., 2016; Proctor et al., 2021; Sokol et al., 2017; Zuo et al., 2020). In addition to these more common fungi, deeper analysis revealed the presence of sequences from multiple fungal species and genera as well as their distribution across different cancer types (FIGs. 2C, 9A, data not shown).

[0121] The growing consensus on the importance of intestinal mycobiota has prompted the investigation of (1) which fungi are capable of surviving, residing, and replicating in the GI tract (fungal symbionts or commensals) to influence the host over a prolonged period, and (2) which are transient passengers, contaminants, or represent environmental fungi (noncommensal fungi) that are normally benign but can impact immunosuppressed individuals (Fiers et al., 2019; Limon et al., 2017). Candida spp. were more abundant across the GI tract as compared to other body sites, consistent with their known commensal status in this part of the body (Kumamoto et al., 2020) and ability to expand (Aggor et al., 2020; Break et al., 2021; Hoarau et al., 2016; Leonard! et al.. 2020; Li et al.. 2022; Liguori et al.. 2016; Sokol et al., 2017) or breach the GI barrier (Fan et al., 2015; Zhai et al., 2020) during disease (FIGs. 2C, 9A). Species-level analysis determined that C. albicans was the most abundant representative of the Candida genus: C. albicans was highly abundant in multiple cancers but was particularly abundant in cancers of the GI tract (FIGs. 2B-2C), consistent with previous studies (Aggor et al., 2020; Break et al., 2021; Leonard! et al., 2020; Li et al., ICI , Liguori et al., 2016; Sokol et al., 2017). C. tropicalis, C. dubliniensis, C. glabrata, C. lusitaniae, C. guilliermondii, C. parapsilosis, and Pichia membranifaciens were also present, but at lower abundance and prevalence across samples (FIGs. 2B-2C, 9A). Saccharomyces spp. were primarily represented by S. cerevisiae (FIGs. 2B-2C) (Liguori et al., 2016; Nash et al., 2017; Sokol et al., 2017). Among fungi broadly assigned as non-commensal, Cyberlindnera jadinii was also detected in multiple GI tissues (FIGs. 2B-2C), a species which is found in processed food products and rarely infects people, presumably arriving via diet (David et al., 2014). Lung tissues carried B. gilchristiildermatitidis (FIGs. 2B-2C), which are causative agents of blastomycosis (Brown et al., 2013; Brown et al., 2012). Interestingly, evidence of Blastomyces DNA was detected in 6 out of 50 patients with squamous cell lung carcinomas. In the general population, the incidence of blastomycosis is 1-2 cases per 100,000 (Benedict et al., 2012). Together, these findings indicated the presence of biologically meaningful associations linking the presence of fungal DNA to tissues from specific body sites.

Example 5: Emergence of Candida and Saccharomyces co-abundance groups is associated with gastrointestinal cancers

[0122] Microbiota participate in a complex web of interspecies ecological interactions and the dynamics of these interaction networks can profoundly influence human health (Dohlman and Shen, 2019; Faust and Raes, 2012). To explore the potential presence of fungal interaction networks and clusters of co-abundant taxa, a bootstrapping procedure SparCC (Friedman and Alm, 2012) was applied to analyze the mycobiota across GI cancers. This analysis uncovered that C. albicans and S. cerevisiae were each at the center of two anti correlated co-abundance clusters which were observed across GI cancer types (FIG. 3A). The co-abundance group associated with C. albicans included C. dubliniensis, C. tropicalis, and C. guilliermondii, while the abundance group associated with S. cerevisiae was comprised of taxa including S. eubayanus, C.jadinii, P. membranifaciens, as well as C. parapsilosis and C. glabrata. Additionally, it was found that these two co-abundance clusters were largely predictive of host gene expression across head-neck, stomach, and colon cancers (FIGs. 3B-3D) These findings suggested that cancers of the GI tract may segregate into Candida- and &ccAarom ce -associated tumors. Notably, many of these species in each of these clusters are taxonomically related, thus the degree to which they are driven by biological or phylogenetic factors (or both) warrants further exploration.

Example 6: Trans-kingdom analysis reveals co-abundance groups associated with Candida and Saccharomyces in GI cancers

[0123] To further explore the microbial communities associated with the Candida and Saccharomyces tumor co-abundance clusters and their relevance to disease, the bacterial populations associated with Candida and Saccharomyces were examined and the same correlation approach was applied to identify associations among GI tumor-resident fungi and matched, decontaminated, intratumoral bacterial communities from The Cancer Microbiome Atlas (TCMA) (Dohlman et al., 2020). This analysis identified several interesting bacterial subpopulations that were correlated with Candida and Saccharomyces in each cancer type.

[0124] In head-neck tumors, Candida and Saccharomyces were associated with similar bacteria (FIGs. 3E, 10A). Lactobacillus spp. and especially Lactobacillus gasseri were very frequently found in the presence of Candida and, to a lesser extent, Saccharomyces (FIGs. 10D-10F) This observation is consistent with reports that Lactobacillus spp. interact extensively with Candida to influence its pathogenicity (Ballou et al., 2016; MacAlpine et al., 2021; Zeise et al., 2021). Bifidobacterium, which is known to support intestinal barrier function (Ewaschuk et al., 2008) was also positively associated with Candida in head-neck cancers. In contrast, it was found that species associated with periodontal disease, including Fusobacterium spp. and Prevotella spp., were negatively associated with Candida and Saccharomyces in head-neck tumors. [0125] In stomach tumors, it was also observed that Candida was strongly associated with Lactobacillus (FIGs. 3F, 10B, 10E). However unlike in head-neck cancer, Candida and Saccharomyces in stomach tumors were largely associated with dissimilar clusters of bacteria. Most notably, it was observed that / /t/a-associated tumors were less likely to harbor Helicobacter pylori, which is believed to be a causative agent in many stomach cancers (Polk and Peek, 2010; Suerbaum and Michetti, 2002; Tang et al., 2005). Conversely, Saccharomyces was more likely to be found alongside H. pylori. A similar pattern was identified for the genera Streptococcus and Clostridium, which were positively associated with Candida and negatively associated with Saccharomyces. Together, these results suggest that Candida and Saccharomyces may occupy similar ecological niches among bacterial communities in head-neck tumors but are associated with very different bacterial populations in stomach cancer.

[0126] In lower GI tumors, Candida and Saccharomyces were also co-abundant with distinct bacterial populations (FIGs. 3G, 10C). Unlike upper GI cancers, no association between L. gasseri and Candida in colon tumors was observed (FIG. 10F). However, it was found that among colon cancers, Candida was positively associated with Dialister, and was negatively associated with Ruminococcus, Akkermansia municiphila, and Barnesiella intestinihominis (FIGs. 3G, 10C). Ruminoccoccus spp. are known to be less abundant in people with inflammatory bowel disease (Nagao-Kitamoto and Kamada, 2017) and may play a role in degradation of starch in the colon (Ze et al., 2012). Interestingly, A. municiphila is known for its anti-inflammatory properties (Zhai et al., 2019) and its ability to promote healthy barrier function (Everard et al., 2013), while B. intestinihominis is associated with prolonged cancer survival and has been shown to modulate tumor immunosurveillance (Daillere et al., 2016). Thus, Candida appears to be negatively associated with several commensal species which help to promote anti-inflammatory and anti-cancer pathways in the lower intestine. Saccharomyces was not associated with the same bacteria as Candida, but was instead positively associated with Porphyromonas, Leptotrichia, and negatively correlated with Odoribacter splanchnicus. Interestingly, the presence of Candida and Saccharomyces were also associated with differing species of Fusobacterium spp. in colon cancer (FIG. 10C), which have been shown to promote tumor development in colorectal cancer by provoking inflammation and host immune response (Bullman et al., 2017;

Flanagan et al., 2014; Kostic et al., 2013).

[0127] Together, these findings demonstrate that Candida and Saccharomyces are associated with very diverging bacterial communities in stomach and colon tumors, but similar communities in oropharyngeal cancers. In the stomach, Candida and Saccharomyces were predictive of the presence of H. pylori, while in the colon Candida was negatively associated with several bacteria known to promote beneficial host-microbe interactions. Additionally, significant co-occurrence was found between Lactobacillus and both Candida and Saccharomyces in head-neck tumors, while the association between Lactobacillus was comparatively weaker or absent in stomach and colon tumors (FIGs. 10D-10F). In addition to providing insight into tumor-associated microbiomes, such trans-kingdom ecological interactions may be relevant for disease detection and potentially inform strategies for modulating tumor microbiomes for therapeutic benefit.

Example 7: Candida and Saccharomyces are predictive of sene expression patterns in GI cancers

[0128] To better understand the effect of Candida and Saccharomyces co-abundance groups on GI cancers, the rates of Candida and Saccharomyces across GI tumors were compared. Due to compositional effects in metagenomic data, log-ratios between taxa represent a robust and reliable way to estimate biologically meaningful fluctuations in microbiomes (Aitchison, 1982; Gloor et al., 2017). Across cancer types, it was discovered that Candida-io-Saccharomyces ratios displayed striking bimodality, corroborating the observation of Candida and Saccharomyces co-abundance clusters and suggesting that GI tumors could be reliably organized into subgroups of Candida- and Saccharomyces- associated cancers (FIGs. 4A, 11 A). To understand the relevance of these two subgroups, GI tumors were divided into Q/ /tG-dominant (Ca-type) and Saccharomyces- Qmm' &n (Sa- type) clusters and compared.

[0129] To see if Ca-type and Aa-type tumors harbored functional differences, RNA-seq data from TCGA was used to analyze gene expression between tumor samples that were highly abundant in Candida or Saccharomyces with tumors in which these taxa were not detected (FIGs. 4B, 11B) This analysis identified several interesting changes in gene expression that were associated with Candida status. In head-neck cancer, it was found that tumor-suppressors TP53 and CDKN2A were expressed at lower rates in (//-type tumors, along with fibronectin (FN1) which is a marker of epithelial-to-mesenchymal transition (EMT) in head-neck cancers. Interestingly, IL22, IL24, CARDIO, and CD44 were up- regulated in (//-type tumors. These genes were not differentially expressed in &cc/2arom ce5-associated tumors. Gene-set enrichment analysis (GSEA) of this expression signature demonstrated that the presence of Candida was associated with decreased expression of genes relating to cell adhesion molecules (q < 0.001) in head-neck cancers. In stomach cancers, it was found that several genes related to cytokine interactions, host immunity and inflammation were positively enriched in (//-type tumors, including ILIA, IL1B, IL6, IL8, CXCL1, CXCL2, and IL17C. This pro-inflammatory immune signature is consistent with previous reports that C. albicans invokes IL-ip, neutrophils and Thl7 cell infiltration in the gut (Li et al., 2022). By contrast, these genes were differentially expressed to a lesser extent or were not differentially expressed at all in Nz-type tumors with high rates of Saccharomyces. Genes down-regulated in (//-type tumors included AL AD, FTL, IL17D, CST5, ELN, and TREM2. Overall, GSEA showed that this gene expression pattern was associated with significant up-regulation of genes involved in cytosolic DNA sensing (q = 0.008), Toll-like receptor (q = 0.033) signaling, Nod-like receptor (q = 0.033) signaling, and cytokine-cytokine receptor interactions (q = 0.035). In colon cancers, it was found that tumor suppressor genes and genes regulating cellular adhesion pathways were downregulated in Cotype tumors, including PTK2B, CDKN2C, and NET1, while genes such as BMP15, PFN3, CCL27, PIP, and SAGE1 were up-regulated in (//-type tumors. Moreover, GSEA identified significant down-regulation of genes involved in ECM-receptor interactions (q = 0.036) and focal adhesion (q = 0.101) pathways in (//-type colon tumors.

[0130] These findings indicated that the presence of Candida in head-neck and colon tumors is associated with pro-tumorigenic and cellular adhesion-related gene pathways, while Candida appears to be associated with a robust immune response in stomach tumors, consistent with previous reports that C. albicans is linked to immune dysfunction and damage to intestinal macrophages and epithelium during pathophysiological conditions such as inflammatory bowel disease (Basmaciyan et al., 2019; Li et al., 2022). Example 8: A Candida-to-Saccharomyces ratio is associated with late-stase, metastatic colon cancer

[0131] The observation that the presence of Candida is associated with down-regulation of genes involved in cellular adhesion pathways and epithelial barrier function in head-neck and colon tumors led to an analysis of whether rates between these two genera were predictive of cancer outcomes. Interestingly, CandidaAo-Saccharomyces ratios were generally low among early-stage colon cancers but were dramatically increased in stage IV disease (FIG. 4C). However, CandidaAo-Saccharomyces ratios did not vary significantly by stage in head-neck, stomach, or other cancers (FIGs. 4C, 11C). The association with latestage colon cancer led to an examination of the rates of metastases among Ca-type and Sa- type tumors. Comparing CandidaAo-Saccharomyces ratios in metastatic and non-metastatic groups, it was found that Ca-type colon tumors were significantly more likely to be metastatic than tumors with higher rates of Saccharomyces (FIG. 4D; p = 8.49E-3; q = 0.051). Similar analysis did not find significant differences in other cancer types (FIG. HD). Thus, CandidaAo-Saccharomyces ratios may capture a clinically relevant shift in tumor mycobiomes with potential prognostic value for colon cancer.

[0132] The observation that tumor mycobiomes were predictive of metastatic colon cancer and deregulation of genes involved in epithelial barrier function led to an analysis of whether fungi or fungal DNA might transfer into the bloodstream from the barrier surfaces in which these fungi normally reside. The composition of patient-matched tumor and blood samples from cancer types of the lower and upper GI tracts was analyzed. It was found that there were statistically significant similarities in the composition of patient-matched tumor and blood samples from patients with upper GI cancers (p = 3.27E-2) and lower GI cancers (p = 3.72E-5) compared to unmatched samples (FIG. 4E); while the same was not true for other tumors, suggesting that the GI tract might be a possible entrance point for fungi or fungal DNA into the bloodstream. Together these data indicate that Candida may be associated in loss of epithelial barrier function, metastasis, and the translocation of fungi from the GI tract into the bloodstream.

Example 9: Live, transcriptionally active Candida species are associated with GI tumors [0133] To further examine the role of Candida, an analysis of fungal abundance distribution across the lower GI tract was performed. Consistent with previous studies focused on fecal mycobiota (Chehoud et al., 2015; Hoarau et al., 2016; Leonard! et al., 2020; Sokol et al., 2017), the Ascomycota phylum was more prevalent in the ascending colon (FIGs. 5A, 12) A targeted, species-level analysis determined that C. albicans is likely driving the abundance of Ascomycota in the ascending colon (FIG. 5B).

[0134] Next the presence of Candida in lower GI cancer tissues was validated. Three primary colorectal tumor samples were obtained from an original TCGA tissue provider: two of these samples was classified as Candida-positive (TCGA-AG-A002) and two as Candida- negative (TCGA-AG-4015, TCGA- AG-3885). Independent, ITS sequencing of these three samples were performed which confirmed the presence of high rates of Candida in TCGA- AG-A002 (98.89% of reads), while Candida appeared to be much less abundant in TCGA- AG-4015 and TCGA- AG-3885 (<2% of reads) (FIG. 5C).

[0135] Notably, culture-dependent analysis of colorectal adenocarcinomas from a separate cohort determined that live C. albicans, C. lusitaniae and C. tropicalis are present in the mucosa of adenocarcinomas from ascending colon (FIG. 5D). No live S. cerevisiae, M. sympodialis or M. globosa were isolated from these samples. In a third cohort from the Human Cancer Model Initiative (HCMI), the presence of Candida RNA in solid tumor samples was assayed, finding that the distribution of Candida RNA along the length of the lower GI tract (FIG. 5E) matched the anatomical distribution of Candida DNA in TCGA cohort (FIG. 5B). No HCMI solid tumor samples were available from the ascending or transverse colon. The detection of live Candida and Candida RNA in GI tumors lead to an examination of whether RNA from Candida or other species could be detected in GI tumors profiled by TCGA. Comparing the abundance of fungal sequences from matched tumors analyzed using both WGS and RNA-seq, it was found that rates of genomic Candida DNA were highly correlated with the presence of Candida RNA transcripts (FIG. 5F), indicating that these Candida species were transcriptionally active across GI tumors. In comparison, no such correlations were observed for other species, including S. cerevisiae and C.jadinii, suggesting that DNA and RNA obtained from these species do not represent living fungi in these tumor tissues, consistent with the culture-dependent analysis. Together, these data demonstrate that live, transcriptionally active Candida species are present in tissues associated with GI tumors and that fungal DNA detected in the blood of patients with lower GI tumors may originate from the gut.

Example 10: Targeted analysis of Candida and Saccharomyces spy.

[0136] To further evaluate the prevalence of specific fungal genera across different cancer types, a targeted analyses of C. albicans, C. tropicalis, and S. cerevisiae was performed. This analysis revealed that C. albicans, C. tropicalis, and S. cerevisiae were more prevalent in GI tract tumors than breast tumors or brain tumor controls (FIGs. 6A-6B).

[0137] Considering the finding that Candida-X.o-Saccharomyces ratios may be prognostic of GI cancer outcomes (FIGs. 4C-4D), the targeted approach was used to examine associations between specific fungi and tumor stage. Consistent with the observation of Candidado-Saccharomyces ratios, it was found that C. albicans, C. tropicalis, and S. cerevisiae were significantly associated with stage IV colon cancer (FIGs. 6C-6D). As latestage CRC is characterized by tumor infiltration of the lymph nodes and lamina propria (Oshima and Miwa, 2016; Yu, 2018), this finding suggests that Candida abundance may be predictive of fungal translocation to the bloodstream, a finding supported by the observation of similar fungal composition of matched blood samples (FIG. 4E). Notably, both C. albicans and C. tropicalis were more abundant in stage I stomach cancer specifically (FIGs. 6C-6D) None of the fungal species that were examined were associated with a specific tumor stage in head-neck samples. These data collectively imply the presence of tumor- associated mycobiomes that may serve as prognostic markers for predicting cancer progression and patient outcomes. Furthermore, this targeted analysis indicated that increased abundance of Candida in late-stage, metastatic colon tumors may be directly or indirectly involved in the deregulation of genes mediating cellular adhesion (FIG. 4B), thereby leading to a deteriorated epithelial barrier, metastasis, and translocation of fungi from the primary tumor site into the bloodstream. Alternatively, increased abundance in late-stage colon tumors might instead be the result of deregulations in the tumor’s immune system, which would allow the unhindered growth of Candida and other pathogens. Example 11: Cancer-associated mycobiota and clinical outcomes highlight predictive value of Candida

[0138] Having observed that higher rates of Candida were associated with increased expression of immune/inflammatory genes in GI cancers (FIGs. 4B-4D), the associations between specific fungi and GI cancer types was further explored by comparing abundance of Candida between tumor samples and normal tissue. It was found that Candida was significantly and uniquely enriched in stomach tumor samples compared to patient-matched normal tissue (p = 4.23E-3, q = 0.026, FIGs. 7A-7B), while Cyberlindnera was significantly enriched in normal tissue (p = 2.15E-5, q = 1.29E-4). Notably, the same analysis determined that Blastomyces (p = 8.80E-3, ^ = 0.114) was similarly enriched in lung tumors compared to matched adjacent normal tissue (FIG. 13A).

[0139] The analysis of TCGA GI tumor samples suggested the possibility that detection of Candida DNA may have potential as a prognostic biomarker. To examine this possibility, a non-parametric machine learning ensemble method known as a random forests (RF) classifier was employed. Similar machine-learning approaches have been used previously to distinguish between cancer types based on intratumoral and circulating bacterial DNA (Poore et al., 2020). However, RFs are particularly useful for estimating the importance of certain features for prediction. This machine learning approach found that Candida was by far the most important feature for distinguishing GI tumors from other cancer types, followed by Cyberlindnera and Saccharomyces (FIG. 7C). Additional targeted analyses of C. albicans and C. tropicalis revealed that the abundance of both Candida species increased steadily from the proximal to distal stomach, with the lowest abundance in the cardia and the greatest abundance in the antrum (FIG. 7D). These results mirror the colonization pattern of H. pylori, which preferentially infects the antrum (Suerbaum and Michetti, 2002).

[0140] Enrichment of Candida in tumor samples and its predictive power for GI cancer led to an analysis of whether Candida might be prognostic of disease outcomes. Using survival data from TCGA, it was found that high rates of tumor-associated C. tropicalis DNA were significantly associated with decreased survival among stomach cancer patients (p = 1.72E-2; q = 2.58E-2) and head-neck cancers (p =1.37E-2; q = 4.02E-2), indicating that the presence of Candida DNA at the tumor may be used as a non-invasive biomarker for predicting GI cancer disease outcomes (FIG. 7E). Moreover, it was observed that the presence of Saccharomyces spp. at the tumor site was also predictive of decreased survival in stomach cancer, suggesting these taxa may be prognostic for multiple GI cancers (FIG. 7E).

[0141] As the presence of Candida spp. appeared to be predictive of patient survival in several GI tissues, it was next determined if these associations extended beyond specific cancer types. To explore this possibility, a pan-cancer analysis was performed, incorporating fungal abundance and survival information from all GI cancer types, including head-neck, esophageal, stomach, colon, and rectal cancers. This analysis found that GI cancer patients with high levels of Candida at the tumor site had significantly decreased survival rates compared to patients who were / /tG-negative (p = 1.31E-2, q = 3.93E-2, FIG. 7F) while Saccharomyces presence at the tumor site was not associated with survival (FIG. 13B). The associations between Candida, GI cancer, and reduced survival were particularly pronounced in stomach cancer, and were consistent with the results of a pathway analysis using functional KEGG pathways with GSEA, which found that the presence of Candida was associated with the expression of genes involved in cytosolic DNA sensing, Toll-like receptor signaling, and Nod-like receptor signaling in stomach cancers (FIG. 7G). Together, these data not only contribute to a growing body of evidence suggesting that Candida contributes to GI cancer severity, but also indicate that Candida can serve as a promising biomarker for predicting disease outcomes.

Example 12: C. Albicans isolate from colorectal adenocarcinoma mucosa promotes augmented tumor progression in a mouse models of colorectal cancer ( AOM-DSS model}

[0142] Animals were divided into three experimental groups: Group-1 : 8 B6 WT mice +2% DSS drinking water; Group-2: 8 B6 WT mice +2% DSS drinking water+ gavage 4N- 3SO2 C.albicans (IxlO 8 ); Group-3: 8 B6 WT mice +2% DSS drinking water+ gavage C. lusitaniae strain(lxlO 8 ). 4N-3S02 C. albicans was isolated from 80 y/o female patient with adenocarcinoma of ascending colon. The colonies were isolated from non-tumor associated mucosa. Colonies were smooth texture, white and circular, and their % LDH (damaging capacity) relative to WT SC5314 in vitro was 84.03%

[0143] FIG. 14A shows the overall experimental design in a mouse model of colorectal cancer. The Azoxymethane (AOM)/Dextran Sodium Sulfate (DSS) mouse model of inflammatory colorectal cancer is described in detail in Parang et al, Methods Mol Biol. 2016; 1422: 297-307. FIG. 14B demonstrates that C. Albicans 3SO2 isolate from colorectal adenocarcinoma mucosa but not C. lusitaniae isolate promotes augmented tumor progression in the AOM-DSS model.

EQUIVALENTS

[0144] The present technology is not to be limited in terms of the particular embodiments described in this application, which are intended as single illustrations of individual aspects of the present technology. Many modifications and variations of this present technology can be made without departing from its spirit and scope, as will be apparent to those skilled in the art. Functionally equivalent methods and apparatuses within the scope of the present technology, in addition to those enumerated herein, will be apparent to those skilled in the art from the foregoing descriptions. Such modifications and variations are intended to fall within the scope of the present technology. It is to be understood that this present technology is not limited to particular methods, reagents, compounds compositions or biological systems, which can, of course, vary. It is also to be understood that the terminology used herein is for the purpose of describing particular embodiments only, and is not intended to be limiting.

[0145] In addition, where features or aspects of the disclosure are described in terms of Markush groups, those skilled in the art will recognize that the disclosure is also thereby described in terms of any individual member or subgroup of members of the Markush group.

[0146] As will be understood by one skilled in the art, for any and all purposes, particularly in terms of providing a written description, all ranges disclosed herein also encompass any and all possible subranges and combinations of subranges thereof. Any listed range can be easily recognized as sufficiently describing and enabling the same range being broken down into at least equal halves, thirds, quarters, fifths, tenths, etc. As a non-limiting example, each range discussed herein can be readily broken down into a lower third, middle third and upper third, etc. As will also be understood by one skilled in the art all language such as “up to,” “at least,” “greater than,” “less than,” and the like, include the number recited and refer to ranges which can be subsequently broken down into subranges as discussed above. Finally, as will be understood by one skilled in the art, a range includes each individual member. Thus, for example, a group having 1-3 cells refers to groups having 1, 2, or 3 cells. Similarly, a group having 1-5 cells refers to groups having 1, 2, 3, 4, or 5 cells, and so forth.

[0147] All patents, patent applications, provisional applications, and publications referred to or cited herein are incorporated by reference in their entirety, including all figures and tables, to the extent they are not inconsistent with the explicit teachings of this specification.

REFERENCES

Abdulamir, A.S., Hafidh, R.R., and Abu Bakar, F. (2011). The association of Streptococcus bovis/gallolyticus with colorectal tumors: the nature and the underlying mechanisms of its etiological role. J Exp Clin Cancer Res 30, 11. 10.1186/1756-9966-30-11.

Abraham, A., Pedregosa, F., Eickenberg, M., Gervais, P., Mueller, A., Kossaifi, J., Gramfort, A., Thirion, B., and Varoquaux, G. (2014). Machine learning for neuroimaging with scikit- learn. Front Neuroinform 8, 14. 10.3389/fninf.2O14.00014.

Aggor, F.E.Y., Break, T.J., Trevejo-Nunez, G., Whibley, N., Coleman, B.M., Bailey, R.D., Kaplan, D.H., Naglik, J.R., Shan, W ., Shetty, A.C., et al. (2020). Oral epithelial IL- 22/STAT3 signaling licenses IL-17-mediated immunity to oral mucosal candidiasis. Sci Immunol 5. 10.1126/sciimmunol.aba0570.

Aitchison, J. (1982). The Statistical Analysis of Compositional Data. Journal of the Royal Statistical Society: Series B (Methodological) 44, 139-160. https://doi.org/10.l l l l/_j.2517- 6161.1982.tb01195.x.

Alam, A., Levanduski, E., Denz, P., Villavicencio, H.S., Bhatta, M., Alhorebi, L., Zhang, Y., Gomez, E.C., Morreale, B., Senchanthisai, S., et al. (2022). Fungal mycobiome drives IL-33 secretion and type 2 immunity in pancreatic cancer. Cancer Cell 40, 153-167 el 11.

10.1016/j.ccell.2022.01.003.

Arthur, J.C., Perez-Chanona, E., Muhlbauer, M., Tomkovich, S., Uronis, J.M., Fan, T.J., Campbell, B.J., Abujamel, T., Dogan, B., Rogers, A.B., et al. (2012). Intestinal inflammation targets cancer-inducing activity of the microbiota. Science 338, 120-123.

10.1126/science.1224820.

Ballou, E.R., Avelar, G.M., Childers, D.S., Mackie, J., Bain, J.M., Wagener, J., Kastora, S.L., Panea, M.D., Hardison, S.E., Walker, L.A., et al. (2016). Lactate signalling regulates fungal beta-glucan masking and immune evasion. Nat Microbiol 2, 16238. 10.1038/nmicrobiol.2016.238.

Banerjee, S., Schlaeppi, K., and van der Heijden, M.G.A. (2018). Keystone taxa as drivers of microbiome structure and functioning. Nat Rev Microbiol 16, 567-576. 10.1038/s41579-018- 0024-1. Basmaciyan, L., Bon, F., Paradis, T., Lapaquette, P., and Dalle, F. (2019). "Candida Albicans Interactions With The Host: Crossing The Intestinal Epithelial Barrier". Tissue Barriers 7, 1612661. 10.1080/21688370.2019.1612661.

Benedict, K., Roy, M., Chiller, T., and Davis, J.P. (2012). Epidemiologic and Ecologic Features of Blastomycosis: A Review. Current Fungal Infection Reports 6, 327-335. 10.1007/S12281-012-0110-1.

Bongomin, F., Gago, S., Oladele, R.O., and Denning, D.W. (2017). Global and Multi- National Prevalence of Fungal Diseases-Estimate Precision. J Fungi (Basel) 3.

10.3390/jof3040057.

Bradford, L.L., and Ravel, J. (2017). The vaginal mycobiome: A contemporary perspective on fungi in women's health and diseases. Virulence 8, 342-351. 10.1080/21505594.2016.1237332.

Break, T.J., Oikonomou, V., Dutzan, N., Desai, J.V., Swidergall, M., Freiwald, T., Chauss, D., Harrison, O.J., Alejo, J., Williams, D.W., et al. (2021). Aberrant type 1 immunity drives susceptibility to mucosal fungal infections. Science 371. 10.1126/science.aay5731.

Breiman, L. (2001). Random Forests. Machine Learning 45, 5-32. 10.1023/A:1010933404324.

Brown, E.M., McTaggart, L.R., Zhang, S.X., Low, D.E., Stevens, D.A., and Richardson, S.E. (2013). Phylogenetic analysis reveals a cryptic species Blastomyces gilchristii, sp. nov. within the human pathogenic fungus Blastomyces dermatitidis. PLoS One 8, e59237. 10.1371/journal. pone.0059237.

Brown, G.D., Denning, D.W., Gow, N.A., Levitz, S.M., Netea, M.G., and White, T.C. (2012). Hidden killers: human fungal infections. Sci Transl Med 4, 165rvl l3. 4/165/165rvl3 [pii]

10.1126/scitranslmed.3004404.

Bullman, S., Pedamallu, C.S., Sicinska, E., Clancy, T.E., Zhang, X., Cai, D., Neuberg, D., Huang, K., Guevara, F., Nelson, T., et al. (2017). Analysis of Fusobacterium persistence and antibiotic response in colorectal cancer. Science 358, 1443-1448. 10.1126/science.aal5240. Byrd, A.L., Belkaid, Y., and Segre, J. A. (2018). The human skin microbiome. Nat Rev Microbiol 16, 143-155. 10.1038/nrmicro.2017.157.

Castellarin, M., Warren, R.L., Freeman, J.D., Dreolini, L., Krzywinski, M., Strauss, J., Barnes, R., Watson, P., Allen-Vercoe, E., Moore, R.A., and Holt, R.A. (2012).

Fusobacterium nucleatum infection is prevalent in human colorectal carcinoma. Genome Res 22, 299-306. 10.1101/gr.126516.111.

Chang, F., Syrjanen, S., Wang, L., and Syrjanen, K. (1992). Infectious agents in the etiology of esophageal cancer. Gastroenterology 103, 1336-1348. 10.1016/0016-5085(92)91526-a.

Chehoud, C., Albenberg, L.G., Judge, C., Hoffmann, C., Grunberg, S., Bittinger, K., Baldassano, R.N., Lewis, J.D., Bushman, F.D., and Wu, G.D. (2015). Fungal Signature in the Gut Microbiota of Pediatric Patients With Inflammatory Bowel Disease. Inflamm Bowel Dis 21, 1948-1956. 10.1097/MIB.0000000000000454.

Coker, O.O., Nakatsu, G., Dai, R.Z., Wu, W.K.K., Wong, S.H., Ng, S.C., Chan, F.K.L., Sung, J.J.Y., and Yu, J. (2018). Enteric fungal microbiota dysbiosis and ecological alterations in colorectal cancer. Gut. 10.1136/gutjnl-2018-317178.

Commichaux, S., Javkar, K., Muralidharan, H.S., Ramachandran, P., Ottesen, A., Rand, H., and Pop, M. (2021). TaxaTarget: Fast, Sensitive, and Precise Classification of Microeukaryotes in Metagenomic Data. Research Square.

Daillere, R., Vetizou, M., Waldschmitt, N., Yamazaki, T., Isnard, C., Poirier-Colame, V., Duong, C.P.M., Flament, C., Lepage, P., Roberti, M.P., et al. (2016). Enterococcus hirae and Bamesiella intestinihominis Facilitate Cyclophosphamide-Induced Therapeutic Immunomodulatory Effects. Immunity 45, 931-943. 10.1016/j.immuni.2016.09.009.

Davar, D., Dzutsev, A.K., McCulloch, J.A., Rodrigues, R.R., Chauvin, J.M., Morrison, R.M., Deblasio, R.N., Menna, C., Ding, Q., Pagliano, O., et al. (2021). Fecal microbiota transplant overcomes resistance to anti-PD-1 therapy in melanoma patients. Science 371, 595-602.

10.1126/science.abf3363.

David, L.A., Maurice, C.F., Carmody, R.N., Gootenberg, D.B., Button, J.E., Wolfe, B.E., Ling, A.V., Devlin, A.S., Varma, Y., Fischbach, M.A., et al. (2014). Diet rapidly and reproducibly alters the human gut microbiome. Nature 505, 559-563. 10.1038/naturel2820. Davis, N.M., Proctor, D.M., Holmes, S.P., Reiman, D.A., and Callahan, B.J. (2018). Simple statistical identification and removal of contaminant sequences in marker-gene and metagenomics data. Microbiome 6, 226. 10.1186/s40168-018-0605-2. de Klerk, N., Maudsdotter, L., Gebreegziabher, H., Saroj, S.D., Eriksson, B., Eriksson, O.S., Roos, S., Linden, S., Sjolinder, H., and Jonsson, A.B. (2016). Lactobacilli Reduce Helicobacter pylori Attachment to Host Gastric Epithelial Cells by Inhibiting Adhesion Gene Expression. Infect Immun 84, 1526-1535. 10.1128/IAI.00163-16.

Delsuc, F., Brinkmann, H., and Philippe, H. (2005). Phylogenomics and the reconstruction of the tree of life. Nat Rev Genet 6, 361-375. 10.1038/nrgl603.

Dobin, A., Davis, C.A., Schlesinger, F., Drenkow, J., Zaleski, C., Jha, S., Batut, P., Chaisson, M., and Gingeras, T.R. (2013). STAR: ultrafast universal RNA-seq aligner. Bioinformatics 29, 15-21. 10.1093/bioinformatics/bts635.

Dohlman, A.B., Arguijo Mendoza, D., Ding, S., Gao, M., Dressman, H., Iliev, I.D., Lipkin, S.M., and Shen, X. (2020). The cancer microbiome atlas: a pan-cancer comparative analysis to distinguish tissue-resident microbiota from contaminants. Cell Host Microbe.

10.1016/j.chom.2020.12.001.

Dohlman, A.B., and Shen, X. (2019). Mapping the microbial interactome: Statistical and experimental approaches for microbiome network inference. Exp Biol Med (Maywood) 244, 445-458. 10.1177/1535370219836771.

Doron, I., Mesko, M., Li, X.V., Kusakabe, T., Leonardi, I., Shaw, D.G., Fiers, W.D., Lin, W.Y., Bialt-DeCelie, M., Roman, E., et al. (2021). Mycobiota-induced IgA antibodies regulate fungal commensalism in the gut and are dysregulated in Crohn's disease. Nat Microbiol 6, 1493-1504. 10.1038/s41564-021-00983-z.

Dzutsev, A., Badger, J.H., Perez-Chanona, E., Roy, S., Salcedo, R., Smith, C.K., and Trinchieri, G. (2017). Microbes and Cancer. Annu Rev Immunol 35, 199-228.

10.1146/annurev-immunol-051116-052133.

Eisenhofer, R., Minich, J. J., Marotz, C., Cooper, A., Knight, R., and Weyrich, L.S. (2019).

Contamination in Low Microbial Biomass Microbiome Studies: Issues and

Recommendations. Trends Microbiol 27, 105-117. 10.1016/j.tim.2018.11.003. Everard, A., Belzer, C., Geurts, L., Ouwerkerk, J.P., Druart, C., Bindels, L.B., Guiot, Y., Derrien, M., Muccioli, G.G., Delzenne, N.M., et al. (2013). Cross-talk between Akkermansia muciniphila and intestinal epithelium controls diet-induced obesity. Proc Natl Acad Sci U S A 110, 9066-9071. 10.1073/pnas. l219451110.

Ewaschuk, J.B., Diaz, H., Meddings, L., Diederichs, B., Dmytrash, A., Backer, J., Looijer- van Langen, M., and Madsen, K.L. (2008). Secreted bioactive factors from Bifidobacterium infantis enhance epithelial cell barrier function. Am J Physiol Gastrointest Liver Physiol 295, G1025-1034. 10.1152/ajpgi.90227.2008.

Fan, D., Coughlin, L.A., Neubauer, M.M., Kim, J., Kim, M.S., Zhan, X., Simms-Waldrip, T.R., Xie, Y ., Hooper, L.V., and Koh, A.Y. (2015). Activation of HIF-lalpha and LL-37 by commensal bacteria inhibits Candida albicans colonization. Nat Med 21, 808-814. 10.1038/nm.3871.

Faust, K., and Raes, J. (2012). Microbial interactions: from networks to models. Nat Rev Microbiol 10, 538-550. 10.1038/nrmicro2832.

Fiers, W.D., Gao, I.H., and Iliev, I.D. (2019). Gut mycobiota under scrutiny: fungal symbionts or environmental transients? Curr Opin Microbiol 50, 79-86. 10.1016/j.mib.2019.09.010.

Findley, K., Oh, J., Yang, J., Conlan, S., Deming, C., Meyer, J. A., Schoenfeld, D., Nomicos, E., Park, M., Kong, H.H., and Segre, J. A. (2013a). Topographic diversity of fungal and bacterial communities in human skin. Nature 498, 367-370. naturel2171 [pii]

10.1038/naturel2171.

Findley, K., Oh, J., Yang, J., Conlan, S., Deming, C., Meyer, J. A., Schoenfeld, D., Nomicos, E., Park, M., Program, N.I.H.I.S.C.C.S., et al. (2013b). Topographic diversity of fungal and bacterial communities in human skin. Nature 498, 367-370. 10.1038/naturel2171.

Finlay, B.B., Goldszmid, R., Honda, K., Trinchieri, G., Wargo, J., and Zitvogel, L. (2020). Can we harness the microbiota to enhance the efficacy of cancer immunotherapy? Nat Rev Immunol 20, 522-528. 10.1038/s41577-020-0374-6.

Flanagan, L., Schmid, J., Ebert, M., Soucek, P., Kunicka, T., Liska, V., Bruha, J., Neary, P., Dezeeuw, N., Tommasino, M., et al. (2014). Fusobacterium nucleatum associates with stages of colorectal neoplasia development, colorectal cancer and disease outcome. Eur J Clin Microbiol Infect Dis 33, 1381-1390. 10.1007/sl0096-014-2081-3.

Foster, Z.S., Sharpton, T.J., and Grunwald, N.J. (2017). Metacoder: An R package for visualization and manipulation of community taxonomic diversity data. PLoS Comput Biol 13, el005404. 10.1371/journal.pcbi.1005404.

Francescone, R., Hou, V., and Grivennikov, S.I. (2014). Microbiome, inflammation, and cancer. Cancer J 20, 181-189. 10.1097/PP0.0000000000000048.

Friedman, J., and Alm, E.J. (2012). Inferring correlation networks from genomic survey data. PLoS Comput Biol 8, el002687. 10.1371/journal.pcbi.1002687.

Garrett, W.S. (2019). The gut microbiota and colon cancer. Science 364, 1133-1135. 10.1126/science.aaw2367.

Glassing, A., Dowd, S.E., Galandiuk, S., Davis, B., and Chiodini, R.J. (2016). Inherent bacterial DNA contamination of extraction and sequencing reagents may affect interpretation of microbiota in low bacterial biomass samples. Gut Pathog 5, 24. 10.1186/sl3099-016-0103- 7.

Gloor, G.B., Macklaim, J.M., Pawlowsky-Glahn, V., and Egozcue, J.J. (2017). Microbiome Datasets Are Compositional: And This Is Not Optional. Front Microbiol 8, 2224. 10.3389/fmicb.2017.02224.

Gopalakrishnan, V., Spencer, C.N., Nezi, L., Reuben, A., Andrews, M.C., Karpinets, T.V., Prieto, P.A., Vicente, D., Hoffman, K., Wei, S.C., et al. (2018). Gut microbiome modulates response to anti-PD-1 immunotherapy in melanoma patients. Science 359, 97-103.

10.1126/science.aan4236.

Grivennikov, S.I., Greten, F.R., and Karin, M. (2010). Immunity, inflammation, and cancer. Cell 140, 883-899. 10.1016/j .cell.2010.01.025.

Helmink, B.A., Khan, M.A.W., Hermann, A., Gopalakrishnan, V., and Wargo, J. A. (2019). The microbiome, cancer, and cancer therapy. Nat Med 25, 377-388. 10.1038/s41591 -019- 0377-7.

Hoarau, G., Mukherjee, P.K., Gower-Rousseau, C., Hager, C., Chandra, J., Retuerto, M.A.,

Neut, C., Vermeire, S., Clemente, J., Colombel, J.F., et al. (2016). Bacteriome and Mycobiome Interactions Underscore Microbial Dysbiosis in Familial Crohn's Disease. Mbio 7, e01250-01216. 10.1128/mBio.01250-16.

Hofman, P., and Vouret-Craviari, V. (2012). Microbes-induced EMT at the crossroad of inflammation and cancer. Gut Microbes 3, 176-185. 10.4161/gmic.20288.

Huffnagle, G.B., and Noverr, M.C. (2013). The emerging world of the fungal microbiome. Trends Microbiol 27, 334-341. 10.1016/j.tim.2013.04.002. lida, N., Dzutsev, A., Stewart, C.A., Smith, L., Bouladoux, N., Weingarten, R.A., Molina, D.A., Salcedo, R., Back, T., Cramer, S., et al. (2013). Commensal bacteria control cancer response to therapy by modulating the tumor microenvironment. Science 342, 967-970. 10.1126/science.1240527.

Jain, T., Sharma, P., Are, A.C., Vickers, S.M., and Dudeja, V. (2021). New Insights Into the Cancer-Microbiome-Immune Axis: Decrypting a Decade of Discoveries. Front Immunol 72, 622064. 10.3389/fimmu.2021.622064.

Jawhara, S., Thuru, X., Standaert-Vitse, A., Jouault, T., Mordon, S., Sendid, B., Desreumaux, P., and Poulain, D. (2008). Colonization of mice by Candida albicans is promoted by chemically induced colitis and augments inflammatory responses through galectin-3. J Infect Dis 197, 972-980. 10.1086/528990.

Kanehisa, M., and Goto, S. (2000). KEGG: kyoto encyclopedia of genes and genomes. Nucleic Acids Res 28, 27-30. 10.1093/nar/28.1.27.

Knutsen, A.P., Bush, R.K., Demain, J.G., Denning, D.W., Dixit, A., Fairs, A., Greenberger, P.A., Kariuki, B., Kita, H., Kurup, V.P., et al. (2012). Fungi and allergic lower respiratory tract diseases. J Allergy Clin Immunol 129, 280-291; quiz 292-283.

10.1016/j.jaci.2011.12.970.

Kostic, A.D., Chun, E., Robertson, L., Glickman, J.N., Gallini, C.A., Michaud, M., Clancy, T.E., Chung, D.C., Lochhead, P., Hold, G.L., et al. (2013). Fusobacterium nucleatum potentiates intestinal tumorigenesis and modulates the tumor-immune microenvironment.

Cell Host Microbe 14, 207-215. 10.1016/j.chom.2013.07.007. Kostic, A.D., Ojesina, A.I., Pedamallu, C.S., Jung, J., Verhaak, R.G., Getz, G., and Meyerson, M. (2011). PathSeq: software to identify or discover microbes by deep sequencing of human tissue. Nat Biotechnol 29, 393-396. 10.1038/nbt. l868.

Kumamoto, C.A., Gresnigt, M.S., and Hube, B. (2020). The gut, the bad and the harmless: Candida albicans as a commensal and opportunistic pathogen in the intestine. Curr Opin Microbiol 56, 7-15. 10.1016/j.mib.2020.05.006.

Leonardi, I., Gao, I.H., Lin, W.Y., Allen, M., Li, X.V., Fiers, W.D., De Celie, M B., Putzel, G.G., Yantiss, R.K., Johncilla, M., et al. (2022). Mucosal fungi promote gut barrier function and social behavior via Type 17 immunity. Cell 185, 831-846 e814.

10.1016/j .cell.2022.01.017.

Leonardi, I., Paramsothy, S., Doron, I., Semon, A., Kaakoush, N.O., Clemente, J.C., Faith,

J. J., Borody, T.J., Mitchell, H.M., Colombel, J.F., et al. (2020). Fungal Trans-kingdom Dynamics Linked to Responsiveness to Fecal Microbiota Transplantation (FMT) Therapy in Ulcerative Colitis. Cell Host Microbe 27, 823-829 e823. 10.1016/j.chom.2020.03.006.

Lewis, J.D., Chen, E.Z., Baldassano, R.N., Otley, A.R., Griffiths, A.M., Lee, D., Bittinger,

K., Bailey, A., Friedman, E.S., Hoffmann, C., et al. (2015). Inflammation, Antibiotics, and Diet as Environmental Stressors of the Gut Microbiome in Pediatric Crohn's Disease. Cell Host Microbe 18, 489-500. 10.1016/j.chom.2015.09.008.

Li, H., and Durbin, R. (2009). Fast and accurate short read alignment with Burrows- Wheel er transform. Bioinformatics 25, 1754-1760. 10.1093/bioinformatics/btp324.

Li, H., Handsaker, B., Wysoker, A., Fennell, T., Ruan, J., Homer, N., Marth, G., Abecasis, G., Durbin, R., and Genome Project Data Processing, S. (2009). The Sequence Alignment/Map format and SAMtools. Bioinformatics 25, 2078-2079.

10.1093/bioinformatics/btp352.

Li, X.V., Leonardi, I., Putzel, G.G., Semon, A., Fiers, W.D., Kusakabe, T., Lin, W.Y., Gao, I.H., Doron, I., Gutierrez-Guerrero, A., et al. (2022). Immune regulation by fungal strain diversity in inflammatory bowel disease. Nature 603, 672-678. 10.1038/s41586-022-04502- w. Liguori, G., Lamas, B., Richard, M.L., Brandi, G., da Costa, G., Hoffmann, T.W., Di Simone, M.P., Calabrese, C., Poggioli, G., Langella, P., et al. (2016). Fungal Dysbiosis in Mucosa- associated Microbiota of Crohn's Disease Patients. Journal of Crohn's & colitis 70, 296-305. 10.1093/ecco-jcc/jjv209.

Limon, J. J., Skalski, J.H., and Underhill, D.M. (2017). Commensal Fungi in Health and Disease. Cell Host Microbe 22, 156-165. 10.1016/j.chom.2017.07.002.

Liu, J., Lichtenberg, T., Hoadley, K.A., Poisson, L.M., Lazar, A.J., Cherniack, A.D., Kovatich, A.J., Benz, C.C., Levine, D.A., Lee, A.V., et al. (2018). An Integrated TCGA PanCancer Clinical Data Resource to Drive High-Quality Survival Outcome Analytics. Cell 73, 400-416 e411. 10.1016/j .cell.2018.02.052.

Liu, N.N., Jiao, N., Tan, J.C., Wang, Z., Wu, D., Wang, A. J., Chen, J., Tao, L., Zhou, C., Fang, W., et al. (2022). Multi-kingdom microbiota analyses identify bacterial -fungal interactions and biomarkers of colorectal cancer across cohorts. Nat Microbiol 7, 238-250. 10.1038/s41564-021 -01030-7.

MacAlpine, J., Daniel-Ivad, M., Liu, Z., Yano, J., Revie, N.M., Todd, R.T., Stogios, P.J., Sanchez, H., O'Meara, T.R., Tompkins, T.A., et al. (2021). A small molecule produced by Lactobacillus species blocks Candida albicans filamentation by inhibiting a DYRK1 -family kinase. Nat Commun 72, 6151. 10.1038/s41467-021-26390-w.

Malik, A., Sharma, D., Malireddi, R.K.S., Guy, C.S., Chang, T.C., Olsen, S.R., Neale, G., Vogel, P., and Kanneganti, T.D. (2018). SYK-CARD9 Signaling Axis Promotes Gut Fungi - Mediated Inflammasome Activation to Restrict Colitis and Colon Cancer. Immunity 49, 515- 530 e515. 10.1016/j.immuni.2018.08.024.

Martin, T.A., and Jiang, W.G. (2009). Loss of tight junction barrier function and its role in cancer metastasis. Biochim Biophys Acta 1788, 872-891. 10.1016/j .bbamem.2008.11.005.

Matson, V., Fessler, J., Bao, R., Chongsuwat, T., Zha, Y., Alegre, M.L., Luke, J. J., and Gajewski, T.F. (2018). The commensal microbiome is associated with anti-PD-1 efficacy in metastatic melanoma patients. Science 359, 104-108. 10.1126/science.aao3290.

Nagao-Kitamoto, H., and Kamada, N. (2017). Host-microbial Cross-talk in Inflammatory Bowel Disease. Immune Netw 17, 1-12. 10.4110/in.2017.17.1.1. Nash, A.K., Auchtung, T.A., Wong, M.C., Smith, D.P., Gesell, J.R., Ross, M.C., Stewart, C.J., Metcalf, G.A., Muzny, D.M., Gibbs, R.A., et al. (2017). The gut mycobiome of the Human Microbiome Project healthy cohort. Microbiome 5, 153. 10.1186/s40168-017-0373-4.

Oshima, T., and Miwa, H. (2016). Gastrointestinal mucosal barrier function and diseases. J Gastroenterol 51, 768-778. 10.1007/s00535-016-1207-z.

Pleguezuelos-Manzano, C., Puschhof, J., Rosendahl Huber, A., van Hoeck, A., Wood, H.M., Nomburg, J., Guijao, C., Manders, F., Dalmasso, G., Stege, P.B., et al. (2020). Mutational signature in colorectal cancer caused by genotoxic pks(+) E. coli. Nature 580, 269-273. 10.1038/s41586-020-2080-8.

Polk, D.B., and Peek, R.M., Jr. (2010). Helicobacter pylori: gastric cancer and beyond. Nat Rev Cancer 10, 403-414. 10.1038/nrc2857.

Poore, G.D., Kopylova, E., Zhu, Q., Carpenter, C., Fraraccio, S., Wandro, S., Kosciolek, T., Janssen, S., Metcalf, J., Song, S.J., et al. (2020). Microbiome analyses of blood and tissues suggest cancer diagnostic approach. Nature 579, 567-574. 10.1038/s41586-020-2095-l.

Proctor, D.M., Dangana, T., Sexton, D.J., Fukuda, C., Yelin, R.D., Stanley, M., Bell, P.B., Baskaran, S., Deming, C., Chen, Q., et al. (2021). Integrated genomic, epidemiologic investigation of Candida auris skin colonization in a skilled nursing facility. Nat Med. 10.1038/s41591-021-01383-w.

Qin, J., Li, R., Raes, J., Arumugam, M., Burgdorf, K.S., Manichanh, C., Nielsen, T., Pons, N., Levenez, F., Yamada, T., et al. (2010). A human gut microbial gene catalogue established by metagenomic sequencing. Nature 464, 59-65. 10.1038/nature08821.

Quinlan, A.R., and Hall, I.M. (2010). BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics 26, 841-842. 10.1093/bioinformatics/btq033.

Ramirez, F., Bhardwaj, V., Arrigoni, L., Lam, K.C., Gruning, B.A., Villaveces, J., Habermann, B., Akhtar, A., and Manke, T. (2018). High-resolution TADs reveal DNA sequences underlying genome organization in flies. Nat Commun 9, 189. 10.1038/s41467- 017-02525-w. Ramirez, F., Ryan, D.P., Gruning, B., Bhardwaj, V., Kilpert, F., Richter, A.S., Heyne, S., Dundar, F., and Manke, T. (2016). deepTools2: a next generation web server for deepsequencing data analysis. Nucleic Acids Res 44, W160-165. 10.1093/nar/gkw257.

Ramirez-Garcia, A., Rementeria, A., Aguirre-Urizar, J.M., Moragues, M.D., Antoran, A., Pelion, A., Abad-Diaz-de-Cerio, A., and Hernando, F.L. (2016). Candida albicans and cancer: Can this yeast induce cancer development or progression? Crit Rev Microbiol 42, 181-193. 10.3109/1040841X.2014.913004.

Ricciardi, M., Zanotto, M., Malpeli, G., Bassi, G., Perbellini, O., Chilosi, M., Bifari, F., and Krampera, M. (2015). Epithelial -to-mesenchymal transition (EMT) induced by inflammatory priming elicits mesenchymal stromal cell-like immune-modulatory properties in cancer cells. Br J Cancer 772, 1067-1075. 10.1038/bjc.2015.29.

Robinson, K.M., Crabtree, J., Mattick, J.S., Anderson, K.E., and Dunning Hotopp, J.C. (2017). Distinguishing potential bacteria-tumor associations from contamination in a secondary data analysis of public cancer genome sequence data. Microbiome 5, 9. 10.1186/s40168-016-0224-8.

Routy, B., Le Chatelier, E., Derosa, L., Duong, C.P.M., Alou, M.T., Daillere, R., Fluckiger, A., Messaoudene, M., Rauber, C., Roberti, M.P., et al. (2018). Gut microbiome influences efficacy of PD-l-based immunotherapy against epithelial tumors. Science 359, 91-97. 10.1126/science.aan3706.

Saheb Kashaf, S., Proctor, D.M., Deming, C., Saary, P., Holzer, M., Program, N.C.S., Taylor, M.E., Kong, H.H., Segre, J. A., Almeida, A., and Finn, R.D. (2022). Integrating cultivation and metagenomics for a multi-kingdom view of skin microbiome diversity and functions. Nat Microbiol 7, 169-179. 10.1038/s41564-021-01011-w.

Sears, C.L., Geis, A.L., and Housseau, F. (2014). Bacteroides fragilis subverts mucosal biology: from symbiont to colon carcinogenesis. J Clin Invest 124, 4166-4172.

10.1172/JCI72334.

Sender, R., Fuchs, S., and Milo, R. (2016). Revised Estimates for the Number of Human and Bacteria Cells in the Body. PLoS Biol 14, el002533. 10.1371/joumal.pbio.1002533. Sharma, P., Hu-Lieskovan, S., Wargo, J. A., and Ribas, A. (2017). Primary, Adaptive, and Acquired Resistance to Cancer Immunotherapy. Cell 168, 707-723.

10.1016/j .cell.2017.01.017.

Shiao, S.L., Kershaw, K.M., Limon, J.J., You, S., Yoon, J., Ko, E.Y., Guarnerio, J., Potdar, A.A., McGovern, D.P.B., Bose, S., et al. (2021). Commensal bacteria and fungi differentially regulate tumor responses to radiation therapy. Cancer Cell 39, 1202-1213 el206.

10.1016/j. ccell.2021.07.002.

Sifrim, D., Castell, D., Dent, J., and Kahrilas, P.J. (2004). Gastro-oesophageal reflux monitoring: review and consensus report on detection and definitions of acid, non-acid, and gas reflux. Gut 53, 1024-1031. 10.1136/gut.2003.033290.

Silverman, D., Ruth, K., Sigurdson, E.R., Egleston, B.L., Goldstein, L.J., Wong, Y.N., Boraas, M., and Bleicher, R.J. (2014). Skin involvement and breast cancer: are T4b lesions of all sizes created equal? J Am Coll Surg 219, 534-544. 10.1016/j.jamcollsurg.2014.04.003.

Sivan, A., Corrales, L., Hubert, N., Williams, J.B., Aquino-Michaels, K., Earley, Z.M., Benyamin, F.W., Lei, Y.M., Jabri, B., Alegre, M.L., et al. (2015). Commensal Bifidobacterium promotes antitumor immunity and facilitates anti-PD-Ll efficacy. Science 350, 1084-1089. 10.1126/science.aac4255.

Sokol, H., Leducq, V., Aschard, H., Pham, H.P., Jegou, S., Landman, C., Cohen, D., Liguori, G., Bourrier, A., Nion-Larmurier, I., et al. (2017). Fungal microbiota dysbiosis in IBD. Gut 66, 1039-1048. 10.1136/gutjnl-2015-310746.

Soler, A.P., Miller, R.D., Laughlin, K.V., Carp, N.Z., Klurfeld, D.M., and Mullin, J.M. (1999). Increased tight junctional permeability is associated with the development of colon cancer. Carcinogenesis 20, 1425-1431. 10.1093/carcin/20.8.1425.

Spencer, C.N., McQuade, J.L., Gopalakrishnan, V., McCulloch, J. A., Vetizou, M., Cogdill, A.P., Khan, M.A.W., Zhang, X., White, M.G., Peterson, C.B., et al. (2021). Dietary fiber and probiotics influence the gut microbiome and melanoma immunotherapy response. Science 374, 1632-1640. 10.1126/science.aaz7015.

Subramanian, A., Tamayo, P., Mootha, V.K., Mukherjee, S., Ebert, B.L., Gillette, M.A., Paulovich, A., Pomeroy, S.L., Golub, T.R., Lander, E.S., and Mesirov, J.P. (2005). Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc Natl Acad Sci U S A 102, 15545-15550. 10.1073/pnas.0506580102.

Suerbaum, S., and Michetti, P. (2002). Helicobacter pylori infection. N Engl J Med 347, 1175-1186. 10.1056/NEJMra020542.

Tang, Y.L., Gan, R.L., Dong, B.H., Jiang, R.C., and Tang, R.J. (2005). Detection and location of Helicobacter pylori in human gastric carcinomas. World J Gastroenterol 11, 1387- 1391. 10.3748/wjg.vl l.i9.1387.

Tanoue, T., Morita, S., Plichta, D.R., Skelly, A.N., Suda, W ., Sugiura, Y., Narushima, S., Vlamakis, H., Motoo, I., Sugita, K., et al. (2019). A defined commensal consortium elicits CD8 T cells and anti-cancer immunity. Nature 565, 600-605. 10.1038/s41586-019-0878-z.

Tipton, L., Muller, C.L., Kurtz, Z.D., Huang, L., Kleerup, E., Morris, A., Bonneau, R., and Ghedin, E. (2018). Fungi stabilize connectivity in the lung and skin microbial ecosystems. Microbiome d, 12. 10.1186/s40168-017-0393-0.

Tjalsma, H., Boleij, A., Marchesi, J.R., and Dutilh, B.E. (2012). A bacterial driver-passenger model for colorectal cancer: beyond the usual suspects. Nat Rev Microbiol 10, 575-582. 10.1038/nrmicro2819.

Vergara, D., Simeone, P., Damato, M., Maffia, M., Lanuti, P., and Trerotola, M. (2019). The Cancer Microbiota: EMT and Inflammation as Shared Molecular Mechanisms Associated with Plasticity and Progression. J Oncol 2019, 1253727. 10.1155/2019/1253727.

Vetizou, M., Pitt, J.M., Daillere, R., Lepage, P., Waldschmitt, N., Flament, C., Rusakiewicz, S., Routy, B., Roberti, M.P., Duong, C.P., et al. (2015). Anticancer immunotherapy by CTLA-4 blockade relies on the gut microbiota. Science 350, 1079-1084.

10.1126/science.aadl329.

Viaud, S., Saccheri, F., Mignot, G., Yamazaki, T., Daillere, R., Hannani, D., Enot, D.P., Pfirschke, C., Engblom, C., Pittet, M.J., et al. (2013). The intestinal microbiota modulates the anticancer immune effects of cyclophosphamide. Science 342, 971-976.

10.1126/science.1240537.

Vogtmann, E., and Goedert, J. J. (2016). Epidemiologic studies of the human microbiome and cancer. Br J Cancer 114, 237-242. 10.1038/bjc.2015.465. Walker, M.A., Pedamallu, C.S., Ojesina, A.I., Bullman, S., Sharpe, T., Whelan, C.W., and Meyerson, M. (2018). GATK PathSeq: a customizable computational tool for the discovery and identification of microbial sequences in libraries from eukaryotic hosts. Bioinformatics 34, 4287-4289. 10.1093/bioinformatics/bty501.

Wang, T., Fan, C., Yao, A., Xu, X., Zheng, G., You, Y., Jiang, C., Zhao, X., Hou, Y., Hung, M.C., and Lin, X. (2018). The Adaptor Protein CARD9 Protects against Colon Cancer by Restricting Mycobiota-Mediated Expansion of Myeloid-Derived Suppressor Cells. Immunity 49, 504-514 e504. 10.1016/j.immuni.2018.08.018.

Wolchok, J.D., Kluger, H., Callahan, M.K., Postow, M.A., Rizvi, N.A., Lesokhin, A.M., Segal, N.H., Ariyan, C.E., Gordon, R.A., Reed, K., et al. (2013). Nivolumab plus ipilimumab in advanced melanoma. N Engl J Med 369, 122-133. 10.1056/NEJMoal302369.

Yang, C.S. (1980). Research on esophageal cancer in China: a review. Cancer Res 40, 2633- 2644.

Ye, S.H., Siddle, K.J., Park, D.J., and Sabeti, P.C. (2019). Benchmarking Metagenomics Tools for Taxonomic Classification. Cell 178, 779-794. 10.1016/j. cell.2019.07.010.

Yu, L.C. (2018). Microbiota dysbiosis and barrier dysfunction in inflammatory bowel disease and colorectal cancers: exploring a common ground hypothesis. J Biomed Sci 25, 79. 10.1186/S12929-018-0483-8.

Zangl, I., Pap, I.J., Aspock, C., and Schuller, C. (2019). The role of Lactobacillus species in the control of Candida via biotrophic interactions. Microb Cell 7, 1-14.

10.15698/mic2020.01.702.

Ze, X., Duncan, S.H., Louis, P., and Flint, H.J. (2012). Ruminococcus bromii is a keystone species for the degradation of resistant starch in the human colon. ISME J 6, 1535-1543. 10.1038/ismej .2012.4.

Zeise, K.D., Woods, R.J., and Huffnagle, G.B. (2021). Interplay between Candida albicans and Lactic Acid Bacteria in the Gastrointestinal Tract: Impact on Colonization Resistance, Microbial Carriage, Opportunistic Infection, and Host Immunity. Clin Microbiol Rev 34, e0032320. 10.1128/CMR.00323-20. Zhai, B., Ola, M., Rolling, T., Tosini, N.L., Joshowitz, S., Littmann, E.R., Amoretti, L.A., Fontana, E., Wright, R.J., Miranda, E., et al. (2020). High-resolution mycobiota analysis reveals dynamic intestinal translocation preceding invasive candidiasis. Nat Med. 10.1038/s41591-019-0709-7.

Zhai, R., Xue, X., Zhang, L., Yang, X., Zhao, L., and Zhang, C. (2019). Strain-Specific Antiinflammatory Properties of Two Akkermansia muciniphila Strains on Chronic Colitis in Mice. Front Cell Infect Microbiol 9, 239. 10.3389/fcimb.2019.00239.

Zuo, T., Wong, S.H., Cheung, C.P., Lam, K., Lui, R., Cheung, K., Zhang, F., Tang, W ., Ching, J.Y.L., Wu, J.C.Y., et al. (2018). Gut fungal dysbiosis correlates with reduced efficacy of fecal microbiota transplantation in Clostridium difficile infection. Nat Commun 9, 3663. 10.1038/s41467-018-06103 -6.

Zuo, T., Zhan, H., Zhang, F., Liu, Q., Tso, E.Y.K., Lui, G.C.Y., Chen, N., Li, A., Lu, W ., Chan, F.K.L., et al. (2020). Alterations in Fecal Fungal Microbiome of Patients With COVID-19 During Time of Hospitalization until Discharge. Gastroenterology 759, 1302- 1310 el305. 10.1053/j.gastro.2020.06.048.

Bolyen, E., Rideout, J.R., Dillon, M.R., Bokulich, N.A., Abnet, C.C., Al-Ghalith, G.A., Alexander, H., Alm, E.J., Arumugam, M., Asnicar, F., et al. (2019). Reproducible, interactive, scalable and extensible microbiome data science using QIIME 2. Nat Biotechnol 37, 852-857. 10.1038/s41587-019-0209-9.

Davidson-Pilon, C., Kalderstam, J., Jacobson, N., Zivich, P., Kuhn, B., Williamson, M., Sean-Reed, JK, A., Fiore-Gartland, A., Datta, D., et al. (2020). CamDavidsonPilon/lifelines: 0.24.6 (Zenodo).

Dobin, A., Davis, C.A., Schlesinger, F., Drenkow, J., Zaleski, C., Jha, S., Batut, P., Chaisson, M., and Gingeras, T.R. (2013). STAR: ultrafast universal RNA-seq aligner. Bioinformatics 29, 15-21. 10.1093/bioinformatics/bts635.

Foster, Z.S., Sharpton, T.J., and Grunwald, N.J. (2017). Metacoder: An R package for visualization and manipulation of community taxonomic diversity data. Pios Comput Biol 13, e!005404. 10.1371/joumal.pcbi.1005404. Friedman, J., and Alm, E.J. (2012). Inferring correlation networks from genomic survey data. Pios Comput Biol 8, el002687. 10.1371/joumal.pcbi.1002687.

Kostic, A.D., Ojesina, A.I., Pedamallu, C.S., Jung, J., Verhaak, R.G., Getz, G., and Meyerson, M. (2011). PathSeq: software to identify or discover microbes by deep sequencing of human tissue. Nat Biotechnol 29, 393-396. 10.1038/nbt. l868.

Li, H., Handsaker, B., Wysoker, A., Fennell, T., Ruan, J., Homer, N., Marth, G., Abecasis, G., Durbin, R., and Genome Project Data Processing, S. (2009). The Sequence Alignment/Map format and SAMtools. Bioinformatics 25, 2078-2079.

10.1093/bioinformatics/btp352.

McMurdie, P.J., and Holmes, S. (2013). phyloseq: an R package for reproducible interactive analysis and graphics of microbiome census data. PLoS One 8, e61217.

10.1371/journal. pone.0061217.

Subramanian, A., Kuehn, H., Gould, J., Tamayo, P., and Mesirov, J.P. (2007). GSEA-P: a desktop application for Gene Set Enrichment Analysis. Bioinformatics 23, 3251-3253.

10.1093/bioinformatics/btm369.

Walker, M.A., Pedamallu, C.S., Ojesina, A.I., Bullman, S., Sharpe, T., Whelan, C.W., and Meyerson, M. (2018). GATK PathSeq: a customizable computational tool for the discovery and identification of microbial sequences in libraries from eukaryotic hosts. Bioinformatics 34, 4287-4289. 10.1093/bioinformatics/bty501.