Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
MICROBIOME FINGERPRINTS, DIETARY FINGERPRINTS, AND MICROBIOME ANCESTRY, AND METHODS OF THEIR USE
Document Type and Number:
WIPO Patent Application WO/2021/186047
Kind Code:
A1
Abstract:
A deep metagenomic sequencing of more than 1000 individual gut microbiomes, coupled with detailed long-term diet, fasting, and same-meal postprandial cardiometabolic blood markers analyses, is described. Strong associations between a set of microbes and specific nutrients, foods, food groups, and general dietary indices are demonstrated. Microbial biomarkers of obesity were reproducible across cohorts, but blood markers of cardiovascular disease and impaired glucose tolerance were more strongly associated with microbiome structures. Panels of intestinal microbial species associated with different conditions and/or habits are identified, enabling stratification of the gut microbiome into generalizable health levels among individuals even without clinically manifest disease.

Inventors:
WOLF JONATHAN THOMAS (GB)
SEGATA NICOLA (GB)
Application Number:
PCT/EP2021/057116
Publication Date:
September 23, 2021
Filing Date:
March 19, 2021
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
ZOE GLOBAL LTD (GB)
International Classes:
A23L33/135; A61K35/742; C12Q1/06
Domestic Patent References:
WO2019014714A12019-01-24
WO2018064165A22018-04-05
Foreign References:
US20190209626A12019-07-11
US20120238468A12012-09-20
US20180122510A12018-05-03
US20190252058A12019-08-15
Other References:
DOROTTYA NAGY-SZAKAL ET AL: "Fecal metagenomic profiles in subgroups of patients with myalgic encephalomyelitis/chronic fatigue syndrome", MICROBIOME, BIOMED CENTRAL LTD, LONDON, UK, vol. 5, no. 1, 26 April 2017 (2017-04-26), pages 1 - 17, XP021244448, DOI: 10.1186/S40168-017-0261-Y
ASNICAR FRANCESCO ET AL: "Microbiome connections with host metabolism and habitual diet from 1,098 deeply phenotyped individuals", NATURE MEDICINE, vol. 27, no. 2, 11 January 2021 (2021-01-11), pages 321 - 332, XP037370419, ISSN: 1078-8956, DOI: 10.1038/S41591-020-01183-8
BROWNHAZEN, NAT. REV. MICROBIOL., vol. 16, 2018, pages 171 - 181
MOZAFFARIAN, CIRCULATION, vol. 133, 2016, pages 187 - 225
MUSSO ET AL., ANNU. REV. MED., vol. 62, 2011, pages 361 - 380
LE CHATELIER ET AL., NATURE, vol. 500, 2013, pages 541 - 546
GILBERT ET AL., NAT. MED., vol. 24, 2018, pages 392 - 400
ZEEVI ET AL., CELL, vol. 163, 2015, pages 1079 - 1094
MENDES-SOARES ET AL., AM. J. CLIN. NUTR., vol. 110, 2019, pages 63 - 75
ZHERNAKOVA ET AL., SCIENCE, vol. 352, no. 6285, 2016, pages 560 - 569
DE FILIPPIS ET AL., CELL HOST MICROBE, vol. 25, 2019, pages 444 - 264.e10
SCHIRMER ET AL., CELL, vol. 167, 2016, pages 1897
FU ET AL., CIRC. RES., vol. 117, 2015, pages 817 - 824
ORG ET AL., GENOME BIOL., vol. 18, 2017, pages 70
PASOLLI ET AL., NAT. METHODS, vol. 14, 2017, pages 1023 - 1024
COSTEA ET AL., MOL. SYST. BIOL., vol. 13, 2017, pages 960
DHAKAN ET AL., GIGASCIENCE, vol. 8, 2019
FERRETTI ET AL., CELL HOST & MICROBE, vol. 24, no. 1, 2018, pages 133 - 145
HANSEN ET AL., NAT. COMMUN., vol. 9, 2018, pages 4630
JIE ET AL., NAT. COMMUN., vol. 8, 2017, pages 845
NIELSEN ET AL., NAT. BIOTECHNOL., vol. 32, 2014, pages 822 - 828
OBREGON-TITO ET AL., NATURE COMMUNICATIONS, vol. 6, no. 1, 2015, pages 1 - 9
RAYMOND ET AL., ISME J., vol. 10, no. 3, 2016, pages 707 - 720
YE ET AL., MICROBIOME, vol. 6, no. 1, 2018, pages 135
ZELLER ET AL., MOL. SYST. BIOL., vol. 10, 2014
PASOLLI ET AL., NAT METHODS, vol. 14, 2017, pages 1023 - 1024
ASNICAR ET AL.: "Microbiome connections with host metabolism and habitual diet from 1,098 deeply phenotyped individuals", NAT MED., vol. 27, 2021, pages 321 - 323
TRUONG ET AL., NATURE METHODS, vol. 12, no. 10, 2015, pages 902 - 3
ABUBUCKER ET AL., PLOS COMPUTATIONAL BIOLOGY, vol. 8, no. 6, 2012
FRANZOSA ET AL., NATURE METHODS, vol. 15, no. 11, 2018, pages 962
TRUONG ET AL., GENOME RESEARCH, vol. 27, no. 4, 2017, pages 626 - 38
LI ET AL., BIOINFORMATICS, vol. 31, no. 10, 2015, pages 1674 - 1676
KANG ET AL., PEERJ, vol. 7, 2019, pages e7359
PARKS ET AL., GENOME RESEARCH, vol. 25, no. 7, 2015, pages 1043 - 55
DAVIDSONEPPERSON, METHODS MOL. BIOL., vol. 1706, 2018, pages 77 - 90
NAGPAL ET AL., FRONT MICROBIOL., vol. 8, pages 2897
NAGPAL ET AL., SCI REP., vol. 8, no. 1, 2018, pages 12649
PASOLLI ET AL., CELL, vol. 176, 2019, pages 649 - 662.e20
WU ET AL., GUT, vol. 65, no. 1, 2016, pages 63 - 72
TRUONG ET AL., NAT. METHODS, vol. 12, 2015, pages 902 - 903
BERRY ET AL., PROTOCOL EXCHANGE, 2020
MCLVER ET AL., BIOINFORMATICS, vol. 34, 2018, pages 1235 - 1237
FRANZOSA ET AL., NAT. METHODS, vol. 15, 2018, pages 962 - 968
PARKS ET AL., GENOME RES., vol. 25, 2015, pages 1043 - 1055
LANGMEADSALZBERG, NAT METHODS, vol. 9, no. 4, 2012, pages 357 - 359
QUINCE ET AL., NAT. BIOTECHNOL., vol. 35, 2017, pages 833 - 844
BINGHAM ET AL., PUBLIC HEALTH NUTR., vol. 4, 2001, pages 847 - 858
HOLLAND ET AL.: "McCance and Widdowson's The Composition of Foods", 1991, ROYAL SOCIETY OF CHEMISTRY
RIMM ET AL., AM J EPIDEMIOL, vol. 135, no. 10, 1992, pages 1114 - 1126
FRANKENFIELD ET AL., J. AM. DIET. ASSOC., vol. 98, 1998, pages 439 - 445
VADIVELOO ET AL., BR. J. NUTR., vol. 112, 2014, pages 1562 - 1574
SATIJA ET AL., J. AM. COLL. CARDIOL., vol. 70, 2017, pages 411 - 422
FUNG ET AL., AM. J. CLIN. NUTR., vol. 82, 2005, pages 163 - 173
MATTHEWS ET AL., BMJ, vol. 300, 1990, pages 230 - 235
PEDREGOSA ET AL., J. MACH. LEARN. RES., vol. 12, 2011, pages 2825 - 2830
THOMAS ET AL., NAT. MED., vol. 25, 2019, pages 667 - 678
PASOLLI ET AL., PLOS COMPUT. BIOL., vol. 12, 2016, pages e1 004977
XIE ET AL., CELL SYST., vol. 3, 2016, pages 572 - 584.e3
RAVEL ET AL., PROC. NATL. ACAD. SCI. U. S. A., vol. 108, no. 1, 2011, pages 4680 - 4687
ATABAKI-PASDAR ET AL., GENETIC AND GENOMIC MEDICINE, 2020
TURNBAUGH ET AL., NATURE, vol. 457, 2009, pages 480 - 484
DAVID ET AL., NATURE, vol. 505, 2014, pages 559 - 563
WURTZ ET AL., CIRCULATION, vol. 131, 2015, pages 774 - 785
AHOLA-OLLI ET AL., DIABETOLOGIA, vol. 62, 2019, pages 2298 - 2309
VOJINOVIC ET AL., NAT. COMMUN., vol. 10, 2019, pages 5813
DUPREZ ET AL., CLIN. CHEM., vol. 62, 2016, pages 1020 - 1031
SATIJA ET AL., PLOS MED., vol. 13, 2016, pages e1002039
VADIVELOO ET AL., J. NUTR., vol. 145, 2015, pages 564 - 571
KIM HYUNJU ET AL., J. AM. HEART ASSOC., vol. 8, 2019, pages e012865
REEDY ET AL., J. NUTR., vol. 144, 2014, pages 881 - 889
MITROU ET AL., ARCH. INTERN. MED., vol. 167, 2007, pages 2461 - 2468
REDONDO-USEROS ET AL., NUTRIENTS, vol. 11, 2019
SAKAMOTO ET AL., INT. J. SYST. EVOL. MICROBIOL., vol. 68, 2018, pages 2074 - 2081
ESLINGER ET AL., NUTR. RES., vol. 34, 2014, pages 714 - 722
MONTEIRO ET AL., PUBLIC HEALTH NUTR., vol. 21, 2018, pages 5 - 17
SAKAMOTO ET AL., GENOME ANNOUNC., vol. 6, 2018
HOSNY ET AL., NEW MICROBES NEW INFECT, vol. 14, 2016, pages 85 - 92
SZESCHLOSS, MBIO, vol. 7, 2016
BEAUMONT ET AL., GENOME BIOL., vol. 17, 2016, pages 189
TRUONG ET AL., GENOME RES., vol. 27, 2017, pages 626 - 638
CUI ET AL., ARCH. INTERN. MED., vol. 161, 2001, pages 1413 - 1419
D'AGOSTINO ET AL., CIRCULATION, vol. 117, 2008, pages 743 - 753
KETTUNEN ET AL., CIRC GENOM PRECIS MED, vol. 11, 2018, pages e002234
HREBICEK ET AL., J. CLIN. ENDOCRINOL. METAB., vol. 87, 2002, pages 144 - 147
THOMAS ET AL., NAT MED, vol. 25, 2019, pages 667 - 678
SKEGGS ET AL., J. LIPID RES., vol. 43, 2002, pages 1264 - 1274
HODSON ET AL., PROG. LIPID RES., vol. 47, 2008, pages 348 - 380
LI ET AL., AM. J. CLIN. NUTR., 2020
MARKLUND ET AL., CIRCULATION, vol. 139, 2019, pages 2422 - 2436
CHOWDHURY ET AL., ANN. INTERN. MED., vol. 160, 2014, pages 398 - 406
ZONG ET AL., BMJ, vol. 355, 2016, pages i5796
WU ET AL., NAT. REV. CARDIOL., vol. 16, 2019, pages 581 - 601
ZONG ET AL., AM. J. CLIN. NUTR., vol. 107, 2018, pages 445 - 453
COHN, J CAN. J. CARDIOL., vol. 14, no. B, 1998, pages 18B - 27B
"Nature", vol. 486, 2012, HUMAN MICROBIOME PROJECT CONSORTIUM, pages: 207 - 214
ARUMUGAM ET AL., NATURE, vol. 473, 2011, pages 174 - 180
CANI, GUT, vol. 67, 2018, pages 1716 - 1725
LEY, NAT. REV. GASTROENTEROL. HEPATOL., vol. 13, 2016, pages 69 - 70
KOVATCHEVA-DATCHARY ET AL., CELL METAB., vol. 22, 2015, pages 971 - 982
DE VADDER ET AL., CELL METAB., vol. 24, 2016, pages 151 - 157
PEDERSEN ET AL., NATURE, vol. 535, 2016, pages 376 - 381
CLARK ET AL., ADV. PARASITOL., vol. 82, 2013, pages 1 - 32
ALFELLANI ET AL., ACTA TROP., vol. 126, 2013, pages 11 - 18
LUKES ET AL., PLOS PATHOG., vol. 11, 2015, pages e1005039
BEGHINI ET AL., ISME J., vol. 11, 2017, pages 2848 - 2863
ANDERSEN ET AL., FEMS MICROBIOL. ECOL., vol. 91, 2015
QIN ET AL., NATURE, vol. 464, 2010, pages 59 - 65
MACHIELS ET AL., GUT, vol. 63, 2014, pages 1275 - 1283
SOKOL ET AL., PROC. NATL. ACAD. SCI. U. S. A., vol. 105, 2008, pages 16731 - 16736
HALL ET AL., GENOME MED., vol. 9, 2017, pages 103
AZZOUZ ET AL., ANN. RHEUM. DIS., vol. 78, 2019, pages 947 - 956
NI ET AL., GASTROENTEROLOGY, vol. 152, 2017, pages S214
VALLES-COLOMER ET AL., NAT MICROBIOL, vol. 4, 2019, pages 623 - 632
GUPTA ET AL., MSYSTEMS, vol. 4, 2019
JIANG ET AL., BRAIN BEHAV. IMMUN., vol. 48, 2015, pages 186 - 194
KIM ET AL., J. NUTR., vol. 148, 2018, pages 624 - 631
MESLIER ET AL., GUT, 2020
MCDONALD ET AL., MSYSTEMS, vol. 3, 2018
CADE ET AL., NUTR. RES. REV., vol. 17, 2004, pages 5 - 22
BEGHINI ET AL., ISMEJ, vol. 11, 2017, pages 2848 - 2863
KURILSHIKOV ET AL., CIRC. RES., vol. 124, 2019, pages 1808 - 1820
KO ET AL., NAT. REV. GASTROENTEROL. HEPATOL., 2020
SCHOLZ ET AL., NAT. METHODS, vol. 13, 2016, pages 435 - 438
ROWLAND ET AL., EUR. J. NUTR., vol. 57, 2018, pages 1 - 24
"Oxford Dictionary of Biochemistry and Molecular Biology", 2004, OXFORD UNIVERSITY PRESS
Attorney, Agent or Firm:
PATENTGRUPPEN A/S (DK)
Download PDF:
Claims:
LISTING OF CLAIMS

What is claimed is:

1 . A method of using a group of microbes to determine a health condition in a human subject, wherein the group of microbes comprises: at least two pro-health indicator microbes; or at least two poor health indicator microbes; or at least two pro-health indicator microbes and at least two poor health indicator microbes; wherein at least one of the pro-health indicator microbes is selected from the group consisting of Prevotella copri, Blastocystis spp., Haemophilus para influenzae, Firmicutes bacterium CAG 95, Bifidobacterium animalis, Oscillibacter sp 57 20, Roseburia sp CAG 182, Veillonella dispar, Eubacterium eligens, Firmicutes bacterium CAG 170, Rothia mucilaginosa, Veillonella infantium, Roseburia hominis, Oscillibacter sp PC13, Clostridium sp CAG 167, Ruminococcaceae bacterium D5, Paraprevotella xylaniphila, Faecalibacterium prausnitzii, Romboutsia ilealis, and Veillonella atypical and wherein at least one of the poor health indicator microbes is selected from the group consisting of Eubacterium ventriosum, Roseburia inulinivorans, Clostridium spiroforme, Clostridium bolteae CAG 59, Eggerthella lenta, Clostridium bolteae, Collinsella intestinalis, Clostridium innocuum, Blautia obeum, Clostridium symbiosum, Clostridium sp CAG 58, Blautia hydrogenotrophica, Anaerotruncus colihominis, Ruminococcus gnavus, Flavonifractor plautii, Clostridium leptum, Ruthenibacterium lactatiformans, and Escherichia coli.

2. The method of claim 1 , comprising: obtaining a biological sample from the human subject; and analyzing the biological sample to determine presence, absence, or abundance of the at least two pro-health indicator microbes and/or the at least two poor health indicator microbes.

3. The method of any of the preceding claims, comprising: obtaining a biological sample from the human subject; identifying in the biological sample at least 10, at least 20, at least 30, at least 40, at least 50, at least 60, at least 70, at least 80, at least 90, at least 100, at least 125, at least 150, at least 175, at least 200, or more than 200 different microbes in the biological sample; and determining the health condition of the human subject based on presence, absence, and/or absolute or relative abundance of the identified microbes in the biological sample.

4. The method of any of the preceding claims, wherein the group of microbes comprises: at least three pro-health indicator microbes; at least five pro-health indicator microbes; at least ten pro-health indicator microbes; or more than 10 listed pro-health indicator microbes.

5. The method of any of the preceding claims, wherein the group of microbes comprises: at least three poor health indicator microbes; at least five poor health indicator microbes; at least ten poor health indicator microbes; or more than 10 listed poor health indicator microbes.

6. The method of any of the preceding claims, wherein the group of microbes comprises Clostridium innocuum, C. symbiosum, C. spiroforme, C. leptum, and C. saccharolyticum.

7. The method of any of the preceding claims, wherein the group of microbes comprises P. copri and Blastocystis spp..

8. The method of any of the preceding claims, wherein the health condition comprises at least one of: overall good health, overall poor health, obesity, BMI, diabetes risk, cardiometabolic risk, cardiovascular disease risk, or postprandial response to food intake.

9. The method of any of the preceding claims, comprising detecting the presence, absence, or relative abundance of at least one of the microbes in a microbiome sample from the human subject.

10. The method of claim 9, wherein the detecting comprises one or more of: sequencing one or more nucleic acids of a pro-health or poor health microbe, hybridizing a nucleic acid probe to a nucleic acid of a pro-health or poor health microbe, detecting one or more proteins from a pro-health or poor health microbe, or measuring activity of one or more proteins a pro-health or poor health microbe.

11 . The method of claim 9 or 10, wherein the detecting comprises shotgun metagenomics.

12. The method of any of the preceding claims, wherein the biological sample comprises a stool sample.

13. A method of predicting a health condition in a subject, comprising: determining presence, absence, or relative abundance of at least three pro-health indicator microbes in a microbiome of the subject; determining presence, absence, or relative abundance of at least three poor health indicator microbes in a microbiome of the subject; and predicting the health condition of the subject, based on the presence, absence, or relative abundance of the pro-health and/or poor health indicator microbes in the microbiome of the subject; wherein at least one of the pro-health indicator microbes is selected from the group consisting of Prevotella copri, Blastocystis spp., Haemophilus parainfluenzae, Firmicutes bacterium CAG 95, Bifidobacterium animalis, Oscillibacter sp 57 20, Roseburia sp CAG 182, Veillonella dispar, Eubacterium eligens, Firmicutes bacterium CAG 170, Rothia mucilaginosa, Veillonella infantium, Roseburia hominis, Oscillibacter sp PC13, Clostridium sp CAG 167, Ruminococcaceae bacterium D5, Paraprevotella xylaniphila, Faecalibacterium prausnitzii, Romboutsia ilealis, and Veillonella atypica; and wherein at least one of the poor health indicator microbes is selected from the group consisting of Eubacterium ventriosum, Roseburia inulinivorans, Clostridium spiroforme, Clostridium bolteae CAG 59, Eggerthella lenta, Clostridium bolteae, Collinsella intestinalis, Clostridium innocuum, Blautia obeum, Clostridium symbiosum, Clostridium sp CAG 58, Blautia hydrogenotrophica, Anaerotruncus colihominis, Ruminococcus gnavus, Flavonifractor plautii, Clostridium leptum, Ruthenibacterium lactati forma ns, and Escherichia coli.

14. The method of claim 13, wherein: the health condition comprises at least one of obesity, increased cardiometabolic risk, diabetes risk, or overall poor health; and the health condition is predicted by the presence and/or abundance of more poor health indicator microbes than pro-health indicator microbes; and/or the health condition comprises at least one of overall good health or absence of obesity, reduced cardiometabolic risk, or reduced diabetes risk; and the health condition is predicted by the presence and/or abundance of more pro-health indicator microbes than poor health indicator microbes.

15. A method to predict overall good or poor general health in a non-diseased human subject, comprising: obtaining a microbiome sample from the human subject; isolating a nucleic acid fraction from the microbiome sample; detecting, within the nucleic acid fraction, presence, absence, or relative abundance of at least one unique marker sequence indicative of: a pro-health indicator microbe selected from the group consisting of Prevotella copri, Blastocystis spp., Haemophilus parainfluenzae, Firmicutes bacterium CAG 95, Bifidobacterium animalis, Oscillibacter sp 57 20, Roseburia sp CAG 182, Veillonella dispar, Eubacterium eligens, Firmicutes bacterium CAG 170, Rothia mucilaginosa, Veillonella infantium, Roseburia hominis, Oscillibacter sp PC13, Clostridium sp CAG 167, Ruminococcaceae bacterium D5, Paraprevotella xylaniphila, Faecalibacterium prausnitzii, Romboutsia ilealis, and Veillonella atypical or a poor health indicator microbes selected from the group consisting of Eubacterium ventriosum, Roseburia inulinivorans, Clostridium spiroforme, Clostridium bolteae CAG 59, Eggerthella lenta, Clostridium bolteae, Collinsella intestinalis, Clostridium innocuum, Blautia obeum, Clostridium symbiosum, Clostridium sp CAG 58, Blautia hydrogenotrophica, Anaerotruncus colihominis, Ruminococcus gnavus, and Flavonifractor plautil·, and at least one of predicting the human subject has overall good general health if the pro-health indicator microbes outnumber or are relatively more abundant than the poor-health indicator microbes; or predicting the human subject has overall poor general health if the poor health indicator microbes outnumber or are relatively more abundant than the pro-health indicator microbes.

16. The method of claim 15, further comprising providing to the human subject a dietary recommendation based on the presence, absence, or relative abundance of one or more poor health indicator microbes and/or one or more pro-health indicator microbes.

17. An assay, comprising: subjecting nucleic acid extracted from a test sample of a human subject to a genotyping assay that detects at least one of Prevotella copri, Blastocystis spp., Haemophilus parainfluenzae, Firmicutes bacterium CAG 95, Bifidobacterium animalis, Oscillibacter sp 57 20, Roseburia sp CAG 182, Veillonella dispar, Eubacterium eligens, Firmicutes bacterium CAG 170, Rothia mucilaginosa, Veillonella infantium, Roseburia hominis, Oscillibacter sp PC13, Clostridium sp CAG 167, Ruminococcaceae bacterium D5, Paraprevotella xylaniphila, Faecalibacterium prausnitzii, Romboutsia ilealis, and Veillonella atypica, the test sample comprising microbiota from a gut of the subject; determining a relative abundance of the at least one of Prevotella copri, Haemophilus parainfluenzae, Firmicutes bacterium CAG 95, Bifidobacterium animalis, Oscillibacter sp 57 20, Roseburia sp CAG 182, Veillonella dispar, Eubacterium eligens, Firmicutes bacterium CAG 170, Rothia mucilaginosa, Veillonella infantium, Roseburia hominis, Oscillibacter sp PC13, Clostridium sp CAG 167, Ruminococcaceae bacterium D5, Paraprevotella xylaniphila, Faecalibacterium prausnitzii, Romboutsia ilealis, and Veillonella atypica that is below a predetermined abundance; and selecting, when the relative abundance is below the predetermined abundance, a treatment regimen that comprises at least one of:

(i) modifying microbiota of the gut of the subject using at least one of a prebiotic, probiotic, or pharmaceutical, or

(ii) altering the diet of the human subject.

18. An assay, comprising: subjecting nucleic acid extracted from a test sample of a human subject to a genotyping assay that detects at least one of Eubacterium ventriosum, Roseburia inulinivorans, Clostridium spiroforme, Clostridium bolteae CAG 59, Eggerthella lenta, Clostridium bolteae, Collinsella intestinalis, Clostridium innocuum, Blautia obeum, Clostridium symbiosum, Clostridium sp CAG 58, Blautia hydrogenotrophica, Anaerotruncus colihominis, Ruminococcus gnavus, Flavonifractor plautii, Clostridium leptum, Ruthenibacterium lactati forma ns, and Escherichia coli, the test sample comprising microbiota from a gut of the subject; determining a relative abundance of the at least one Eubacterium ventriosum, Roseburia inulinivorans, Clostridium spiroforme, Clostridium bolteae CAG 59, Eggerthella ienta, Clostridium bolteae, Collinsella intestinalis, Clostridium innocuum, Blautia obeum, Clostridium symbiosum, Clostridium sp CAG 58, Blautia hydrogenotrophica, Anaerotruncus colihominis, Ruminococcus gnavus, Flavonifractor plautii, Clostridium leptum, Ruthenibacterium lactatiformans, and Escherichia coli that is above a predetermined abundance; and selecting, when the relative abundance is above the predetermined abundance, a treatment regimen that comprises at least one of:

(i) modifying microbiota of the gut of the subject using at least one of a prebiotic, probiotic, or pharmaceutical, or

(ii) altering the diet of the human subject.

19. A method of diagnosing a human subject as having a healthy diet, comprising detecting in a microbiome sample from the subject the presence of Firmicutes CAG95 and/or the absence of Firmicutes CAG94.

20. A method of diagnosing a human subject as having an unhealthy diet, comprising detecting in a microbiome sample from the subject the presence of Firmicutes CAG94 and/or the absence of Firmicutes CAG95.

21 . A microbial signature for good health, comprising presence or relatively high abundance of at least three microbes selected from the group consisting of Prevotella copri, Blastocystis spp., Haemophilus parainfluenzae, Firmicutes bacterium CAG 95, Bifidobacterium animalis, Oscillibacter sp 5720, Roseburia sp CAG 182, Veillonella dispar, Eubacterium eligens, Firmicutes bacterium CAG 170, Rothia mucilaginosa, Veillonella infantium, Roseburia hominis, Oscillibacter sp PC13, Clostridium sp CAG 167, Ruminococcaceae bacterium D5, Paraprevotella xylaniphila, Faecalibacterium prausnitzii, Romboutsia ilealis, and Veillonella atypica, and/or absence or relatively low abundance of at least three microbes selected from the group consisting of Eubacterium ventriosum, Roseburia inulinivorans, Clostridium spiroforme, Clostridium bolteae CAG 59, Eggerthella Ienta, Clostridium bolteae, Collinsella intestinalis, Clostridium innocuum, Blautia obeum, Clostridium symbiosum, Clostridium sp CAG 58, Blautia hydrogenotrophica, Anaerotruncus colihominis, Ruminococcus gnavus, Flavonifractor plautii, Clostridium leptum, Ruthenibacterium lactatiformans, and Escherichia coli.

22. A microbial signature for poor health, comprising absence or relatively low abundance of at least three microbes selected from the group consisting of Prevotella copri, Blastocystis spp., Haemophilus parainfluenzae, Firmicutes bacterium CAG 95, Bifidobacterium animalis, OscHlibacter sp 5720, Roseburia sp CAG 182, Veillonella dispar, Eubacterium eligens, Firmicutes bacterium CAG 170, Rothia mucilaginosa, Veillonella infantium, Roseburia hominis, OscHlibacter sp PC13, Clostridium sp CAG 167, Ruminococcaceae bacterium D5, Paraprevotella xylaniphila, Faecalibacterium prausnitzii, Romboutsia ilealis, and Veillonella atypica, and/or presence or relatively high abundance of at least three microbes selected from the group consisting of R. gnavus, F. plautii, C. innocuum, C. sym biosum, C. bolteae, A. colihominis, C. intestinalis, B. obeum, R. inulinivorans, E. ventriosum, B. hydrogenotrophica, Clostridium CAG 58, E. lenta, C. bolteae CAG 59, C. spiroforme, Clostridium leptum, Ruthenibacterium lactatiformans, and Escherichia coli.

23. The microbial signature of claim 21 , wherein the signature comprises: at least three pro-health indicator microbes; at least five pro-health indicator microbes; at least ten pro-health indicator microbes; or more than 10 listed pro-health indicator microbes.

24. The microbial signature of claim 21 or 23, wherein the group of microbes comprises P. copri and Blastocystis spp.

25. The microbial signature of claim 22, wherein the group of microbes comprises: at least three poor health indicator microbes; at least five poor health indicator microbes; at least ten poor health indicator microbes; or more than 10 listed poor health indicator microbes.

26. The microbial signature of claim 22 or 25, wherein the group of microbes comprises Clostridium innocuum, C. symbiosum, C. spiroforme, C. leptum, and C. saccharolyticum.

27. Use of the microbial signature of any one of claims 21-26, to guide treatment decisions for a human subject.

28. The use of claim 27, wherein the treatment decision comprises selecting one or more of: modifying overall diet, increasing intake of at least one specified food or supplement, decreasing intake of at least one specified food or supplement, administration of a probiotic composition, administration of a prebiotic composition, or administration of an antibiotic compound.

29. A method for targeting a microbiome of a human subject to promote health, comprising: detecting in a microbiome sample from the human subject one or more pro-health indicator microbes selected from the group consisting of Prevotella copri, Blastocystis spp., Haemophilus parainfluenzae, Firmicutes bacterium CAG 95, Bifidobacterium animalis, Oscillibacter sp 57 20, Roseburia sp CAG 182, Veillonella dispar, Eubacterium eligens, Firmicutes bacterium CAG 170, Rothia mucilaginosa, Veillonella infantium, Roseburia hominis, Oscillibacter sp PC13, Clostridium sp CAG 167, Ruminococcaceae bacterium D5, Paraprevotella xylaniphila, Faecalibacterium prausnitzii, Romboutsia ilealis, and Veillonella atypica ; and administering to the human a composition that increases growth or survival of the pro-health indicator microbe(s); and/or detecting in a microbiome sample from the human subject one or more poor health indicator microbe selected from the group consisting of Eubacterium ventriosum, Roseburia inulinivorans, Clostridium spiroforme, Clostridium bolteae CAG 59, Eggerthella lenta, Clostridium bolteae, Collinsella intestinalis, Clostridium innocuum, Blautia obeum, Clostridium symbiosum, Clostridium sp CAG 58, Blautia hydrogenotrophica, Anaerotruncus colihominis, Ruminococcus gnavus, Flavonifractor plautii, Clostridium leptum, Ruthenibacterium lactatiformans, and Escherichia coll·, and administering to the human a composition that decreases growth or survival of the poor health indicator microbe(s).

30. The method of claim 29, comprising detecting: at least three pro-health indicator microbes; at least five pro-health indicator microbes; at least ten pro-health indicator microbes; or more than 10 listed pro-health indicator microbes.

31 . The method of claim 29 or claim 30, wherein the pro-health indicator microbes comprise P. copri and Blastocystis spp.

32. The method of any of claims 29-31 , comprising detecting: at least three poor health indicator microbes; at least five poor health indicator microbes; at least ten poor health indicator microbes; or more than 10 listed poor health indicator microbes.

33. The method of any of claims 29-32, wherein the poor health indicator microbes comprise Clostridium innocuum, C. symbiosum, C. spiroforme, C. leptum, and C. saccharolyticum.

34. A probiotic composition for ingestion by a human subject, comprising at least one of Prevotella copri, Blastocystis spp., Haemophilus parainfluenzae, Firmicutes bacterium CAG 95, Bifidobacterium anim alls, OscHlibacter sp 57 20, Roseburia sp CAG 182, Veillonella dispar, Eubacterium eligens, Firmicutes bacterium CAG 170, Rothia mucilaginosa, Veillonella infantium, Roseburia hominis, OscHlibacter sp PC13, Clostridium sp CAG 167, Ruminococcaceae bacterium D5, Paraprevotella xylaniphila, Faecalibacterium prausnitzii, Romboutsia ilealis, and Veillonella atypica.

35. The probiotic composition of claim 34, comprising at least three, at least five, at least seven, at least 9, at least 10, at least 12, at least 14, or all of the listed microbes.

36. The probiotic composition of claim 34 or claim 35, comprising Prevotella copri or Blastocystis spp. or both.

37. A method of altering abundance of one or more microbes in gut microflora of a subject, comprising administering the probiotic composition of any of claims 34-36 to the subject.

38. A system to assay a biological condition in a subject, comprising: a nucleic acid sample isolation device, which is adapted to isolate a nucleic acid sample from the subject; a sequencing device, which is connected to the nucleic acid sample isolation device and adapted to sequence the nucleic acid sample, thereby obtaining a sequencing result; and an alignment device, which is connected to the sequencing device and adapted to align the sequencing result against sequence from one or more of microbes in order to determine presence or absence of the microbe(s) based on the alignment result, wherein the microbes comprise one or more of: pro-health indicator microbes selected from the group consisting of Prevotella copri, Blastocystis spp., Flaemophilus parainfluenzae, Firmicutes bacterium CAG 95, Bifidobacterium animalis, OscHlibacter sp 57 20, Roseburia sp CAG 182, Veillonella dispar, Eubacterium eligens, Firmicutes bacterium CAG 170, Rothia mucilaginosa, Veillonella infantium, Roseburia hominis, OscHlibacter sp PC13, Clostridium sp CAG 167, Ruminococcaceae bacterium D5, Paraprevotella xylaniphila, Faecalibacterium prausnitzii, Romboutsia ilealis, and Veillonella atypica ; and/or poor health indicator microbes selected from the group consisting of Eubacterium ventriosum, Roseburia inulinivorans, Clostridium spiroforme, Clostridium bolteae CAG 59, Eggerthella lenta, Clostridium bolteae, Collinsella intestinalis, Clostridium innocuum, Blautia obeum, Clostridium symbiosum, Clostridium sp CAG 58, Blautia hydrogenotrophica, Anaerotruncus colihominis, Ruminococcus gnavus, Flavonifractor plautii, Clostridium leptum, Ruthenibacterium lactati forma ns, and Escherichia coli.

Description:
MICROBIOME FINGERPRINTS, DIETARY FINGERPRINTS, AND MICROBIOME ANCESTRY,

AND METHODS OF THEIR USE

FIELD OF THE DISCLOSURE

[0001] The present disclosure relates generally to microbiome analyses, as well as methods of modifying the microbiome of an individual, methods of diagnosis, and compositions based on such analyses.

BACKGROUND OF THE DISCLOSURE

[0002] Dietary contributions to health, and particularly to long-term chronic conditions such as obesity, metabolic syndrome, and cardiac events, are of universal importance. This is especially true as obesity and associated mortality and morbidity have risen dramatically over the past decades and continue to do so worldwide. The reasons for this relatively rapid chew ange have remained unclear, with the gut microbiome implicated as one of several potentially causal human-environmental interactions (Brown & Hazen, Nat. Rev. Microbiol. 16:171-181 , 2018; Mozaffarian, Circulation 133:187-225, 2016; Musso et al. , Annu. Rev. Med. 62, 361-380, 2011 ; Le Chatelier et al. , Nature 500:541-546, 2013). Surprisingly, the details of the microbiome’s role in obesity and cardiometabolic health have proven difficult to define reproducibly in large, diverse human populations - contrary to their behavior in mice - likely due to the complexity of habitual diets, the difficulty of measuring them at scale, and the highly personalized nature of the microbiome (Gilbert et al., Nat. Med. 24:392-400, 2018).

[0003] Today, individuals can measure a large number of health characteristics without having to go to a lab or clinic. For example, individuals may obtain an analysis of their microbiome by mailing a sample, collected at home, to a company for analysis. Generally, a microbiome analysis includes determining the composition and function of a community of microbes in a particular location, such as within the gut of an individual. A microbiome of the gut is made up of trillions of microorganisms, such as bacteria, and their genetic material that live in the intestinal tract, including bacteria, archaea or archaebacteria, viruses, and microeukaryotes.

[0004] These microorganisms appear to be an important part of digesting food, assisting with absorbing and synthesizing nutrients, regulating metabolism, body weight, and immune regulation, as well as contributing to regulating brain functions and mood. Microbiomes of different individuals, however, vary greatly. For instance, it is estimated that only ten to thirty percent of the bacterial species in a microbiome is common across different individuals. Much of this diversity of microbiomes remains unexplained, yet diet, environment, and host genetics appear to play a part. Determining how to utilize the results of the microbiome analysis, however, can be challenging.

[0005] Growing evidence also implicates the gut microbiome as a factor in the development of a number of disease processes, including inflammatory bowel diseases, atherosclerosis, obesity, diabetes, and colon cancer. The association of these disease processes with an altered microbial community structure suggests that interventions that restore the normal resilient gut microbial community might be an innovative intervention, as well as a way to influence overall health and wellness.

SUMMARY OF THE DISCLOSURE

[0006] Described herein is the Personalized Responses to Dietary Composition Trial (PREDICT 1) observational and interventional study of diet-microbiome interactions in metabolic health. PREDICT 1 included over 1 ,000 participants in the United Kingdom (UK) and the United States (US) who were profiled pre- and post-standardized dietary challenges using a combination of intensive in-clinic biometric and blood measures, nutritionist-administered free-living dietary recall and logging, habitual dietary data collection, continuous glucose monitoring, and stool shotgun metagenomic sequencing. The study was inspired by and generally concordant with previous large-scale diet-microbiome interaction profiles, identifying both overall gut microbiome configurations and specific microbial taxa and functions associated with postprandial glucose responses (Zeevi et al., Cell 163:1079-1094, 2015; Mendes-Soares et al., Am. J. Clin. Nutr. 110, 63-75, 2019), obesity-associated biometrics such as body mass index (BMI) and adiposity (Falony et al., Science 352, 560-564, 2016; Zhernakova et al., Science 352, 565-569, 2016; Thingholm et al., Cell Host Microbe 26, 252-264.e10, 2019), and blood lipids and inflammatory markers (Schirmer et al., Cell 167:1897, 2016; Fu et al., Circ. Res. 117:817-824, 2015; Org et al., Genome Biol. 18:70, 2017). By combining PREDICT’s extensive dietary and blood biomarker measures with high-precision microbiome analysis, these findings were able to extend to specific beneficial (e.g. Faecalibacterium prausnitzii) and detrimental (e.g. Ruminococcus gnavus) organisms, as well as to a highly-reproducible gut microbial signature of overall health that reproduced across multiple blood and dietary measures within PREDICT and in several previously published cohorts (Pasolli et al., Nat. Methods 14:1023-1024, 2017).

[0007] The current disclosure provides methods of using a group of microbes to determine a health condition in a human subject, wherein the group of microbes includes: at least two pro-health indicator microbes; or at least two poor health indicator microbes; or at least two pro-health indicator microbes and at least two poor health indicator microbes; wherein at least one of the pro-health indicator microbes is selected from the group including Prevotella copri, Blastocystis spp., Haemophilus parainfluenzae, Firmicutes bacterium CAG 95, Bifidobacterium animalis, Oscillibacter sp 57 20, Roseburia sp CAG 182, Veillonella dispar, Eubacterium eligens, Firmicutes bacterium CAG 170, Rothia mucilaginosa, Veillonella infantium, Roseburia hominis, Oscillibacter sp PC13, Clostridium sp CAG 167, Ruminococcaceae bacterium D5, and Paraprevotella xylaniphila ; and wherein at least one of the poor health indicator microbes is selected from the group including Eubacterium ventriosum, Roseburia inulinivorans, Clostridium spiroforme, Clostridium bolteae CAG 59, Eggerthella lenta, Clostridium bolteae, Collinsella intestinalis, Clostridium innocuum, Blautia obeum, Clostridium symbiosum, Clostridium sp CAG 58, Blautia hydrogenotrophica, Anaerotruncus colihominis, Ruminococcus gnavus, and Flavonifractor plautii. In another embodiment, at least one of the pro- health indicator microbes is selected from the group including Firmicutes bacterium CAG 95, Haemophilus parainfluenzae, Oscillibacter sp 57 20, Firmicutes bacterium CAG 170, Roseburia sp CAG 182, Clostridium sp CAG 167, Oscillibacter sp PC13, Eubacterium eligens, Prevotella copri, Veillonella dispar, Veillonella infantium, Faecalibacterium prausnitzii, Bifidobacterium animalis, Romboutsia ilealis, and Veillonella atypical and at least one of the poor health indicator microbes is selected from the group including Clostridium leptum, Ruthenibacterium lactati forma ns, Collinsella intestinalis, Escherichia coli, Blautia hydrogenotrophica, Clostridium sp CAG 58, Eggerthella lenta, Ruminococcus gnavus, Clostridium spiroforme, Clostridium bolteae CAG 59, Clostridium innocuum, Anaerotruncus colihominis, Clostridium symbiosum, Clostridium bolteae, and Flavonifractor plautii. [0008] In embodiments of the invention the methods comprise: obtaining a biological sample from the human subject; and analyzing the biological sample to determine presence, absence, or abundance of the at least two pro-health indicator microbes and/or the at least two poor health indicator microbes.

[0009] In embodiments of the invention the methods comprise: obtaining a biological sample from the human subject; identifying in the biological sample at least 10, at least 20, at least 30, at least 40, at least 50, at least 60, at least 70, at least 80, at least 90, at least 100, at least 125, at least 150, at least 175, at least 200, or more than 200 different microbes in the biological sample; and determining the health condition of the human subject based on presence, absence, and/or absolute or relative abundance of the identified microbes in the biological sample.

[0010] In embodiments of the invention, the group of microbes comprises: at least three pro-health indicator microbes; at least five pro-health indicator microbes; at least ten pro-health indicator microbes; or more than 10 listed pro-health indicator microbes.

[0011] In embodiments of the invention, the group of microbes comprises: at least three poor health indicator microbes; at least five poor health indicator microbes; at least ten poor health indicator microbes; or more than 10 listed poor health indicator microbes.

[0012] In embodiments of the invention, the group of microbes comprises Clostridium innocuum, C. symbiosum, C. spiroforme, C. leptum, and C. saccharolyticum.

[0013] In embodiments of the invention, the group of microbes comprises P. copri and Blastocystis spp..

[0014] In embodiments of the invention, the health condition comprises at least one of: overall good health, overall poor health, obesity, BMI, diabetes risk, cardiometabolic risk, cardiovascular disease risk, or postprandial response to food intake.

[0015] In embodiments of the invention the methods comprise detecting the presence, absence, or relative abundance of at least one of the microbes in a microbiome sample from the human subject. [0016] In embodiments of the invention, the detecting comprises one or more of: sequencing one or more nucleic acids of a pro-health or poor health microbe, hybridizing a nucleic acid probe to a nucleic acid of a pro-health or poor health microbe, detecting one or more proteins from a pro-health or poor health microbe, or measuring activity of one or more proteins a pro-health or poor health microbe. [0017] In embodiments of the invention, the detecting comprises shotgun metagenomics.

[0018] In embodiments of the invention, the biological sample comprises a stool sample.

[0019] Another aspect provides methods of predicting a health condition in a subject, the method including: determining presence, absence, or relative abundance of at least three pro-health indicator microbes in a microbiome of the subject; determining presence, absence, or relative abundance of at least three poor health indicator microbes in a microbiome of the subject; and predicting the health condition of the subject, based on the presence, absence, or relative abundance of the pro-health and/or poor health indicator microbes in the microbiome of the subject; wherein at least one of the pro-health indicator microbes is selected from the group including Prevotella copri, Blastocystis spp., Haemophilus parainfluenzae, Firmicutes bacterium CAG 95, Bifidobacterium animalis, Oscillibacter sp 5720, Roseburia sp CAG 182, Veillonella dispar, Eubacterium eligens, Firmicutes bacterium CAG 170, Rothia mucilaginosa, Veillonella infantium, Roseburia hominis, Oscillibacter sp PC13, Clostridium sp CAG 167, Ruminococcaceae bacterium D5, and Paraprevotella xylaniphila ; and wherein at least one of the poor health indicator microbes is selected from the group including Eubacterium ventriosum, Roseburia inulinivorans, Clostridium spiroforme, Clostridium bolteae CAG 59, Eggerthella lenta, Clostridium bolteae, Collinsella intestinalis, Clostridium innocuum, Blautia obeum, Clostridium symbiosum, Clostridium sp CAG 58, Blautia hydrogenotrophica, Anaerotruncus colihominis, Ruminococcus gnavus, and Flavonifractor plautii. In another embodiment, at least one of the pro-health indicator microbes is selected from the group including Firmicutes bacterium CAG 95, Haemophilus parainfluenzae, Oscillibacter sp 57 20, Firmicutes bacterium CAG 170, Roseburia sp CAG 182, Clostridium sp CAG 167, Oscillibacter sp PC13, Eubacterium eligens, Prevotella copri, Veillonella dispar, Veillonella infantium, Faecalibacterium prausnitzii, Bifidobacterium animalis, Romboutsia ilealis, and Veillonella atypica ; and at least one of the poor health indicator microbes is selected from the group including Clostridium leptum, Ruthenibacterium lactati forma ns, Collinsella intestinalis, Escherichia coli, Blautia hydrogenotrophica, Clostridium sp CAG 58, Eggerthella lenta, Ruminococcus gnavus, Clostridium spiroforme, Clostridium bolteae CAG 59, Clostridium innocuum, Anaerotruncus colihominis, Clostridium symbiosum, Clostridium bolteae, and Flavonifractor plautii. [0020] In embodiments of the invention, the health condition comprises at least one of obesity, increased cardiometabolic risk, diabetes risk, or overall poor health; and the health condition is predicted by the presence and/or abundance of more poor health indicator microbes than pro-health indicator microbes; and/or the health condition comprises at least one of overall good health or absence of obesity, reduced cardiometabolic risk, or reduced diabetes risk; and the health condition is predicted by the presence and/or abundance of more pro-health indicator microbes than poor health indicator microbes.

[0021] Also provided are methods to predict overall good or poor general health in a non-diseased human subject, which methods include: obtaining a microbiome sample from the human subject; isolating a nucleic acid fraction from the microbiome sample; detecting, within the nucleic acid fraction, presence, absence, or relative abundance of at least one unique marker sequence indicative of: a pro-health indicator microbe selected from the group including Prevotella copri, Blastocystis spp., Haemophilus parainfluenzae, Firmicutes bacterium CAG 95, Bifidobacterium animalis, Oscillibacter sp 5720, Roseburia sp CAG 182, Veillonella dispar, Eubacterium eligens, Firmicutes bacterium CAG 170, Rothia mucilaginosa, Veillonella infantium, Roseburia hominis, Oscillibacter sp PC13, Clostridium sp CAG 167, Ruminococcaceae bacterium D5, and Paraprevotella xylaniphila ; or a poor health indicator microbes selected from the group including Eubacterium ventriosum, Roseburia inulinivorans, Clostridium spiroforme, Clostridium bolteae CAG 59, Eggerthella lenta, Clostridium bolteae, Collinsella intestinalis, Clostridium innocuum, Blautia obeum, Clostridium symbiosum, Clostridium sp CAG 58, Blautia hydrogenotrophica, Anaerotruncus colihominis, Ruminococcus gnavus, and Flavonifractor plautil·, and at least one of predicting the human subject has overall good general health if the pro-health indicator microbes outnumber or are relatively more abundant than the poor-health indicator microbes; or predicting the human subject has overall poor general health if the poor health indicator microbes outnumber or are relatively more abundant than the pro-health indicator microbes. In another example of this embodiment, at least one of the pro-health indicator microbes is selected from the group including Firmicutes bacterium CAG 95, Haemophilus parainfluenzae, Oscillibacter sp 57 20, Firmicutes bacterium CAG 170, Roseburia sp CAG 182, Clostridium sp CAG 167, Oscillibacter sp PC13, Eubacterium eligens, Prevotella copri, Veillonella dispar, Veillonella infantium, Faecalibacterium prausnitzii, Bifidobacterium animalis, Romboutsia ilealis, and Veillonella atypica ; and at least one of the poor health indicator microbes is selected from the group including Clostridium leptum, Ruthenibacterium lactati forma ns, Collinsella intestinalis, Escherichia coli, Blautia hydrogenotrophica, Clostridium sp CAG 58, Eggerthella lenta, Ruminococcus gnavus, Clostridium spiroforme, Clostridium bolteae CAG 59, Clostridium innocuum, Anaerotruncus colihominis, Clostridium symbiosum, Clostridium bolteae, and Flavonifractor piautii.

[0022] In embodiments of the invention, the methods comprise providing to the human subject a dietary recommendation based on the presence, absence, or relative abundance of one or more poor health indicator microbes and/or one or more pro-health indicator microbes.

[0023] This disclosure further provides an assay, which includes: subjecting nucleic acid extracted from a test sample of a human subject to a genotyping assay that detects at least one of Prevotella copri, Blastocystis spp., Haemophilus parainfluenzae, Firmicutes bacterium CAG 95, Bifidobacterium animalis, Oscillibacter sp 57 20, Roseburia sp CAG 182, Veillonella dispar, Eubacterium eligens, Firmicutes bacterium CAG 170, Rothia mucilaginosa, Veillonella infantium, Roseburia hominis, Oscillibacter sp PC13, Clostridium sp CAG 167, Ruminococcaceae bacterium D5, Paraprevotella xylaniphila, Faecalibacterium prausnitzii, Romboutsia ilealis, and Veillonella atypica, the test sample including microbiota from a gut of the subject; determining a relative abundance of the at least one of Prevotella copri, Haemophilus parainfluenzae, Firmicutes bacterium CAG 95, Bifidobacterium animalis, Oscillibacter sp 57 20, Roseburia sp CAG 182, Veillonella dispar, Eubacterium eligens, Firmicutes bacterium CAG 170, Rothia mucilaginosa, Veillonella infantium, Roseburia hominis, Oscillibacter sp PC13, Clostridium sp CAG 167, Ruminococcaceae bacterium D5, Paraprevotella xylaniphila, Faecalibacterium prausnitzii, Romboutsia ilealis, and Veillonella atypica that is below a predetermined abundance; and selecting, when the relative abundance is below the predetermined abundance, a treatment regimen that includes at least one of: (i) modifying microbiota of the gut of the subject using at least one of a prebiotic, probiotic, or pharmaceutical, or (ii) altering the diet of the human subject.

[0024] Another aspect is an assay, which includes: subjecting nucleic acid extracted from a test sample of a human subject to a genotyping assay that detects at least one of Eubacterium ventriosum, Roseburia inulinivorans, Clostridium spiroforme, Clostridium bolteae CAG 59, Eggerthella lenta, Clostridium bolteae, Collinsella intestinalis, Clostridium innocuum, Blautia obeum, Clostridium symbiosum, Clostridium sp CAG 58, Blautia hydrogenotrophica, Anaerotruncus colihominis, Ruminococcus gnavus, Flavonifractor plautii, Clostridium leptum, Ruthenibacterium lactatiformans, and Escherichia coli, the test sample including microbiota from a gut of the subject; determining a relative abundance of the at least one Eubacterium ventriosum, Roseburia inulinivorans, Clostridium spiroforme, Clostridium bolteae CAG 59, Eggerthella lenta, Clostridium bolteae, Collinsella intestinalis, Clostridium innocuum, Blautia obeum, Clostridium symbiosum, Clostridium sp CAG 58, Blautia hydrogenotrophica, Anaerotruncus colihominis, Ruminococcus gnavus, Flavonifractor plautii, Clostridium leptum, Ruthenibacterium lactatiformans, and Escherichia coli that is above a predetermined abundance; and selecting, when the relative abundance is above the predetermined abundance, a treatment regimen that includes at least one of: (i) modifying microbiota of the gut of the subject using at least one of a prebiotic, probiotic, or pharmaceutical, or (ii) altering the diet of the human subject.

[0025] Yet another aspect is a method of diagnosing a human subject as having a healthy diet, including detecting in a microbiome sample from the subject the presence of Firmicutes CAG95 and/or the absence of Firmicutes CAG94.

[0026] Another aspect is a method of diagnosing a human subject as having an unhealthy diet, including detecting in a microbiome sample from the subject the presence of Firmicutes CAG94 and/or the absence of Firmicutes CAG95.

[0027] Also described herein are microbial signatures (fingerprints) for good health, which include presence or relatively high abundance of at least three microbes selected from the group including Prevotella copri, Blastocystis spp., Fiaemophilus para influenzae, Firmicutes bacterium CAG 95, Bifidobacterium animalis, Oscillibacter sp 57 20, Roseburia sp CAG 182, Veillonella dispar, Eubacterium eligens, Firmicutes bacterium CAG 170, Rothia mucilaginosa, Veillonella infantium, Roseburia hominis, Oscillibacter sp PC13, Clostridium sp CAG 167, Ruminococcaceae bacterium D5, Paraprevotella xylaniphila, Faecalibacterium prausnitzii, Romboutsia ilealis, and Veillonella atypica, and/or absence or relatively low abundance of at least three microbes selected from the group including Eubacterium ventriosum, Roseburia inulinivorans, Clostridium spiroforme, Clostridium bolteae CAG 59, Eggerthella lenta, Clostridium bolteae, Collinsella intestinalis, Clostridium innocuum, Blautia obeum, Clostridium symbiosum, Clostridium sp CAG 58, Blautia hydrogenotrophica, Anaerotruncus colihominis, Ruminococcus gnavus, Flavonifractor plautii, Clostridium leptum, Ruthenibacterium lactati forma ns, and Escherichia coli.

[0028] In some embodiments, the signature comprises: at least three pro-health indicator microbes; at least five pro-health indicator microbes; at least ten pro-health indicator microbes; or more than 10 listed pro-health indicator microbes.

[0029] In some embodiments, the group of microbes comprises P. copri and Blastocystis spp.

[0030] This disclosure also describes microbial signatures (fingerprints) for poor health, including absence or relatively low abundance of at least three microbes selected from the group including Prevotella copri, Blastocystis spp., Haemophilus para influenzae, Firmicutes bacterium CAG 95, Bifidobacterium animalis, Oscillibacter sp 57 20, Roseburia sp CAG 182, Veillonella dispar, Eubacterium eligens, Firmicutes bacterium CAG 170, Rothia mucilaginosa, Veillonella infantium, Roseburia hominis, Oscillibacter sp PC13, Clostridium sp CAG 167, Ruminococcaceae bacterium D5, Paraprevotella xylaniphila, Faecalibacterium prausnitzii, Romboutsia ilealis, and Veillonella atypica, and/or presence or relatively high abundance of at least three microbes selected from the group including R. gnavus, F. plautii, C. innocuum, C. symbiosum, C. bolteae, A. colihominis, C. intestinalis,

B. obeum, R. inulinivorans, E. ventriosum, B. hydrogenotrophica, Clostridium CAG 58, E. lenta, C. bolteae CAG 59, C. spiroforme, C. leptum, R. lactatiformans, and E. coli.

[0031] In some embodiments, the group of microbes comprises: at least three poor health indicator microbes; at least five poor health indicator microbes; at least ten poor health indicator microbes; or more than 10 listed poor health indicator microbes.

[0032] In some embodiments, the group of microbes comprises Clostridium innocuum, C. symbiosum,

C. spiroforme, C. leptum, and C. saccharolyticum.

[0033] An aspect of the disclosure further relates to use of microbial signatures of the disclosure to guide treatment decisions for a human subject.

[0034] In some embodiments, the treatment decision comprises selecting one or more of: modifying overall diet, increasing intake of at least one specified food or supplement, decreasing intake of at least one specified food or supplement, administration of a probiotic composition, administration of a prebiotic composition, or administration of an antibiotic compound.

[0035] Another aspect provides methods for targeting a microbiome of a human subject to promote health, which methods include: (A) detecting in a microbiome sample from the human subject one or more pro-health indicator microbes selected from the group including Prevotella copri, Blastocystis spp., Haemophilus para influenzae, Firmicutes bacterium CAG 95, Bifidobacterium animalis, Oscillibacter sp 57 20, Roseburia sp CAG 182, Veillonella dispar, Eubacterium eligens, Firmicutes bacterium CAG 170, Rothia mucilaginosa, Veillonella infantium, Roseburia hominis, Oscillibacter sp PC13, Clostridium sp CAG 167, Ruminococcaceae bacterium D5, Paraprevotella xylaniphila, Faecalibacterium prausnitzii, Romboutsia ilealis, and Veillonella atypica ; and administering to the human a composition that increases growth or survival of the pro-health indicator microbe(s); and/or (B) detecting in a microbiome sample from the human subject one or more poor-health indicator microbe selected from the group including Eubacterium ventriosum, Roseburia inulinivorans, Clostridium spiroforme, Clostridium bolteae CAG 59, Eggerthella ienta, Clostridium bolteae, Collinsella intestinalis, Clostridium innocuum, Blautia obeum, Clostridium symbiosum, Clostridium sp CAG 58, Blautia hydrogenotrophica, Anaerotruncus colihominis, Ruminococcus gnavus, Flavonifractor plautii, Clostridium leptum, Ruthenibacterium lactatiformans, and Escherichia coll·, and administering to the human a composition that decreases growth or survival of the poor health indicator microbe(s).

[0036] In some embodiments, the methods comprise detecting: at least three pro-health indicator microbes; at least five pro-health indicator microbes; at least ten pro-health indicator microbes; or more than 10 listed pro-health indicator microbes.

[0037] In some embodiments, the pro-health indicator microbes comprise P. copri and Blastocystis spp.

[0038] In some embodiments, the methods comprise detecting: at least three poor health indicator microbes; at least five poor health indicator microbes; at least ten poor health indicator microbes; or more than 10 listed poor health indicator microbes.

[0039] In some embodiments, the poor health indicator microbes comprise Clostridium innocuum, C. symbiosum, C. spiroforme, C. leptum, and C. saccharolyticum.

[0040] Also described are probiotic compositions for ingestion by a human subject, which include at least one of Prevotella copri, Blastocystis spp., Haemophilus para influenzae, Firmicutes bacterium CAG 95, Bifidobacterium animalis, Oscillibacter sp 5720, Roseburia sp CAG 182, Veillonella dispar, Eubacterium eligens, Firmicutes bacterium CAG 170, Rothia mucilaginosa, Veillonella infantium, Roseburia hominis, Oscillibacter sp PC13, Clostridium sp CAG 167, Ruminococcaceae bacterium D5, Paraprevotella xylaniphila, Faecalibacterium prausnitzii, Romboutsia ilealis, and Veillonella atypica. Also provided are methods of altering abundance of one or more microbes in gut microflora of a subject, which including administering such a probiotic composition to the subject.

[0041] In some embodiments, the probiotic compositions comprise at least three, at least five, at least seven, at least 9, at least 10, at least 12, at least 14, or all of the listed microbes.

[0042] In some embodiments, the probiotic compositions comprise Prevotella copri or Blastocystis spp. or both.

[0043] Another aspect relates to methods of altering abundance of one or more microbes in gut microflora of a subject, comprising administering the probiotic composition of the disclosure to the subject.

[0044] Yet another embodiment is a system to assay a biological condition in a subject, which system includes: a nucleic acid sample isolation device, which is adapted to isolate a nucleic acid sample from the subject; a sequencing device, which is connected to the nucleic acid sample isolation device and adapted to sequence the nucleic acid sample, thereby obtaining a sequencing result; and an alignment device, which is connected to the sequencing device and adapted to align the sequencing result against sequence from one or more of microbes in order to determine presence or absence of the microbe(s) based on the alignment result, wherein the microbes include one or more of: pro-health indicator microbes selected from the group including Prevotella copri, Blastocystis spp., Haemophilus parainfluenzae, Firmicutes bacterium CAG 95, Bifidobacterium animalis, Oscillibacter sp 57 20, Roseburia sp CAG 182, Veillonella dispar, Eubacterium eligens, Firmicutes bacterium CAG 170, Rothia mucilaginosa, Veillonella infantium, Roseburia hominis, Oscillibacter sp PC13, Clostridium sp CAG 167, Ruminococcaceae bacterium D5, Paraprevotella xylaniphila, Faecalibacterium prausnitzii, Romboutsia ilealis, and Veillonella atypica ; and/or poor health indicator microbes selected from the group including Eubacterium ventriosum, Roseburia inulinivorans, Clostridium spiroforme, Clostridium bolteae CAG 59, Eggerthella lenta, Clostridium bolteae, Collinsella intestinalis, Clostridium innocuum, Blautia obeum, Clostridium symbiosum, Clostridium sp CAG 58, Blautia hydrogenotrophica, Anaerotruncus colihominis, Ruminococcus gnavus, Flavonifractor plautii, Clostridium leptum, Ruthenibacterium lactati forma ns, and Escherichia coli.

[0045] Methods, assays, microbial signatures, use of microbial signatures, probiotic compositions, and/or systems of the disclosure may for example permit improved, more efficient, easier, cheaper, or more rapid diagnosis, prediction or determination of health conditions. Further, health of human subjects may be improved.

BRIEF DESCRIPTION OF THE DRAWINGS

[0046] FIG. 1 is a block diagram depicting an illustrative operating environment in which microbiome data is analyzed to generate microbiome fingerprints, dietary fingerprints, and microbiome ancestry for users.

[0047] FIG. 2 is a block diagram depicting an illustrative operating environment in which a data ingestion service receives, and processes test data associated with at home tests and sample collections.

[0048] FIG. 3 is a flow diagram showing a process illustrating aspects of a mechanism disclosed herein for obtaining and utilizing microbiome data for a user to generate microbiome fingerprints, dietary fingerprints, and microbiome ancestry for users.

[0049] FIG. 4 is a flow diagram showing a process illustrating aspects of a mechanism disclosed herein for generating a microbiome fingerprint for a user.

[0050] FIG. 5 is a flow diagram showing a process illustrating aspects of a mechanism disclosed herein for generating a dietary fingerprint for a user.

[0051] FIG. 6 is a flow diagram showing a process illustrating aspects of a mechanism disclosed herein for generating a microbiome ancestry for a user.

[0052] FIG. 7 is a flow diagram showing a process illustrating aspects of a mechanism disclosed herein for obtaining test data, including microbiome data, that may be utilized for generating microbiome fingerprints, dietary fingerprints, and microbiome ancestry for users. [0053] FIG. 8 is a computer architecture diagram showing one illustrative computer hardware architecture for implementing a computing device that might be utilized to implement aspects of the various examples presented herein.

[0054] FIGs. 9A, 9B. The PREDICT 1 study associates gut microbiome structure with habitual diet and blood cardiometabolic markers. (FIG. 9A) The PREDICT 1 study assessed the gut microbiome of 1 ,098 volunteers from the UK and US via metagenomic sequencing of stool samples. Phenotypic data obtained through in-person assessment, blood/biospecimen collection, and the return of validated study questionnaires queried a range of relevant host/environmental factors including (1) personal characteristics, such as age, BMI, and estimated visceral fat; (2) habitual dietary intake using semi- quantitative food frequency questionnaires (FFQs); (3) fasting; and (4) postprandial cardiometabolic blood and inflammatory markers, total lipid and lipoprotein concentrations, lipoprotein particle sizes, apolipoproteins, derived metabolic risk scores, glycemic-mediated metabolites, and metabolites related to fatty acid metabolism. (FIG. 9B) Overall microbiome alpha diversity, estimated as the total number of confidently identified microbial species in a given sample (richness), was correlated with HDL-D (high-density lipoprotein density; positive) and estimated hepatic steatosis (negative). Up to ten strongest absolute Spearman correlations are reported for each category with q<0.05. T op species based on Shannon diversity are reported in FIG. 11A.

[0055] FIG. 10 Distributions of BMI in each curatedMetagenomicData dataset. The figure shows the distributions of BMI values for the datasets available in curatedMetagenomicData. This was used to further select those datasets with a comparable range of values (interquartile range between 3.5 and 7.5) as the one in the PREDICT 1 UK dataset (IQR of 5.5), to be used as validation datasets for the associations found. Along the X- axis (labeled “Dataset_name”), the dataset names are: A - “CosteaPI_2017” (Costea et al., Mol. Syst. Biol. 13:960, 2017), B - “DhakanDB_2019” (Dhakan et al., Gigascience 8, 2019), C - “FerrettiP_2018” (Ferretti et al., Cell Host & Microbe, 24(1), 133-145, 2018), D -“HansenLBS_2018” (Hansen et al., Nat. Commun. 9, 4630, 2018), E - “JieZ_2017” (Jie et al., Nat. Commun. 8, 845, 2017), F - “NielsenHB_2014” Nielsen et al., Nat. Biotechnol. 32, 822-828, 2014), G - Obregon-TitoAJ_2015” (Obregon-Tito et al., Nature communications, 6 (1), 1-9, 2015), H - “RaymondF_2016” (Raymond et al., ISME J. 10(3)707-720, 2016), I - “SchirmerM_2016” (Schirmer et al., Cell 167, 1897, 2016), J - “YeZ_2018” (Ye et al., Microbiome 6(1):135, 2018), K- “ZellerG_2014” (Zeller et al., Mol. Syst. Biol. 10, 2014), and L - Zoe (described herein).

[0056] FIGs. 11A-11D Alpha diversity linked with personal factors, habitual diet, fasting, and postprandial markers. (FIG. 11 A) Microbiome alpha diversity computed using the Shannon index correlated markers from the four categories: personal, habitual diet, fasting, and post-prandial. Reported are the top-ten strongest absolute Spearman correlations for each category with p<0.05. The y-axis reads (from top to bottom): ASCVD_10yr_risk, person_md_age, person_clinic_bmi, ROE, PEACHES, BACON, WHOLEMEAL_BREAD, SPREAD_OLIVE_OIL, CEREAL_SUGAR_TOPPED, BROWN_RICE, KETCHUP , HFD, XL_HDL_L_0, LDL_size_0, IDL_L_0, L_HDL_L_0, HDL_size_0, IL-6_0, XXL_VLDL_L_0, VLDL_size_0, GlycA_0, MUFA_pct_0, IDL_L_360, XL_HDL_L_360, XS_VLDL_L_360, Total _C_360, HDL_size_360, and VLDL_size_360. (FIG. 11 B) Inter-sample microbiome distances (beta-diversity) were substantially lower, i.e. closer, among samples from the same individuals (two weeks apart) compared to those amongst different individuals. Gut microbial communities in monozygotic twins were slightly more similar than in dizygotic twins (Mann-Whitney U test p=0.06), which, in turn, were more similar than unrelated individuals (p<1 e-12), even after adjusting for age (p<1e-12). (FIG. 11 C) After excluding twin status (i.e. non-twin, vs. mono vs. dizygotic twins) from the model, personal factors still accounted for the greatest proportion of variance explained in overall microbial diversity, followed by dietary habits, fasting and postprandial cardiometabolic blood markers (by cumulative stepwise dbRDA). (FIG. 11 D) Cumulative distributions for each metadata variable based on Aitchison distance and Bray-Curtis dissimilarity are reported in FIGs. 13A-13C, 14A and 14B. The labels along the x-axis from left to right are: bristo_stool_score_average_last_3_months, FAw6.FA_0, person_clinic_weigth, XS.VLDL.C_360, abx_courses_last_12_months, bowel_movements_last_7_days, AcAce_0, person_md_age, visceral_fat, Healthy_PDI_Score_sum, maltose_g_kcal, starch_g_kcal, LDL.D_360, M.VLDL.C_360_rise, pulse, Meal_JJ_Hospital_meal_insulin_120_iacu, quicki_score, and cigarettes_a_day.

[0057] FIGs. 12A-12F. Food quality, regardless of source, is linked to overall and feature-level composition of the gut microbiome. (FIG. 12A, in two parts) Specific components of habitual diet including foods, nutrients, and dietary indices are linked to the composition of the gut microbiome with variable strengths as estimated by machine learning regression and classification models. Boxplots report the correlation between the real value of each component and the value predicted by regression models across 100 training/testing folds (Methods). Circles denote median area-under-the-curve (AUC) values across 100 folds for a corresponding binary classifier between the highest and lowest quartiles (Methods). (FIG. 12B) The association between the gut microbiome and coffee consumption in UK participants is dose-dependent, i.e. stronger when assessing heavy (e.g. >4 cups/d) vs. never drinkers and was validated in the US cohort when applying the UK model. (FIG. 12C) Among general dietary patterns and indices, the Healthy Food Diversity index (HFD) and the (FIG. 12D) Alternate Mediterranean Diet score (aMED) were validated in the US cohort, thus showing consistency between the two populations on these two important dietary indices. Other validations of the UK model applied to the US cohort are reported in FIGs. 13A-13C. (FIG. 12E) Number of significant positive and negative associations (Spearman’s correlation p<0.2) between foods and taxa categorized by more and less healthy plant-based foods and more and less healthy animal-based foods according to the PDI. Taxa shown are the 20 species with the highest total number of significant associations regardless of category. (FIG. 12F, in two parts) Single Spearman correlations adjusted for BMI and age between microbial species and components of habitual diet with asterisks denoting significant associations (FDR q<0.2). The 30 microbial species with the highest number of significant associations across habitual diet categories are reported. All indices of dietary patterns are reported, whereas only food groups and nutrients (energy-adjusted) with at least 7 associations among the top 30 microbial species are reported. Full heatmaps of foods and unadjusted nutrients are reported in FIGs. 14A, 14B, and the full set of correlations is provided in Table 3. The species listed on the y-axis from top to bottom include: R. hominis, Roseburia CAG 182, A. butyriciproducens, A. hadrus, Clostridium CAG167, R. lactaris, Firmicutes CAG 95, E. eligens, Oscillibacter sp 57 20, H pa rain fluenzae, B. animalis, S. thermophilus, B. adolescentis, B. longum, C. leptum, B. bifid um, B. catenulatum, L asaccharolyticus, Clostridium CAG 58, R. lactatiformans, C. innocuum, C. symbiosum, A. colihominis, F. plautii, P. merdae, , Pseudoflavonifractor An 184, Anaeromassilibacillus An250, Firmicutes CAG 94, C. saccharolyticum, and C. spiroforme. The x-axis from left to right reads: Meat, Desserts, Sugary drinks, Potatoes, Animal-based, Tea & coffee, Alcohol, Whole grain, Fruits, Legumes, Eggs, Vegetables, Nuts, Lactose, Maltose, Carbohydrates, Sucrose, Starch, Galactose, Vit. B2, Calcium, Vit. B12, Potassium, Phosphorus, Zinc, Selenium, Fructose, Vitamin B1 , Folate, Vit. C, Carotene equiv., Beta-carotene, NSP, Manganese, Magnesium, Iron, Vit. E equiv., PUFAs, Copper, U-plant (n), U-plant (%), uPDI, Tot. plants (n), Tot. PDI, Tot. plant (%), H-plant (%), H-plant (n), aMED, hPDI, HEI, Animal soccer, and HFD. Positive Spearman correlation values are enclosed in dashed outline; asterisks indicate statistical significance.

[0058] FIGs. 13A-13C Top foods, food groups, nutrients, and dietary patterns validated in the PREDICT 1 US cohort. The application of the RF regression model trained on the PREDICT 1 UK cohort on the PREDICT 1 US participants, validating the associations with food-related variables found in the PREDICT 1 UK.

[0059] FIGs. 14A, 14B Species-level correlation with single foods. The figure shows the species- level correlations (Spearman) with single food quantities as estimated from the food frequency questionnaires. Only foods with at least 5 significant associations (q-value<0.2) are displayed. Species are sorted by the number of significant associations, and the top 30 are reported in the figure. The species listed along the y-axis from top to bottom are: Bifidobacterium animalis, Fiaemophilus para influenzae, Firmicutes bacterium CAG 95, Oscillibacter sp 57 20, Ruminococcus lactaris, Oscillibacter sp PC13, Eubacterium eligens, Faecalibacterium prausnitzii, Agathobaculum butyriciproducens, Anaerostipes hadrus, Roseburia hominis, Roseburia sp CAG 182, Harryflintia acetispora, Clostridium saccharolyticum, Clostridium sp CAG 58, Clostridium spiroforme, Pseudoflavonifractor sp An184, Anaeromassilibacillus sp An250, Firmicutes bacterium CAG 94, Clostridium leptum, Bifidobacterium bifidum, Bifidobacterium catenulatum, Alistipes finegoldii, Ruthenibacterium lactatiformans, Clostridium bolteae, Anaerotruncus colihominis, Flavonifractor plautii, Eggerthella lenta, Clostridium innocuum, and Clostridium symbiosum.

[0060] FIGs. 15A-15D. Random forest machine learning models based on microbial or functional profiles are capable of predicting obesity phenotypic markers, even when tested against separate, independent cohorts. (FIG. 15A) Whole-microbiome machine learning models can assess personal factors with RF regression (boxplots and left-side vertical axis) using only taxonomic or functional (i.e. pathway) microbiome features. Classification models (circles and right-side vertical axis) exceed AUC 0.65 except for waist-to-hip ratio (WHR) and smoking. (FIG. 15B) The highest correlations were observed between the relative abundance of microbial species and age, BMI, and visceral fat. The link between microbial features and visceral fat was of greater effect and more often significant than with traditional BMI. (FIG. 15C) Using several independent datasets (Pasolli et al. , Nat Methods, 14, 1023-1024, 2017) correlations were confirmed between single microbial species and BMI with blue points denoting significant associations at p<0.05. (FIG. 15D) The machine learning model for BMI trained on PREDICT 1 data is reproducible in several external datasets (FIG. 10), achieving correlations with true values exceeding those obtained in cross-validation of a single given dataset in five of seven cases. When the PREDICT 1 microbiome model is expanded to include other datasets (excluding those ones used for testing, i.e. leave-one-dataset-out/LODO approach) the performance remains comparable, affirming the generalizability of the PREDICT 1 model on obesity-related indicators.

[0061] FIGs. 16A-16H. Fasting and postprandial cardiometabolic responses to standardized test meals associated with the microbiome. (FIG. 16A) The strongest observed links according to correlation of the predicted versus collected measures between the gut microbiome and fasting metabolic blood markers. For measures of lipid concentration in lipoproteins, only the five strongest correlations were reported. Indices are grouped in nine distinct categories, and boxplots report the correlation between the prediction of RF regression models trained on microbial taxa or pathway abundances across 100 training/testing folds. Circles denote AUC values for RF classification, while stars report regressor performance when trained on the UK cohort and evaluated on the independent US validation cohort. (FIG. 16B) RF regression and classification performance in predicting postprandial metabolic responses for clinic Meal 1 (breakfast) measured as iAUC at6h for triglycerides (TG) and iAUC at 2h for glucose, C-peptide, and insulin. (FIG. 16C) Glycemic-mediated postprandial iAUCs at 2h for the other meals, and (FIG. 16D) glycemic-mediated markers absolute levels vs. rise. (FIG. 16E) Postprandial inflammatory measures (concentration and rise). (FIG. 16F) RF microbiome- based model performance with postprandial changes (concentrations and rise) in lipoprotein concentration, composition, and size. (FIG. 16G) Spearman’s correlation for regression and classification of US validation studies. (FIG. 16H) Fasting and postprandial performance indices (correlation of the regressors’ outputs) were more tightly linked to gut community structure than were their corresponding postprandial rises. (FIGs. 16B-16F) Performance of the microbiome-based ML- model in estimating postprandial absolute levels and postprandial increases in cardiometabolic markers. Stars denote regression model results in the US validation cohort for postprandial measurements (not rises; FIGs. 18 and 19).

[0062] FIG. 17 Performance for random Forest regression and classification on microbiome functional potential in predicting fasting measurements. FIG. 17 shows the performance of both RF regression and classification tasks trained on microbiome gene families’ profiles in predicting the fasting measurements presented in FIG. 16A. Boxplots show the distribution of the Spearman correlations (left axis) between real and predicted values using RF regression. Circles show the median AUC (right axis) of RF classification in predicting the bottom quartile of the distribution vs. the top quartile. Fasting measurements are sorted as in FIG. 16A.

[0063] FIG. 18 Random Forest regression and classification performances for total cholesterol in different lipoproteins. The figure shows the performances of both RF regression and classification tasks in predicting the total cholesterol in different size lipoproteins. For each lipoprotein, its concentration values were considered at both fasting and postprandial (6h), and the difference (rise) between the post-prandial concentration and the fasting one. Boxplots show the distribution of the Spearman correlations (left axis) between real and predicted values using RF regression. Circles show the median AUC (right axis) of RF classification in predicting the bottom quartile of the distribution vs. the top quartile. Lipoproteins are sorted descending according to the median of the RF regression for the fasting measure.

[0064] FIG. 19 Random Forest regression and classification performances for triglycerides in different lipoproteins. The figure shows the performances of both RF regression and classification tasks in predicting triglycerides in different size lipoproteins. For each lipoprotein, its concentration values were considered at both fasting and postprandial (6h), and also the difference (rise) between the post prandial concentration and the fasting one. Boxplots show the distribution of the Spearman correlations (left axis) between real and predicted values using RF regression. Circles show the median AUC (right axis) of RF classification in predicting the bottom quartile of the distribution vs. the top quartile. Lipoproteins are sorted descending according to the median of the RF regression for the fasting measure.

[0065] FIGs. 20A-20D. Species-level segregation into healthy and unhealthy microbial signatures of fasting and postprandial cardiometabolic markers. (FIG. 20A) Associations (Spearman correlation, q<0.2 marked with stars) between single microbial species and fasting clinical risk measures and (FIG. 20B, in two parts) glycemic, inflammatory, and lipemic indices. (FIG. 20C) Correlation between microbial species and the iAUC for glucose and C-peptide estimations based on clinical measurements before and after standardized meals. The 30 species with the highest number of significant correlations with distinct fasting and postprandial indices are shown. In each of FIGs. 20A- 20C, positive Spearman correlation values are enclosed in dashed outline; asterisks indicate statistical significance. (FIG. 20D) Microbe-metabolite correlations are very consistent when evaluated for fasting versus postprandial (6h) conditions (left panel). Associations with postprandial variations (rise) conversely often show opposing relationships, with several species positively correlated with fasting measures being negatively correlated with postprandial variation of the same metabolite (or vice versa, central panel). This was mitigated somewhat when comparing absolute postprandial responses with rise (right panel).

[0066] FIG. 21 (in two parts) Species-level correlations with total lipids in lipoproteins. The heatmap shows the species-level correlations with total lipids in lipoprotein variables at fasting, post-prandial (6h) and the difference (rise) between the postprandial and fasting concentrations. The 30 species with the highest number of significant associations (FDR<0.2) are shown. The asterisk indicates a significant correlation between species and metadata variable using t-test, corrected with FDR with q<0.2. The species listed along the y-axis from top to bottom are: Ruminococcus gnavus, Anaerotruncus colihominis Clostridium symbiosum, Clostridium bolteae sp_ CAG_58, Clostridium innocuum, Prevotella copri, Firmicutes bacterium_CAG_170, Roseburia sp_CAG_182, Firmicutes bacterium_CAG_95, Fiaemophilus parainfluenzae, Coprobacter secundus, Oscillibacter sp_PC13, Faecalibacterium prausnitzii, Veillonella parvula, Turicibacter sanguinis, Oscillibacter sp 57 20, Clostridium disporicum, and Firmicutes bacterium CAG 110. Positive Spearman correlation values are enclosed in dashed outline; asterisks indicate statistical significance.

[0067] FIG. 22 (in two parts) Species-level correlations with total cholesterol in lipoproteins. The heatmap shows the species-level correlations with total cholesterol in lipoprotein variables at fasting, post-prandial (6h) and the difference (rise) between the postprandial and fasting concentrations. The 30 species with the highest number of significant associations (FDR<0.2) are shown. The asterisk indicates a significant correlation between species and metadata variable using t-test, corrected with FDR with q<0.2.The species listed along the y-axis from top to bottom are: Clostridium citroniae, Flungatella hathewayi, Clostridium sp_CAG_58, Gemella sanguinis, Blautia hydrogenotrophica, Eggerthella lenta, Bacteroides uniformis, Eisenbergiella tayi, Ruthenibacterium lactati forma ns, Clostridium spiroforme, Flavonifractor plautii, Clostridium bolteae, Ruminococcus gnavus, Anaerotruncus colihominis, Clostridium symbiosum, Clostridium bolteae_CAG_59, Clostridium innocuum, Prevotella copri, Firmicutes bacterium CAG 170, Roseburia sp CAG182, Firmicutes bacterium CAG 95, Fiaemophilus parainfluenzae, Coprobacter secundus, Oscillibacter sp_PC13, Faecalibacterium prausnitzii, Veillonella parvula, Turicibacter sanguinis, Oscillibacter sp_57_20, Clostridium disporicum, and Firmicutes bacterium_CAG_110. Positive Spearman correlation values are enclosed in dashed outline; asterisks indicate statistical significance.

[0068] FIG. 23 (in two parts) Species-level correlations with triglycerides in lipoproteins. The heatmap shows the species-level correlations with triglycerides in lipoprotein variables at fasting, post prandial (6h) and the difference (rise) between the postprandial and fasting concentrations. The 30 species with the highest number of significant associations (FDR<0.2) are shown. The asterisk indicates a significant correlation between species and metadata variable using t-test, corrected with FDR with q<0.2. The species listed along the y-axis from top to bottom are the same as those listed in FIG. 22. Positive Spearman correlation values are enclosed in dashed outline; asterisks indicate statistical significance.

[0069] FIG. 24 (in two parts) Gene families’ correlations with clinical and metabolic risk scores, glycemic and inflammatory measures, and lipoproteins. The heatmap shows gene families correlations with the set of metadata presented in FIGs. 20A-20C reporting the top 2,000 genes selected those with at least 20% prevalence on their number of significant correlations (q<0.2). Gene families’ correlations are showing the same clusters as the species-level correlations in FIGs. 20A- 20C. A colorversion of this Figure can be found in Asnicar ef a/. (Nat Med. 27:321-323, 2021). [0070] FIG. 25 (in two parts) Pathway abundances correlations with clinical and metabolic risk scores, glycemic and inflammatory measures, and lipoproteins. The heatmap shows pathway abundances correlations with the set of metadata presented in FIGs. 20A-20C reporting all the pathways at 20% prevalence (349 in total). Pathway abundances correlations are showing the same cluster structure as the species-level correlations in FIGs. 20A-20C. A color version of this Figure can be found in Asnicar et al. ( Nat Med. 27:321-323, 2021).

[0071] FIGs. 26A-26F Concordance of Random Forest scores with species-level partial correlations. Volcano plots of the scores assigned to each species by Random Forest and their partial correlation, showing an overall concordance between the two independent approaches. The top 5 metadata variables were considered for the six metadata categories: (FIG. 26A) Foods, bacon (g) (corr. 0.496), unsalted nuts (g) (0.466), pork (g) (0.424), dark chocolate (g) (0.41), and garlic (g) (0.401) (FIG. 26B) Food groups, nuts (0.436), legumes (0.403), meat (0.393), sweets and desserts (0.369), and potatoes (0.323). (FIG. 26C) Nutrients, polyunsaturated fatty acids (FAs) (g) (0.524), vitamin B12 pg (0.406), niacin equivalent (mg) (0.406), cis-polyunsaturated FAs (g) (0.358), and starch (g) (0.351). (FIG. 26D) Nutrients normalized by energy intake, polyunsaturated FAs (g %E) (0.528), fat (g %E) (0.512), vitamin B12 (pg %E) (0.48), niacin equivalent (mg %E) (0.462), and cis-polyunsaturated FAs (g %E) (0.436). (FIG. 26E) Dietary patterns, healthy PDI (0.528), unhealthy PDI (0.381), healthy plant percentage (0.373), unhealthy plants number (0.363), and total PDI (0.361). (FIG. 26F) Lipoproteins, ApoA1 6h rise (0.493), XL-VLDL-TG 6h (0.413), VLDL-D 6h (0.396), M-HDL-TG 6h (0.393), and M- VLDL-TG 6h (0.387). VLDL = very low density lipoprotein. Key - filled dots are those for which the correlation coefficient is statistically significant

[0072] FIGs. 27A-27E Prevotella copri and/or Blastocystis spp. presence are indicators of a more favorable postprandial glucose response to meals. (FIGs. 27A-27C) Differential analysis of visceral fat, HFD and glucose iAUC 2h after standardized breakfast according to presence-absence of one and both of P. copri and Blastocystis spp. The analysis reveals that both these species are indicators of reduced visceral fat, good cholesterol and meal-driven increase of glucose. (FIGs. 27D-27E) Differential analysis of C-peptide and triglycerides at different time points according to presence- absence of one and both of P. copri and Blastocystis spp. The distributions of the concentrations for C-peptide and triglycerides were typically lower when one or both are absent. An asterisk between two boxplots represents a significant p-value (p<0.05) according to the Mann-Whitney U test (Table 4). In FIGs. 27A-27E, the left bar of each pair is “Absent”; the right bar of each pair is “Present”. [0073] FIG. 28 (in two parts) The panel of 30 species showing the strongest overall correlations with a selection of markers of nutritional and cardiometabolic health. The 30 species with the highest and lowest average ranks with diverse positive and negative health indicators, respectively, are shown here. The rank of each microbe’s correlation with individual health indicators is written within cells when significant (p<0.05). For each of the main categories of indices, up to five representative quantitative markers were selected (for “Personal” only four were considered as the remaining were highly correlated with visceral fat or not relevant in this context). Indices can be considered “positive” and “negative” depending on whether higher or lower values are a proxy for more or less healthy conditions. A color version of this Figure can be found in Asnicar et al. (Nat Med. 27:321-323, 2021). [0074] Several of FIGs. 9-28, or versions thereof, were published in Asnicar et al. (Nat Med. 27:321- 323, 2021 , Epub 11 January 2021 ; which is incorporated herein by reference for all it teaches); at least some of these Figures may be clearer in color, as they are depicted in Ansicar et al., and Applicant considers that color information to be included in this filing.

DETAILED DESCRIPTION

[0075] Using the technologies described herein, microbiome data associated with an individual and other data are analyzed to generate a microbiome fingerprint, a dietary fingerprint, and microbiome ancestry data for a user. As used herein, a “microbiome fingerprint” is data that uniquely identifies the microbiome of a user at a particular point in time, and a “dietary fingerprint” is data that identifies how the microbiome of a user at a particular point in time is associated with one or more different indexes associated with a diet and/or health characteristics. The indexes may include, but are not limited to a Mediterranean diet index, a vegetarian diet index, a fast food index, an internal fat index, a fat- digesting index, a carbohydrate-digesting index, a health index, a fasting index, a ketogenic index, and the like. According to some configurations, one or more computers of a microbiome service generate a score, such as from 0-100, (or some other indicator) that indicates how closely the microbiome of the user is associated with a particular index.

[0076] As an example, the Mediterranean diet index score for a user indicates how closely the microbiome of the user resembles the typical microbiome of someone on a Mediterranean diet. The vegetarian diet index score indicates how closely the microbiome of the user resembles someone on a vegetarian diet. The fast food index score indicates how closely the microbiome of the user resembles someone on a fast food diet. The internal fat index score indicates how closely the microbiome of the user resembles someone with high or low visceral fat. The fat-digesting index score indicates how closely the microbiome of the user resembles someone with low postprandial triacylglycerol (TAG) rises. The carbohydrate-digesting index score indicates how closely the microbiome of the user resembles someone with low postprandial glucose rises. The health index score indicates how closely the microbiome of the user resembles someone that is healthy. The fasting index score indicates how closely the microbiome of the user resembles someone that fasts regularly. The ketogenic index score indicates how closely the microbiome of the user resembles someone who is ketogenic.

[0077] The microbiome service may utilize microbiome data generated from a microbiome sample and/or other data to generate a microbiome fingerprint, dietary fingerprint, and/or microbiome ancestry data for a user, or for a delegate of a user. For example, the microbiome service may perform an analysis of the microbiome data associated with a microbiome sample to identify the microbial composition (e.g., the species, genes, taxa, and the like); such identification may include the unique, detailed characterization of each and every microbial strain in the sample, but it is not necessary to identify every strain present in the sample. For instance, the analysis of the microbiome data may identify as few as 2% of the strains in the sample; as few as 5%, as few as 8%, as few as 10%, as few as 15%, as few as 20%, or more than 30% of the strains in the sample. In certain embodiments, the characterization will identify more than 25% of the strains; for instance, more than 30%, more than 35%, more than 40%, more than 45%, more than 50%, more than 55%, more than 60%, more than 65%, more than 70%, more than 75%, more than 80%, more than 85%, more than 90%, or even more than 95% of the strains in the sample.

[0078] In some examples, some/all of the analysis of the microbiome service may be performed by a service provider that is external from the microbiome service. The microbiome service may obtain this portion of the microbiome data from the external service provider(s). The microbiome service may also generate reconstructed microbial genomes, determine a diversity of the microbiome, identify functions of the microbiome, identify a uniqueness of the microbiome, identify interesting species, and the like.

[0079] In some examples, the microbiome data of the user is utilized with other data that is gathered about the user, as well as other users. For instance, users may provide responses to questionnaires, data about food that is eaten, data about supplements or medicines that are eaten, sleep habits, and the like.

[0080] Among other uses, data in addition to the microbiome data may be utilized to assist in determining a “microbiome ancestry” of a user. A “microbiome ancestry” for a user indicates that the user has relationships with other users and/or locations based on a similarity of the microbiome data (e.g., the microbiome fingerprint) for a particular user with other users.

[0081] In some examples, the microbiome service generates a microbiome ancestry by analyzing the microbiome data of the user and determining how closely the microbiome of the user is related to one or more other users, and/or locations. For instance, the microbiome service may determine a number of other users to which the microbiome of the user is most closely related to. In some configurations, the microbiome service compares the microbiome data, such as the microbiome fingerprint, of the user to microbiome data, such as the microbiome fingerprints, of other users to determine whether the user is related to any of the other users.

[0082] As briefly discussed, the microbiome service may also identify one or more locations to which the microbiome of the user is associated with. For example, the microbiome service may identify the countries the microbiome of the user is associated with (e.g. 75% North America, 25% Mexico). This identification may be based on microbiome data of users at different locations and/or different populations (e.g., English, American, French, Mexican, Italian, ...). For instance, the microbiome service may determine that the microbiome fingerprint of the user is more similar to a microbiome of a user in France even though the user is from England.

[0083] According to some configurations, a user may “opt-in” to allow use of the microbiome data and/or other data associated with a user. In some examples, the user“opts-in” to participate in a social network and/or some other communication mechanism to discuss issues related to the microbiome data such as a microbiome ancestry (e.g., compare diets and background with other users). The microbiome service may also compare the microbiome of the user with other family members, and/or other users when the users have “opted-in” to allow this. For instance, the microbiome service may identify how many strains they share (with respect to sharing with unrelated persons) and overall how similar they are compared to the average.

[0084] In some examples, the microbiome service may provide a user interface (Ul), such as a graphical user interface (GUI) for a user to view and interact with microbiome data and/or other data associated with the microbiome fingerprints, dietary fingerprints, and microbiome ancestry. For instance, the GUI may display microbiome fingerprint data that shows various characteristics of the microbiome fingerprint, dietary fingerprint data that shows various characteristics of the dietary fingerprint, microbiome ancestry data that shows various characteristics of the microbiome ancestry, recommendation data that identifies one or more recommendations relating to changing the microbiome of the user, and the like.

[0085] As an example, the microbiome service may provide recommendations to increase the diversity of foods eaten, as there is no one good food for a healthy microbiome. The recommendations may include to eat different gut-healthy foods, eat fermented foods, minimize highly processed foods (things like emulsifiers and artificial sweeteners may affect the microbiome), consume prebiotic substances, administer a probiotic preparation, or any combination thereof. The microbiome service may base the recommendations on data obtained from the user, from other users, and/or from both. [0086] The microbiome service may also track the state of the microbiome of the user over time. For example, the microbiome service may provide data related to different microbiome analysis. In this way, the user may see how changes made by the user (e.g., eating different foods, changing exercise patterns, consuming prebiotic substance(s), taking a probiotic preparation, and so forth) have affected the microbiome.

[0087] Additional details regarding the various components and processes described above relating to generating microbiome fingerprints, dietary fingerprints, and microbiome ancestry are presented below with regard to FIGS. 1-8.

[0088] It will be appreciated that the subject matter presented herein may be implemented as a computer process, a computer-controlled apparatus, a computing system, or an article of manufacture, such as a computer-readable storage medium. While the subject matter described herein is presented in the general context of program modules that execute on one or more computing devices, those skilled in the art will recognize that other implementations may be performed in combination with other types of program modules. Generally, program modules include routines, programs, components, data structures and other types of structures that perform particular tasks or implement particular abstract data types.

[0089] Those skilled in the art will also appreciate that aspects of the subject matter described herein may be practiced on or in conjunction with other computer system configurations beyond those described herein, including multiprocessor systems, microprocessor-based or programmable consumer electronics, minicomputers, mainframe computers, handheld computers, personal digital assistants, e-readers, mobile telephone devices, tablet computing devices, special-purposed hardware devices, network appliances and the like.

[0090] In the following detailed description, references are made to the accompanying drawings that form a part hereof, and that show, by way of illustration, specific examples or examples. The drawings herein are not drawn to scale. Like numerals represent like elements throughout the several figures (which may be referred to herein as a “FIG.” or“FIGs.”).

[0091] Provided below is additional description in support of this technology, which is organized in the following sections: (I) Generation, Collection, and Analysis of Microbiome Data; (II) Representative Computer Architecture; (III) Detection and Identification of Individual Microbes; (IV) Methods of Use; (V) Kits and Arrays; (VI) Systems; (VII) Exemplary Embodiments; (VIII) Example(s); (IX) Incorporation of Appendix I; and (X) Closing Paragraphs.

[0092] (I) Generation, Collection, and Analysis of Microbiome Data.

[0093] FIG. 1 is a block diagram depicting an illustrative operating environment 100 in which microbiome data is analyzed to generate microbiome fingerprints, dietary fingerprints, and microbiome ancestry for users. An individual, such as an individual interested in obtaining microbiome fingerprints, dietary fingerprints, and microbiome ancestry information, may communicate with the nutritional environment 106 using a computing device 102 and possibly other computing devices, such as mobile electronic devices.

[0094] In some configurations, an individual may generate and provide data 108, such as microbiome data, test data, and/or other data. According to some examples, the user may utilize a variety of at home biological collection devices, which collect a biological sample. These devices may include but are not limited to “At Home Blood Tests” which use blood extraction devices such as finger pricks which in some examples are used with dried blood spot cards, button operated blood collection devices using small needles and vacuum to collect liquid capillary blood and the like. In some examples there may be home biological collection devices such as a stool test which is then assayed to produce biomarker test data such as gut microbiome data. As exemplified herein, the subject from which the biological sample is obtained may be a human subject. Other animal subjects are also contemplated, including non-human primates, companion animals, domestic animals, livestock, endangered and threatened animals, laboratory animals, and so forth.

[0095] A computing device, such as a mobile phone or a tablet computing device can also be used to improve the accuracy of the measurements. For instance, instead of relying on an individual to accurately record the time a test was taken or a sample was obtained, the computing device 102 can record information that is associated with the event. The computing device 102 may also be utilized to capture the timing data associated with the test (e.g., the time the test was performed, ...), or the sample was collected, and provide that data to a data ingestion service 110. As an example, a clock (or some other timing device) of the computing device 102 may be used to record the time the measurement(s) were collected and/or samples were obtained.

[0096] As illustrated in FIG. 1 , the operating environment 100 includes one or more computing devices 102, in communication with a nutritional environment 106. In some examples, the nutritional environment 106 may be associated with and/or implemented by resources provided by a service provider network such as provided by a cloud computing company. The nutritional environment 106 includes a data ingestion service 110, a microbiome service 120, a nutritional service 132, and a data store 140. The nutritional service 132 can be utilized to generate personalized nutritional recommendations. For example, the personalized nutritional recommendations can be generated using techniques described in U.S. Patent Publication No. US 2019-0252058 A1 , published August 15, 2019. According to some examples, the nutritional service 132 may provide recommendations based on the microbiome fingerprint, dietary fingerprint, microbiome ancestry data and/or other data. [0097] The nutritional environment 106 may include a collection of computing resources (e.g., computing devices such as servers). The computing resources may include a number of computing, networking and storage devices in communication with one another. In some examples, the computing resources may correspond to physical computing devices and/or virtual computing devices implemented by one or more physical computing devices.

[0098] It should be appreciated that the nutritional environment 106 may be implemented using fewer or more components than are illustrated in FIG. 1. For example, all or a portion of the components illustrated in the nutritional environment 106 may be provided by a service provider network (not shown). In addition, the nutritional environment 106 could include various Web services and/or peer to peer network configurations. Thus, the depiction of the nutritional environment 106 in FIG. 1 should be taken as illustrative and not limiting to the present disclosure.

[0099] The data ingestion service 110 facilitates submission of data utilized by the microbiome service 120 and, in some configurations, the nutritional service 132. Accordingly, utilizing a computing device 102, an electronic collection device, an at home biological collection device or via in clinic biological collection, an individual may submit data 108 to the nutritional environment 106 via the data ingestion service 110. Some of the data 108 may be sample data, biomarker test data, and some of the data 108 may be non-biomarker test data such as photos, barcode scans, timing data, and the like.

[0100] A “biomarker” or biological marker generally refers to one or more measurable indicators (that may be combined using various techniques) of some biological state or condition associated with an individual. Stated another way, a biomarker may be anything that can be used as an indicator of particular disease, condition, health, state, or some other physiological state of an organism. A biomarker typically can be measured accurately (either objectively and/or subjectively) and the measurement is reproducible. By way of example, the following are considered biomarkers: blood glucose, triglycerides (TG), insulin, c-peptides, ketone body ratios, IL-6 inflammation markers, the expression of any specified gene or protein, hunger, fullness, body mass index (BMI), composition of a microbiome (including not only what strains are present, but the relative abundance of two or more strains in a microbiome), and the like. In practice, a good biomarker is often a combination of two or more measurable indicators combined in a simple or complex way; in some cases, the combination of more than one measurable indicator makes the biomarker more closely linked to the disease, condition, health, state, or some other physiological state of an organism.

[0101] The measured biomarkers can include many different types of health data such as microbiome data which may be referred to herein as “microbiome data”, blood data, glucose data, lipid data, nutrition data, wearable data, genetic data, biometric data, questionnaire data, psychological data (e.g., hunger, sleep quality, mood, ...), objective health data (e.g., age, sex, height, weight, medical history, ...), as well as other types of data. Generally, “health data” refers to any psychological, subjective, and/or objective data that relates to and is associated with one or more individuals. The health data might be obtained through testing, self-reporting, and the like. Some biomarkers change in response to eating food, such as blood glucose, insulin, c-peptides, and triglycerides and their lipoprotein components.

[0102] To understand the differences in nutritional responses for different users, dynamic changes in biomarkers caused by eating food such as a standardized meal (“postprandial responses”) can be measured. By understanding an individual’s nutritional responses, in terms of blood biomarkers such as glucose, insulin, and triglyceride levels, or non-blood biomarkers such as the microbiome, a nutritional service may be able to choose or recommend food(s) that is/are more suited for that particular person.

[0103] Data may also be obtained by the data ingestion service 110 from other data sources, such as data source(s) 150. For example, the data source(s) 150 can include, but are not limited to microbiome data associated with one or more users, nutritional data (e.g., nutrition of particular foods, nutrition associated with the individual, and the like), health data records associated with the individual and/or other individuals, and the like.

[0104] The data, such as data 108, or data obtained from one or more data sources 150, may then be processed by the data manager 112 and/or the microbiome manager 122 and included in a memory, such as the data store 140. As illustrated, the data store 140 can be configured to store user microbiome data 140A, other users’ microbiome data 140A2, and other data 140B (see FIG. 2 for more details on the data ingestion service 110). In some examples, the user microbiome data 140A and other users’ microbiome data 140A2 includes microbiome data.

[0105] As discussed in more detail below (see FIGs. 3-7 for more details), the microbiome service 120 utilizing the microbiome manager 122, the microbiome analyzer 124, the microbiome finger printer 126, the microbiome dietary finger printer 128, and the microbiome ancestry manager 130, analyzes the data 108 associated with a user and generate a microbiome fingerprint, a dietary fingerprint, and microbiome ancestry data for the user. According to some configurations, the microbiome service 120 utilizes both data 108 associated with the user and data from other users. [0106] In some examples, the microbiome manager 122 may utilize one or more machine learning mechanisms. For example, the microbiome manager 122 can use a classifier to classify the microbiome within a classification category (e.g., associate with a particular dietary index, a geographic location, ...). In other examples, the microbiome manager 122 may use a scorer to generate scores that may provide an indication of the dietary index associated with a user, how closely related the user is to other users based on the microbiome data, and the like.

[0107] The data ingestion service 110 and/or the microbiome service 120 can generate one or more user interfaces, such as a user interface 104 and/or user interface 104B, through which an individual, utilizing the computing device 102, or some other computing device, may provide/receive data from the nutritional environment 106. For example, the data ingestion service 110 may provide a user interface 104 that allows an individual of the computing device 102A to submit data 108 to the nutritional environment 106.

[0108] In some cases, the individual can also provide biological samples to a lab for testing, for instance using a biological collection device. According to some configurations, this will include At Home Blood Tests. According to some configurations, individuals can provide a sample (such as a stool sample) for microbiome analysis. As an example, metagenomic testing can be performed using the sample to allow the DNA of the microbes in the microbiome of an individual to be digitalized. Generally, a microbiome analysis includes determining the composition and functional potential (here called just “function”) of a community of microbes in a particular location, such as within the gut of an individual. An individual’s microbiome appears to have a strong relationship to metabolism, weight, and health, yet only ten to thirty percent of the bacterial species in a microbiome is estimated to be common across different individuals. Embodiments described herein combine different techniques to assist in improving the accuracy of the data captured outside of a clinical setting, such as calculating accurate glucose responses to individual meals, which can then be linked to measures like the microbiome.

[0109] According to some configurations, individuals can provide a sample or samples of their stool for microbiome analysis as part of the at home biological collection. In some cases, this sample may be collected without using a chemical buffer. The sample can then be used to culture live microbes, or for chemical analysis such as for metabolites or for genetic related analysis such as metagenomic or metatranscriptomic sequencing. In such cases, the sample may suffer from changes in microbial composition due to causes including microbial blooming from oxygen in the period between being collected and when it is received in the lab, where it usually will be immediately assayed or frozen. In some cases, to avoid this change in bacterial composition after collection, the sample obtained a home may be frozen at low temperatures very rapidly after collection. The sample can then be used to culture live bacteria, or for chemical analysis or for metagenomic sequencing. This collection can be done as part of an in clinic biological collection or at home where the collection kit is configured to deliver such low temperatures and maintain them until a courier has taken the sample to a lab. [0110] A stool sample may be combined with a chemical preservation buffer, such as ethanol, as part of the at home collection process to stop further microbial activity, which allows a sample to be kept at room temperature before being received at the lab where the assay is done. In some examples, the buffer may be a proprietary chemical product sold and validated by another company for the task of freezing microbial activity while still allowing the sample to be processed for metagenomics sequencing. A buffer allows for such a sample to be posted in the mail without (or minimizing) issues of microbial blooming or other continuing changes in microbial composition. The buffer may however prevent some biochemical analyses from being done, and because preservation buffers are likely to kill a large fraction of the microbial population, it is unlikely that samples conserved in preservation buffers can be used for cultivation assays.

[0111] In some cases, a user may do multiple stool tests overtime, so that changes in the microbiome over time can be measured, or changes in the microbiome in response to meals, or changes in the microbiome in response to other clinical or lifestyle variations.

[0112] In some examples, the stool sample is collected using a scoop or swab from a stool that is collected by the user using a stool collection kit that prevents the stool from contamination, such as for instance the contamination that would occur from stool falling into a toilet. Because there is a very high microbial load in the gut microbiome compared, for example, to the skin microbiome, it is also possible that in some cases the stool sample is taken from paper that is used to clean the user’s behind after they have passed a stool. This is only possible if the quantity of stool is large enough that the microbes from the stool greatly exceed the microbes that will be picked up from the user’s skin or environmental contaminants. In any of these cases the scoop, swab, or tissue may be placed inside a collection device, such as a vial that contains a buffer solution. If the user ensures the stool comes into contact with the buffer, for example by shaking, then further microbial activity is stopped and the solution can be kept at room temperature without a significant change in microbial composition. [0113] In some cases, a sterile synthetic tissue is used that does not have biological origins such as paper, so that when the DNA of the sample is extracted there is no contamination from DNA originating in the tissue.

[0114] According to some examples, the tissue is impregnated with a liquid to help capture more stool from the user’s skin, where the liquid does not interfere with the results of the stool test and is not potentially dangerous for the human body.

[0115] In some cases, the timing and quality of the stool sample can be recorded using the computing device 102, for example using a camera. Where there are multiple stool tests the computing device 102 can use a barcode (or some other identifier) to confirm the timing and identity of that particular sample. Other data can also be collected. For example, data about how the sample was stored, how long the sample was stored before being supplied to the lab for analysis, and the like.

[0116] While the data ingestion service 110, the microbiome service 120, the nutritional service 132 are illustrated separately, all ora portion of these services may be located in other locations or together with other components. For example, the data ingestion service 110 may be located within the microbiome service 120. Similarly, the microbiome manager 122 may be part of a different service, and the like.

[0117] According to some examples, some individuals may be asked to visit a clinic to combine at home data with data collected at a clinic. The purpose of the clinic visit is to allow much higher accuracy of measurement for a subset of the individual’s data, which can then be combined with the lower quality at home data. This may be used by the microbiome service 120 to improve the quality of the at home data.

[0118] According to some examples, the day before the visit to the clinic, the individuals are asked to avoid taking part in any strenuous exercise and to limit the intake of alcohol. In some configurations, the microbiome service 120 can analyze the data 108, such as data obtained from an activity tracker, to determine whether the individual followed the instructions of avoiding strenuous exercise. Similarly, the nutritional service 132, or some other device or component, may analyze the foods eaten by the individual by analyzing food data that indicates the foods eaten by the user. Individuals may be provided with instructions for the tests (e.g., avoid eating high fat or high fiber meals that may interfere with test results, fasting, drinking water, ...).

[0119] As described in more detail below with regard to FIGs. 4 and 5, the microbiome service 120 may use the microbiome manager 122 to generate a microbiome fingerprint, and a dietary fingerprint for a user. As discussed above, a “microbiome fingerprint” is data that uniquely identifies the microbiome of a user at a particular point in time. According to some configurations, the microbiome finger printer 126 generates a microbiome fingerprint from a user based on different profiles generated from the microbiome data, such as but not limited to quantitative taxonomic profiles, quantitative functional potential profiles, and strain-level genomic profiles. In some examples, the profiles are generated by the microbiome finger printer 126 and/or the microbiome analyzer 124.

[0120] According to some configurations, the microbiome fingerprint is a combination of descriptors, including, but not limited to (1) the quantitative (i.e. relative abundance) taxonomic profiles (i.e., the names or more generally identifiers (IDs) in case of unknown entities of microbial species or other taxonomic units), (2) the quantitative (i.e. relative abundance) functional potential profiles, (i.e., the names or generally identifiers (IDs) in case of unknown entities of microbial gene families, microbial pathways, and microbial functional modules), and (3) the strain-level genomic profiles (i.e., the reconstruction of the genomes or part of the genomes of as many microbes present in the microbiome as possible).

[0121] The microbiome fingerprint may be generated by the microbiome finger printer 126 using various techniques and methods. In some configurations, generation of the microbiome fingerprint includes obtaining the microbiome sample, generating DNA from the sample, preprocessing the raw sequencing data to the generate quality-screened sequencing data, and transforming the sequencing data is transformed into the numerical and genomics sets for the descriptors utilized to generate the microbiome fingerprint (e.g., quantitative taxonomic profiles, quantitative functional potential profiles, and strain-level genomic profiles). [0122] The microbiome analyzer 124 may also be configured to perform processing associated with the microbiome data. For example, the microbiome analyzer 124 may be configured to generate and/or process sequencing data associated with the microbiome of the user. See FIG. 4 for more details on generating the profiles. After generating the profiles, the microbiome finger printer 126 may generate the microbiome fingerprint for the user. In some examples, the dietary finger printer 128 combines the data associated with the different profiles generated.

[0123] The dietary finger printer 128 is configured to generate a dietary fingerprint for the user. As discussed above, the “dietary fingerprint” of a user indicates how the microbiome of a user is associated with one or more different indexes that may be associated with a particular diet and/or a health characteristic. The indexes may include, but are not limited to a Mediterranean diet index, a vegetarian diet index, a fast food index, an internal fat index, a fat-digesting index, a carbohydrate- digesting index, a health index, a fasting index, a ketogenic index, and the like.

[0124] According to some configurations, the dietary finger printer 128 generates a score for each of the different indexes, such as from 0-100 (or some other indicator), to indicate how closely the microbiome of the user is associated with a particular index. For example, the dietary finger printer 128 may generate a score for each of the indexes based on how closely the microbiome of the user resembles a typical microbiome of someone that is known to follow a specific diet. For example, a score of 100 may indicate that the diet is strongly correlated to a particular diet, a score of 0 would indicate no correlation, and a score between 0 and 100 would indicate a different correlation. According to some configurations, the dietary finger printer 128 generates a Mediterranean diet index score, a vegetarian diet index score, a fast food index score, an internal fat index score, a fat-digesting index score, a carbohydrate-digesting index score, a health index score, fasting index score, ketogenic index score, and the like.

[0125] The Mediterranean diet index score for a user indicates how closely the microbiome of the user resembles the typical microbiome of someone on a Mediterranean diet. The vegetarian diet index score indicates how closely the microbiome of the user resembles someone on a vegetarian diet. The fast food index score indicates how closely the microbiome of the user resembles someone on a fast food diet. The internal fat index score indicates how closely the microbiome of the user resembles someone with high or low visceral fat. The fat-digesting index score indicates how closely the microbiome of the user resembles someone with low postprandial triacylglycerol (TAG) rises. The carbohydrate-digesting index score indicates how closely the microbiome of the user resembles someone with low postprandial glucose rises. The health index score indicates how closely the microbiome of the user resembles someone that is healthy. The fasting index score indicates how closely the microbiome of the user resembles someone that fasts regularly. The ketogenic index score indicates how closely the microbiome of the user resembles someone who is ketogenic.

[0126] In other configurations, the dietary finger printer 128, or some other service or component may utilize different mechanisms to determine whether the microbiome of the user resembles a particular diet and/or group. For instance, the dietary finger printer 128 may utilize a machine learning mechanism to classify the microbiome of the user within a classification and/or generate a score, or some other indicator that indicates how closely the microbiome data of the user matches the microbiome data of a representative user associated with the particular index.

[0127] The microbiome ancestry manager 130 is configured to generate microbiome ancestry data for a user. A “microbiome ancestry” refers to microbiome data that indicates that the user has relationships with other users and/or locations. In some examples, the microbiome service analyzes the microbiome data of the user and determines how closely the microbiome of the user is related to other users, and/or locations. For instance, the microbiome service may determine a number of other users to which the microbiome of the user is most closely related to. In some configurations, the microbiome ancestry manager 130 compares the microbiome data of the user to microbiome data of other users to identify a relationship. Similar to generating the scores for the different indexes performed by the dietary finger printer 128, the microbiome ancestry manager 130 may generate a score for each comparison between the user and the other users. The scores that indicate a close relationship (e.g., above a specified value) with the user may be identified as related.

[0128] The microbiome service may also identify one or more locations to which the microbiome of the user is associated with. For example, the microbiome service may identify the countries the microbiome of the user is associated with (e.g. 75% North America, 25% Mexico). This identification may be based on microbiome data of users at different locations and/or different populations (e.g., English, American, French, Mexican, Italian, ...). See FIG. 7 for additional details for generating the microbiome ancestry data.

[0129] The microbiome analyzer 124, or some other device or component, may analyze the microbiome data of a user before/after generating the microbiome fingerprint, dietary fingerprint, and/or microbiome ancestry for a user. For example, the microbiome analyzer 124 may perform an analysis of the microbiome data to identify the microbial composition of the microbiome (e.g., the species, genes, taxa, and the like). The microbiome service may also generate reconstructed microbial genomes, determine a diversity of the microbiome, identify functions of the microbiome, identify a uniqueness of the microbiome, identify interesting species, and the like.

[0130] In some examples, the microbiome data of the user is compared (e.g., by the microbiome service 120) with other data that is gathered about the user, as well as other users. For instance, users may provide responses to questionnaires, data about food that is eaten, sleep habits, and the like. Among other uses, this data may be utilized to determine a “microbiome ancestry” of a user. [0131] In some examples, the microbiome service may provide a user interface (Ul), such as a graphical user interface (GUI) 104 for a user to view and interact with data associated with the microbiome fingerprints, dietary fingerprints, and microbiome ancestry. For instance, the GUI may display microbiome fingerprint data that shows various characteristics of the microbiome fingerprint, dietary fingerprint data that shows various characteristics of the dietary fingerprint, microbiome ancestry data that shows various characteristics of the microbiome ancestry, recommendation data that identifies one or more recommendations relating to changing the microbiome of the user, and the like. In some configurations, the user may utilize an application 130 on the computing device 102 to interact with the nutritional environment. In some configurations, the application 130 may include functionality relating to processing at least a portion of the data 108.

[0132] As an example, the microbiome service 120 may provide recommendations generated by the nutritional service 132 to increase the diversity of foods eaten as there is no one good food for a microbiome. The recommendations may include to eat different gut-healthy foods, eat fermented foods, minimize highly processed foods (things like emulsifiers and artificial sweeteners may affect the microbiome). The microbiome service may base the recommendations on data obtained from the user, and other users.

[0133] The microbiome service 120 may also track the state of the microbiome of the user over time. For example, the microbiome service may provide data related to different microbiome analysis. In this way, the user may see how changes made by the user (e.g., eating different foods, changing exercise patterns, ...) have affected the microbiome.

[0134] FIG. 2 is a block diagram depicting an illustrative operating environment 200 in which a data ingestion service 110 receives and processes data associated with data associated with at home tests and sample collections. As illustrated in FIG. 2, the operating environment 200 includes the data ingestion service 110 that may be utilized in ingesting data utilized by the microbiome service 120. [0135] In some configurations, the data manager 112 is configured to receive data such as, health data 202 that can include, but is not limited to microbiome data 206A, triglycerides data 206B, glucose data 206C, blood data 206D, wearable data 206E, questionnaire data 206F, psychological data (e.g., hunger, sleep quality, mood, ...) 206G, objective health data (e.g., height, weight, medical history, ...) 206H, nutritional data 140B, and other data 140C.

[0136] According to some examples, the microbiome data 206A includes data about the gut microbiome of an individual. The gut microbiome can host a large number of microbial species (e.g., > 1000) that together have millions of genes. Microbial species include bacteria, fungi, parasites, viruses, and archaea. Imbalance of the normal gut microbiome has been linked with gastrointestinal conditions such as inflammatory bowel disease (IBD) and irritable bowel syndrome (IBS), and wider systemic manifestations of disease such as obesity and type 2 diabetes (T2D). The microbes of the gut undertake a variety of metabolic functions and are able to produce a variety of vitamins, synthesize essential and nonessential amino acids, and provide other functions. Amongst other functions, the microbiome of an individual provides biochemical pathways for the metabolism of non-digestible carbohydrates; some oligosaccharides that escape digestion; unabsorbed sugars and alcohols from the diet; and host-derived mucins.

[0137] The triglycerides data 206B may include data about triglycerides for an individual. In some examples, the triglycerides data 206B can be determined from an At Home Blood Test which in some cases is a finger prick on to a dried blood spot card. [0138] The glucose data 206C includes data about blood glucose. The glucose data 206C may be determined from various testing mechanisms, including at home measurements, such as a continuous glucose meter.

[0139] The blood data 206D may include blood tests relating to a variety of different biomarkers. As discussed above, at least some blood tests can be performed at home. In some configurations, the blood data 206D is associated with measuring blood sugar, insulin, c-peptides, triglycerides, IL-6 inflammation, ketone bodies, nutrient levels, allergy sensitivities, iron levels, blood count levels, HbA1c, and the like.

[0140] The wearable data 206E can include any data received from a computing device associated with an individual. For instance, an individual may wear an electronic data collection device 103, such as an activity-monitoring device, that monitors motion, heart rate, determines how much an individual has slept, the number of calories burned, activities performed, blood pressure, body temperature, and the like. The individual may also wear a continuous glucose meter that monitors blood glucose levels. [0141] The questionnaire data 206F can include data received from one or more questionnaires, and/or surveys received from one or more individuals. The psychological data 206G, that may be subjectively obtained, may include data received from the individual and/or a computing device that generates data or input based on a subjective determination (e.g., the individual states that they are still hungry after a meal, or a device estimates sleep quality based on the movement of the user at night perhaps combined with heart rate data). The objective health data 206H includes data that can be objectively measured, such as but not limited to height, weight, medical history, and the like. [0142] The nutritional data 140B can include data about food, which is referred to herein as “food data”. For example, the nutritional data can include nutritional information about different food(s) such as their macronutrients and micronutrients or the bioavailability of its nutrients under different conditions (raw vs cooked, or whole vs ground up). In some examples, the nutritional data 140C can include data about a particular food. For instance, before an individual consumes a particular meal, information about that food can be determined. As briefly discussed, the user might scan a barcode on the food item(s) being consumed and/or take one or more pictures of the food to determine the food, as well as the amount of food, being consumed.

[0143] The nutritional data can include food data that identifies foods consumed, a quantity of the foods consumed, food nutrition (e.g., obtained from a nutritional database), food state (e.g., cooked, reheated, frozen, etc.), food timing data (e.g., what time was the food consumed, how long did it take to consume, ...), and the like. The food state can be relevant for foods such as carbohydrates (e.g., pasta, bread, potatoes or rice), since carbohydrates may be altered by processes such as starch retrogradation. The food state can also be relevant for quantity estimation of the foods, since foods can change weight dramatically during cooking. In some instances, the user may also take a picture before and/or after consuming a meal to determine what food was consumed as well as how much of the food was consumed. The picture can also provide an indication as to the food state. [0144] The other data 142B can include other data associated with the individual. For example, the other data 142B can include data that can be received directly from a computer application that logs information for an individual (e.g., food eaten, sleep, ...) and/or from the user via a user interface. [0145] In some examples, different computing devices 102 associated with different users provide application data 204 to the data manager 112 for ingestion by the data ingestion service 110. As illustrated, computing device 102A provides app data 204A to the data manager 112, computing device 104B provides app data 204B to the data manager 112, and computing device 104N provides app data 204N to the data manager 112. There may be any number of computing devices utilized. [0146] As discussed briefly above, the data manager 112 receives data from different data sources, processes the data when needed (e.g., cleans up the data for storage in a uniform manner), and stores the data within one or more data stores, such as the data store 140.

[0147] The data manager 112 can be configured to perform processing on the data before storing the data in the data store 140. For example, the data manager 112 may receive data for ketone bodies and then use that data to generate ketone body ratios. Similarly, the data manager 112 may process food eaten and generate meal calories, number of carbohydrates, fat to carbohydrate rations, how much fiber consumed during a time period, and the like. The data stored in the data store 140, or some other location, can be utilized by the microbiome service 120 to determine an accuracy of at home measurements of nutritional responses performed by users. The data outputted by the microbiome service 120 to the nutritional service may therefore contain different values than are stored in the data store 140, for example if a food quantity is adjusted.

[0148] FIGs. 3-7 are flow diagrams showing processes 300, 400, 500, 600, and 700, respectively that illustrate aspects of generating microbiome fingerprints, dietary fingerprints, and microbiome ancestry data in accordance with examples described herein. It should be appreciated that at least some of the logical operations described herein with respect to FIGs. 3-7, and the other FIGs., may be implemented (1) as a sequence of computer implemented acts or program modules running on a computing system and/or (2) as interconnected machine logic circuits or circuit modules within the computing system.

[0149] The implementation of the various components described herein is a matter of choice dependent on the performance and other requirements of the computing system. Accordingly, the logical operations described herein are referred to variously as operations, structural devices, acts, or modules. These operations, structural devices, acts, and modules may be implemented in software, in firmware, in special purpose digital logic and any combination thereof. It should also be appreciated that more or fewer operations may be performed than shown in the FIGs. and described herein. These operations may also be performed in parallel, or in a different order than those described herein. [0150] FIG. 3 is a flow diagram showing a process 300 illustrating aspects of a mechanism disclosed herein for obtaining and utilizing microbiome data for a user to generate microbiome fingerprints, dietary fingerprints, and microbiome ancestry for users. [0151] The process 300 may begin at 302, where microbiome sample/data is obtained from a user. As discussed above, a user may provide one or more microbiome samples that may be obtained at home or in a clinical setting. For example, the user may provide a sample or samples of their stool for microbiome analysis as part of the at home biological collection, and/orthe sample(s) may be collected in a lab, or other clinical setting. In some configurations, the user may also provide other data that may be utilized when processing the sample. For instance, the user may provide timing data indicating when the sample was taken, conditions under which the sample was obtained, and/or other health data.

[0152] At 304, the microbiome data is processed. As discussed above, microbiome service 120 may generate DNA data from the sample. In some examples, the DNA is extracted from the cells of the microbiome sample and purified. Different techniques that are commercially available can be utilized for DNA extraction from the microbiome sample. Generally, the use of different extraction techniques may result in different biases that may affect an accurate microbial representation.

[0153] At 306, the microbial composition of the microbiome sample may be identified. According to some configurations, the microbiome service 120, or some other device or component, identifies the microbial composition of the microbiome (e.g., the species, genes, taxa, and the like). The microbiome service 120 may also generate reconstructed microbial genomes, determine a diversity of the microbiome, identify functions of the microbiome, identify a uniqueness of the microbiome, identify interesting species, and the like.

[0154] At 308, the diversity of the microbiome may be determined. As discussed above, the microbiome service 120 may determine the diversity of the microbiome associated with a user. In some examples, the diversity determined by the microbiome service 120 is the number of individual bacteria from each of the bacterial species present in the microbiome. Having a more diverse microbiome may have health benefits. According to some configurations, the microbiome service 120 may provide this data, possibly along with recommendations, to the user via a Ul, or some other interface.

[0155] At 310, reconstructed microbial genomes are generated. The microbiome service 120, orsome other component or device may generate the reconstructed microbial genomes. Reconstruction of DNA fragments into genomes may utilize different techniques and methods and generally incorporates sequence assembly and sorting/clustering of assembled sequences into different bins associated with characteristic of a genome.

[0156] At 312, the functions of a microbiome may be determined. As discussed above, the microbiome service 120, or some other device or component, may determine the functions of a microbiome. Different techniques and methods may be utilized to determine the functions. Generally, the microbiome service 120 may map the sequencing reads against sequences of DNA (or amino acids) representing known genes (or proteins) and gene families (or protein families) to determine the functional potential of the microbiome. [0157] At 314, other data associated with the microbiome of the user may be determined. As discussed above, the microbiome service 120, or some other device or component, may determine data such as the uniqueness of the microbiome (e.g., compared to the microbiome of other users), species identified as interesting, and the like.

[0158] At 316, the microbiome data associated with the user is stored. As discussed above, the microbiome service 120, or some other device or component, may store the microbiome data in a data store, such as user microbiome data 140A within data store 140.

[0159] At 318, the microbiome data associated with the user is utilized to generate microbiome fingerprints, dietary fingerprints, and microbiome ancestry for the user. As discussed above, the microbiome service 120, or some other device or component, may perform these tasks. See FIGs. 4- 6 and related discussion for more details.

[0160] FIG. 4 is a flow diagram showing a process 400 illustrating aspects of a mechanism disclosed herein for generating a microbiome fingerprint for a user. As discussed above, the microbiome fingerprint may be generated using various techniques and methods. The following process is an example of generating a microbiome fingerprint.

[0161] At 402, microbiome data fora particular user is accessed. As discussed above, the microbiome service 120, or some other device or component, may access the microbiome data 140A within data store 140 to obtain the microbiome data for a user. In other examples, the microbiome data may be obtained/accessed using some other technique (e.g., accessing a different memory, receiving the data from some other source, such as data source(s) 150, and the like).

[0162] At 404, the microbiome data may be preprocessed to generate screened microbiome data. As discussed above, the microbiome service 120, or some other device or component, may process the sequencing data to generate screened sequencing data. The screened sequence data may make the generation of the different profiles described below be more accurate.

[0163] At 406, the quantitative taxonomic profiles are generated. As discussed above, the microbiome service 120, or some other device or component, may generate the quantitative taxonomic profiles. The quantitative taxonomic profiles can be obtained by mapping (i.e. matching the sequences) the sequencing reads against sequences representing the known microbial organisms. The mapping is then processed to produce relative abundances of the reference microbes. Many open source algorithms and corresponding implementations are available for this step, including for example, the techniques as described by Truong et al. ( Nature Methods 12 (10): 902-3, 2015) and the newer versions of the associated software.

[0164] At 408, the quantitative functional potential profiles are generated. As discussed above, the microbiome service 120, or some other device or component, may generate the quantitative functional potential profiles. The quantitative functional potential profiles can be obtained by mapping the sequencing reads against sequences of DNA (or amino acids) representing known genes (or proteins) and gene families (or protein families). Based on the number of reads matching each gene or gene family the presence and abundance of the gene families and pathways are inferred. Several open source algorithms and corresponding implementations are available for this step, including for example the technique HUMAnN2 as described by Abubucker et al. ( PLoS Computational Biology 8 (6), 2012) and Franzosa et al. ( Nature Methods, 15(11), 962, 2018) and any newer versions of the associated software.

[0165] At 410, the strain-level genomic profiles are generated. As discussed above, the microbiome service 120, or some other device or component, may generate the strain-level genomic profiles. The strain-level genomic profiles, or the third descriptor, can be obtained with reference-based and assembly-based approaches. For reference-based approaches the methods use specific genetic markers against which the reads are mapped, and single-nucleotide polymorphisms are inferred. The combinations of single-nucleotide polymorphisms provide strain-specific profiles. Some open source algorithms and implementations for this step are available, including for example the techniques described by T ruong et al. (Genome Research 27 (4): 626-38, 2017). In assembly-based approaches, reads may be first concatenated to form longer contiguous sequences such as described by Li et al. (Bioinformatics 31 (10): 1674-76, 2015).

[0166] Contigs may then be clustered in bins representing the sequences of whole genomes, such as described by Kang et al. ( PeerJ 7: e7359, 2019). The resulting draft genomes may be quality controlled using for example the techniques described by Parks et al. ( Genome Research 25 (7): 1043-55, 2015). The quality-controlled genomes represent single strains in the microbiome.

[0167] At 412, the microbiome fingerprint for the user is generated. As discussed above, the microbiome service 120, or some other device or component, may combine the data associated with the different indexes generated at 406, 408, and 410 to generate the microbiome fingerprint for the user.

[0168] FIG. 5 is a flow diagram showing a process 500 illustrating aspects of a mechanism disclosed herein for generating a dietary fingerprint for a user.

[0169] The process 500 may begin at 502, where microbiome data for a particular user are accessed. As discussed above, the microbiome service 120, or some other device or component, may access the microbiome data 140A within data store 140 to obtain the microbiome data for a user. In other examples, the microbiome data may be obtained/accessed using some other technique (e.g., accessing a different memory, receiving the data from some other source, such as data source(s) 150, and the like).

[0170] At 504, dietary fingerprint data is generated. As discussed above, the microbiome service 120, or some other device or component, may generate dietary fingerprint data that identifies a similarity between the microbiome of a particular user and a “dietary fingerprint” is data that identifies how the microbiome of a user is associated with one or more different indexes. The indexes may include, but are not limited to a Mediterranean diet index, a vegetarian diet index, a fast food index, an internal fat index, a fat-digesting index, a carbohydrate-digesting index, a health index, a fasting index, a ketogenic index, and the like. According to some configurations, one or more computers of a microbiome service generate a score, such as from 0-100, (or some other indicator) that indicates how closely the microbiome of the user is associated with a particular index.

[0171] As an example, the Mediterranean diet index score for a user indicates how closely the microbiome of the user resembles the typical microbiome of someone on a Mediterranean diet. The vegetarian diet index score indicates how closely the microbiome of the user resembles someone on a vegetarian diet. The fast food index score indicates how closely the microbiome of the user resembles someone on a fast food diet. The internal fat index score indicates how closely the microbiome of the user resembles someone with high or low visceral fat. The fat-digesting index score indicates how closely the microbiome of the user resembles someone with low postprandial triacylglycerol (TAG) rises. The carbohydrate-digesting index score indicates how closely the microbiome of the user resembles someone with low postprandial glucose rises. The health index score indicates how closely the microbiome of the user resembles someone that is healthy. The fasting index score indicates how closely the microbiome of the user resembles someone that fasts regularly. The ketogenic index score indicates how closely the microbiome of the user resembles someone who is ketogenic.

[0172] At 506, a determination is made as to whether another dietary index is to be compared. As discussed above, there may be a variety of dietary indexes, including but not limited to a Mediterranean diet index, a vegetarian diet index, a fast food index, an internal fat index, a fat- digesting index, a carbohydrate-digesting index, a health index, a fasting index, a ketogenic index, and the like. When there is another index, the process 500 returns to 504. When there is not another index, the process 500 moves to 508.

[0173] At 508, the dietary index(es) associated with the user are identified. As discussed above, the microbiome service 120, or some other device or component, may identify one or more diets that resemble the microbiome of the user. In some examples, the microbiome service 120 identifies the closest dietary index (e.g., based on a score). In other examples, the microbiome service 120 may rank the dietary index.

[0174] At 510, the dietary fingerprint data may be utilized. As discussed above, the microbiome service 120, or some other device or component, may utilize the dietary fingerprint data when providing data to the user, when generating the microbiome ancestry data, generating recommendations for the user (e.g., nutritional), and/or performing some other task.

[0175] FIG. 6 is a flow diagram showing a process 600 illustrating aspects of a mechanism disclosed herein for generating a microbiome ancestry for a user.

[0176] The process 600 may begin at 602, where microbiome data for a particular user is accessed. As discussed above, the microbiome service 120, or some other device or component, may access the microbiome data 140A within data store 140 to obtain the microbiome data for a user. In other examples, the microbiome data may be obtained/accessed using some other technique (e.g., accessing a different memory, receiving the data from some other source, such as data source(s) 150, and the like). [0177] At 604, the microbiome data is compared to microbiome data from other users. As discussed above, the microbiome service 120, or some other device or component, may utilize the microbiome data, such as the microbiome fingerprint data of a particular user, and compare microbiome fingerprint data of other users. According to some configurations, the microbiome service 120 may generate one or more indicators that identify how close another user is to the user based on a similarity of the microbiome data.

[0178] At 606, one or more other users are identified based on a similarity of the microbiome data between the users. As discussed above, the microbiome service 120, or some other device or component, may identify the related users based on a score generate the microbiome service 120, or some other indicators.

[0179] At 608, the geographic region(s) that are commonly associated with the microbiome data of a user are identified. As discussed above, the microbiome service 120, or some other device or component, may identify that different geographic regions are more closely linked to certain microbiomes.

[0180] At 610, the microbiome ancestry data may be utilized. As discussed above, the microbiome service 120, or some other device or component, may utilize the microbiome ancestry data when providing data to the user, when generating the microbiome ancestry data, generating recommendations for the user (e.g., nutritional), and/or performing some other task.

[0181] FIG. 7 is a flow diagram showing a process 700 illustrating aspects of a mechanism disclosed herein for obtaining test data, including microbiome data, to be utilized for generating microbiome fingerprints, dietary fingerprints, and microbiome ancestry for users.

[0182] At 702, food(s) for at home measurements of nutritional responses may be selected. As briefly discussed above, different foods may be selected for a user to eat before a test is performed in order to evoke a desired response. The foods can include foods for a series of standardized meals, a single food, or some other combination of foods.

[0183] At 704, food data is received. As discussed above, the food data is associated with foods that are utilized to evoke a nutritional response. The food data can include foods for a series of standardized meals, a single food, or some other combination of foods. The food data can include data such as foods consumed, a quantity of the foods consumed, food nutrition (e.g., obtained from a nutritional database), food state (e.g., cooked, reheated, frozen, etc.), food timing data (e.g., what time was the food consumed, how long did it take to consume, ...), and the like. The food state can be relevant for foods such as carbohydrates (e.g., pasta, bread, potatoes or rice), since carbohydrates may be altered by processes such as starch retrogradation. The food state can also be relevant for quantity estimation of the foods, since foods can change weight dramatically during cooking.

[0184] At 706, at home test(s) are performed. The tests may include at home tests as described above and/or the collection of one or more samples (e.g., stool for microbiome analysis).

[0185] At 708, test data associated with the at home tests including microbiome data is received. As discussed above, microbiome data may be associated with one or more tests. In some configurations, the microbiome data includes a stool sample, timing data for the sample (e.g., when collected, how long stored before providing to a lab), data associated with collection of the sample (e.g., how was sample stored, was the sample contaminated), as well as other data. For example, a user may be instructed to take a picture of the sample and provide the image to the service.

[0186] At 710, the test data is utilized to generate microbiome fingerprints, dietary fingerprints, and microbiome ancestry. In some examples, the test data is used by the microbiome service 120 to generate the microbiome fingerprints, dietary fingerprints, and microbiome ancestry. The nutritional service 132 may also use the test data to generate nutritional recommendations that are personalized for a particular user.

[0187] (II) Representative Computer Architecture.

[0188] FIG. 8 shows an example computer architecture for a computer 800 capable of executing program components for generating microbiome fingerprints, dietary fingerprints, and microbiome ancestry for users in the manner described above. The computer architecture shown in FIG. 8 illustrates a conventional server computer, workstation, desktop computer, laptop, tablet, network appliance, digital cellular phone, smart watch, or other computing device, and may be utilized to execute any of the software components presented herein. For example, the computer architecture shown in FIG. 8 may be utilized to execute software components for performing operations as described above. The computer architecture shown in FIG. 8 might also be utilized to implement a computing device 102, or any other of the computing systems described herein.

[0189] The computer 800 includes a baseboard 802, or “motherboard,” which is a printed circuit board to which a multitude of components or devices may be connected by way of a system bus or other electrical communication paths. In one illustrative example, one or more central processing units (CPUs) 804 operate in conjunction with a chipset 806. The CPUs 804 may be standard programmable processors that perform arithmetic and logical operations necessary for the operation of the computer 800.

[0190] The CPUs 804 perform operations by transitioning from one discrete, physical state to the next through the manipulation of switching elements that differentiate between and change these states. Switching elements may generally include electronic circuits that maintain one of two binary states, such as flip-flops and electronic circuits that provide an output state based on the logical combination of the states of one or more other switching elements, such as logic gates. These basic switching elements may be combined to create more complex logic circuits, including registers, adders- subtractors, arithmetic logic units, floating-point units and the like.

[0191] The chipset 806 provides an interface between the CPUs 804 and the remainder of the components and devices on the baseboard 802. The chipset 806 may provide an interface to a random-access memory (RAM) 808, used as the main memory in the computer 800. The chipset 806 may further provide an interface to a computer-readable storage medium such as a read-only memory (ROM) 810 or non-volatile RAM (NVRAM) for storing basic routines that help to startup the computer 800 and to transfer information between the various components and devices. The ROM 810 or NVRAM may also store other software components necessary for the operation of the computer 800 in accordance with the examples described herein.

[0192] The computer 800 may operate in a networked environment using logical connections to remote computing devices and computer systems through a network, such as the network 820. The chipset 806 may include functionality for providing network connectivity through a network interface controller (NIC) 812, such as a mobile cellular network adapter, WiFi network adapter or gigabit Ethernet adapter. The NIC 812 is capable of connecting the computer 800 to other computing devices over the network 820. It should be appreciated that multiple NICs 812 may be present in the computer 800, connecting the computer to other types of networks and remote computer systems.

[0193] The computer 800 may be connected to a mass storage device 818 that provides non-volatile storage for the computer. The mass storage device 818 may store system programs, application programs, other program modules and data, which have been described in greater detail herein. The mass storage device 818 may be connected to the computer 800 through a storage controller 814 connected to the chipset 806. The mass storage device 818 may include one or more physical storage units. The storage controller 814 may interface with the physical storage units through a serial attached SCSI (SAS) interface, a serial advanced technology attachment (SATA) interface, a fiber channel (FC) interface, or other type of interface for physically connecting and transferring data between computers and physical storage units.

[0194] The computer 800 may store data on the mass storage device 818 by transforming the physical state of the physical storage units to reflect the information being stored. The specific transformation of physical state may depend on various factors, in different implementations of this description. Examples of such factors may include, but are not limited to, the technology used to implement the physical storage units, whether the mass storage device 818 is characterized as primary or secondary storage and the like.

[0195] For example, the computer 800 may store information to the mass storage device 818 by issuing instructions through the storage controller 814 to alter the magnetic characteristics of a particular location within a magnetic disk drive unit, the reflective or refractive characteristics of a particular location in an optical storage unit, or the electrical characteristics of a particular capacitor, transistor, or other discrete component in a solid-state storage unit. Other transformations of physical media are possible without departing from the scope and spirit of the present description, with the foregoing examples provided only to facilitate this description. The computer 800 may further read information from the mass storage device 818 by detecting the physical states or characteristics of one or more particular locations within the physical storage units.

[0196] In addition to the mass storage device 818 described above, the computer 800 may have access to other computer-readable storage media to store and retrieve information, such as program modules, data structures, or other data. It should be appreciated by those skilled in the art that computer-readable storage media is any available media that provides for the non-transitory storage of data and that may be accessed by the computer 800.

[0197] By way of example, and not limitation, computer-readable storage media may include volatile and non-volatile, removable and non-removable media implemented in any method or technology. Computer-readable storage media includes, but is not limited to, RAM, ROM, erasable programmable ROM (EPROM), electrically-erasable programmable ROM (EEPROM), flash memory or other solid- state memory technology, compact disc ROM (CD-ROM), digital versatile disk (DVD), high definition DVD (HD-DVD), BLU-RAY, or other optical storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium that can be used to store the desired information in a non-transitory fashion.

[0198] The mass storage device 818 may store an operating system 830 utilized to control the operation of the computer 800. According to one example, the operating system includes the LINUX® (Linus Torvalds, Boston, MA) operating system. According to another example, the operating system includes the WINDOWS® SERVER® (Microsoft Corporation, Redmond, WA) operating system from MICROSOFT® (Microsoft Corporation, Seattle, WA). According to another example, the operating system includes the iOS® (Cisco Technology Inc., San Jose, CA) operating system from Apple® (Apple Inc., Cupertino, CA). According to another example, the operating system includes the Android® (Google LLC, Mountain View, CA) operating system from Google® (Google LLC) or its ecosystem partners. According to further examples, the operating system may include the UNIX® (The Open Group Limited, Reading, Berkshire, England) operating system. It should be appreciated that other operating systems may also be utilized. The mass storage device 818 may store other system or application programs and data utilized by the computer 800, such as components that include the data manager 122, the microbiome manager 122 and/or any of the other software components and data described above. The mass storage device 818 might also store other programs and data not specifically identified herein.

[0199] In one example, the mass storage device 818 or other computer-readable storage media is encoded with computer-executable instructions that, when loaded into the computer 800, create a special-purpose computer capable of implementing the examples described herein. These computer- executable instructions transform the computer 800 by specifying how the CPUs 804 transition between states, as described above. According to one example, the computer 800 has access to computer-readable storage media storing computer-executable instructions which, when executed by the computer 800, perform the various processes described above with regard to FIGs. 4-8. The computer 800 might also include computer-readable storage media for performing any of the other computer-implemented operations described herein.

[0200] The computer 800 may also include one or more input/output controllers 816 for receiving and processing input from a number of input devices, such as a keyboard, a mouse, a touchpad, a touch screen, an electronic stylus, or other type of input device. Similarly, the input/output controller 816 may provide output to a display, such as a computer monitor, a flat-panel display, a digital projector, a printer, a plotter, or other type of output device. It will be appreciated that the computer 800 may not include all of the components shown in FIG. 8, may include other components that are not explicitly shown in FIG. 8, or may utilize an architecture completely different than that shown in FIG. 8.

[0201] (III) Detection and Identification of Individual Microbes.

[0202] Described herein are specific methods for detecting and identifying individual member microbes in the microbiome of a subject, as well as methods for identifying and quantifying (in relative or absolute terms) the members of a microbiome. It will be understood, however, that other methods known to those of skill in the art can also be used with the methods described herein. See, for instance: Davidson & Epperson ( Methods Mol. Biol., 1706:77-90, 2018), Nagpal et al. (Front Microbiol., 8:2897, doi:10.3389/fmicb.2018.02897, 2018), Nagpal etal. (Sci Rep. 8(1):12649, 2018), The Integrative HMP (iHMP) Research Network Consortium ( Nature 569:641-648, 2019; and publications cited therein), Wu et al. (Gut. 65(1):63-72, 2016). Additional resources are available online, for instance, through the NIH Human Microbiome Project (at hmpdacc.org), including tools and protocols related to Microbial Reference Genomes, Sampling, Sequence & Analysis of 16S RNA, and Sampling, Sequencing & Analysis of Whole Metagenomic Sequence.

[0203] Having provided in this disclosure specific individual microbes and sets of microbes associated with and/or linked to poor health and others associated with and/or linked to pro-health conditions, profiles can now be detected without needing to sequence or otherwise assay the entire microbiome of the subject. For instance, the following are pro-health linked/indicator microbes: Prevotella copri, Blastocystis spp., Haemophilus para influenzae, Firmicutes bacterium CAG 95, Bifidobacterium animaiis, Oscillibacter sp 57 20, Roseburia sp CAG 182, Veillonella dispar, Eubacterium eligens, Firmicutes bacterium CAG 170, Rothia mucilaginosa, Veillonella infantium, Roseburia hominis, Oscillibacter sp PC13, Clostridium sp CAG 167, Ruminococcaceae bacterium D5, Paraprevotella xylaniphila, Faecalibacterium prausnitzii, Romboutsia ilealis, and Veillonella atypica ; and the following are poor health linked/indicator microbes: Eubacterium ventriosum, Roseburia inulinivorans, Clostridium spiroforme, Clostridium bolteae CAG 59, Eggerthella lenta, Clostridium bolteae, Collinsella intestinalis, Clostridium innocuum, Blautia obeum, Clostridium symbiosum, Clostridium sp CAG 58, Blautia hydrogenotrophica, Anaerotruncus colihominis, Ruminococcus gnavus, Flavonifractor plautii, Clostridium leptum, Ruthenibacterium lactati forma ns, and Escherichia coli. These strains can be further identified by their respective NCBI Taxonomy ID Number (see ncbi.nlm.nih.gov/taxonomy), as shown in Table 6. Additional specific taxonomic information can be found, for instance, using MetaPhlAn2 (Metagenomic Phylogenetic Analysis; version 2.9.21 and marker database release 2.9.4; Truong et al., Nat. Methods 12, 902-903, 2015).

Table 6: NCBI Taxonomy Identification Numbers for Select Indicator Microbes

[0204] A collection of two or more microbes described or illustrated herein as associated with a biological status or condition can be referred to as a microbial signature, or a microbiome fingerprint. For instance, any two, any three, any four, any five, any six, any seven, any eight, any nine, any 10, any 11 , any 12, any 13, any 14, any 15, or more microbes listed in Table 6 may be included in a microbial signature for a biological status or condition. Such microbes may be selected from the Pro- Health or the Poor Health indicators, or some from both. All seventeen of the listed pro-health indicator microbes for instance may be included in a single microbial signature. Similarly, all fifteen poor health indicator microbes may be included in a single microbial signature. Additional microbes useful in the assembling of a microbial signature, or microbiome fingerprint, are provided for instance in Table 5, and are discussed more fully in Example 1.

[0205] (IV) Methods of Use.

[0206] Based on the research reported herein, including specifically in Example 1 , there are now enabled a number of methods of using the results of the microbiome metagenomic analyses.

[0207] For instance, one embodiment is a method of using a group of microbes to determine a health condition in a human subject. By way of example, the group of microbes includes: at least two prohealth indicator microbes; or at least two poor health indicator microbes; or at least two pro-health indicator microbes and at least two poor health indicator microbes. Lists of pro-health and poor health indicator microbes are described herein, for instance in Example 1 and Table 6. By way of example, in some embodiments the pro-health indicator microbes are selected from the group including Prevotella copri, Blastocystis spp., Haemophilus para influenzae, Firmicutes bacterium CAG 95, Bifidobacterium animalis, Oscillibacter sp 57 20, Roseburia sp CAG 182, Veillonella dispar, Eubacterium eligens, Firmicutes bacterium CAG 170, Rothia mucilaginosa, Veillonella infantium, Roseburia hominis, Oscillibacter sp PC13, Clostridium sp CAG 167, Ruminococcaceae bacterium D5, and Paraprevotella xylaniphila. By way of further example, in some embodiments the poor health indicator microbes are selected from the group including Eubacterium ventriosum, Roseburia inulinivorans, Clostridium spiroforme, Clostridium bolteae CAG 59, Eggerthella lenta, Clostridium bolteae, Collinsella intestinalis, Clostridium innocuum, Blautia obeum, Clostridium symbiosum, Clostridium sp CAG 58, Blautia hydrogenotrophica, Anaerotruncus colihominis, Ruminococcus gnavus, and Flavonifractor plautii. In another example embodiment, at least one of the pro-health indicator microbes is selected from the group including Firmicutes bacterium CAG 95, Haemophilus para influenzae, Oscillibacter sp 57 20, Firmicutes bacterium CAG 170, Roseburia sp CAG 182, Clostridium sp CAG 167, Oscillibacter sp PC13, Eubacterium eligens, Prevotella copri, Veillonella dispar, Veillonella infantium, Faecalibacterium prausnitzii, Bifidobacterium animalis, Romboutsia ilealis, and Veillonella atypica ; and at least one of the poor health indicator microbes is selected from the group including Clostridium leptum, Ruthenibacterium lactati forma ns, Collinsella intestinalis, Escherichia coli, Blautia hydrogenotrophica, Clostridium sp CAG 58, Eggerthella lenta, Ruminococcus gnavus, Clostridium spiroforme, Clostridium bolteae CAG 59, Clostridium innocuum, Anaerotruncus colihominis, Clostridium symbiosum, Clostridium bolteae, and Flavonifractor plautii.

[0208] In further examples of such methods, the method of using a group of microbes to determine a health condition in a human subject includes obtaining a biological sample from the human subject (for instance, a microbiome sample, such as a stool sample); and analyzing the biological sample to determine presence, absence, or abundance of the at least two pro-health indicator microbes and/or the at least two poor health indicator microbes.

[0209] In additional examples of such methods, the method of using a group of microbes to determine a health condition in a human subject includes obtaining a biological sample from the human subject; identifying in the biological sample at least 10, at least 20, at least 30, at least 40, at least 50, at least 60, at least 70, at least 80, at least 90, at least 100, at least 125, at least 150, at least 175, at least 200, or more than 200 different microbes in the biological sample; and determining the health condition of the human subject based on presence, absence, and/or absolute or relative abundance of the identified microbes in the biological sample.

[0210] In any of these methods using a group of microbes to determine a health condition in a human subject, the group of microbes may include at least three pro-health indicator microbes; at least five pro-health indicator microbes; at least ten pro-health indicator microbes; or more than 10 pro-health indicator microbes. Optionally, the group of microbes includes all of the following pro-health indicator microbes: Prevotella copri, Blastocystisspp., Haemophilus para influenzae, Firmicutes bacterium CAG 95, Bifidobacterium animalis, Oscillibacter sp 57 20, Roseburia sp CAG 182, Veillonella dispar, Eubacterium eligens, Firmicutes bacterium CAG 170, Rothia mucilaginosa, Veillonella infantium, Roseburia hominis, Oscillibacter sp PC13, Clostridium sp CAG 167, Ruminococcaceae bacterium D5, and Paraprevotella xylaniphila. In another example, the group of microbes includes all of the following pro-health indicator microbes: Firmicutes bacterium CAG 95, Haemophilus parainfluenzae, Oscillibacter sp 5720, Firmicutes bacterium CAG 170, Roseburia sp CAG 182, Clostridium sp CAG 167, Oscillibacter sp PC13, Eubacterium eligens, Prevotella copri, Veillonella dispar, Veillonella infantium, Faecalibacterium prausnitzii, Bifidobacterium animalis, Romboutsia ilealis, and Veillonella atypica.

[0211] In any of these methods using a group of microbes to determine a health condition in a human subject, the group of microbes may include: at least three poor health indicator microbes; at least five poor health indicator microbes; at least ten poor health indicator microbes; or more than 10 poor health indicator microbes. Optionally, the group of microbes includes all of the following poor health indicator microbes: Eubacterium ventriosum, Roseburia inulinivorans, Clostridium spiroforme, Clostridium bolteae CAG 59, Eggerthella lenta, Clostridium bolteae, Collinsella intestinalis, Clostridium innocuum, Blautia obeum, Clostridium symbiosum, Clostridium sp CAG 58, Blautia hydrogenotrophica, Anaerotruncus colihominis, Ruminococcus gnavus, and Flavonifractor plautii. In another example, the group of microbes includes all of the following poor health indicator microbes: Clostridium leptum, Ruthenibacterium lactatiformans, Collinsella intestinalis, Escherichia coli, Blautia hydrogenotrophica, Clostridium sp CAG 58, Eggerthella lenta, Ruminococcus gnavus, Clostridium spiroforme, Clostridium bolteae CAG 59, Clostridium innocuum, Anaerotruncus colihominis, Clostridium symbiosum, Clostridium bolteae, and Flavonifractor plautii. [0212] In exemplary method embodiments, the group of microbes includes Clostridium innocuum, C. symbiosum, C. spiroforme, C. leptum, C. saccharolyticum. In exemplary method embodiments, the group of microbes includes P. copri and Blastocystis spp..

[0213] In any of these methods of using a group of microbes to determine a health condition in a human subject, the health condition may include at least one of: overall good health, overall poor health, obesity, BMI, diabetes risk, cardiometabolic risk, cardiovascular disease risk, or postprandial response to food intake.

[0214] Optionally, any of the provided methods of using a group of microbes to determine a health condition in a human subject may include detecting the presence, absence, or relative abundance of at least one of the microbes in a microbiome sample from the human subject. For instance, in this context the detecting may include one or more of: sequencing one or more nucleic acids of a prohealth or poor health microbe, hybridizing a nucleic acid probe to a nucleic acid of a pro-health or poor health microbe, detecting one or more proteins from a pro-health or poor health microbe, or measuring activity of one or more proteins a pro-health or poor health microbe. For instance, the detecting may include shotgun metagenomics.

[0215] Also provided herein are methods of predicting a health condition in a subject. Such methods involve determining presence, absence, or relative abundance of at least three pro-health indicator microbes in a microbiome of the subject; determining presence, absence, or relative abundance of at least three poor health indicator microbes in a microbiome of the subject; and predicting the health condition of the subject, based on the presence, absence, or relative abundance of the pro-health and/or poor health indicator microbes in the microbiome of the subject. By way of example, in some such methods the pro-health indicator microbes are selected from the group including Prevotella copri, Blastocystis spp., Haemophilus para influenzae, Firmicutes bacterium CAG 95, Bifidobacterium animalis, Oscillibacter sp 57 20, Roseburia sp CAG 182, Veillonella dispar, Eubacterium eligens, Firmicutes bacterium CAG 170, Rothia mucilaginosa, Veillonella infantium, Roseburia hominis, Oscillibacter sp PC13, Clostridium sp CAG 167, Ruminococcaceae bacterium D5, Paraprevotella xylaniphila, Faecalibacterium prausnitzii, Romboutsia ilealis and, Veillonella atypica. By way of further example, in some such methods the poor health indicator microbes are selected from the group including Eubacterium ventriosum, Roseburia inulinivorans, Clostridium spiroforme, Clostridium bolteae CAG 59, Eggerthella lenta, Clostridium bolteae, Collinsella intestinalis, Clostridium innocuum, Blautia obeum, Clostridium symbiosum, Clostridium sp CAG 58, Blautia hydrogenotrophica, Anaerotruncus colihominis, Ruminococcus gnavus, Flavonifractor plautii, Clostridium leptum, Ruthenibacterium lactati forma ns, and Escherichia coli.

[0216] It is contemplated that in some methods of predicting a health condition in a subject, the health condition includes at least one of obesity, increased cardiometabolic risk, diabetes risk, or overall poor health; and the health condition is predicted by the presence and/or abundance of more poor health indicator microbes than pro-health indicator microbes; and/or the health condition includes at least one of overall good health or absence of obesity, reduced cardiometabolic risk, or reduced diabetes risk; and the health condition is predicted by the presence and/or abundance of more pro-health indicator microbes than poor health indicator microbes.

[0217] Another embodiment is a method to predict overall good or poor general health in a non- diseased human subject. In examples of such methods, the methods involve obtaining a microbiome sample (for instance, a stool sample) from the human subject; isolating a nucleic acid fraction from the microbiome sample; detecting, within the nucleic acid fraction, presence, absence, or relative abundance of at least one unique marker sequence indicative of: a pro-health indicator microbe selected from the group including Prevotella copri, Blastocystis spp., Haemophilus parainfluenzae, Firmicutes bacterium CAG 95, Bifidobacterium animalis, Oscillibacter sp 57 20, Roseburia sp CAG 182, Veillonella dispar, Eubacterium eligens, Firmicutes bacterium CAG 170, Rothia mucilaginosa, Veillonella infantium, Roseburia hominis, Oscillibacter sp PC13, Clostridium sp CAG 167, Ruminococcaceae bacterium D5, Paraprevotella xylaniphila, Faecalibacterium prausnitzii, Romboutsia ilealis, and Veillonella atypica ; or a poor health indicator microbes selected from the group including Eubacterium ventriosum, Roseburia inulinivorans, Clostridium spiroforme, Clostridium bolteae CAG 59, Eggerthella lenta, Clostridium bolteae, Collinsella intestinalis, Clostridium innocuum, Blautia obeum, Clostridium symbiosum, Clostridium sp CAG 58, Blautia hydrogenotrophica, Anaerotruncus colihominis, Ruminococcus gnavus, Flavonifractor plautii, Clostridium leptum, Ruthenibacterium lactati forma ns, and Escherichia coll·, and at least one of predicting the human subject has overall good general health if the pro-health indicator microbes outnumber or are relatively more abundant than the poor-health indicator microbes; or predicting the human subject has overall poor general health if the poor health indicator microbes outnumber or are relatively more abundant than the pro-health indicator microbes.

[0218] Examples of the methods to predict overall good or poor general health in a non-diseased human subject further include providing to the human subject a dietary recommendation based on the presence, absence, or relative abundance of one or more poor health indicator microbes and/or one or more pro-health indicator microbes. Such dietary recommendation may be provided as a prescription. Optionally, the method may further include administering to the subject one or more compounds or substances intended to alter the presence or quantity or relative proportion of at least one pro-health indicator microbe or at least one poor health indicator microbe in the subject.

[0219] Also enabled by this disclosure are methods for targeting a microbiome of a human subject to promote health, which methods include (A) detecting in a microbiome sample from the human subject one or more pro-health indicator microbes selected from the group including Prevotella copri, Blastocystis spp., Haemophilus parainfluenzae, Firmicutes bacterium CAG 95, Bifidobacterium animalis, Oscillibacter sp 57 20, Roseburia sp CAG 182, Veillonella dispar, Eubacterium eligens, Firmicutes bacterium CAG 170, Rothia mucilaginosa, Veillonella infantium, Roseburia hominis, Oscillibacter sp PC13, Clostridium sp CAG 167, Ruminococcaceae bacterium D5, Paraprevotella xylaniphila, Faecalibacterium prausnitzii, Romboutsia ilealis, and Veillonella atypica ; and administering to the human a composition that increases growth or survival of the pro-health indicator microbe(s); and/or (B) detecting in a microbiome sample from the human subject one or more poor health indicator microbe selected from the group including Eubacterium ventriosum, Roseburia inulinivorans, Clostridium spiroforme, Clostridium bolteae CAG 59, Eggerthella ienta, Clostridium bolteae, Collinsella intestinalis, Clostridium innocuum, Blautia obeum, Clostridium symbiosum, Clostridium sp CAG 58, Blautia hydrogenotrophica, Anaerotruncus colihominis, Ruminococcus gnavus, Flavonifractor plautii, Clostridium leptum, Ruthenibacterium lactatiformans, and Escherichia coll·, and administering to the human a composition that decreases growth or survival of the poor health indicator microbe(s).

[0220] Examples of such methods for targeting a microbiome of a human subject to promote health involve detecting: at least three pro-health indicator microbes; at least five pro-health indicator microbes; at least ten pro-health indicator microbes; or more than ten pro-health indicator microbes. All of the following pro-health indicator microbes are detected in some embodiments: Prevotella copri, Blastocystis spp., Haemophilus para influenzae, Firmicutes bacterium CAG 95, Bifidobacterium animalis, Oscillibacter sp 57 20, Roseburia sp CAG 182, Veillonella dispar, Eubacterium eligens, Firmicutes bacterium CAG 170, Rothia mucilaginosa, Veillonella infantium, Roseburia hominis, Oscillibacter sp PC13, Clostridium sp CAG 167, Ruminococcaceae bacterium D5, and Paraprevotella xylaniphila. Alternatively, the indicator microbes include at least P. copri and Blastocystis spp.. Alternatively, the indicator microbes include all of: Prevotella copri, Haemophilus parainfluenzae, Firmicutes bacterium CAG 95, Bifidobacterium animalis, Oscillibacter sp 57 20, Roseburia sp CAG 182, Veillonella dispar, Eubacterium eligens, Firmicutes bacterium CAG 170, Veillonella infantium, Oscillibacter sp PC13, Clostridium sp CAG 167, Faecalibacterium prausnitzii, and Romboutsia ilealis, Veillonella atypica.

[0221] Further examples of such methods for targeting a microbiome of a human subject to promote health involve detecting: at least three poor health indicator microbes; at least five poor health indicator microbes; at least ten poor health indicator microbes; or more than ten poor health indicator microbes. All of the following poor health indicator microbes are detected in some embodiments: Eubacterium ventriosum, Roseburia inulinivorans, Clostridium spiroforme, Clostridium bolteae CAG 59, Eggerthella Ienta, Clostridium bolteae, Collinsella intestinalis, Clostridium innocuum, Blautia obeum, Clostridium symbiosum, Clostridium sp CAG 58, Blautia hydrogenotrophica, Anaerotruncus colihominis, Ruminococcus gnavus, and Flavonifractor plautii. Alternatively, the indicator microbes include Clostridium innocuum, C. symbiosum, C. spiroforme, C. leptum, and C. saccharolyticum. Alternatively, the indicator microbes include all of: Clostridium leptum, Ruthenibacterium lactatiformans, Collinsella intestinalis, Escherichia coli, Blautia hydrogenotrophica, Clostridium sp CAG 58, Eggerthella Ienta, Ruminococcus gnavus, Clostridium spiroforme, Clostridium bolteae CAG 59, Clostridium innocuum, Anaerotruncus colihominis, Clostridium symbiosum, Clostridium bolteae, and Flavonifractor plautii. [0222] Also provided are methods of altering abundance of one or more microbes in gut microflora of a subject, including administering to the subject a probiotic composition, or administering to the subject a prebiotic composition, or administering to the subject an antibiotic composition. [0223] (V) Kits and Arrays.

[0224] Also provided herein are various different types of kits. Examples of such kits include kits useful to gather data or information from a subject, for instance. Examples of the information/data-gathering kits include one or more device(s) to in/with which to collect a microbiome sample (for instance, a stool sample collection device, surface swab, etc.), and optionally one or more devices in/with which to collect biological samples (such as blood samples; for instance, a device for the collection of blood spots). Optionally, the kits will also include instructions for how the subject, or a health care provider, is to collect the samples; how those samples are to be treated and/or stored before they are forwarded for analysis; and additional instructions regarding recording information other than biological samples that can inform or influence the interpretation of results from analyses of the biological sample(s). For instance, kits may include instructions on how to install or access computer software useful to collect information from the subject, such as food intake, exercise, and other objective or subject information. [0225] In some kit embodiments, the kit will further include a device or system for monitoring blood glucose of the subject. By way of example, such device may be a continues blood glucose monitor. Alternatively, the kit may provide a system for intermittently monitoring blood glucose, for instance through periodic blood sampling and analysis such as is routine for monitoring the blood glucose of Type 1 diabetics.

[0226] It is also contemplated that some kit embodiments will include instructions to enable the subject being tested to undergo one or more additional sampling or testing procedures, for instance at a laboratory or other device outside of their home. For instance, some kits may include instructions for howto provide a fasting blood sample, or more generally a blood sample useful to detect or measure metabolic action.

[0227] Additional kit embodiments are provided for the analysis of samples collect from a subject. By way of example, such testing kits include one or more marker molecules capable of detecting the presence (and/or quantity) of at least one indicator microbe in a sample (e.g., a stool or other microbiome sample) from a subject. For instance, marker molecules are nucleic acids (e.g., oligonucleotides) or amino acids (e.g., peptides) specific for a single indicator microbe. Such marker molecules may optionally be attached to a solid surface, such as an array. Marker molecules may optionally be labeled for ease of detection.

[0228] A kit can include a device as described herein, and optionally additional components such as buffers, reagents, and instructions for carrying out the methods described herein. The choice of buffers and reagents will depend on the particular application, e.g., setting of the assay (point-of-care, research, clinical), analyte(s) to be assayed, the detection moiety used, the detection system used, etc.

[0229] The kit can also include informational material, which can be descriptive, instructional, marketing, or other material that relates to the methods described herein and/or the use of the devices for the methods described herein. In embodiments, the informational material can include information about production of the device, physical properties of the device, date of expiration, batch or production site information, and so forth.

[0230] Also contemplated are arrays of biological macromolecules (markers), such as nucleic acids (e.g., oligonucleotides) or amino acids (e.g., peptides or proteins), that enable the detection and/or quantification of microbes from a microbiome of a subject, such as a human subject. With the provision herein of lists of specific pro-health and specific poor health indicator microbes, arrays can be prepared that specifically can detect and/or quantify such indicator microbes. By way of example, an array may include markers specific for individual pro-health or poor health microbes. Such examples may be genomic sequence determined to be or recognized as being specific for an individual microbe listed, for instance, in Table 5.

[0231] Specific arrays are pro-health indicator detection arrays, which contain two or more markers each of which is specific for a pro-health indicator microbe as describe herein, including for instance microbes indicated to be associated with generally good health of the subject from which the microbe is isolated. By way of example, such pro-health indicator microbes may include: Prevotella copri, Blastocystis spp., Haemophilus parainfluenzae, Firmicutes bacterium CAG 95, Bifidobacterium animalis, Oscillibacter sp 57 20, Roseburia sp CAG 182, Veillonella dispar, Eubacterium eligens, Firmicutes bacterium CAG 170, Rothia mucilaginosa, Veillonella infantium, Roseburia hominis, Oscillibacter sp PC13, Clostridium sp CAG 167, Ruminococcaceae bacterium D5, Paraprevotella xylaniphila, Faecalibacterium prausnitzii, Romboutsia ilealis, and Veillonella atypica. Thus, contemplated herein are pro-health indicator arrays that include at least one marker for each of at least two of these listed pro-health indictor microbes; each of at least three; each of at least four; each of at least five; each of at least six; each of at least seven; each of at least eight; each of at least nine; each of at least ten; or more than ten of these listed pro-health indictor microbes. Some arrays will include all seventeen of the listed pro-health indictor microbes. Optionally, any of these pro-health indicator arrays may also include markers for additional microbes; these may be other pro-health indicator microarrays or poor health indictor microbes, for instance.

[0232] Additional specific arrays are poor health indicator detection arrays, which contain two or more markers each of which is specific for a poor health indicator microbe as describe herein, including for instance microbes indicated to be associated with generally poor health of the subject from which the microbe is isolated. By way of example, such poor health indicator microbes include: Eubacterium ventriosum, Roseburia inulinivorans, Clostridium spiroforme, Clostridium bolteae CAG 59, Eggerthella lenta, Clostridium bolteae, Collinsella intestinalis, Clostridium innocuum, Blautia obeum, Clostridium symbiosum, Clostridium sp CAG 58, Blautia hydrogenotrophica, Anaerotruncus colihominis, Ruminococcus gnavus, Flavonifractor plautii, Clostridium leptum, Ruthenibacterium lactati forma ns, and Escherichia coli. Thus, contemplated herein are poor health indicator arrays that include at least one marker for each of at least two of these listed poor health indictor microbes; each of at least three; each of at least four; each of at least five; each of at least six; each of at least seven; each of at least eight; each of at least nine; each of at least ten; or more than ten of these listed poor health indictor microbes. Some arrays will include all fifteen of the listed poor health indictor microbes. Optionally, any of these poor health indicator arrays may also include markers for additional microbes; these may be other poor health indicator microarrays or pro-health indictor microbes, for instance.

[0233] The arrays may be utilized in myriad applications. For example, the arrays in some embodiments are used in methods for detecting association between a behavior (such as a food choice, or more generally, a diet) and a health condition. For instance, such a health condition may include balance (or imbalance) of the normal gut microbiome; gastrointestinal conditions such as inflammatory bowel disease (IBD) and irritable bowel syndrome (IBS); wider systemic manifestations of disease or disorder, such as obesity, type 2 diabetes (T2D), diabetes risk, metabolic syndrome, prediabetes, and obesity; as well as overall good health, overall poor health, BMI, cardiometabolic risk, cardiovascular disease risk, and postprandial response to food intake. This method typically includes incubating a sample from a subject (e.g., from the microbiome of the subject) with the array under conditions such that biomolecules in the sample may associate with marker biomolecules attached to the array. The association is then detected, using means commonly known in the art. In this context, the term association may include hybridization, covalent binding, or ionic binding, for instance. A skilled artisan will appreciate that conditions under which association occurs will vary depending on the biomolecules, the markers, the substrate, and the detection method utilized. As such, suitable conditions can be optimized for each individual array created or assay carried out with an array.

[0234] In yet another embodiment, the array is used as a tool in a method to determine whether a compound or composition is effective to modify a biological condition, such as the balance or imbalance of the microbiome in a subject, or for a treatment of a disease or disorder in a subject. [0235] In another embodiment, the array is used as a tool in a method to determine whether a compound increases or decreases the relative abundance in a subject of any of the pro-health or poor health indicator microbes describe herein. Typically, such methods include comparing the presence, absence, and/or quantity of one or more indicator microbes in a subject’s microbiome before and after administration of a compound or composition. If the abundance of biomolecule(s) associated with at least one pro-health microbe increases after treatment, or the abundance of biomolecule(s) associated with at least one poor health microbe decreases, or if the relative abundance of biomolecule(s) shifts to be more similar to a “healthy” profile or fingerprint discussed herein, the compound or composition may be effective in improving the health of the subject.

[0236] (VI) Systems.

[0237] Also provided are systems to assay a biological condition in a subject, such as a human or other mammalian subject. By way of example, such a system includes: a nucleic acid sample isolation device, which is adapted to isolate a nucleic acid sample from the subject; a sequencing device, which is connected to the nucleic acid sample isolation device and adapted to sequence the nucleic acid sample, thereby obtaining a sequencing result; and an alignment device, which is connected to the sequencing device and adapted to align the sequencing result against sequence from one or more of microbes in order to determine presence or absence of the microbe(s) based on the alignment result. In examples of such systems, the microbes include one or more of: pro-health indicator microbes selected from the group including Prevotella copri, Blastocystis spp., Haemophilus parainfluenzae, Firmicutes bacterium CAG 95, Bifidobacterium animalis, Oscillibacter sp 57 20, Roseburia sp CAG 182, Veillonella dispar, Eubacterium eligens, Firmicutes bacterium CAG 170, Rothia mucilaginosa, Veillonella infantium, Roseburia hominis, Oscillibacter sp PC13, Clostridium sp CAG 167, Ruminococcaceae bacterium D5, Paraprevotella xylaniphila, Faecalibacterium prausnitzii, Romboutsia ilealis, and Veillonella atypica ; and/or poor health indicator microbes selected from the group including Eubacterium ventriosum, Roseburia inulinivorans, Clostridium spiroforme, Clostridium bolteae CAG 59, Eggerthella lenta, Clostridium bolteae, Collinsella intestinalis, Clostridium innocuum, Blautia obeum, Clostridium symbiosum, Clostridium sp CAG 58, Blautia hydrogenotrophica, Anaerotruncus colihominis, Ruminococcus gnavus, Flavonifractor plautii, Clostridium leptum, Ruthenibacterium lactati forma ns, and Escherichia coli.

[0238] Optionally, the systems may further include an information delivery device capable of delivering to the subject information about the results of the alignment. Such information may include one or more of: the identity and/or relative or absolute quantity of one or more microbes, such as microbes found or not found in the microbiome of the subject; information on the subject’s gut microbiome health; information on the health of the subject, for instance based the presence, absence, or relative abundance of one or microbes in the subject’s microbiome; one or more recommendations for how to modify the subject’s diet; a specific recommendation for a food to eat, or a food to avoid; information on general diet plan(s); options for lifestyle choices; and so forth.

[0239] The Exemplary Embodiments and Example(s) below are included to demonstrate particular embodiments of the disclosure. Those of ordinary skill in the art will recognize in light of the present disclosure that many changes can be made to the specific embodiments disclosed herein and still obtain a like or similar result without departing from the spirit and scope of the disclosure.

[0240] (VII) Exemplary Embodiments.

1 . A method of using a group of microbes to determine a health condition in a human subject, wherein the group of microbes includes: at least two pro-health indicator microbes; or at least two poor health indicator microbes; or at least two pro-health indicator microbes and at least two poor health indicator microbes; wherein the pro-health indicator microbes are selected from the group including Prevotella copri, Blastocystis spp., Haemophilus parainfluenzae, Firmicutes bacterium CAG 95, Bifidobacterium animalis, Oscillibacter sp 57 20, Roseburia sp CAG 182, Veillonella dispar, Eubacterium eligens, Firmicutes bacterium CAG 170, Rothia mucilaginosa, Veillonella infantium, Roseburia hominis, Oscillibacter sp PC13, Clostridium sp CAG 167, Ruminococcaceae bacterium D5, Paraprevotella xylaniphila, Faecalibacterium prausnitzii, Romboutsia ilealis, and Veillonella atypica ; and wherein the poor health indicator microbes are selected from the group including Eubacterium ventriosum, Roseburia inulinivorans, Clostridium spiroforme, Clostridium bolteae CAG 59, Eggerthella ienta, Clostridium bolteae, Collinsella intestinalis, Clostridium innocuum, Blautia obeum, Clostridium symbiosum, Clostridium sp CAG 58, Blautia hydrogenotrophica, Anaerotruncus colihominis, Ruminococcus gnavus, Flavonifractor plautii, Clostridium leptum, Ruthenibacterium lactati forma ns, and Escherichia coli.

2. The method of embodiment 1 , including: obtaining a biological sample from the human subject; and analyzing the biological sample to determine presence, absence, or abundance of the at least two pro-health indicator microbes and/or the at least two poor health indicator microbes.

3. The method of embodiment 1 , including: obtaining a biological sample from the human subject; identifying in the biological sample at least 10, at least 20, at least 30, at least 40, at least 50, at least 60, at least 70, at least 80, at least 90, at least 100, at least 125, at least 150, at least 175, at least 200, or more than 200 different microbes in the biological sample; and determining the health condition of the human subject based on presence, absence, and/or absolute or relative abundance of the identified microbes in the biological sample.

4. The method of embodiment 1 , wherein the group of microbes includes: at least three prohealth indicator microbes; at least five pro-health indicator microbes; at least ten pro-health indicator microbes; or more than 10 listed pro-health indicator microbes.

5. The method of embodiment 1 , wherein the group of microbes includes: at least three poor health indicator microbes; at least five poor health indicator microbes; at least ten poor health indicator microbes; or more than 10 listed poor health indicator microbes.

6. The method of embodiment 1 , wherein the group of microbes includes Clostridium innocuum, C. symbiosum, C. spiroforme, C. leptum, C. saccharolyticum.

7. The method of embodiment 1 , wherein the group of microbes includes P. copri and Blastocystis spp.

8. The method of any one of embodiments 1-3, wherein the health condition includes at least one of: overall good health, overall poor health, obesity, BMI, diabetes risk, cardiometabolic risk, cardiovascular disease risk, or postprandial response to food intake.

9. The method of any one of embodiments 1-8, including detecting the presence, absence, or relative abundance of at least one of the microbes in a microbiome sample from the human subject.

10. The method of embodiment 9, wherein the detecting includes one or more of: sequencing one or more nucleic acids of a pro-health or poor health microbe, hybridizing a nucleic acid probe to a nucleic acid of a pro-health or poor health microbe, detecting one or more proteins from a pro-health or poor health microbe, or measuring activity of one or more proteins a pro-health or poor health microbe.

11 . The method of embodiment 9, wherein the detecting includes shotgun metagenomics.

12. The method of any one of embodiments 1-10, wherein the biological sample includes a stool sample. 13. A method of predicting a health condition in a subject, including: determining presence, absence, or relative abundance of at least three pro-health indicator microbes in a microbiome of the subject; determining presence, absence, or relative abundance of at least three poor health indicator microbes in a microbiome of the subject; and predicting the health condition of the subject, based on the presence, absence, or relative abundance of the pro-health and/or poor health indicator microbes in the microbiome of the subject; wherein the pro-health indicator microbes are selected from the group including Prevotella copri, Blastocystis spp., Haemophilus parainfluenzae, Firmicutes bacterium CAG 95, Bifidobacterium animalis, Oscillibacter sp 5720, Roseburia sp CAG 182, Veillonella dispar, Eubacterium eligens, Firmicutes bacterium CAG 170, Rothia mucilaginosa, Veillonella infantium, Roseburia hominis, Oscillibacter sp PC13, Clostridium sp CAG 167, Ruminococcaceae bacterium D5, Paraprevotella xylaniphila, Faecalibacterium prausnitzii, Romboutsia ilealis, and Veillonella atypica ; and wherein the poor health indicator microbes are selected from the group including Eubacterium ventriosum, Roseburia inulinivorans, Clostridium spiroforme, Clostridium bolteae CAG 59, Eggerthella lenta, Clostridium bolteae, Collinsella intestinalis, Clostridium innocuum, Blautia obeum, Clostridium symbiosum, Clostridium sp CAG 58, Blautia hydrogenotrophica, Anaerotruncus colihominis, Ruminococcus gnavus, Flavonifractor plautii, Clostridium leptum, Ruthenibacterium lactati forma ns, and Escherichia coli.

14. The method of embodiment 13, wherein: the health condition includes at least one of obesity, increased cardiometabolic risk, diabetes risk, or overall poor health; and the health condition is predicted by the presence and/or abundance of more poor health indicator microbes than pro-health indicator microbes; and/or the health condition includes at least one of overall good health or absence of obesity, reduced cardiometabolic risk, or reduced diabetes risk; and the health condition is predicted by the presence and/or abundance of more pro-health indicator microbes than poor health indicator microbes.

15. A method to predict overall good or poor general health in a non-diseased human subject, including: obtaining a microbiome sample from the human subject; isolating a nucleic acid fraction from the microbiome sample; detecting, within the nucleic acid fraction, presence, absence, or relative abundance of at least one unique marker sequence indicative of: a pro-health indicator microbe selected from the group including Prevotella copri, Blastocystis spp., Haemophilus parainfluenzae, Firmicutes bacterium CAG 95, Bifidobacterium animalis, Oscillibacter sp 57 20, Roseburia sp CAG 182, Veillonella dispar, Eubacterium eligens, Firmicutes bacterium CAG 170, Rothia mucilaginosa, Veillonella infantium, Roseburia hominis, Oscillibacter sp PC13, Clostridium sp CAG 167, Ruminococcaceae bacterium D5, and Paraprevotella xylaniphila ; or a poor health indicator microbes selected from the group including Eubacterium ventriosum, Roseburia inulinivorans, Clostridium spiroforme, Clostridium bolteae CAG 59, Eggerthella lenta, Clostridium bolteae, Collinsella intestinalis, Clostridium innocuum, Blautia obeum, Clostridium symbiosum, Clostridium sp CAG 58, Blautia hydrogenotrophica, Anaerotruncus colihominis, Ruminococcus gnavus, Flavonifractor plautii, Clostridium leptum, Ruthenibacterium lactatiformans, and Escherichia coll·, and at least one of predicting the human subject has overall good general health if the pro-health indicator microbes outnumber or are relatively more abundant than the poor-health indicator microbes; or predicting the human subject has overall poor general health if the poor health indicator microbes outnumber or are relatively more abundant than the pro-health indicator microbes.

16. The method of embodiment 15, further including providing to the human subject a dietary recommendation based on the presence, absence, or relative abundance of one or more poor health indicator microbes and/or one or more pro-health indicator microbes.

17. An assay, including: subjecting nucleic acid extracted from a test sample of a human subject to a genotyping assay that detects at least one of Prevotella copri, Blastocystis spp., Haemophilus para influenzae, Firmicutes bacterium CAG 95, Bifidobacterium animalis, Oscillibacter sp 57 20, Roseburia sp CAG 182, Veillonella dispar, Eubacterium eligens, Firmicutes bacterium CAG 170, Rothia mucilaginosa, Veillonella infantium, Roseburia hominis, Oscillibacter sp PC13, Clostridium sp CAG 167, Ruminococcaceae bacterium D5, Paraprevotella xylaniphila, Faecalibacterium prausnitzii, Romboutsia ilealis, and Veillonella atypica, the test sample including microbiota from a gut of the subject; determining a relative abundance of the at least one of Prevotella copri, Haemophilus para influenzae, Firmicutes bacterium CAG 95, Bifidobacterium animalis, Oscillibacter sp 57 20, Roseburia sp CAG 182, Veillonella dispar, Eubacterium eligens, Firmicutes bacterium CAG 170, Rothia mucilaginosa, Veillonella infantium, Roseburia hominis, Oscillibacter sp PC13, Clostridium sp CAG 167, Ruminococcaceae bacterium D5, Paraprevotella xylaniphila, Faecalibacterium prausnitzii, Romboutsia ilealis, and Veillonella atypica that is below a predetermined abundance; and selecting, when the relative abundance is below the predetermined abundance, a treatment regimen that includes at least one of: (i) modifying microbiota of the gut of the subject using at least one of a prebiotic, probiotic, or pharmaceutical, or (ii) altering the diet of the human subject.

18. An assay, including: subjecting nucleic acid extracted from a test sample of a human subject to a genotyping assay that detects at least one of Eubacterium ventriosum, Roseburia inulinivorans, Clostridium spiroforme, Clostridium bolteae CAG 59, Eggerthella lenta, Clostridium bolteae, Collinsella intestinalis, Clostridium innocuum, Blautia obeum, Clostridium symbiosum, Clostridium sp CAG 58, Blautia hydrogenotrophica, Anaerotruncus colihominis, Ruminococcus gnavus, Flavonifractor plautii, Clostridium leptum, Ruthenibacterium lactatiformans, and Escherichia coli, the test sample including microbiota from a gut of the subject; determining a relative abundance of the at least one Eubacterium ventriosum, Roseburia inulinivorans, Clostridium spiroforme, Clostridium bolteae CAG 59, Eggerthella lenta, Clostridium bolteae, Collinsella intestinalis, Clostridium innocuum, Blautia obeum, Clostridium symbiosum, Clostridium sp CAG 58, Blautia hydrogenotrophica, Anaerotruncus colihominis, Ruminococcus gnavus, Flavonifractor plautii, Clostridium leptum, Ruthenibacterium lactatiformans, and Escherichia coli, that is above a predetermined abundance; and selecting, when the relative abundance is above the predetermined abundance, a treatment regimen that includes at least one of: (i) modifying microbiota of the gut of the subject using at least one of a prebiotic, probiotic, or pharmaceutical, or (ii) altering the diet of the human subject.

19. A method of diagnosing a human subject as having a healthy diet, including detecting in a microbiome sample from the subject the presence of Firmicutes CAG95 and/or the absence of Firmicutes CAG94.

20. A method of diagnosing a human subject as having an unhealthy diet, including detecting in a microbiome sample from the subject the presence of Firmicutes CAG94 and/or the absence of Firmicutes CAG95.

21. A microbial signature (fingerprint) for good health, including presence or relatively high abundance of at least three microbes selected from the group including Prevotella copri, Blastocystis spp., Haemophilus parainfluenzae, Firmicutes bacterium CAG 95, Bifidobacterium animalis, Oscillibacter sp 57 20, Roseburia sp CAG 182, Veillonella dispar, Eubacterium eligens, Firmicutes bacterium CAG 170, Rothia mucilaginosa, Veillonella infantium, Roseburia horn inis, Oscillibacter sp PC13, Clostridium sp CAG 167, Ruminococcaceae bacterium D5, Paraprevotella xylaniphila, Faecalibacterium prausnitzii, Romboutsia ilealis, and Veillonella atypica, and/or absence or relatively low abundance of at least three microbes selected from the group including Eubacterium ventriosum, Roseburia inulinivorans, Clostridium spiroforme, Clostridium bolteae CAG 59, Eggerthella lenta, Clostridium bolteae, Collinsella intestinalis, Clostridium innocuum, Blautia obeum, Clostridium symbiosum, Clostridium sp CAG 58, Blautia hydrogenotrophica, Anaerotruncus colihominis, Ruminococcus gnavus, Flavonifractor plautii, Clostridium leptum, Ruthenibacterium lactati forma ns, and Escherichia coli.

22. A microbial signature for poor health, including absence or relatively low abundance of at least three microbes selected from the group including Prevotella copri, Blastocystis spp., Haemophilus parainfluenzae, Firmicutes bacterium CAG 95, Bifidobacterium animalis, Oscillibacter sp 57 20, Roseburia sp CAG 182, Veillonella dispar, Eubacterium eligens, Firmicutes bacterium CAG 170, Rothia mucilaginosa, Veillonella infantium, Roseburia hominis, Oscillibacter sp PC13, Clostridium sp CAG 167, Ruminococcaceae bacterium D5, Paraprevotella xylaniphila, Faecalibacterium prausnitzii, Romboutsia ilealis, and Veillonella atypica, and/or presence or relatively high abundance of at least three microbes selected from the group including R. gnavus, F. plautii, C. innocuum, C. symbiosum, C. bolteae, A. colihominis, C. intestinalis, B. obeum, R. inulinivorans, E. ventriosum, B. hydrogenotrophica, Clostridium CAG 58, E. lenta, C. bolteae CAG 59, C. spiroforme, Clostridium leptum, Ruthenibacterium lactati forma ns, and Escherichia coli.

23. The microbial signature of embodiment 21 , wherein the signature includes: at least three prohealth indicator microbes; at least five pro-health indicator microbes; at least ten pro-health indicator microbes; or more than 10 listed pro-health indicator microbes.

24. The microbial signature of embodiment 21 , wherein the group of microbes includes P. copri and Blastocystis spp. 25. The microbial signature of embodiment 22, wherein the group of microbes includes: at least three poor health indicator microbes; at least five poor health indicator microbes; at least ten poor health indicator microbes; or more than 10 listed poor health indicator microbes.

26. The microbial signature of embodiment 22, wherein the group of microbes includes Clostridium innocuum, C. symbiosum, C. spiroforme, C. leptum, C. saccharolyticum.

27. Use of the microbial signature of any one of embodiments 2-26, to guide treatment decisions for a human subject.

28. The use of embodiment 27, wherein the treatment decision includes selecting one or more of: modifying overall diet, increasing intake of at least one specified food or supplement, decreasing intake of at least one specified food or supplement, administration of a probiotic composition, administration of a prebiotic composition, or administration of an antibiotic compound.

29. A method for targeting a microbiome of a human subject to promote health, including: (A) detecting in a microbiome sample from the human subject one or more pro-health indicator microbes selected from the group including Prevotella copri, Blastocystis spp., Haemophilus parainfluenzae, Firmicutes bacterium CAG 95, Bifidobacterium animalis, Oscillibacter sp 57 20, Roseburia sp CAG 182, Veillonella dispar, Eubacterium eligens, Firmicutes bacterium CAG 170, Rothia mucilaginosa, Veillonella infantium, Roseburia hominis, Oscillibacter sp PC13, Clostridium sp CAG 167, Ruminococcaceae bacterium D5, and Paraprevotella xylaniphila ; and administering to the human a composition that increases growth or survival of the pro-health indicator microbe(s); and/or (B) detecting in a microbiome sample from the human subject one or more poor health indicator microbe selected from the group including Eubacterium ventriosum, Roseburia inulinivorans, Clostridium spiroforme, Clostridium bolteae CAG 59, Eggerthella ienta, Clostridium bolteae, Collinsella intestinalis, Clostridium innocuum, Blautia obeum, Clostridium symbiosum, Clostridium sp CAG 58, Blautia hydrogenotrophica, Anaerotruncus colihominis, Ruminococcus gnavus, and Flavonifractor plautil·, and administering to the human a composition that decreases growth or survival of the poorhealth indicator microbe(s).

30. The method of embodiment 29, including detecting: at least three pro-health indicator microbes; at least five pro-health indicator microbes; at least ten pro-health indicator microbes; or more than 10 listed pro-health indicator microbes.

31 . The method of embodiment 29 or embodiment 30, wherein the indicator microbes include P. copri and Blastocystis spp..

32. The microbial signature of embodiment 29, including detecting: at least three poor health indicator microbes; at least five poor health indicator microbes; at least ten poor health indicator microbes; or more than 10 listed poor health indicator microbes.

33. The microbial signature of embodiment 29, wherein the indicator microbes include Clostridium innocuum, C. symbiosum, C. spiroforme, C. leptum, C. saccharolyticum.

34. A probiotic composition for ingestion by a human subject, including at least one of Prevotella copri, Blastocystis spp., Haemophilus parainfluenzae, Firmicutes bacterium CAG 95, Bifidobacterium anim alls, OscHlibacter sp 57 20, Roseburia sp CAG 182, Veillonella dispar, Eubacterium eligens, Firmicutes bacterium CAG 170, Rothia mucilaginosa, Veillonella infantium, Roseburia hominis, OscHlibacter sp PC13, Clostridium sp CAG 167, Ruminococcaceae bacterium D5, Paraprevotella xylaniphila, Faecalibacterium prausnitzii, Romboutsia ilealis, and Veillonella atypica.

35. The probiotic composition of embodiment 34, including at least three, at least five, at least seven, at least 9, at least 10, at least 12, at least 14, or all of the listed microbes.

36. The probiotic composition of embodiment 34 or embodiment 35, including Prevotella copri or Blastocystis spp. or both.

37. A method of altering abundance of one or more microbes in gut microflora of a subject, including administering the probiotic composition of embodiment 34 to the subject.

38. A system to assay a biological condition in a subject, including: a nucleic acid sample isolation device, which is adapted to isolate a nucleic acid sample from the subject; a sequencing device, which is connected to the nucleic acid sample isolation device and adapted to sequence the nucleic acid sample, thereby obtaining a sequencing result; and an alignment device, which is connected to the sequencing device and adapted to align the sequencing result against sequence from one or more of microbes in order to determine presence or absence of the microbe(s) based on the alignment result, wherein the microbes include one or more of: pro-health indicator microbes selected from the group including Prevotella copri, Blastocystis spp., Flaemophilus parainfluenzae, Firmicutes bacterium CAG 95, Bifidobacterium animalis, OscHlibacter sp 57 20, Roseburia sp CAG 182, Veillonella dispar, Eubacterium eligens, Firmicutes bacterium CAG 170, Rothia mucilaginosa, Veillonella infantium, Roseburia hominis, OscHlibacter sp PC13, Clostridium sp CAG 167, Ruminococcaceae bacterium D5, Paraprevotella xylaniphila, Faecalibacterium prausnitzii, Romboutsia ilealis, and Veillonella atypica ; and/or poor health indicator microbes selected from the group including Eubacterium ventriosum, Roseburia inulinivorans, Clostridium spiroforme, Clostridium bolteae CAG 59, Eggerthella lenta, Clostridium bolteae, Collinsella intestinalis, Clostridium innocuum, Blautia obeum, Clostridium symbiosum, Clostridium sp CAG 58, Blautia hydrogenotrophica, Anaerotruncus colihominis, Ruminococcus gnavus, Flavonifractor plautii, Clostridium leptum, Ruthenibacterium lactati forma ns, and Escherichia coli.

[0241] (VIII) Example(s).

[0242] Example 1 : Microbiome connections with host metabolism and habitual diet from the PREDICT 1 metagenomic study

[0243] The gut microbiome is shaped by diet and influences host metabolism, but these links remain poorly characterized, are complex and can be unique to each individual. This example describes the deep metagenomic sequencing of more than 1 ,100 gut microbiomes from individuals with detailed long-term diet information, as well as hundreds of fasting and same-meal postprandial cardiometabolic blood markers. Strong associations were found between microbes and specific nutrients, foods, food groups, and general dietary indices, driven especially by the presence and diversity of healthy and plant-based foods. Microbial biomarkers of obesity were reproducible across cohorts, and blood markers of cardiovascular disease and impaired glucose tolerance were more strongly associated with microbiome structures. Although some microbes, such as Prevotella copri and Blastocystis spp., were indicators of reduced postprandial glucose metabolism, several species were more directly predictive for postprandial triglycerides and C-peptide. The panel of intestinal species associated with healthy dietary habits overlapped with those associated with favorable cardiometabolic and postprandial markers, indicating this large-scale resource can potentially stratify the gut microbiome into generalizable health levels among individuals without clinically manifest disease. At least some of the material described in this Example was published as Asnicar et al. (“Microbiome connections with host metabolism and habitual diet from 1 ,098 deeply phenotyped individuals”, Nat Med. 27:321- 323, 2021 ; associated metagenomes deposited in European Bioinformatics Institute European Nucleotide Archive under accession no. PRJEB39223; all of which is incorporated herein by reference for all it teaches).

[0244] INTRODUCTION

[0245] Dietary contributions to health, and particularly to long-term chronic conditions such as obesity, metabolic syndrome, and cardiac events, are of universal importance. This is especially true as obesity and associated mortality and morbidity have risen dramatically over the past decades, and continue to do so worldwide. The reasons for this relatively rapid change have remained unclear, with the gut microbiome implicated as one of several potentially causal human-environmental interactions (Brown & Hazen, Nat. Rev. Microbiol. 16:171-181 , 2018; Mozaffarian, Circulation 133:187-225, 2016; Musso et al., Annu. Rev. Med. 62, 361-380, 2011 ; Le Chatelier et al., Nature 500:541-546, 2013). Surprisingly, the details of the microbiome’s role in obesity and cardiometabolic health have proven difficult to define reproducibly in large, diverse human populations, contrary to their behavior in mice. This is likely due to the complexity of habitual diets, the difficulty of measuring them at scale, and the highly personalized nature of the microbiome (Gilbert et al., Nat. Med. 24:392-400, 2018).

[0246] This example describes the Personalized Responses to Dietary Composition Trial (PREDICT 1) observational and interventional study of diet-microbiome interactions in metabolic health. PREDICT 1 included over 1 ,000 participants in the United Kingdom (UK) and the United States (US) who were profiled pre- and post-standardized dietary challenges using a combination of intensive in clinic biometric and blood measures, nutritionist-administered free-living dietary recall and logging, habitual dietary data collection, continuous glucose monitoring, and stool shotgun metagenomic sequencing. This study was inspired by and generally concordant with previous large-scale diet- microbiome interaction profiles, identifying both overall gut microbiome configurations and specific microbial taxa and functions associated with postprandial glucose responses (Zeevi et al., Cell 163:1079-1094, 2015; Mendes-Soares et al., Am. J. Clin. Nutr. 110, 63-75, 2019), obesity-associated biometrics such as body mass index (BMI) and adiposity (Falony et al., Science 352, 560-564, 2016; Zhernakova et al., Science 352, 565-569, 2016; Thingholm et al. Cell Host Microbe 26, 252-264. e10, 2019), and blood lipids and inflammatory markers (Schirmeret al., Cell 167:1897, 2016; Fu et al., Circ. Res. 117:817-824, 2015; Org et al., Genome Biol. 18:70, 2017). By combining PREDICT’s extensive dietary and blood biomarker measures with high-precision microbiome analysis, these findings were able to extend to specific beneficial (e.g. Faecalibacterium prausnitzif) and detrimental (e.g. Ruminococcus gnavus) organisms, as well as to a highly-reproducible gut microbial signature of overall health that is validated across multiple blood and dietary measures within PREDICT and in several previously published cohorts (Pasolli et al., Nat. Methods 14:1023-1024, 2017).

[0247] MATERIALS AND METHODS [0248] The PREDICT 1 study

[0249] The PREDICT 1 clinical trial (NCT03479866) aimed to quantify and predict individual variations in metabolic responses to standardized meals. Data was integrated from a cohort of twins and unrelated adults from the UK to explore genetic, metabolic, microbiome composition, meal composition and meal context data to distinguish predictors of individual responses to meals. These predictions were then validated in an independent cohort of adults from the US. The trial was a single arm, single-blinded intervention study that commenced in June 2018 and completed in May 2019. [0250] For full protocol, see Berry et al. (Protocol Exchange, 2020). In brief; 1 ,002 generally healthy adults from the United Kingdom (UK; non-twins, and identical [monozygotic; MZ] and non-identical [dizygotic; DZ] twins) and 100 healthy adults from the United States (US) (non-twins; validation cohort) were enrolled in the study and completed baseline clinic measurements. The study included a 1-day clinical visit at baseline followed by a 13-day at-home period. At baseline (Day 1), participants arrived fasted and were given a standardized metabolic challenge meal for breakfast (Oh; 86g carbohydrate, 53g fat) and lunch (4h; 71 g carbohydrate, 22g fat). Fasting and postprandial (9 timepoints; 0-6h) venous blood was collected to determine serum concentrations of glucose, triglycerides (TG), insulin, C-peptide (as a surrogate for insulin) and metabolomics (by NMR). Stool samples, anthropometry, and a questionnaire querying habitual diet, lifestyle and medical health were obtained at baseline. During the home-phase (Days 2-14), participants consumed standardized test meals in duplicate varying in sequence and macronutrient composition, while wearing digital devices to continuously monitor their blood glucose (continuous glucose monitor; CGM), physical activity and sleep. Capillary blood was collected using dried blood spot cards, during the clinic visit and at home, to analyze fasting and postprandial concentrations of TG and C-peptide. Participants were supported throughout the study with reminders and communication from study staff delivered through the ZOE® (Zoe Global Limited, London, England) study app. A second stool sample was collected at home by participants following completion of the study and all devices and samples were mailed back to study staff. To monitor compliance, all test meals consumed by participants were logged in the ZOE® (Zoe Global Limited) app (with an accompanying picture) and reviewed in real-time by the study nutritionists. Only test meals that were consumed according to the standardized meal protocol were included in the analysis. [0251] The recruitment criteria, meal intervention challenges, outcome variables, and sample collection and analysis procedures relevant to this paper are described elsewhere (Berry et al., Protocol Exchange, 2020). The trial was approved in the UK by the Research Ethics Committee and Integrated Research Application System (IRAS 236407) and in the US by the Partners Healthcare Institutional Review Board (IRB 2018P002078). The core characteristics of study participants at baseline were not significantly different between UK and US cohorts.

[0252] Overview of microbiome sequencing and profiling

[0253] Deep shotgun metagenomic sequencing was performed (mean 8.8±2.2 gigabases/sample) in stool samples from a total of 1 ,098 PREDICT 1 participants (UK n=1 ,001 ; US n=97). From a random subset of these participants (n=70), fecal metagenomes were sequenced from a second stool sample collected 14 days after the first collection (FIG. 9A) for a total of 1 ,168 metagenomes. Computational analysis was performed using the bioBakery suite of tools (Mclver et al., Bioinformatics 34, 1235— 1237, 2018) to obtain species-level microbial abundances for the 769 taxa identified using the newly updated MetaPhlAn 2.96 tool (version 2.14; Kang et al., PeerJ 7, e7359, 2019), functional potential profiling of >1.91 M microbial gene families, 445 KEGG pathways with HUMAnN 2.0 (version 0.11.2 and UniRef database release 2014-07; Franzosa et al., Nat. Methods 15, 962-968, 2018), and reconstruction of 48,181 metagenome-assembled genomes (MAGs) of medium or high-quality using the validated pipeline (Pasolli et al., Cell 176, 649-662. e20, 2019), which includes assembly with MegaHIT (Li et al., Bioinformatics 31 , 1674-1676, 2015), binning with MetaBAT2 (Kang et al., PeerJ 7, e7359, 2019), and quality-control with CheckM (version 1.0.18; Parks etal., Genome Res. 25:1043- 1055, 2015).

[0254] Microbiome sample collection

[0255] Participants were mailed a pre-visit study pack with a stool collection kit and relevant questionnaires and asked to collect an at-home stool sample at two timepoints (one day prior to their in-person clinical visit on day 0 and the next at the conclusion of their home-phase, day 14). Those who did not collect a sample prior to their in-person, baseline visit completed the collection as soon as possible during the home-phase. Baseline samples in the UK were collected using the EasySampler collection kit (ALPCO, NH), whereas post-study samples, as well as the entirety of the US collection was conducted using the Fecotainer collection kit (Excretas Medical BV, Enschede, the Netherlands). For baseline samples, one fresh unfixed sample was deposited into a sterile universal collection container (Sarstedt, Australia, Cat #L0263-10) and one into a tube containing DNA/RNA Shield buffer (Zymo Research, CA, US, Cat #R1101). Samples were stored at ambient temperature until return to the study staff. Follow-up samples were collected similarly, but only sampled into a DNA/RNA Shield buffer tube and sent by standard mail to study staff. Upon receipt in the laboratory, samples were homogenized, aliquoted, and stored at -80°C in Qiagen PowerBeads 1.5 mL tubes (Qiagen, Germany). This sample collection procedure was tested and validated internally comparing different storage conditions (fresh, frozen, buffer), different DNA extraction kits (PowerSoilPro, FastDNA, ProtocolQ, Zymo), and different sequencing technologies (16S rRNA, shotgun metagenomics, and arrays).

[0256] DNA extraction and sequencing

[0257] DNA was isolated by QIAGEN Genomic Services using DNeasy® (Qiagen) 96 PowerSoil® (Qiagen) Pro from all Day 0 (baseline) DNA/RNA shield fixed microbiome samples. A random subset of Day 14 (end of at-home phase) samples (n=70) were also extracted. Optical density measurement was done using Spectrophotometer Quantification (Tecan Infinite 200). Before library preparation and sequencing, the quality and quantity of the samples were assessed using the Fragment Analyzer (Agilent Technologies, Inc., Santa Clara, CA) according to manufacturer's guidelines. Samples with a high-quality DNA profile were further processed. The NEBNext® (New England Biolabs, Ipswich, MA) Ultra II FS DNA module (Cat# NEB #E7810S/L) was used for DNA fragmentation, end-repair, and A- tailing. For adapter ligation, the NEBNext® (New England Biolabs) Ultra II Ligation module (Cat# NEB #E7595S/L) was used. The quality and yield after sample preparation were measured with the Fragment Analyzer. The size of the resulting product was consistent with the expected size of 500- 700 bp. Libraries were sequenced for 300 bp paired-end reads using the lllumina NovaSeq® (lllumina, San Diego, CA) 6000 platform according to manufacturer's protocols. 1.1 nM library was used for flow cell loading. NovaSeq® (lllumina) control software NCS v1.5 was used. Image analysis, base calling, and the quality check were performed with the lllumina data analysis pipeline RTA3.3.5 and Bcl2fastq v2.20.

[0258] Metagenome quality control and pre-processing

[0259] All sequenced metagenomes were QCed using the pre-processing pipeline as implemented in the BiotBucket Computational Metagenomics Lab, available online at https://github.com/SegataLab/preprocessing. Pre-processing includes three main steps: (1) read-level quality control; (2) screening of contaminant i.e. host sequences; and (3) split and sorting of cleaned reads. Initial quality control involves the removal of low-quality reads (quality score <Q20), fragmented short reads (<75 bp), and reads with >2 ambiguous nucleotides. Contaminant DNA was identified using Bowtie 2 (Langmead & Salzberg, Nat Methods 9(4):357-359, 2012) using the -sensitive-local parameter, allowing confident removal of the phiX174 lllumina spike-in and human-associated reads (hg19). Sorting and splitting allowed for the creation of standard forward, reverse, and unpaired reads output files for each metagenome.

[0260] Microbiome taxonomic and functional potential profiling

[0261] The metagenomic analysis was performed following the general guidelines described by Quince et al. (Nat. Biotechnol. 35, 833-844, 2017) and relying on the bioBakery computational environment (Mclver et al., Bioinformatics 34, 1235-1237, 2018). The taxonomic profiling and quantification of organisms’ relative abundances of all metagenomic samples were quantified using MetaPhlAn2 (Metagenomic Phylogenetic Analysis; version 2.9.21 and marker database release 2.9.4; Truong et al., Nat. Methods 12, 902-903, 2015). The updated species-specific database of markers was built using 99,237 reference genomes representing 16,797 species retrieved from GenBank (January 2019). From this set of reference genomes, a total of 1 ,077,785 markers were extracted and 10,586 species were profiled. Compared to the previous version of the MetaPhlAn2 database (mpa_v20_m200), the updated database is able to profile 8,102 more species. Metagenomes were mapped internally in MetaPhlAn2 against the marker genes database with Bowtie2 version 2.3.4.3 with the parameter “very-sensitive”. The resulting alignments were filtered to remove reads aligned with a MAPQ value <5, representing an estimated probability of the likelihood of the alignments.

[0262] For estimating the microbiome species richness of an individual, from the taxonomic profiles of the PREDICT 1 participants, two alpha diversity measures were computed: the number of species found in the microbiome ("observed richness"), and the Shannon entropy estimation. Microbiome dissimilarity between participants (beta diversity) was computed using the Bray-Curtis dissimilarity and the Aitchison distance on microbiome taxonomic profiles.

[0263] Functional potential analysis of the metagenomic samples was performed using HUMAnN2 (version 0.11.2 and UniRef database release 2014-07; Franzosa et al., Nat Methods 15, 962-968, 2018) that computed pathway profiles and gene-family abundances.

[0264] Metagenomic assembly

[0265] Metagenomic samples were processed to obtain metagenome-assembled genomes (MAGs) following the procedure used elsewhere (Pasolli et al., Cell 176, 649-662. e20, 2019). In brief, MEGAHIT (version 1.2.9; Li et al., Bioinformatics 31 , 1674-1676, 2015) was used with parameters “- -k-max 127” for assembly and assembled contigs >1.5kb were considered for the binning step performed using MetaBAT2 (version 2.14; Kang et ai., PeerJ l, e7359, 2019) with parameters: “-m 1500 -unbinned”. Quality control of the obtained MAGs was performed using CheckM (version 1.0.18; Parks et ai., Genome Res. 25:1043-1055, 2015) using default parameters. High-quality and medium- quality microbial genomes were integrated into the existing database of >150,000 human MAGs. [0266] Collection and processing of habitual diet information

[0267] Habitual diet information was collected using food frequency questionnaires (FFQ). For the UK, the European Prospective Investigation into Cancer and Nutrition (EPIC) FFQ was used and in the US, the Harvard semi-quantitative FFQ was used.

[0268] For the UK, the 131 -item EPIC FFQ that was developed and validated against pre-established nutrient biomarkers was used for the EPIC Norfolk (Bingham et al., Public Health Nutr. 4, 847-858, 2001). The questionnaire captured average intakes in the past year. Nutrient intakes were determined via consultation with McCance and Widdowson's 6th edition, an established nutrient database (Holland et ai., McCance and Widdowson’s The Composition of Foods. (Royal Society of Chemistry, 1991). US participants completed the Harvard 2007 Grid 131 -item FFQ previously validated against two week dietary records (Rimm et ai., Am J Epidemiol 135(10:1114-1126, 1992).

[0269] Nutrient intakes were estimated using the Harvard Nutrient Database.

[0270] Submitted FFQs were excluded if greater than 10 food items were left unanswered, or if the total energy intake estimate derived from FFQ as a ratio of the subject’s estimated basal metabolic rate (determined by the Harris-Benedict equation; Frankenfield et al. , J. Am. Diet. Assoc. 98, 439- 445, 1998) was more than two standard deviations outside the mean of this ratio (<0.52 or >2.58). [0271] The following dietary indices were calculated as described below and according to categorization listed in Tables 1 and 3:

[0272] Healthy Food Diversity Index: The Healthy Food Diversity (HFD) index considers the number, distribution, and health value of consumed foods. To obtain this index, food frequency questionnaire foods were first aggregated into 15 food groups according to the HFD (Vadiveloo et al., Br. J. Nutr. 112, 1562-1574, 2014). Health values were then derived from the German Nutrition Society (DGE) dietary guidelines (available online at dge.de/en/); and the weight of each food group was multiplied by its corresponding health value (hv). Scores were divided by the maximum (hv=0.26) to bind values between 0-1 before multiplication with the Berry-Index. The original HFD was used instead of the US- HFD for the following reasons: the original HFD gives greater emphasis to plant-based foods and less to meat than the US-HFD which would more closely align with hypothesized microbiome-plant food/fibre interactions, and converting UK g/serving to US volume measures (as required for the US- HFD) would introduce additional error to the FFQ estimates.

[0273] The plant-based diet index: Three versions of the plant-based diet index (Satija et al., J. Am. Coll. Cardiol. 70, 411-422, 2017) were considered: the original plant-based diet index (PDI), the healthy plant-based index (h-PDI) and the unhealthy plant-based index (u-PDI). Eighteen food groups (amalgamated from the FFQ food groups; Table 1) were assigned either positive or reverse scores after segregation into quintiles, as outlined in Table 3 (Part 1) and Satija et al. (J. Am. Coll. Cardiol. 70, 411-422, 2017). Participants with an intake above the highest quintile for the positive score received a score of 5. Those below the lowest quintile intake received a score of 1 . A reverse value was applied for the reverse scores. The scores for each participant were summed to create the final score. For the PDI, a positive score was applied to the “healthy” and “less-healthyTunhealthy” plant foods, and a reverse score applied to the animal-based foods. For the h-PDI, positive scores were applied to the “healthy” plant foods, and a reverse score to the “less-healthyTunhealthy” plant foods and the animal-based foods. For the u-PDI, a positive score was applied to the “less- healthyTunhealthy” plant foods and a reverse score applied to the “healthy” plant foods and the animal-based foods.

[0274] Animal score: The animal-based score categorized animal foods into “healthy” and “less- healthyTunhealthy” categories according to previous epidemiological studies. A similar approach to the PDI scoring was applied to the animal-based food groups, with either a positive (“healthy”) or reverse (“less-healthyTunhealthy”) quintile scoring; Tables 1 and 3.

[0275] The aMED score (Mediterranean Diet): Adherence to the aMED diet was calculated by following the method outlined by Fung et al. (Am. J. Clin. Nutr. 82, 163-173, 2005). Nine food/nutrient categories were included (Table 3, Part 5) and the score ranged from 0 to 9 (“least” to “most” Mediterranean). To form groups, weekly intake frequencies were first multiplied for assigned foods by the amount in grams per serving and then divided by 7 to determine grams per day. Next, food gram amounts were summed to make the final category total. For all food categories as well as the fatty acid intake ratio, the median intake of each category was calculated. A score of 0 (no aMED) or 1 (aMED) was given for each category depending on whether the twin was above or below the median intake. For alcohol intake, a range was used for score assignment: females: 5-25 g/d; males: 10-50 g/d were assigned a score of 1 , while those above or below this range were assigned a score of 0. Finally, the aMED was then generated by summation of each category score.

[0276] Food groups: For individual analyses of food groups-microbe interaction, food groups were formed by aggregation of FFQ foods into the 18 PDI food groups plus margarine and alcohol (Table 3, Part 1).

[0277] Percentage of plants within diet: The percentage of plants within diet was calculated as weight in grams of plant foods within total weight (g) of diet after adjustment of FFQ foods into quantities (g) per week.

[0278] Number of plant foods. For the number of plant foods, each plant food item within the FFQ above the value of Og was allocated a score of 1 and summed for each participant. For the total number of plants and the number of “healthy” and “unhealthy” plants, FFQ food items were allocated into groups according to the PDI food groupings.

[0279] Collection and processing of fasting and postprandial markers

[0280] Venous blood samples were collected as described in Berry et al. (Protocol Exchange, 2020). In brief, participants were cannulated and venous blood was collected at fasting (prior to a test breakfast) and at 9 timepoints postprandially (15, 30, 60, 120, 180, 240, 270, 300, and 360 minutes). Plasma glucose and serum C-peptide and insulin were measured at all timepoints. Serum TG was measured at hourly intervals and serum metabolomics (NMR by Nightingale Health, Helsinki, Finland) at 0, 4 and 6h). Fasting samples were analyzed for lipid profile, thyroid-stimulating hormone, alanine aminotransferase, liver function panel, and complete blood count (CBC) analysis.

[0281] Continuous glucose monitoring (CGM) on days 2-14 were measured every 15 minutes using Freestyle Libre Pro continuous glucose monitors (Abbott, Abbott Park, IL, US), fitted on the upper, non-dominant arm at participants’ baseline clinical visit. Given the CGM device requires time to calibrate once fitted to a participant, CGM data collected 12 hours and onwards after activating the device was used for analysis.

[0282] Dry blood spot (DBS) analysis of TG and C-peptide was completed by participants on the first four days of the home-phase while consuming test meals. The timepoints were dependent on the test meal as described elsewhere (Berry et al., Protocol Exchange, 2020). Test cards were stored in aluminum sachets with desiccant once completed and placed in the refrigerator at the end of the study day or until participants mailed them back to the study site. DBS cards were frozen at -80 °C upon receipt in the laboratory until being shipped to Vitas for analysis (Vitas Analytical Services, Oslo, Norway).

[0283] Specific timepoints and increments for TG, glucose, insulin, and C-peptide were selected for the current analysis to reflect the different pathophysiological processes for each measure as described in the protocol (Berry et al., Protocol Exchange, 2020). The incremental area under the postprandial TG (0-6h), glucose (0-2h), and insulin (0-2h) curves (iAUC) were computed using the trapezium rule (Matthews et al., BMJ 300, 230-235, 1990).

[0284] For a detailed description of sample collection, processing and analysis see Berry et al., Protocol Exchange, 2020.

[0285] Machine learning

[0286] The machine learning (ML) framework employed is based on the scikit-learn Python package (Pedregosa et al., J. Mach. Learn. Res. 12, 2825-2830, 2011). The ML algorithms used for the prediction and classification of personal, habitual diet, fasting, and postprandial metadata are based on Random Forest (RF) regressor and classification. RF-based methods were selected a priori as it has been repeatedly shown to be particularly suitable and robust to the statistical challenges inherent to microbiome abundance data (Thomas et al., Nat. Med. 25, 667-678, 2019; Pasolli et al., PLoS Comput. Biol. 12, e1004977, 2016). For both the regression and classification tasks, a cross-validation approach was implemented, based on 100 bootstrap iterations and an 80/20 random split of training and testing folds. To specifically avoid overfitting as a result of the twin population and their shared factors, any twin was removed from the training fold if their twin was present in the test fold.

[0287] For the regression task, an RF regressor was trained to learn the feature to predict, and simple linear regression to calibrate the output for the test folds on the range of values in the training folds. From the scikit-learn package, the RandomForestRegressor was used with “n_estimators=1000, criterion='mse'” parameters and LinearRegression with default parameters. For the classification task, the continuous features were divided into two classes: the top and bottom quartiles. From the scikit- learn package, the RandomForestClassifier function was used with “n_estimators=1000” parameter. [0288] RF classification and regression on both species-level taxonomic relative abundance and functional potential profiles were used. For taxonomic abundances, the relative abundances of MetaPhlAn2 (see above) were used with all the abundances of all microbial clades from phylum to species normalized using the arcsin-sqrt transformation for compositional data. For functional profiles, both raw relative abundance estimates of single microbial gene families as well as pathway-level relative abundance as provided by HUMAnN2 were considered.

[0289] As an additional control, it was verified that when random swapping the target labels or values (classification and regression, respectively), the performances were reflecting a random prediction, hence an AUC very close to 0.5 and a non-significant correlation between the predicted with values approaching 0.

[0290] Statistical analysis

[0291] Spearman’s correlations (reported with “p” in the text) have been computed using the contest from the stats R package and a modified version of the pcor.test from the ppcor package (available online at yilab.gatech.edu/pcor. R) that permits to control for a set of covariates rather than single ones, respectively. Correlations and the p-values were computed for each couple of metadata and species and p-values were corrected using FDR through the Benjamini-Hochberg procedure, which are reported in the text as q-values. Significant correlations with q<0.2 were considered. Significant species have been selected by ranking them according to their number of significant associations for the panel of metadata considered, and then the top thirty unique species are considered for each panel of metadata. In the heatmaps for partial correlations, the asterisk indicates that the correlation index for the corresponding species-metadata pair is significant at FDR<0.2.

[0292] The contribution of metadata variables to microbiota community variation was determined by distance-based redundancy analysis (dbRDA) on species-level Bray-Curtis dissimilarity and Aitchison distance with the capscale function in the vegan R package 93. Correction for multiple testing (Benjamini-Hochberg, FDR) was applied and significance was defined at FDR <0.1. The cumulative contribution of metadata variables or metadata categories was determined by forward model selection on dbRDA (stepwise dbRDA) with the ordiR2step function in vegan, with variables that showed a significant contribution to microbiota community variation in the previous step. Only metadata variables with <15% missing data and without high collinearity with other variables (Spearman’s rho <0.8) were used as input in the stepwise model.

[0293] Data validation on the US cohort and on the cMD datasets

[0294] As independent validation, the publicly available datasets collected in the curatedMetagenomicData version 1.16.0 R package (cMD; Pasolli et al., Nat. Methods 14, 1023- 1024, 2017) were considered. Of the 57 datasets available, those that have samples with the following characteristics were selected: (1) gut samples collected from healthy adult individuals at first collection (“days_from_first_collection”=0 or NA), (2) samples with age and BMI data available and BMI interquartile range (IQR) of these samples between 3.5 and 7.5 (± 2 with respect to the PREDICT 1 UK IQR of 5.5, FIG. 10). For each dataset with samples meeting the above criteria, only datasets with at least 50 samples were considered: CosteaPI_2017 (84 samples out of 279), DhakanDB_2019 (88 samples out of 110), Hanenl_BS_2018 (58 samples out of 208), JieZ_2017 (157 samples 385), SchirmerM_2016 (396 samples out of 471), and ZellerG_2014 (59 samples out of 199).

[0295] The previously selected validation datasets were used from cMD in two analyses: one based on machine learning to verify the reproducibility of the ML model trained using the PREDICT 1 UK samples, and the second to verify the species-level correlations found in the PREDICT 1 UK cohort. For the first task, a regression algorithm was applied to predict BMI and age. Three different cross- validation approaches were used. First, using each dataset independently in 100 bootstrap iterations and an 80/20 random split of training and testing folds. Second, one more iteration was performed using the PREDICT 1 UK dataset as training fold and each dataset as testing fold. Third, a final prediction was made using Leave-One-Dataset-Out cross-validation (LODO), meaning that all datasets (PREDICT 1 UK, PREDICT 1 UK, and the cMD datasets) were considered together and each validation dataset was successively used as the test fold while all others were used for training. An additional validation performed using the cMD datasets was done by applying a pairwise Spearman correlation for each species in each cMD dataset against BMI and age. For each correlation, the top associated species were selected in PREDICT 1 UK (FDR q<=0.05) and their correlation was reported in cMD. For those species also found in the PREDICT 1 US, their correlation was reported as well.

[0296] RESULTS AND DISCUSSION

[0297] Large metagenomically-profiled cohorts with rich clinical, cardiometabolic, and dietary information

[0298] A multi-national, single-arm (pre-post) intervention study of diet-microbiome-cardiometabolic interactions was performed, including a discovery cohort based in the United Kingdom (UK) and a validation population in the United States (US). The UK cohort recruited 1 ,002 generally healthy adults (non-twins, identical [monozygotic; MZ] and non-identical [dizygotic; DZ] twins), with detailed demographic information, quantitative habitual diet data, cardiometabolic blood biomarkers, and assessed postprandial responses to both standardized test meals in the clinic and in free-living setting (Berry et al., Protocol Exchange, 2020; FIG. 9A). At-home collection of stool by the validated protocol (Methods) yielded 1 ,001 baseline samples for gut microbiome analysis. The US population employed the same enrollment and biospecimen collection protocols for 100 healthy, unrelated individuals (97 stool samples from 1 ,098 PREDICT 1 participants (UK n=1 ,001 ; US n=97). From a random subset of these received). The data from the US cohort was analyzed separately to the UK data to test the machine learning models trained in the UK cohort and independently validate microbiome-feature correlations. From a randomly selected subset of UK participants (n=70), fecal metagenomes were additionally sequenced from a second stool sample collected 14 days after the first collection (FIG. 9A) for a total of 1 ,168 metagenomes. All metagenomes were shotgun sequenced, taxonomically and functionally profiled, and assembled to provide metagenome-assembled genomes (MAGs). Computational analysis was performed using the bioBakery suite of tools (Mclver et al. , Bioinformatics 34, 1235-1237, 2018) to obtain species-level microbial abundances for the 769 taxa identified using an updated version of MetaPhlAn2 (Truong et al., Nat. Methods 12, 902-903, 2015), functional potential profiling of >1.91 M microbial gene families and 445 KEGG pathways with HUMAnN2 (Franzosa et al., Nat. Methods 15, 962-968, 2018), and reconstruction of 48,181 MAGs of medium or high-quality using the validated pipeline (Pasolli et al., Cell 176, 649-662. e20, 2019) which includes assembly with MegaHIT (Li etal., Bioinformatics 31 , 1674-1676, 2015), binning with MetaBAT2 (Kang et al., PeerJl , e7359, 2019), and quality-control with Check-M (Parks etal., Genome Res. 25, 1043- 1055, 2015). Collectively, these UK and US-based results include the PREDICT 1 study.

[0299] Microbial diversity and composition are linked with diet and fasting and postprandial biomarkers

[0300] A unique subpopulation of the study was first leveraged including 480 twins to disentangle the confounding effects of shared genetics from other factors on microbiome composition. The data confirmed that host genetics influences microbiome composition only to a small extent (Xie et al., Cell Syst. 3, 572-584. e3, 2016), as intra-twin pair microbiome similarities were significantly greater than those among unrelated individuals (p<1 e-12, FIG. 11 B), and monozygotic twins showed slightly more similar microbiomes than dizygotic twins (p=0.06). Intra twin-pair microbiome similarity, regardless of zygosity, remained substantially lower than intra-subject longitudinal sampling (day 0 vs. day 14, p<1 e-12, FIG. 11 B), a testament to the highly personalized nature of the gut microbiome attributable to a variable extent to non-genetic factors (FIGs. 11 C, 11 D).

[0301] The overall intra-sample (alpha) diversity of the gut microbiome as a broad summary statistic of microbiome structure (Ravel et al. , Proc. Natl. Acad. Sci. U. S. A. 108 Suppl 1 , 4680-4687, 2011) was investigated. In the cohort of healthy individuals, links were found between alpha diversity (specifically species richness) and personal characteristics (e.g. age and anthropometry), habitual diet, and metabolic indices (FIG. 9B) with 109 significant associations (p<0.05) among the total 295 Spearman’s correlation tests, and 56 after FDR-correction (q<0.05). Participant BMI, absorptiometry- based visceral fat measurements, and probability of fatty liver (using a validated prediction model; Atabaki-Pasdar et al., Genetic and Genomic Medicine, doi:10.1101/2020.02.10.20021147, 2020) were inversely associated with species richness. Consistent with previous findings for BMI (Le Chatelier et al., Nature 500, 541-546, 2013; Turnbaugh et al., Nature 457, 480-484, 2009), the findings suggest that the link between the microbiome and body habitus may be mediated in part by hepatic insulin resistance, particularly given the gut microbiome’s strong association with liver disease and activity observed in this cohort and previously (Qin et al., Nature 513, 59-64, 2014). With respect to habitual dietary factors, 18 of 126 total nominally significant (p<0.05) correlations (5 at q<0.05, FIG. 9B) were found.

[0302] Among clinical circulating measures, HDL cholesterol (HDL-C) was positively correlated with species richness. However, emerging cardiometabolic biomarkers with strong associations with cardiometabolic diseases (Wiirtz et al. , Circulation 131 , 774-785, 2015; Ahola-Olli et al. , Diabetologia 62, 2298-2309, 2019; Vojinovic et al., Nat. Commun. 10, 5813, 2019; Duprez et al., Clin. Chem. 62, 1020-1031 , 2016) that are not routinely used clinically, including lipoprotein particle size (diameter, “- D”), lipoprotein composition (cholesterol “-C” and TG “-TG”), apo-lipoproteins and GlycA (inflammatory biomarker; glycoprotein acetyls), were even more strongly associated with richness than the remaining traditional clinical measures (TG, Total-C, LDL-C and fasting glucose). LDL stands for low density lipoprotein and VLDL stands for very low density lipoprotein. These emerging biomarkers of reduced risk of chronic disease were positively associated with microbial diversity (e.g., extra-large and large HDL-C, HDL-D, Apolipoprotein-A1) both at fasting and postprandially, whilst those associated with increased risk of chronic disease were inversely correlated with microbial diversity (e.g. GlycA, VLDL-D small-HDL-TG). These results for species richness provide initial evidence that the microbiome is modestly, but significantly, associated with some key classical and emerging cardiometabolic health indicators and diet, motivating more detailed investigations of the links between cardiometabolic health, diet, and specific gut microbiome components. [0303] Diversity of healthy plant-based foods in habitual diet shapes gut microbiome composition

[0304] Links between habitual diet (over the past year) and the microbiome in PREDICT 1 using detailed, validated semi-quantitative food frequency questionnaires (FFQs) were assessed. These links were quantified using random forest (RF) regression and classification models, each trained on the whole set of quantitative microbiome features to predict one habitual diet feature (with training/testing via repeated bootstrapping, Methods). The performance of the models was evaluated with receiver operating characteristic (ROC) AUCs for classification and with correlation between predicted and collected values for regression, thus quantifying the degree to which each dietary feature could be estimated based on microbiome composition.

[0305] Dietary features assessed in this manner included individual food items, food groups, nutrients (energy adjusted and non-adjusted), and dietary patterns (FIGs. 12A-12F). Individual foods and food groups were assessed, the latter after collapsing items into bins according to Plant-based Diet Index (PDI; Satija et al. , PLoS Med. 13, e1002039, 2016) groupings (Table 1). Several foods and food groups exceeded 0.15 median Spearman’s correlation over bootstrap folds (denoted as “p”) between predicted and FFQ-estimated values (20/165 or 12.1%) and AUOO.65 (14/165, 8.5%; FIG. 12A). The strongest association among food items was coffee (p=0.45), which appeared to be dose-dependent (FIG. 12B) and validated in the US cohort when the model trained in the UK cohort was applied in the US. Particularly tight coupling was found between energy-adjusted derived nutrients and the taxonomic composition of the microbiome, especially compared to foods and food groups (FIG. 12A). Almost one-third of the energy-normalized nutrients (Table 1) had correlations above 0.3 (14/47) with the highest correlations achieved for saturated fatty acids (SFAs, p=0.46, AUC 0.82), zinc (p=0.39, AUC 0.76), and starch (p=0.39, AUC 0.75).

[0306] Because of the complex and interacting nature of dietary intake, as well as to offer practical recommendations, constituent foods and food groups were summarized into several established dietary indices (Table 1), including the Healthy Food Diversity index (HFD), Vadiveloo et al., Br. J. Nutr. 112, 1562-1574, 2014 the Healthy and Unhealthy Plant-based Dietary Indices (H-PDI and U- PDI), and the Alternate Mediterranean Diet score (aMED; Fung et al., Am. J. Clin. Nutr. 82, 163-173, 2005). The HFD, unlike the other food scores, incorporates a measure of dietary diversity (greater is considered better) and food quality according to dietary guidelines, whereas the PDI characterizes a given diet on the basis of type and quantity of the plant-based foods categorized as ‘more- healthyThealthy’ or ‘less-healthyTunhealthy’ based on epidemiological evidence (Satija et al., PLoS Med. 13, e1002039, 2016). These scores have been associated with lower cardiovascular disease risk 29, type 2 diabetes (T2D) risk (Satija et al., PLoS Med. 13, e1002039, 2016), metabolic syndrome (Vadiveloo et al. , J. Nutr. 145, 564-571 , 2015), and all-cause mortality (Kim Hyunju et al. , J. Am. Heart Assoc. 8, e012865, 2019). The aMED dietary score is based on dietary patterns in Mediterranean countries and has been associated with reduced risk of chronic disease and mortality (Reedy et al., J. Nutr. 144, 881-889, 2014; Mitrou et al., Arch. Intern. Med. 167, 2461-2468, 2007). Tight correlations were demonstrated between values predicted from gut microbial composition and all the indices (HFD, H-PDI, U-PDI, and aMED) in the UK (p=0.36, 0.34, 0.33, and 0.23, respectively) and in the US validation cohort (p=0.39, 0.23, 0.31 , and 0.38, respectively; FIG. 12A and FIGs. 13A-13C), highlighting the relationship between the microbiome and healthy dietary patterns. Additionally, these results indicate that diet-microbiome associations are consistent and generalizable from UK to US populations, adding confidence to the suggested biological targets explored below and alleviating concerns of overfitting.

[0307] Microbial species segregate into groups associated with more healthy and less healthy plant- and animal-based foods

[0308] Feature-level testing to identify the specific microbial taxa most responsible for these diet- based community associations (FIG. 12F) was undertaken. By focusing on prevalent species (i.e., those detected in >20% of samples) and adjusting for age and BMI, 30 species (17%) were found to be significantly correlated with at least five defined dietary exposures at False Discovery Rate (FDR) q<0.2 (Table 3). This included a confirmation of expected associations (FIGs. 14A, 14B), such as the relative enrichment of the probiotic taxa Bifidobacterium animalis (Redondo-Useros et al., Nutrients 11 , 2019) and Streptococcus thermophilus with greater full-fat yogurt consumption (p=0.22 and 0.20 respectively). The strongest food/microbe association was between the recently characterized butyrate-producing Lawsonibacter asaccharolyticus (Sakamoto et al., Int. J. Syst. Evol. Microbiol. 68, 2074-2081 , 2018) and coffee consumption (FIG. 12F).

[0309] However, due to the low precision of dietary data collected by FFQ, the complexity of dietary patterns, nutrient-nutrient interactions, and clustering of ‘healthyTless-healthy’ food items within diets, it is challenging to disentangle the independent associations of single nutrients and single foods with microbial species. Indeed, considering the top 30 species most strongly associated with various dietary determinants (based on number of significant correlations; FIG. 12F), a clear segregation of species into two distinct clusters was found with either more healthy plant-based foods (e.g. spinach, seeds, tomatoes, broccoli) or with less healthy plant-based (e.g. juices, sweetened beverages, and refined grains) and animal-based foods, as defined by the PDI (Satija et al., J. Am. Coll. Cardiol. 70, 411-422, 2017; Table 3).

[0310] Taxa linked to diets rich in more healthy plant-based foods (FIGs. 12F, 12E and FIGs. 14A, 14B) mostly included butyrate producers, such as Roseburia hominis, Agathobaculum butyriciproducens, Faecalibacterium prausnitzii, and Anaerostipes hadrus, as well as other uncultivated species from clades typically capable of butyrate production ( Roseburia CAG 182) or predicted to have this metabolic capability ( Firmicutes CAG 95, with 92% of its 166 MAGs encoding for butyrate kinases). Clades correlating with several ‘less-healthy’ plant-based and animal-based foods included several Clostridium species ( Clostridium innocuum, C. symbiosum, C. spiroforme, C. leptum, C. saccharoiyticum). The relationship between C. leptum and the intake of unhealthy foods is particularly worth noting, as prior experimental evidence has demonstrated their counts can be modulated by diet in mice (Eslinger et al., Nutr. Res. 34, 714-722, 2014). The segregation of species according to animal-based ‘healthy’ foods (e.g. eggs, white and oily fish) or animal-based ‘less- healthy’ foods (e.g. meat pies, bacon and dairy desserts) using a novel categorization developed for this analysis based on epidemiological evidence outlined in Methods, was also distinct and was similar to taxa linked to patterns for ‘healthy’ and ‘less-healthy’ plant foods (FIG. 12E and FIGs. 14A, 14B). The few food items that did not fit into the ‘healthy’ cluster despite being categorized as ‘healthy plant’ foods, were (ultra) processed foods according to the NOVA classification (Monteiro et al. , Public Health Nutr. 21 :5-17, 2018; e.g. sauces, tomato ketchup, and baked beans; Group 4 and 3, respectively; FIGs. 14A, 14B). This emphasizes the importance of food quality (e.g. highly processed vs. unprocessed), food source (e.g. plant vs. animal), and food heterogeneity (i.e. not all plant foods are healthy and animal foods unhealthy, nor vice versa) both in overall health and in microbiome ecology.

[0311] Poorly characterized microbes drive the strongest microbiome-habitual diet associations

[0312] Many of the strongest microbial associations with food items, food groups, and dietary indices occurred with only recently isolated organisms or still uncultured taxa including, for example, five species defined using co-abundance gene groups (CAGs) from metagenomics (Nielsen et al., Nat. Biotechnol. 32, 822-828, 2014). Among indices, the HFD, which prioritizes diversity of all food items while considering dietary guidelines, was most tightly coupled to feature-level abundances (FIG. 12A), significantly correlated with 41 of the 174 prevalent species (i.e. those found in >20% samples), highlighting the synergistic impact of dietary diversity, dietary quality, and gut microbial responsiveness. Among species whose abundance was highly correlated to the HFD (FIG. 12F) were taxa also associated with ‘healthy’ or ‘less-healthy’ foods, such as Firmicutes CAG 94 (p=-0.25) and Roseburia CAG 182 (p=0.13). The highest correlation was observed for Lawsonibacter asaccharolyticus (p=-0.29), the aforementioned and recently characterized (Sakamoto et al., Int. J. Syst. Evol. Microbiol. 68, 2074-2081 , 2018) and sequenced species (Sakamoto et al., Genome Announc. 6, 2018). This microbe has two additional known genomes with the conflicting species name of Clostridium phoceensis (Hosny, et al., New Microbes New Infect 14, 85-92, 2016) , and it is predicted that it encodes butyrate-producing enzymes from metagenome-assembled genomes enzymes (Pasolli et al., Cell 176, 649-662. e20, 2019; 49 of the 53 MAGs in the L. asaccharolyticus SGB15154 encode for butyrate kinase EC 2.7.2.7). The link between the HFD and L. asaccharolyticus is particularly noteworthy and not likely a consequence of the previously observed association with coffee, as the HFD index does not include non-caloric beverages, including coffee, mineral water, and tea, as well as alcoholic beverages. This may suggest alternative and complementary strategies to modulate this microbe through both coffee intake and adherence to a diverse diet.

[0313] Among other dietary indices and nutrients, general concordance with the two sets of microbes associated with healthy and less-healthy foods was observed. A greater animal-based food score, which is derived based on the relative amount of ‘healthy’ (positive score) and ‘less-healthy’ (inverse score) animal foods consumed (Table 3), was associated with the 'healthy' cluster, suggesting that a diet rich in healthier animal-based foods is associated with the more favorable diet-microbiome signature, although this likely also reflects an overall healthier dietary pattern by healthy animal-based food consumers. The healthy and unhealthy PDI, which have been shown to differentially affect disease risk (Satija et al., PLoS Med. 13, e1002039, 2016; Satija et al., J. Am. Coll. Cardiol. 70, 411- 422, 2017) also had distinct clusters, again emphasizing the oversimplification of conventional plant and animal-based food groupings. The strongest representatives for the two clusters (i.e. taxa with the highest correlations) are Firmicutes CAG 95 and Firmicutes CAG 94 for healthy and unhealthy diet, respectively, and the lack of cultivated representatives for these two candidate species may explain why these links were previously overlooked even in large analyses (Zeevi et al., Cell 163, 1079-1094, 2015; Zhernakova et al., Science 352, 565-569, 2016). The PREDICT 1 validation cohort in the US generally confirmed these associations despite its comparatively smaller sample size: among the subset of derived pattern/index scores shared between the UK and US cohorts, of the 52 associations that were significant both in the UK cohort (FDR q<0.2) and in the US cohort (p<0.05), 78.8% were concordant for the direction of the correlation.

[0314] Microbial indicators of obesity are reproducible across varied populations [0315] Microbiome links to obesity have attracted much interest although results have varied in human populations (Le Chatelier et al., Nature 500, 541-546, 2013; Sze & Schloss, MBio 7, 2016). They were explored in the PREDICT 1 populations with RF regression and classification (as above, Methods) using either taxonomic or functional features. Visceral fat measured by DEXA scan was found to be more strongly linked to gut microbial composition than BMI (Beaumont et al., Genome Biol. 17, 189, 2016), a finding validated in the US participants when applying UK-trained models (FIG. 15A). Some obesity-associated taxa - assessed either by BMI or visceral fat - were also associated with poor dietary patterns after controlling for BMI (e.g. Clostridium CAG 58, Flavonifractor plautii), whereas markers of healthier low visceral fat mass (e.g. Faecalibacterium prausnitzii) were more strongly linked to healthier foods and patterns of intake, illustrating that diet and obesity signatures overlap but are not identical (FIG. 15B).

[0316] Microbiome models to predict BMI developed and trained on the UK-based cohort were validated not only in the PREDICT US cohort, but also in six additional independent datasets (Schirmer et al., Cell 167, 1897, 2016; Zeller et al., Mol. Syst. Biol. 10, 2014; Hansen et al., Nat. Commun. 9, 4630, 2018; Costea et al., Mol. Syst. Biol. 13, 960, 2017; Jie et al., Nat. Commun. 8, 845, 2017; Dhakan et al., Gigascience 8, 2019) that have been uniformly pre-processed and harmonized using curatedMetagenomicData (Pasolli et al., Nat. Methods 14, 1023-1024, 2017; cMD), lending credence and generalizability to the findings. Despite substantial differences (Falony et al., Science, 352(6285): 560-4, 2016; Truong et al., Genome Res. 27, 626-638, 2017) in the microbiomes among people from different populations, the PREDICT 1 UK model improved cohort-specific cross-validation accuracy in the majority of cases, on par with the leave-one-out approach that notably also includes the UK cohort (FIG. 15D). Interestingly, BMI was not predictable at all for two included datasets when using just their own samples. However, predictions and classification improved when using the PREDICT 1 UK model. Of the 17 species surpassing the FDR threshold of q<0.05, three had an (absolute) p>0.1 in the smaller US cohort and two of these three were concordant with those in the UK cohort (/. butyriciproducens negatively and R. torques positively correlated with BMI; FIG. 15C). Across the harmonized independent cMD datasets, all but two median association estimates were consistent with the PREDICT 1 UK signatures, and 12 of the 14 were concordant despite different sample collection and DNA extraction methods.

[0317] Fasting cardiometabolic markers associated with specific microbiome structures [0318] To explore the connections between the gut microbiome and markers of cardiometabolic health, fine-scale evaluations of microbial community membership and their biochemical functions against established clinical and emerging cardiometabolic biomarkers were performed. ML prediction models were developed for each of these outcomes built using both species-level taxonomic abundances and functional potential profiles and tested how accurately they were able to estimate host biomarkers.

[0319] Modest concordance between microbiome classifiers and several traditional clinical fasting cardiometabolic biomarkers (FIG. 16A). These include near-term metrics, such as systolic and diastolic blood pressure, heart rate, lipids (TG, TC, HDL-C, LDL-C) and fasting glucose, as well as glycosylated hemoglobin (HbA1c), a widely-used clinical test reflecting mean glucose levels over weeks-to-months. Notably, the difference between total and high-density lipoprotein (HDL) cholesterol (e.g. non-HDL), recently considered a clinically useful aggregate count of atherogenic cholesterol fractions (Cui et al. , Arch. Intern. Med. 161 , 1413-1419, 2001), was also linked to gut microbial features (p=0.17; AUC 0.61). These associations were largely recapitulated in a clinical prediction model incorporating most of these factors to estimate latent 10-year risk of heart disease or stroke using the Atherosclerotic Cardiovascular Disease (ASCVD) algorithm (D’Agostino et al., Circulation 117, 743-753, 2008).

[0320] From the remaining compendium of blood biomarkers (FIG. 9A), stronger correlations were found between the microbiome and an inflammatory surrogate (glycoprotein acetyls, GlycA, FIG. 16A), as well as various emerging lipid measures linked to host health, such as HDL and VLDL particle size (HDL-D and VLDL-D, p=0.3 and 0.28 respectively), the lipid content of lipoprotein subfractions (including XL-HDL-L and L-HDL-L, p=0.39 and 0.37 respectively), and circulating polyunsaturated fatty acids (PUFA) fatty acid (omega-6 [FAo)6/FA] and PUFA [PUFA/FA] to total fatty acid ratios, p=0.31 for both). GlycA (Duprez et al., Clin. Chem. 62, 1020-1031 , 2016) and VLDL-D have been strongly associated with increased risk for the metabolic syndrome, CVD, and T2D, whereas HDL-D and its lipid constituents, omega-6, and PUFA have strong inverse associations (Wiirtz et al., Circulation 131 , 774-785, 2015; Ahola-Olli et al., Diabetologia 62, 2298-2309, 2019; Kettunen et al., Circ Genom Precis Med 11 , e002234, 2018). The strongest association for all circulating markers was observed for large HDL particle lipid concentrations (XL-HDL-L and L-HDL-L, with p=0.41 and 0.38, and AUC=0.70 and 0.69, respectively), which also have the strongest inverse association with CVD and T2D of all the lipid measures (Wiirtz et al., Circulation 131 , 774-785, 2015; Ahola-Olli et al., Diabetologia 62, 2298-2309, 2019; Kettunen et al., Circ Genom Precis Med 11 , e002234, 2018). Similarly, the majority of glycemic indicators such as insulin, C-peptide (a surrogate of insulin secretion), and to a much lesser extent, impaired glucose tolerance (IGT) were also coupled to human gut microbiome composition (FIG. 16A). Derived predictors of insulin sensitivity (Quantitative Insulin sensitivity Check Index or QUICKI; Hrebicek et al., J. Clin. Endocrinol. Metab. 87, 144-147, 2002) and hepatic steatosis (Liver Fat Probability) were also reasonably captured using microbiome-based ML classifiers (p=0.22 and 0.18; AUC 0.66 and 0.64 respectively).

[0321] Species-based predictors proved more accurate for RF-based learning tasks than pathway abundance profiles (FIG. 17), consistent with other microbiome-wide training exercises (Thomas et al., Nat Med 25,667-678, 2019). Despite a smaller study population and a more restricted panel of fasting circulating metabolites, the primary findings were generally replicated in the US validation cohort (FIG. 16A), corroborating the existence of a strong, previously overlooked link between the gut microbiome and surrogate markers of cardiometabolic health.

[0322] The gut microbiome is a better predictor of postprandial triglycerides and insulin concentrations than of glucose levels

[0323] Fasting blood assays are the standard for most research and clinical investigations; however, in free-living conditions, individuals consume multiple meals throughout the day and therefore spend most of their waking hours in the postprandial state. Mixed nutrient meals (carbohydrate, fat and protein) result in person-specific food-induced elevations in triglycerides (TG), glucose, insulin, and other related metabolites, impacting personalized cardiometabolic responses and downstream health outcomes. Whilst prior efforts have demonstrated that postprandial glucose responses may, in part, be predicted by the gut microbiome (Zeevi et al. , Cell 163, 1079-1094, 2015), the relationship between the microbiome and ‘real-life’ variations in both postprandial lipid and glucose-mediated metabolites has not been explored. Postprandial metabolic responses to foods of varying nutrient composition were therefore assessed in the clinic and free-living settings by considering the overall magnitude of the response by iAUC, as well as its peak concentrations, and its change from fasting (i.e. rise). [0324] Firstly, postprandial TGs, glucose, C-peptide, insulin, and circulating metabolite concentrations were measured at regular intervals (0-6h) in the clinic after the administration of two formulated, sequential test meals (890 kcal, 50g fat and 85g carb at Oh [breakfast] and 500 kcal, 22g fat and 71 g carb at 4h [lunch]; FIGs. 16B, 16C). Notably, it was found that the magnitude of postprandial TG (0- 6h iAUC), insulin, and C-peptide (both 0-2h iAUC) responses were more strongly associated with the gut microbiome (p=0.15, 0.19, and 0.21 , respectively; AUC >0.63 for each) compared with postprandial glucose (0-2h iAUC) responses (p=0.12 and AUC 0.59, FIG. 16B), findings replicated in the US validation cohort (FIG. 16B).

[0325] Following the in-person clinic day, glucose concentrations were also measured via continuous glucose monitoring over the subsequent 13-day at-home period (Berry et al., Protocol Exchange, 2020) that included responses to isocaloric standardized meals, in duplicate, with different macronutrient compositions (fat, carbohydrate, protein and fiber; Table 2). However, contrary to the clinic meal responses (FIG. 16B) and previous work (Zeevi et al. , Cell 163, 1079-1094, 2015), the glucose 0-2h iAUCs following these meals did not achieve high correlations with the microbiome regardless of their macronutrient composition (all p<0.11 and AUC<0.58, FIG. 16C). Whilst this may be due to the lower energy, fat, and carbohydrate dose in at-home isocaloric meals (500 kcal) compared to the successive clinic meals (total 1 ,390 kcal for breakfast and lunch), reducing discrimination between interindividual responses, Zeevi et al. ( Cell 163, 1079-1094, 2015) found associations using meals of <500 kcal. However, the stool sample in this study was collected within 24h of the metabolic clinic meal(s), whereas the standardized at-home meals were consumed (in random order) between days 2-13 post-home stool collection, introducing additional variability due to short-term fluctuations in microbiome composition (David et al., Nature 505, 559-563, 2014). Taken together, these results suggest that the microbiome is a stronger predictor of postprandial lipemia (TG) than glycaemia, with the strength of association for glycemic responses influenced by overall metabolic load and short-term variations in microbial composition rather than differences in macronutrient composition.

[0326] Postprandial rises in lipid- and glucose-mediated measures are differentially predicted by the microbiome compared with fasting levels

[0327] Postprandial measures (iAUC and peak) depend both on the corresponding fasting measure and the meal-induced rise. Therefore, the differential prediction accuracy of the gut microbiome for fasting levels, postprandial (peak) total levels, and postprandial rises (FIG. 16H) were compared. When looking at lipid and glucose-mediated metabolites from the clinic day measures, despite a similar strength of association between peak (6h), magnitude (iAUC) and fasting TG concentrations, the rise (6-0h) was not similarly correlated (FIGs. 16A, 16E, 16F). In contrast, the microbiome associations with glycemic measures were comparable between fasting, peak, and rise (FIGs. 16A, 16D).

[0328] Of particular interest were the lipoprotein subfraction concentrations, composition, and size (FIGs. 18 and 19), which are remodeled postprandially, resulting in the generation of atherogenic lipoproteins (e.g. Large VLDL particles and TG-enriched LDL, and HDL particles). These atherogenic particles were predicted at comparable accuracy for both fasting and postprandial peak 6h concentrations (FIGs. 16A, 16F, 16H), and notably, HDL and VLDL size (“-D”, key lipoproteins associated with cardiometabolic risk) achieve modestly stronger correlations (p=0.32 and 0.31 , respectively) postprandially (FIG. 16F). However, as with TG, the microbiome was substantially less predictive for the postprandial rise (6h - fasting) in all lipid metabolite measures compared with fasting and postprandial 6h peak concentration (FIGs. 16A, 16F, 16H). For example, HDL-D is closely associated with gut microbial composition at fasting and 6h postprandially (p=0.30 and 0.32; AUC 0.71 and 0.72 respectively; FIGs. 16A, 16F, 16H), but not with the rise (FIG. 16F).

[0329] These differential associations suggest that the microbiome may influence postprandial lipid- mediated measures via effects on fasting measures but may impact the postprandial glucose rise more independently of fasting levels. [0330] Distinct microbial signatures discriminate between positive and negative metabolic health indices under fasting conditions

[0331] Motivated by the observed potential of the gut microbiome to predict the fasting and postprandial levels of circulating metabolic markers, identifying the specific taxa and functions driving these associations was next sought. Among three general risk indices of cardiovascular health (ASCVD, liver fat probability, and insulin sensitivity or quantitative insuli-sensitivity check index (QUICKI)) which demonstrated significant although rather modest correlation of predictions (0.2) using the microbiome-wide RF model (FIG. 16A), eight species were found that were significantly correlated with all three (negatively or positively, p<0.05). Seven of these eight were concordantly correlated in the direction of a more healthful metabolic profile (i.e. correlated for greater QUICKI values and lower ASCVD and fatty liver risk), hinting at a global underlying microbial signature of improved metabolic health. These taxa included Flavonifractor plautii and Clostridium innocuum (higher cardiometabolic risk, FIGs. 20A-20C) and Oscillibacter sp 57_20, Haemophilus parainfluenzae, and Eubacterium eligens (lower risk, FIGs. 20A-20C) that had previously been linked with healthy and less-healthy dietary habits.

[0332] Similarly, distinct separations were found between two opposing and clearly defined clusters of species either positively or negatively correlated with fasting cardiometabolic measures (FIG. 20A), including blood pressure, inflammatory markers, lipid concentrations, lipoprotein sizes and fractions, and apolipoproteins (FIGs. 20A, 20B). As per the association with diet, species correlated with positive markers included some taxa generally regarded as healthy (e.g. F. prausnitzii) but also many uncultivated and under-characterized bacteria (7 from the cluster of 18). With the notable exception of three species of Prevotella (P. copri, P. clara, and P. xylaniphila) the positive cluster included many distinct genera, pointing at a large functional richness and diversity. In contrast, the cluster of species negatively correlated with positive markers again included many Clostridium species (5 of the 12 in the cluster) and the recurrent negatively connotated R. gnavus and P. plautii. Large HDL particles (and their lipid compositions, FIGs. 21-23), which have strong inverse associations with cardiometabolic outcomes (Wiirtzetal., Circulation 131 , 774-785, 2015; Ahola-Olli etal., Diabetologia 62, 2298-2309, 2019) as well as with the microbiome (FIG. 16A), were associated with the healthy cluster. Conversely, lipoproteins associated with increased risk of CVD and T2D (VLDL of all sizes; XXL, XL, L, M, S and lipid composition) and atherogenicity (Skeggs et al. , J. Lipid Res. 43, 1264- 1274, 2002; S-LDL, M-HDL and S-HDL TG), were associated with the less-healthy cluster (FIGs. 21- 23).

[0333] Circulating omega-6 and total polyunsaturated fatty acids (PUFA), which reflect dietary intake due to the lack of endogenous production of these fatty acids (Hodson et al., Prog. Lipid Res. 47, 348-380, 2008), were associated with the healthy cluster for which Firmicutes bacterium CAG95 was the most correlated representative, and P. plautii the strongest negative correlation (FIG. 20A). Both omega-6 and PUFA have been linked to reduced risk of chronic disease, whether measured from dietary inventories (Li et al., Am. J. Clin. Nutr., doi:10.1093/ajcn/nqz349, 2020) or directly assayed from the circulation (Wiirtz et al., Circulation 131 , 774-785, 2015; Ahola-Olli et al., Diabetologia 62, 2298-2309, 2019; Marklund et al., Circulation 139, 2422-2436, 2019). In contrast, circulating monounsaturated fatty acids (MUFA) in blood were associated with the unhealthy cluster, with an under-characterized Oscillibacter species (sp. 57_20) and Clostridium bolteae responsible for the strongest negative and positive associations respectively. Measures of circulating MUFA () but not dietary intake of MUFA (Chowdhury et al., Ann. Intern. Med. 160, 398-406, 2014; Zong et al., BMJ 355, Ϊ5796, 2016) have been associated with increased risk of CVD and T2D. Differences in circulating vs. estimated dietary intakes of MUFA may be a function of endogenous MUFA production, as well as the divergent animal and plant dietary sources of MUFA (Wu et al., Nat. Rev. Cardiol. 16, 581— 601 , 2019; Zong et al., Am. J. Clin. Nutr. 107, 445-453, 2018), complicating their relationship with chronic health outcomes (Hodson et al., Prog. Lipid Res. 47, 348-380, 2008). Taken together with the findings, these results suggest that food sources of MUFA play an important role in the relationship between MUFA and health.

[0334] Both favorable and unfavorable microbial signatures of metabolic health were maintained under postprandial conditions

[0335] Links between postprandial levels of cardiometabolic and inflammatory measures corresponded with the segregation of healthful vs. detrimental taxa observed under fasting conditions (FIG. 20B and FIGs. 21-23). Notably, fasting and postprandial GlycA, which were found to be highly correlated with postprandial TG concentrations, were strongly linked with the microbiome (62 species significantly correlated at 6 hours and 67 at fasting), substantially exceeding IL-6 (5 and 26 significant postprandial and fasting associations, FIG. 20B). F. plautii and R. gnavus were the two species most correlated with increased inflammation both in fasting and postprandial conditions, whereas H. parainfluenzae and Firmicutes bacterium CAG95 were the strongest associations with reduced GlycA levels. VLDL lipoprotein subfractions (markers of adverse cardiometabolic effects) were also consistently associated with the less-healthy cluster both at fasting and postprandially. Postprandial rises, rather than absolute postprandial levels, were frequently uncoupled from the microbial associations with fasting markers; several positive correlations between microbial species and fasting and peak metabolites measures became negative when correlating the same species with the rise from fasting (and vice versa, FIG. 20D). For example, the rise in total LDL cholesterol and size (-D, FIG. 20B) was differentially associated with clusters compared to fasting levels (especially for T. sanguinis, B. animalis, and R. muciiaginosa). S- and XL- HDL total lipid (-L) and cholesterol (-C) levels also paralleled this behavior (FIGs. 21 , 22), possibly reflecting postprandial lipoprotein remodeling and reciprocal exchange of TG and cholesterol, between these particles and TG-rich lipoproteins (chylomicrons and VLDL; Cohn, J Can. J. Cardiol. 14 Suppl B, 18B-27B, 1998). In contrast, the associations of the microbial species with absolute fasting and postprandial peak levels were fully consistent (FIG. 20D), again reflecting the close relationship between fasting levels and postprandial responses. The same “favorable” vs. “unfavorable” clustering of microbiome features was observed when analyzing microbial pathways and gene families (FIGs. 24 and 25). This supports the segregation of many taxa, even at the species level (and likely more so among strains), by their underlying biochemical activities in the microbiome. The strengths of microbe-blood marker associations measured using Spearman’s correlation were consistent with the estimated microbe relevance by the random forest model (FIG. 26F). Importantly, these associations were confirmed in the PREDICT 1 US validation cohort; there was a total of 62,366 microbe-index correlations for indices present in both cohorts, and for the 292 that were significant both in the UK cohort (q<0.2) and in the US cohort (p<0.05) the concordance in the sign of the correlation reached 90.8% for the associations in fasting conditions and 91 .2% postprandially.

[0336] Prevotella copri diversity and Blastocystis spp. presence are markers of improved postprandial glucose responses

[0337] Some ecologically unusual microbes hypothesized to have population-scale health effects solely based on their presence or absence appeared among the microbial signatures. Among them, Prevotella copri is a frequent and highly abundant inhabitant of the gut (Human Microbiome Project Consortium. Nature 486, 207-214, 2012; Arumugam et al. , Nature 473, 174-180, 2011), but its beneficial or detrimental role in human health remains controversial (Cani, Gut 67, 1716-1725, 2018; Ley, Nat. Rev. Gastroenterol. Hepatol. 13, 69-70, 2016). Previous reports have yielded conflicting accounts of P. copri in glucose homeostasis, with some studies suggesting health benefits (Kovatcheva-Datchary et al. , Cell Metab. 22, 971-982, 2015; De Vadder et al., Cell Metab. 24, 151 — 157, 2016) and others suggesting deleterious effects (Pedersen et al., Nature 535, 376-381 , 2016) possibly due to subspecies diversity (Tett et al., Cell Host Microbe, doi:10.1016/j.chom.2019.08.018, 2019; De Filippis et al., Cell Host Microbe 25, 444-453. e3, 2019). These data largely find P. copri to be associated with beneficial cardiometabolic markers, being weakly negatively correlated with estimated visceral fat (p=-0.09, p=0.009, q=0.098), fasting VLDL-D (p=-0.07, p=0.06, q=0.21), and fasting GlycA (p=-0.12, p=0.0001 , q=0.005) among others (Table 3). While almost no habitual diet foods, nutrients, or scores were associated with P. copri, this bacterium showed a very strong correlation with postprandial increases of several circulating metabolic markers when compared with corresponding absolute fasting or postprandial levels. Postprandial rises in glucose (p=-0.12, p<0.0002) and polyunsaturated and omega-6 fatty acids (p=0.11 and 0.10, respectively, and p<0.001) were among the top-scoring correlations and were more strongly connected with the microbiome than were corresponding fasting and postprandial levels, in sharp contrast with what was observed for the overall microbiome (FIGs. 16A, 16B), suggesting a potentially unique role for P. copri in host metabolism.

[0338] As P. copri has a relatively low prevalence in Western-lifestyle populations but is highly abundant when present (Tett et al., Cell Host Microbe, doi:10.1016/j.chom.2019.08.018, 2019), the presence of one or more of the subtypes of this species was tested (Tett et al., 2019) to determine whether it is associated with markers of improved glucose metabolism. P. copri is present in the form of at least one of its subtypes in 29.8% of the PREDICT 1 individuals, and significant differences were identified in P. copri carriers including lower C-peptide (-9.2%, p=0.002) (FIG. 27D), insulin (-14%, p=0.006), and lower TG levels (-3.2%, p=0.003) (FIG. 27E) compared to individuals without this species. Similarly, postprandial blood glucose spikes after breakfast were significantly less pronounced in individuals with P. copri (-20.4% glucose iAUC at 2h, p=0.002, FIG. 27C), and visceral fat was significantly lower (-12.5%, p=3E-7, FIG. 21k). Although these observations are only associative, and the direct effect of P. copri on these markers of glucose metabolism is unknown, this positive association further supports that the presence of P. copri in the gut microbiome could be beneficial in glucose homeostasis.

[0339] Blastocystis spp. is a unicellular eukaryotic parasite increasingly regarded as a commensal member of the gut microbiome rather than a potential pathogen (Clark et al., Adv. Parasitol. 82, 1-32, 2013; Alfellani et al., Acta Trop. 126, 11-18, 2013; Lukes et al., PLoS Pathog. 11 , e1005039, 2015). It shares with P. copri a limited prevalence in Western-lifestyle populations (Beghini et al., ISMEJ. 11 , 2848-2863, 2017) coupled with high relative abundance when present, unique among eukaryotic organisms in the gut to date. By assessing microbiome characteristics in presence or absence of Blastocystis spp., evidence was found that Blastocystis- positive individuals (28.1% in the cohort) also have a favorable glucose homeostasis and lower estimated visceral fat (-14.9% glucose iAUC, -21 .7% visceral fat, p<0.01 , FIGs. 27A and 27C). The latter confirms that Blastocystis spp. is less prevalent in overweight and obese individuals compared to individuals with BMI in the normal range, as previously shown (Beghini et al., ISME J. 11 , 2848-2863, 2017) in multiple cohorts (Le Chatelier et al., Nature 500, 541-546, 2013; Nielsen et al., Nat. Biotechnol. 32, 822-828, 2014; Andersen et al., FEMS Microbiol. Ecol. 91 , 2015; Qin et al., Nature 464, 59-65, 2010). Interestingly, the effect of the simultaneous presence of P. copri and Blastocystis spp. (12.8% of the individuals) appears to further promote healthier metabolic function. Visceral fat is 9.4% lower on average (p=0.028, Table 4) for individuals positive for both P. copri and Blastocystis spp. compared to individuals with only one or the other and 22.6% lower (p=3.3E-7) compared with individuals lacking both. Triglycerides and C- peptide were also consistently lower (although not individually significant, Table 4) when both microbes were present.

[0340] A clear microbial signature of health levels consistent across diet, obesity indicators, and cardiometabolic risks

[0341] In the preceding analyses, a consistent set of microbial species was observed that were strongly linked to (1) foods and food indices reflecting different levels of a “healthy” diet, (2) indicators of obesity and of general health, (3) fasting circulating metabolites connected with cardiometabolic risks, and (4) postprandial responses to food. To test the consistency of such a signature, a representative set of “health” indicators were selected from each of the four categories (diet, personal characteristics, fasting and postprandial biomarkers) and ranked each microbial species based on their correlation coefficient. By averaging the ranks of the association (or inverted ranks for “unhealthy” indicators), remarkable agreement among microbes associated with different positive or negative indicators of health was found (FIG. 28, Table 5). [0342] In particular, Firmicutes CAG 95 is the uncultivated species with the most beneficial score (average rank 7.14) and ranked within the top 5 correlated species for 13 of the 20 indicators. Of the “health”-associated microbial species only R. hominis (23.76) was already convincingly linked with health in case/control disease investigations (Machiels et al., Gut 63, 1275-1283, 2014), even though others such as F. prausnitzii (Sokol et al., Proc. Natl. Acad. Sci. U. S. A. 105, 16731-16736, 2008) and P. copri were highly ranked (average ranks 31.7 and 37.2 respectively, 18th and 21st best ranks) but not in the top 15. The beneficial signature also included several known species such as E. eligens (16.6) and H. parainfluenzae (6.4) without clear roles in health, and additional species without cultivated representatives such as Roseburia CAG 182 (15.5), Oscillibacter sp 57_20 (13.6), Firmicutes bacterium CAG 170 (20.1), Oscillibacter sp PC13 (24.5), Clostridium sp CAG 167 (24.8), and Ruminococcaceae bacterium D5 (24.8). Species that were conversely consistent with indicators of poor overall health (FIG. 28) included the already discussed set of Clostridia (C. spiroforme - 149.7, C. bolteae CAG 59 - 149.9, C. bolteae - 154.8, Clostridium CAG 58 - 157.5, C. symbiosum - 157.4, C. innocuum - 155.1). The two strongest microbial indicators of poor cardiometabolic and diet-related health were the mucolytic microbe R. gnavus (158.8) and F. plautii (169.1), again previously found to be associated with disease conditions (Hall et al., Genome Med. 9, 103, 2017; Azzouz et al., Ann. Rheum. Dis. 78, 947-956, 2019; Ni et al., Gastroenterology 152, S214, 2017; Valles-Colomer et al., Nat Microbiol 4, 623-632, 2019; Gupta et al., mSystems 4, 2019; Jiang et al., Brain Behav. Immun. 48, 186-194, 2015). Overall, this set of 30 species serves as a marker of overall good or poor general health and dietary patterns in non-diseased human hosts.

[0343] DISCUSSION

[0344] PREDICT 1 represents the first diet-microbiome clinical intervention study to identify both individual components of the microbiome and an overall gut microbial signature associated with multiple measures of dietary intake and cardiometabolic health. These signatures reproduced across UK and US populations, across multiple previously-published study populations, and for multiple dietary, biometric, and blood markers of health and cardiometabolic risk, including individual food items, nutrients, dietary patterns, adiposity, BMI, circulating lipids, inflammatory markers, blood glucose, and interactions between baseline and postprandial response levels. Notably, microbiome signatures robustly grouped both microbiome and dietary components into health-associated and antiassociated clusters, the latter in agreement with dietary quality and diversity scores (such as the Plant- based Diet Index [PDI] and Healthy Food Diversity [HFD] index) known to be health-associated (Vadiveloo et al,. Br. J. Nutr. 112, 1562-1574, 2014; Kim etal., J. Nutr. 148, 624-631 , 2018) and often unlinked from macronutrient source (e.g. more vs. less healthy plant- and animal-based foods). The diversity of a healthy diet (measured by the HFD and PDI) was particularly predictable by the microbiome, surpassing other indices such as the Mediterranean diet index that has been independently linked with microbiome composition (Meslier et al., Gut, doi: 10.1136/gutjnl-2019- 320438, 2020). The segregation of favorable and unfavorable microbial clusters according to the heterogeneity of the food source (healthy or unhealthy animal or plant), quality (processed vs unprocessed), and dietary patterns highlights the importance of looking beyond nutrients and single foods in diet-microbiome research. The substantially greater detail and consistency in the results relative to prior diet-microbiome work (Zeevi et al., Cell 163, 1079-1094, 2015; Falony et al., Science 352, 560-564, 2016; Zhernakova et al., Science 352, 565-569, 2016; Thingholm et al., Cell Host Microbe 26, 252-264. e10, 2019; Fu et al. , Circ. Res. 117, 817-824, 2015; McDonald et al. , mSystems 3, 2018) may be due to the quality in the metagenomic profiling and the large sample size. However, given the limitations of FFQ dietary data (which can be highly scalable but noise-prone; Cade et al., Nutr. Res. Rev. 17, 5-22, 2004), future diet-microbiome studies would benefit further from more detailed weighed food record data complemented with nutritionist/dietitian support.

[0345] Several aspects of the gut microbiome associations and matched signatures across diet, obesity, and metabolic health measures are striking with respect to their potential novel epidemiology and microbial biochemistry. A surprising proportion of diet- or health-associated taxa in these results are represented solely by existing or newly generated metagenomic assemblies (Pasolli et al., Cell 176, 649-662. e20, 2019), in addition to very recently isolated organisms with limited cultured strains. This was true for Lawsonibacter asaccharolyticus, the taxon most strongly associated with individual food items (particularly coffee) and nutrient intake, for which only two recent publications with limited and conflicting microbial physiology and taxonomy exist (Sakamoto et al., Int. J. Syst. Evol. Microbiol. 68, 2074-2081 , 2018; Hosny et al. , New Microbes New Infect 14, 85-92, 2016). Both of the taxa most abundant in diets rich in healthy plant-based foods were represented only by previous metagenomic assemblies (Nielsen et al., Nat. Biotechnol. 32, 822-828, 2014; Firmicutes CAG 95 and Roseburia CAG 182), as was the strongest microbial association with adiposity ( Clostridium CAG 58) and several of the most reproducible microbes associated with (un)healthy blood markers (C. bolteae CAG 59, Clostridium CAG 167). Other microbes found here to have dietary or cardiometabolic associations, such as Prevotella spp. or Blastocystis spp., have been characterized in greater biochemical detail, but their prevalence and population structure in the human microbiome have only recently begun to be appreciated (Tett et al., Cell Host Microbe, doi: 10.1016/ j.chom.2019.08.018, 2019; Beghini et al., ISMEJ. 11 :2848-2863, 2017). The latter in particular may be only one of many examples of eukaryotic, fungal, or viral members of the gut microbiome not amenable to most current high-throughput experimental or analytical approaches, but with unexpected and potentially key positive roles in dietary metabolism or cardiometabolic health.

[0346] Likewise, these new, highly specific contributions of the gut microbiome to human dietary responses may help to explain some of the heterogeneity and apparent contradictions seen among previous population studies (Sze & Schloss, MBio 7, 2016; Zeevi et al., Cell 163, 1079-1094, 2015; McDonald et al., mSystems 3, 2018; Kurilshikov et al., Circ. Res. 124, 1808-1820, 2019). First, diet- microbiome-blood marker associations were overall strongest with respect to circulating lipid levels (triglycerides, lipoproteins, etc.) relative to glycemic indices (e.g. blood glucose, insulin sensitivity). This may have both biochemical and clinical implications. It is possible that gut microbial metabolism contributes relatively more to circulating lipid levels than to carbohydrate derivatives, either directly or via mediating processes such as gastrointestinal or systemic bile acid signaling (Kurilshikov et al., Circ. Res. 124, 1808-1820, 2019; Ko et al., Nat. Rev. Gastroenterol. Hepatol., doi:10.1038/s41575- 019-0250-7, 2020). Alternatively, host metabolism may play a greater role in circulating glucose and insulin levels relative to microbial bioactivity. The lipoprotein features most closely associated with the microbiome (such as L-HDL-L) are also more strongly associated with cardiovascular risk compared with typically measured lipids (e.g. TC, HDL-C, LDL-C), suggesting a closer look may be warranted at their utility as clinical biomarkers or as targets for beneficial gut microbiome manipulation.

[0347] Finally, an important conclusion of these results with respect to overall microbiome epidemiology is the limitation and coarseness of phenotypic associations achievable by using simple diversity or microbiome summary statistics. Even when a variety of significant species-specific dietary and molecular associations in the gut were identified, their effect sizes were often limited, likely reflecting both strain-specific functionality not assessed in these profiles (Pasolli et al., Cell 176, 649- 662. e20, 2019; Truong et al., Genome Res. 27, 626-638, 2017; Scholz et al., Nat. Methods 13, 435- 438, 2016; Quince et al., Nat. Biotechnol. 35, 833-844, 2017) and ecological signals among multiple interacting microbes as captured by the richer machine learning models (Pasolli et al., PLoS Comput. Biol. 12, e1004977, 2016). Similarly, with respect to host physiology, many postprandial responses relative to individual-specific fasting values (e.g., triglyceride levels, lipoproteins, insulin concentrations) were moderately more associated with the gut microbiome than the pre-existing fasting values themselves. This may speak to the interaction of both host metabolism and microbial metabolism impacting digestive and metabolic pathways, shaping long- and short-term diet-host effects on health and disease (Rowland et al., Eur. J. Nutr. 57, 1-24, 2018). Overall, this is the first study to identify a shared diet-metabolic-health microbial signature, segregating favorable and unfavorable taxa with multiple measures of both dietary intake and cardiometabolic health. The hope is that these initial PREDICT 1 results, targeted clinical and microbial follow-up based on them, and future iterations of the PREDICT study will aid as a resource both in utilization of the gut microbiome as a biomarker for cardiometabolic risk and in strategies for reshaping the microbiome to improve personalized dietary health.

[0348] Table 1 (Part 1). List of foods and their assigned food groups and health classification.

[0349] Table 1 (Part 2). List of Nutrients.

[0350] Table 1 (Part 3). List of Nutrients_%E

[0351] Table 2. Meal descriptions.

[0352] Table 3 (in six parts). Plant-based Diet Index, Healthy Food Diversity index, Food group classifications, animal groups, Alternate Mediterranean score, and Healthy Eating Index (HEI) descriptions. Table 3 (Part 1). Plant-based Diet Index (PDI).

Table 3 (Part 2). Healthy Food Diversity Index (HFDI).

Table 3 (Part 3). Food Groups Classifications. Table 3 (Part 4). Animal groups.

Table 3 (Part 5). aMED

Table 3 (Part 6). HEI.

[0353] Table 4. P-values from the Mann-Whitney U test between presence/absence of Prevotella copri, Blastocystis spp., and P. copri and Blastocystis spp. (Part 1). Effect size measured as the ratio of the medians for P. copri and Blastocystis spp. presence/absence (Part 2). Table 4 (Part 1). Mann-WhitneyU p-values presjabs

Table 4 (Part 2). Effect size pres|abs.

[0354] Table 5. Ranks and average ranks for determining the two sets of positive and negative bacterial species according to their correlations with a balanced set of personal, habitual diet, fasting, and postprandial metadata.

Table 5 (Part 1A). Spearman’s correlation.

Table 5 (Part 1B): Spearman’s correlations

Table 5 (Part 1C): Spearman’s correlations Ill

Table 5 (Part 2A). Ranks. Table 5 (Part 2B): Ranks

Table 5 (Part 2C): Ranks

[0355] (X) Closing Paragraphs: As will be understood by one of ordinary skill in the art, each embodiment disclosed herein can comprise, consist essentially of or consist of its particular stated element, step, ingredient or component. Thus, the terms “include” or “including” should be interpreted to recite: “comprise, consist of, or consist essentially of.” The transition term “comprise” or “comprises” means includes, but is not limited to, and allows for the inclusion of unspecified elements, steps, ingredients, or components, even in major amounts. The transitional phrase “consisting of’ excludes any element, step, ingredient or component not specified. The transition phrase “consisting essentially of’ limits the scope of the embodiment to the specified elements, steps, ingredients or components and to those that do not materially affect the embodiment. A material effect, in this context, is an alteration in the correlation between the presence, absence, or abundance of a microbe with a selected biological condition, or an alteration in a microbiome in a subject.

[0356] Unless otherwise indicated, all numbers expressing quantities of ingredients, properties such as molecular weight, reaction conditions, and so forth used in the specification and claims are to be understood as being modified in all instances by the term “about.” Accordingly, unless indicated to the contrary, the numerical parameters set forth in the specification and attached claims are approximations that may vary depending upon the desired properties sought to be obtained by the present invention. At the very least, and not as an attempt to limit the application of the doctrine of equivalents to the scope of the claims, each numerical parameter should at least be construed in light of the number of reported significant digits and by applying ordinary rounding techniques. When further clarity is required, the term “about” has the meaning reasonably ascribed to it by a person skilled in the art when used in conjunction with a stated numerical value or range, i.e. denoting somewhat more or somewhat less than the stated value or range, to within a range of ±20% of the stated value; ±19% of the stated value; ±18% of the stated value; ±17% of the stated value; ±16% of the stated value; ±15% of the stated value; ±14% of the stated value; ±13% of the stated value; ±12% of the stated value; ±11% of the stated value; ±10% of the stated value; ±9% of the stated value; ±8% of the stated value; ±7% of the stated value; ±6% of the stated value; ±5% of the stated value; ±4% of the stated value; ±3% of the stated value; ±2% of the stated value; or ±1% of the stated value. [0357] Notwithstanding that the numerical ranges and parameters setting forth the broad scope of the invention are approximations, the numerical values set forth in the specific examples are reported as precisely as possible. Any numerical value, however, inherently contains certain errors necessarily resulting from the standard deviation found in their respective testing measurements.

[0358] The terms “a,” “an,” “the” and similar referents used in the context of describing the invention (especially in the context of the following claims) are to be construed to cover both the singular and the plural, unless otherwise indicated herein or clearly contradicted by context. Recitation of ranges of values herein is merely intended to serve as a shorthand method of referring individually to each separate value falling within the range. Unless otherwise indicated herein, each individual value is incorporated into the specification as if it were individually recited herein. All methods described herein can be performed in any suitable order unless otherwise indicated herein or otherwise clearly contradicted by context. The use of any and all examples, or exemplary language (e.g., “such as”) provided herein is intended merely to better illuminate the invention and does not pose a limitation on the scope of the invention otherwise claimed. No language in the specification shall be construed as indicating any non-claimed element essential to the practice of the invention. [0359] Groupings of alternative elements or embodiments of the invention disclosed herein are not to be construed as limitations. Each group member may be referred to and claimed individually or in any combination with other members of the group or other elements found herein. It is anticipated that one or more members of a group may be included in, or deleted from, a group for reasons of convenience and/or patentability. When any such inclusion or deletion occurs, the specification is deemed to contain the group as modified thus fulfilling the written description of all Markush groups used in the appended claims.

[0360] Certain embodiments of this invention are described herein, including the best mode known to the inventors for carrying out the invention. Of course, variations on these described embodiments will become apparent to those of ordinary skill in the art upon reading the foregoing description. The inventor expects skilled artisans to employ such variations as appropriate, and the inventors intend for the invention to be practiced otherwise than specifically described herein. Accordingly, this invention includes all modifications and equivalents of the subject matter recited in the claims appended hereto as permitted by applicable law. Moreover, any combination of the above-described elements in all possible variations thereof is encompassed by the invention unless otherwise indicated herein or otherwise clearly contradicted by context.

[0361] Furthermore, numerous references have been made to patents, printed publications, database entries, online resources, journal articles, and otherwritten or otherwise memorialized text throughout this specification (referenced materials herein). Each of the referenced materials are individually incorporated herein by reference in their entirety for their referenced teaching, as of the filing date of this application.

[0362] It is to be understood that the embodiments of the invention disclosed herein are illustrative of the principles of the present invention. Other modifications that may be employed are within the scope of the invention. Thus, byway of example, but not of limitation, alternative configurations of the present invention may be utilized in accordance with the teachings herein. Accordingly, the present invention is not limited to that precisely as shown and described.

[0363] The particulars shown herein are by way of example and for purposes of illustrative discussion of the preferred embodiments of the present invention only and are presented in the cause of providing what is believed to be the most useful and readily understood description of the principles and conceptual aspects of various embodiments of the invention. In this regard, no attempt is made to show structural details of the invention in more detail than is necessary for the fundamental understanding of the invention, the description taken with the drawings and/or examples making apparent to those skilled in the art how the several forms of the invention may be embodied in practice. [0364] Definitions and explanations used in the present disclosure are meant and intended to be controlling in any future construction unless clearly and unambiguously modified in the example(s) or when application of the meaning renders any construction meaningless or essentially meaningless. In cases where the construction of the term would render it meaningless or essentially meaningless, the definition should be taken from Webster's Dictionary, 3rd Edition or a dictionary known to those of ordinary skill in the art, such as the Oxford Dictionary of Biochemistry and Molecular Biology (Ed. Anthony Smith, Oxford University Press, Oxford, 2004).