Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
RESPIRATORY INFECTIOUS DISEASE TRIAGE
Document Type and Number:
WIPO Patent Application WO/2021/201675
Kind Code:
A1
Abstract:
The invention provides a method of predicting disease course severity in a subject suffering from or at risk of suffering from an infectious respiratory disease, the method comprising the steps of: a) optionally providing a stratified pool of reference subjects, wherein said reference subjects are stratified according to disease course severity following contact with an infectious respiratory disease agent, such as a respiratory virus; b) optionally performing a method of typing a subject according to the invention on said stratified pool of reference subjects to thereby provide a database of reference signatures stratified according to disease course severity, wherein each of said reference signatures is annotated to the disease course severity observed in each of said reference subjects; c) performing a method of typing a subject according to any the invention on a test subject suffering from or at risk of suffering from an infectious disease caused by said infectious respiratory disease agent, to thereby provide a test signature, d) comparing said test signature with a database of stratified reference signatures as provided in accordance with steps a) and b), and predicting disease course severity in said test subject on the basis of that comparison.

Inventors:
BUDDING ANDRIES EDWARD (NL)
Application Number:
PCT/NL2021/050203
Publication Date:
October 07, 2021
Filing Date:
March 29, 2021
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
INBIOME B V (NL)
International Classes:
C12Q1/686; C12Q1/689; C12Q1/70
Domestic Patent References:
WO2015170979A12015-11-12
Foreign References:
EP3140424A12017-03-15
CN107637705A2018-01-30
US20160317414A12016-11-03
Other References:
A. E. BUDDING ET AL: "IS-pro: high-throughput molecular fingerprinting of the intestinal microbiota", THE FASEB JOURNAL, vol. 24, no. 11, 19 July 2010 (2010-07-19), pages 4556 - 4564, XP055145525, ISSN: 0892-6638, DOI: 10.1096/fj.10-156190
BUDDING ET AL., FASEB J., vol. 24, no. 11, 2010, pages 4556 - 64
MORA ET AL., MICROBIOLOGY, vol. 149, 2003, pages 807 - 813
WANG ET AL., J CLIN MICROBIOL., vol. 46, no. 11, 2008, pages 3555 - 3563
BARRY, T. ET AL., PCR METHODS APPL., vol. 1, 1991, pages 51 - 56
SAIKI ET AL.: "PCR Protocols", 1990, ACADEMIC PRESS, article "Amplification of Genomic DNA", pages: 13 - 20
WHARAM ET AL., NUCLEIC ACIDS RES., vol. 29, no. 11, 2001, pages E54 - E54
HAFNER ET AL., BIOTECHNIQUES, vol. 30, no. 4, 2001, pages 852 - 6,858,860
LALAM, J THEOR BIOL., vol. 242, no. 4, 2006, pages 947 - 53
HAEGEMAN ET AL., ISME J 2013, vol. 7, 2013, pages 1092 - 101
OKSANEN, J. ET AL.: "vegan: Community Ecology Package", R PACKAGE, 2013
"GenBank", Database accession no. MN985325.1
BUDDING ET AL., FASEB J., vol. 24, no. 11, November 2010 (2010-11-01), pages 4556 - 64
NEEFS ET AL., NUCLEIC ACIDS RESEARCH, vol. 21, no. 13, 1993, pages 3025 - 3049
BAKER ET AL., J. MICROBIOL. METH., vol. 55, 2003, pages 41 - 555
VAN DE PEER ET AL., NUCLEIC ACIDS RESEARCH, vol. 24, no. 17, 1996, pages 3381 - 3391
GURTLERSTANISICH, MICROBIOLOGY, vol. 142, 1996, pages 3 - 16
LUDWIG ET AL., NUCLEIC ACIDS RESEARCH, vol. 3.01, no. 4, 2004, pages 1363 - 1371
ASHELFORD ET AL., NUCL ACIDS RES, vol. 30, 2002, pages 3481 - 3489
LOY ET AL., NUCL ACIDS RES., vol. 31, 2003, pages 514 - 516
MAIDAK ET AL., NUCL ACIDS RES, vol. 29, 2001, pages 173 - 174
CORMAN ET AL., EURO SURVEILL. 2020, vol. 25, no. 3, 2020, pages 2000045
BUDDING ET AL., FASEB J, vol. 24, 2010, pages 4556 - 64
BUDDING ET AL., J CLIN MICROBIOL, vol. 54, 2016, pages 934 - 43
DE MEIJ ET AL., FASEB J, vol. 30, 2016, pages 1512 - 22
Attorney, Agent or Firm:
WITMANS, H.A. (NL)
Download PDF:
Claims:
Claims

1. A method of typing a subject, preferably a sample of a subject, for an infectious respiratory disease parameter, comprising the steps of: a) performing on a sample of genomic DNA from the microorganisms in a respiratory tract microbiome obtained from a subject a PCR ampbfication reaction using at least one set of PCR amplification primers directed to a conserved DNA region comprised in the 16S and 23S rRNA sequence flanking microbial 16S-23S rRNA internal transcribed spacer (ITS) regions for amplification of said ITS region to thereby amplify and provide amplification products of said ITS regions comprised in said sample of genomic DNA; b) analyzing said amplification products based on length differences in said amplification products to thereby provide a test signature of a composition of a population of microorganisms in said sample; c) comparing said test signature with at least one reference signature that is provided by performing steps a) and b) on a reference subject, and typing the subject, preferably a sample of said subject, on the basis of that comparison for said infectious respiratory disease parameter, wherein said disease parameter is selected from risk of infection, risk of developing respiratory disease, presence or absence of the infectious agent and/or respiratory disease, and disease course severity.

2. The method of typing according to claim 1, wherein said at least one set of PCR amplification primers comprises:

- a set of PCR ampbfication primers for amplifying the 16S-23S rRNA internal transcribed spacer (ITS) regions in the genomic DNA of microorganisms belonging to the phylum Firmicutes, and - a set of PCR amplification primers for amplifying the 16S-23S rRNA internal transcribed spacer (ITS) regions in the genomic DNA of microorganisms belonging to the phylum Bacteroidetes.

3. The method according to any one of the preceding claims, wherein said at least one set of PCR amplification primers further comprises:

- a set of PCR amplification primers for amplifying the 16S-23S rRNA internal transcribed spacer (ITS) regions in the genomic DNA of microorganisms belonging to the phylum Proteobacteria.

4. The method according to any one of the preceding claims, wherein said set of PCR amplification primers for amplifying the 16S-23S rRNA internal transcribed spacer (ITS) regions in the genomic DNA of the phyla Firmicutes and Bacteroidetes comprises the use of: a) the forward primer 5’-CTGGATCACCTCCTTTCTAWG-3’ comprising a first fluorescent label, b) the forward primer 5’-CTGGAACACCTCCTTTCTGGA-3’ comprising a second fluorescent label; c) and three unlabeled reverse primers 5’- AGGCATCCACCGTGCGCCCT-3’; 5’-AGGCATTCACCRTGCGCCCT-3’; and 5’-AGGCATCCRCCATGCGCCCT-3’.

5. The method according to any one according to any one of the preceding claims, wherein said set of PCR amplification primers for amplifying the 16S-23S rRNA internal transcribed spacer (ITS) regions in the genomic DNA of the phylum Proteobacteria comprises the use of: a) the forward primer 5’-CCGCCCGTCACACCATGG-3’ b) at least one of the reverse primers selected from the group consisting of 5'-AATCTCGGTTGATTTCTTTTCCT-3', 5'-AATCTCGGTTGATTTCTTCTCCT-3', 5'-AATCTCTTTTGATTTCTTTTCCTCG-3',

5'-AATCTCATTTGATGTCTTTTCCTCG-3',

5'-AATCTCTTTTGATTTCTTTTCCTTCG-3',

5'-AATCTCTCTTGATTTCTTTTCCTTCG-3',

5'-AATCTCAATTGATTTCTTTTCCTAAGG-3', wherein at least one of said primers comprises a fluorescent label.

6. The method according to any one of the preceding claims, wherein steps a) and b) of said method comprise the steps of: bl) providing a PCR calibrator system, comprising a set of PCR amplification primers at least one of which primers comprises a label, and a set of at least two PCR calibrators, each PCR calibrator consisting of a DNA fragment comprising a spacer region having a DNA sequence of a given length flanked by upstream and downstream adapter DNA sequences that comprise primer binding sites for binding of said PCR amplification primers wherein said set of PCR amplification primers is for PCR amplifying the spacer region DNA sequence of all PCR calibrators in said set of at least two PCR calibrators, wherein the spacer region DNA sequence comprised in each of said PCR calibrators in said set of at least two PCR calibrators is of a different length, and wherein each PCR calibrator in said set of at least two PCR calibrators is present in equal amount or in a known amount relative to other PCR calibrators in said set; b2) adding said set of at least two PCR calibrators from said PCR calibrator system to said sample of genomic DNA; b3) performing a PCR amplification reaction on said sample of genomic DNA comprising said set of at least two PCR calibrators using said set of PCR amplification primers from said PCR calibrator system as a first set of amplification primers to amplify and provide amplification products of said ITS region(s) comprised in said set of at least two PCR calibrators, and using at least a second set of PCR amplification primers directed to said flanking conserved DNA regions to thereby co-amplify and provide amplification products of said ITS regions comprised in said sample of genomic DNA, and; b4) providing a standard curve by determining the PCR amplification efficiency of each of said at least two PCR calibrators from said PCR calibrator system in said PCR ampbfication reaction of step b3) and expressing said PCR amplification efficiency as a function of the length of the DNA sequence of the ITS region; b5) determining the length-specific amplification efficiency for ITS regions of different length comprised in said genomic DNA sample and amplified in step b3) using the standard curve as provided in step b4); b6) determining the abundance of microbial 16S-23S rRNA internal transcribed spacer (ITS) regions of different length using the length-specific amplification efficiencies determined in step b5), and c) analyzing the composition of a population of microorganisms based on the abundances of ITS regions of different length determined in step b6) to thereby provide a test signature of a composition of a population of microorganisms in said sample.

7. The method of claim 6, wherein said standard curve is based on at least five PCR calibrators of different length ranging in length from 50 to 1200 bps.

8. The method of claim 6 or 7, wherein said step b3) of performing a PCR amplification reaction on said sample of genomic DNA using at least a set of PCR amplification primers directed to said flanking conserved DNA regions comprises the use of a set of PCR amplification primers for amplifying the 16S-23S rRNA internal transcribed spacer (ITS) regions in the genomic DNA of the phyla Firmicutes and Bacteroidetes and a set of PCR amplification primers for amplifying the 16S-23S rRNA internal transcribed spacer (ITS) regions in the genomic DNA of the phylum Proteobacteria.

9. The method according to any one of the preceding claims, wherein the test signature is characterized in having an increased or decreased diversity in the phylum Firmicutes, Bacteroidetes and/or Proteobacteria compared to the reference signature as calculated using the Shannon index.

10. The method according to any one of the preceding claims, wherein the test signature is characterized in having an increased or decreased abundance, preferably relative to the total population number, of the phylum Firmicutes, Bacteroidetes and/or Proteobacteria compared to the reference signature.

11. A method of typing a subject, preferably a biological sample of a subject, for an infectious respiratory disease parameter, said method comprising the steps of: a) providing a sample of a respiratory tract microbiome, and optionally isolating nucleic acid sequences comprised in said sample; b) PCR amplifying at least one microbial rRNA internal transcribed spacer (ITS) region from the, optionally isolated, nucleic acid sequences comprised in said sample using a set of broad-taxonomic range amplification primers for amplifying said at least one microbial rRNA ITS, to thereby generate PCR amplicons; c) recording a high resolution melting curve for the PCR amplicons generated in step b) to thereby generate a first test signature, and optionally recording the length of the PCR amplicons generated in step b) by capillary electrophoresis or sequencing; d) optionally, comparing the high resolution melting curve recorded in step c) with a database comprising high resolution melting curves of reference amplicons generated from reference microbial species or strains of known taxonomic identity using the same set of amplification primers, to thereby obtain a first taxonomic identity indicator of a microorganism present in said sample; e) optionally, comparing the length of each PCR amplicon having a distinct length recorded in step c) with a database comprising PCR amplicon lengths of reference amplicons generated from reference microbial species or strains of known taxonomic identity using the same set of amplification primers, to thereby obtain a second taxonomic identity indicator of a microorganism present in said sample; f) optionally, identifying a microorganism present in said sample to the species or strain level if the first and second taxonomic identity indicator match to thereby provide a second test signature; g) comparing said first and/or second test signature with at least one reference signature of a reference subject that is provided by performing steps a)-c) and optionally d)-f) on a reference subject, and typing the subject, preferably a sample of said subject, on the basis of that comparison for said infectious respiratory disease parameter, wherein said disease parameter is selected from risk of (viral) infection, risk of developing (viral) respiratory disease, presence or absence of the infectious agent and/or respiratory disease, and disease course severity.

12. The method according to claim 11, wherein the reference amplicons in said database comprise amplicons generated by an in vivo and/or in silico PCR amplification reaction for amplifying the corresponding rRNA ITS region of a reference microbial species or strain of known taxonomic identity using the same set of broad-taxonomic range amplification primers for amplifying said at least one microbial rRNA ITS region.

13. The method according to claim 11 or 12, wherein said databases in step d) and e) are combined into a single database, preferably wherein said database comprises rRNA ITS sequences and corresponding taxonomic identity data on bacteria.

14. The method according to any one of claims 11-13, wherein said broad-taxonomic range amplification primers are for amplifying a microbial rRNA ITS region of multiple, preferably essentially all, strains or species from a microbial genus, family, order, class, phylum, kingdom and/or domain, preferably essentially all strains or species from a microbial phylum, more preferably essentially all strains or species from a microbial kingdom, most preferably bacteria.

15. The method according to any one of claims 11-14, wherein said broad-taxonomic range amplification primers for amplifying said at least one microbial rRNA ITS region comprise a forward and reverse primer for amplifying a 16S-23S rRNA ITS region, a 23S-5S rRNA ITS region, a microbial 18S-5.8S rRNA ITS region, or a microbial 5.8S-26S/28S rRNA ITS region, preferably a forward and reverse primer for amplifying a 16S- 23S rRNA ITS region.

16. The method according to any one of claims 11-15, wherein the biological sample is selected from the group consisting of nasopharyngeal and oropharyngeal (e.g. throat) swab, wash and aspirate of the upper respiratory tract; saliva sample; tracheal and endotracheal aspirate, mucus, sputum and bronchoalveolar lavage of the lower respiratory tract; microbial aerosols from coughs and tidal breathing; and nasopharyngeal and oropharyngeal tissue biopsy including lung biopsy.

17. The method according to any one of claims 11-16, wherein the set of broad-taxonomic range amplification primers is a set of amplification primers for amplifying at least one rRNA ITS region of bacteria of the phylum Bacteroidetes and/or Firmicutes.

18. The method according to any one of claims 11-17, wherein said set of broad-taxonomic range amplification primers comprises each of the amplification primers of SEQ ID NOs: 1 and 3-5, or each of the amplification primers of SEQ ID NOs: 2-5, or each of the amplification primers of SEQ ID NOs: 1-5 as provided in Table 1.

19. The method according to any one of claims 11-18, wherein said set of broad-taxonomic range amplification primers comprises each of the amplification primers of SEQ ID NOs: 6 and 7-13, as provided in Table 1.

20. The method according to any one of claims 11-19, wherein said set of broad-taxonomic range amplification primers is a set of universal bacterial amplification primers, said set preferably comprising each of the amplification primers of SEQ ID NOs: 14-15, as provided in Table 1.

21. The method according to any one of claims 11-20, wherein said step of PCR amplifying comprises qPCR.

22. The method according to any one of claims 11-21, wherein in step c) the length of the PCR amplicons is recorded by capillary electrophoresis or sequencing.

23. The method according to any one of claims 11-22, wherein step c), and optionally also step b), is performed in a miniaturized device, preferably a lab-on-a-chip (LOC) device.

24. The method according to any one of claims 11-23, wherein said database comprising high resolution melting curves and PCR amplicon lengths of reference amplicons generated from reference microbial species or strains of known taxonomic identity, further comprises high resolution melting curves and PCR amplicon lengths of reference amplicons generated from human sequences as controls for aspecific amplicon generation using said set of broad-taxonomic range amplification primers.

25. The method according to any one of claims 11-24, wherein said PCR amplification reaction further comprises the use of a PCR calibrator system, comprising a set of PCR amplification primers at least one of which primers comprises a label, and a set of at least two PCR calibrators, each PCR calibrator consisting of a DNA fragment of a given length flanked by upstream and downstream adapter DNA sequences that comprise primer binding sites for binding of said PCR amplification primers wherein said set of PCR amplification primers is for PCR amplifying the DNA sequence of all PCR calibrators in said set of at least two PCR calibrators, wherein the spacer region DNA sequence comprised in each of said PCR calibrators in said set of at least two PCR calibrators is of a different length, and wherein each PCR calibrator in said set of at least two PCR calibrators is present in equal amount or in a known amount relative to other PCR calibrators in said set; and wherein said step b) of PCR amplifying further comprises PCR amplifying the at least two PCR calibrators using the PCR amplification primers of the PCR calibrator system.

26. The method according to any one of claims 11-25, wherein said set of broad-taxonomic range amplification primers for amplifying at least one microbial rRNA ITS region comprises a labelled forward and/or labelled reverse primer, preferably a labelled forward primer, more preferably a fluorescently labelled forward primer.

27. A method of predicting disease course severity in a subject suffering from or at risk of suffering from an infectious respiratory disease, the method comprising the steps of: a) optionally providing a stratified pool of reference subjects, wherein said reference subjects are stratified according to disease course severity following contact with an infectious agent, such as a respiratory virus; b) optionally performing a method of typing a subject according to any one of claims 1-26 on said stratified pool of reference subjects to thereby provide a database of reference signatures stratified according to disease course severity, wherein each of said reference signatures is annotated to the disease course severity observed in each of said reference subjects; c) performing a method of typing a subject according to any one of claims 1- 26 on a test subject suffering from or at risk of suffering from an infectious disease caused by said respiratory virus, to thereby provide a test signature, d) comparing said test signature with a database of stratified reference signatures as provided in accordance with steps a) and b), and predicting disease course severity in said test subject on the basis of that comparison.

28. The method according to claim 27, wherein the stratification according to disease course severity following contact with an infectious agent, such as a respiratory virus in step a) comprises a stratification of at least one of:

- no disease,

- reduced or increased risk of developing a mild disease course (not requiring hospitalization),

- a severe disease course (requiring hospitalization), and - a very severe disease course (requiring hospitalization with intensive care, due to ARDS and/or cytokine storm).

29. The method according to claim 27 or 28, wherein step c) of typing a subject is performed before, during and/or after disease symptoms in said subject arise.

30. The method according to any one of the preceding claims, wherein said sample is selected from the group consisting of nasopharyngeal and oropharyngeal (e.g. throat) swab, wash and aspirate of the upper respiratory tract; saliva sample; tracheal and endotracheal aspirate, mucus, sputum and bronchoalveolar lavage of the lower respiratory tract; microbial aerosols from coughs and tidal breathing; and nasopharyngeal and oropharyngeal tissue biopsy including lung biopsy.

31. The method according to any one of the preceding claims, wherein said infectious respiratory disease is a disease caused by infection with influenza virus (flu), infection with Respiratory Syncytial virus (RSV), infection with Enterovirus, infection with Rhinovirus, infection with Adenovirus, infection with Coronavirus, infection with Herpes Simplex virus, infection with Epstein-Bar virus, infection with Varicella zoster virus, or, preferably, infection with Coronavirus, such as SARS-CoV, MERS-CoV or, preferably, SARS-CoV-2 (COVID-19).

32. The method of typing according to any one of claims 1-26 and 30- 31, wherein said subject, preferably said sample of said subject, is typed on the basis of the presence, absence or an abundance of one or more bacterial species in said test signature, wherein said one or more bacterial species are selected from the group consisting of Haemophilus parainfluenzae, Neisseria cinerea, Streptococcus mitis group, Streptococcus bovis group, Leptotrichia buccalis and Rothia mucilaginosa.

33. The method of typing according to any one of claims 1-26 and 30- 31, wherein said subject, preferably said sample of said subject, is typed on the basis of a test signature having an increased or decreased abundance of one or more bacterial species as compared to an abundance of said one or more bacterial species in a reference signature of a subject that has been tested positive or negative for an infectious respiratory disease, preferably a viral respiratory disease, more preferably a viral respiratory disease that is caused by SARS-CoV such as SARS-CoV-2 (COVID-19); wherein said or more bacterial species are selected from the group consisting of Haemophilus parainfluenzae, Neisseria cinerea, Streptococcus mitis group, Streptococcus bovis group, Leptotrichia buccalis and Rothia mucilaginosa.

34. The method of predicting according to any one of claims 28-31, wherein said disease course severity is predicted on the basis of the presence, absence or an abundance of one or more bacterial species in said test signature, wherein said one or more bacterial species are selected from the group consisting of Haemophilus parainfluenzae, Neisseria cinerea, Streptococcus mitis group, Streptococcus bovis group, Leptotrichia buccalis and Rothia mucilaginosa.

35. Use of the presence, absence or an abundance, preferably an increased abundance, of one or more bacterial species selected from the group consisting of Haemophilus parainfluenzae, Neisseria cinerea, Streptococcus mitis group, Streptococcus bovis group, Leptotrichia buccalis and Rothia mucilaginosa in a sample of a test subject (preferably a sample as defined in any one of the previous claims) for typing said test subject, preferably said sample of said test subject, for an infectious respiratory disease parameter, preferably a viral respiratory disease parameter, selected from risk of infection, risk of developing respiratory disease, presence or absence of the infectious respiratory disease agent and/or respiratory disease, and disease course severity (preferably as defined in any one of the previous claims).

36. Use according to claim 35, wherein said abundance is an increased abundance as compared to an abundance of said one or more bacterial species in a sample of a subject that has been tested positive for said respiratory virus and/or viral respiratory disease (agent).

37. A probiotic composition comprising one or more bacterial species selected from the group consisting of Haemophilus parainfluenzae, Neisseria cinerea, Streptococcus mitis group, Streptococcus bovis group, Leptotrichia buccalis and Rothia mucilaginosai, preferably selected from the group consisting of Haemophilus parainfluenzae, Neisseria cinerea, Streptococcus bovis group, Leptotrichia buccalis and Rothia mucilaginosai

38. A method for the treatment or prophylaxis of infectious respiratory disease, preferably viral respiratory disease, in a subject comprising administering to a subject an effective amount of a probiotic composition of claim 37.

Description:
Title: Respiratory infectious disease triage.

FIELD OF THE INVENTION

The invention is in the field of medical diagnostics. More specifically, the invention relates to methods of diagnosing, typing, stratifying or classifying subjects for infectious respiratory disease status, preferably viral respiratory disease status. The invention further relates to probiotic compositions for use in methods of treating or preventing an infectious respiratory disease, preferably a viral respiratory disease.

BACKGROUND TO THE INVENTION

Given the current global emergency created by the Coronavirus disease 2019 (COVID-19) pandemic due to the SARS-CoV-2, there is an unprecedented need for innovations related to preventing, diagnosing or treating the disease and the virus. There is inter alia a great demand for diagnostic tools that can provide insight into the workings of the virus and provide front line healthcare workers with the ability to stratify patient susceptibility or infection outcomes.

Currently, the biggest threat to countries is that their health systems will be overwhelmed and they are seeking ways to most effectively allocate their resources to those who need them the most. This may be accomphshed by the stratification of patients and associated prioritization of resources. Further, the economic impact of quarantining mass populations is crippling many economies. The ability to identify populations who are least likely to contract such a virus, would permit a stratified quarantine approach across outbreak areas and effective filtering of patients with similar symptoms but who are not infected with the SARS-CoV-2 virus.

Much research is currently underway to provide rapid tests to detect COVID19, as well as search for potential vaccines. However, for both COVID19, and other similar virus outbreaks such as influenza, there is a huge unmet need to provide clinicians with an early insight into which individuals are susceptible to the virus, and what the severity of their infection is likely to be should they become infected.

COVID-19 has a broad clinical spectrum ranging from no symptoms to severe pneumonia, acute respiratory distress syndrome (ARDS) and death. Several clinical risk factors and diagnostic characteristics associated with more severe disease have been described. These include older age, hypertension, obesity, diabetes, neutrophilia and high D-Dimer. Based on current epidemiological studies severe COVID-19 is strikingly rare in children and young adults, but it is not yet known what causes this low risk of severe disease in the younger population.

But even if these mechanisms are discovered in the near future, there is a great need to single out subjects that are at risk, in order to mitigate the impact on hospitals and the economy. Such a triage method is preferably capable of stratifying non-infected or asymptomatic subjects based on expected disease severity should they become infected. The triage method is preferably also capable of stratifying infected and symptomatic patients on the likelihood of experiencing a severe versus a mild course of disease, so as to efficiently deploy the scarce hospital resources. This kind of triage is currently a major challenge for all hospitals in affected areas.

There is also a need for a triage method that is capable of identifying subjects that have a low risk of contracting the virus and become infected. This would allow such subjects to be released from quarantine and return to work.

SUMMARY OF THE INVENTION

The inventor has discovered that it is possible to detect an infectious respiratory disease, in particular a viral infectious disease such as COVID-19 on the basis of the respiratory microbiome of a subject. In addition, the inventor has discovered that it is possible to predict the course of disease by respiratory infection, in particular viral infections, on the basis of the respiratory microbiome of a subject. For instance, for COVID-19, it has now been discovered that it is possible to distinguish between infection type (mild disease course or severe disease course), which allows for the triage and prioritization of patients in times when health care resources are so scarce. In addition, the present inventor has discovered throat microbiota profiles that protect against coronavirus infection, and/or that are associated with lower rates of developing COVID-19 and/or that predict lower COVID-19 disease severity.

The present invention provides a COVID-19 triage method. The apphcation of the method is not restricted to COVID-19, but also provides triage for other respiratory viruses. The presently provided triage method stratifies patients based on expected disease severity; for those already infected this tool should discriminate between patients who are likely to have a severe course of disease (and should thus be admitted to the scarce hospital resources) and those with an expected mild course of the disease (who can remain at home and not occupy hospital beds unnecessarily). Thus, the present invention provides a method of predicting severity of the disease course in a subject suffering from or at risk of suffering from an infectious respiratory disease, preferably a respiratory viral disease.

The present invention provides a method of identifying subjects having a reduced risk of contracting a respiratory virus and/or developing a respiratory virus-associated disease and/or developing a respiratory virus- associated disease of reduced severity. The present invention permits a stratified quarantine approach across outbreak areas and effective filtering of increased versus decreased disease susceptibility.

The present invention is based on the finding that analytical profiling of the microbiome of the respiratory tract provides predictive diagnostic results in that it predicts disease outcome in patients, and susceptibility in non-patients.

It is an important advantage that the methods provided herein can be applied to any infectious respiratory disease, preferably respiratory viral infection, meaning that it can also be applied to outbreaks of other highly infectious diseases, including influenza, and thus provides a long- term tool.

It is another advantage that the method is rapid, as it takes less than 5 hours from sample to result. This has as a consequence that when it is performed parallel to the COVID-19 virus assay, medical staff are timely provided with critical data enabling stratification of patients and allocation of scarce resources.

It is yet another advantage that the method is cheap, uniform and robust and can easily be integrated into clinical workflows, and can be performed on standard equipment present in most clinical microbiology laboratories.

The present invention now provides in a first aspect a method of typing a subject, preferably a sample of a subject, for an infectious respiratory disease parameter, preferably a viral respiratory disease parameter, comprising the steps of: a) performing on a sample of genomic DNA from the microorganisms in a respiratory tract microbiome obtained from a subject a PCR amplification reaction using at least one set of PCR amplification primers directed to a conserved DNA region comprised in the 16S and 23S rRNA sequence flanking microbial 16S-23S rRNA internal transcribed spacer (ITS) regions for amplification of said ITS region to thereby amplify and provide amplification products of said ITS regions comprised in said sample of genomic DNA; b) analyzing said amplification products based on length differences in said amplification products to thereby provide a test signature of a composition of a population of microorganisms in said sample; c) comparing said test signature with at least one reference signature that is provided by performing steps a) and b) on a reference subject, and typing the subject, preferably a sample of said subject, on the basis of that comparison for said infectious respiratory disease parameter, preferably said viral respiratory disease parameter, wherein said disease parameter is selected from risk of (viral) infection, risk of developing (viral) respiratory infectious disease (risk of development of disease symptoms), presence or absence of the infectious agent and/or respiratory disease, presence or absence of respiratory virus and/or disease agent, and disease course severity (i.e. severity of disease symptoms).

In a preferred embodiment of the above method, said at least one set of PCR amplification primers comprises:

- a set of PCR amplification primers for amplifying the 16S-23S rRNA internal transcribed spacer (ITS) regions in the genomic DNA of microorganisms belonging to the phylum Firmicutes, and

- a set of PCR amplification primers for amplifying the 16S-23S rRNA internal transcribed spacer (ITS) regions in the genomic DNA of microorganisms belonging to the phylum Bacteroidetes.

In another preferred embodiment of the above method, said at least one set of PCR amplification primers further comprises:

- a set of PCR amplification primers for amplifying the 16S-23S rRNA internal transcribed spacer (ITS) regions in the genomic DNA of microorganisms belonging to the phylum Proteobacteria.

In yet another preferred embodiment of the above method, said set of PCR amplification primers for amplifying the 16S-23S rRNA internal transcribed spacer (ITS) regions in the genomic DNA of the phyla Firmicutes and Bacteroidetes comprises the use of: a) the forward primer 5’-CTGGATCACCTCCTTTCTAWG-3’ comprising a first fluorescent label, b) the forward primer 5’-CTGGAACACCTCCTTTCTGGA-3’ comprising a second fluorescent label; c) and three unlabeled reverse primers 5’-AGGCATCCACCGTGCGCCCT-3’; 5’-AGGCATTCACCRTGCGCCCT-3’; and 5’-AGGCATCCRCCATGCGCCCT- 3’.

In yet another preferred embodiment of the above method, said set of PCR amplification primers for amplifying the 16S-23S rRNA internal transcribed spacer (ITS) regions in the genomic DNA of the phylum Proteobacteria comprises the use of: a) the forward primer 5’-CCGCCCGTCACACCATGG-3’ b) at least one of the reverse primers selected from the group consisting of 5'-AATCTCGGTTGATTTCTTTTCCT-3', 5'-AATCTCGGTTGATTTCTTCTCCT-3', 5'-AATCTCTTTTGATTTCTTTTCCTCG-3', 5'-AATCTCATTTGATGTCTTTTCCTCG-3', 5'-AATCTCTTTTGATTTCTTTTCCTTCG-3', 5'-AATCTCTCTTGATTTCTTTTCCTTCG-3', 5'-AATCTCAATTGATTTCTTTTCCTAAGG-3', wherein at least one of said primers comprises a fluorescent label.

In yet another preferred embodiment of the above method, steps a) and b) of said method comprise the steps of: bl) providing a PCR calibrator system, comprising a set of PCR amplification primers at least one of which primers comprises a label, and a set of at least two PCR calibrators, each PCR calibrator consisting of a DNA fragment comprising a spacer region having a DNA sequence of a given length flanked by upstream and downstream adapter DNA sequences that comprise primer binding sites for binding of said PCR amplification primers wherein said set of PCR amplification primers is for PCR amplifying the spacer region DNA sequence of all PCR calibrators in said set of at least two PCR calibrators, wherein the spacer region DNA sequence comprised in each of said PCR calibrators in said set of at least two PCR calibrators is of a different length, and wherein each PCR calibrator in said set of at least two PCR calibrators is present in equal amount or in a known amount relative to other PCR calibrators in said set; b2) adding said set of at least two PCR calibrators from said PCR calibrator system to said sample of genomic DNA; b3) performing a PCR amplification reaction on said sample of genomic DNA comprising said set of at least two PCR calibrators using said set of PCR amplification primers from said PCR calibrator system as a first set of amplification primers to amplify and provide amplification products of said ITS region(s) comprised in said set of at least two PCR calibrators, and using at least a second set of PCR amplification primers directed to said flanking conserved DNA regions to thereby co-amplify and provide amplification products of said ITS regions comprised in said sample of genomic DNA, and; b4) providing a standard curve by determining the PCR amplification efficiency of each of said at least two PCR calibrators from said PCR calibrator system in said PCR amplification reaction of step f>3) and expressing said PCR amplification efficiency as a function of the length of the DNA sequence of the ITS region; b5) determining the length-specific amplification efficiency for ITS regions of different length comprised in said genomic DNA sample and amplified in step b3) using the standard curve as provided in step b4); b6) determining the abundance of microbial 16S-23S rRNA internal transcribed spacer (ITS) regions of different length using the length-specific amplification efficiencies determined in step b5), and c) analyzing the composition of a population of microorganisms based on the abundances of ITS regions of different length determined in step b6) to thereby provide a test signature of a composition of a population of microorganisms in said sample.

In yet another preferred embodiment of the above method, said standard curve is based on at least five PCR calibrators of different length ranging in length from 50 to 1200 bps.

In yet another preferred embodiment of the above method, said step b3) of performing a PCR amplification reaction on said sample of genomic DNA using at least a set of PCR amplification primers directed to said flanking conserved DNA regions comprises the use of a set of PCR amplification primers for amplifying the 16S-23S rRNA internal transcribed spacer (ITS) regions in the genomic DNA of the phyla Firmicutes and Bacteroidetes and a set of PCR amplification primers for amplifying the 16S-23S rRNA internal transcribed spacer (ITS) regions in the genomic DNA of the phylum Proteobacteria.

In yet another preferred embodiment of the above method, the test signature is characterized in having an increased or decreased diversity in the phylum Firmicutes, Bacteroidetes and/or Proteobacteria compared to the reference signature as calculated using the Shannon index.

In yet another preferred embodiment of the above method, the test signature is characterized in having an increased or decreased abundance, preferably relative to the total population number, of the phylum Firmicutes, Bacteroidetes and/or Proteobacteria compared to the reference signature.

In another aspect, the present invention provides a method of typing a subject, preferably a biological sample of a subject, for an infectious respiratory disease parameter, preferably a viral respiratory disease parameter, said method comprising the steps of: a) providing a sample of a respiratory tract microbiome, and optionally isolating nucleic acid sequences comprised in said sample; b) PCR amplifying at least one microbial rRNA internal transcribed spacer (ITS) region from the, optionally isolated, nucleic acid sequences comprised in said sample using a set of broad-taxonomic range amplification primers for amplifying said at least one microbial rRNA ITS, to thereby generate PCR amplicons; c) recording a high resolution melting curve for the PCR amplicons generated in step b) to thereby generate a first test signature, and optionally recording the length of the PCR amplicons generated in step b) by capillary electrophoresis or sequencing; d) optionally, comparing the high resolution melting curve recorded in step c) with a database comprising high resolution melting curves of reference amplicons generated from reference microbial species or strains of known taxonomic identity using the same set of amplification primers, to thereby obtain a first taxonomic identity indicator of a microorganism present in said sample; e) optionally, comparing the length of each PCR amplicon having a distinct length recorded in step c) with a database comprising PCR amplicon lengths of reference amplicons generated from reference microbial species or strains of known taxonomic identity using the same set of amplification primers, to thereby obtain a second taxonomic identity indicator of a microorganism present in said sample; f) optionally, identifying a microorganism present in said sample to the species or strain level if the first and second taxonomic identity indicator match to thereby provide a second test signature; g) comparing said first and/or second test signature with at least one reference signature of a reference subject that is provided by performing steps a)-c) and optionally d)-f) on a reference subject, and typing the subject, preferably a sample of said subject, on the basis of that comparison for said infectious respiratory disease parameter, preferably said viral respiratory disease parameter, wherein said disease parameter is selected from risk of (viral) infection, risk of developing (viral) respiratory disease (risk of development of disease symptoms), presence or absence of the infectious agent and/or respiratory disease, presence or absence of respiratory virus and/or respiratory disease agent, and disease course severity (i.e. severity of disease symptoms).

In a preferred embodiment of the above method, the reference amplicons in said database comprise amplicons generated by an in vivo and/or in silico PCR amplification reaction for amplifying the corresponding rRNA ITS region of a reference microbial species or strain of known taxonomic identity using the same set of broad-taxonomic range amplification primers for amplifying said at least one microbial rRNA ITS region.

In another preferred embodiment of the above method, said databases in step d) and e) are combined into a single database, preferably wherein said database comprises rRNA ITS sequences and corresponding taxonomic identity data on bacteria.

In yet another preferred embodiment of the above method, said broad-taxonomic range amplification primers are for amplifying a microbial rRNA ITS region of multiple, preferably essentially all, strains or species from a microbial genus, family, order, class, phylum, kingdom and/or domain, preferably essentially all strains or species from a microbial phylum, more preferably essentially all strains or species from a microbial kingdom, most preferably bacteria.

In yet another preferred embodiment of the above method, said broad-taxonomic range amplification primers for amplifying said at least one microbial rRNA ITS region comprise a forward and reverse primer for amplifying a 16S-23S rRNA ITS region, a 23S-5S rRNA ITS region, a microbial 18S-5.8S rRNA ITS region, or a microbial 5.8S-26S/28S rRNA ITS region, preferably a forward and reverse primer for amplifying a 16S- 23S rRNA ITS region. In yet another preferred embodiment of the above method, the biological sample is selected from the group consisting of nasopharyngeal and oropharyngeal (e.g. throat) swab, wash and aspirate of the upper respiratory tract; saliva sample; tracheal and endotracheal aspirate, mucus, sputum and bronchoalveolar lavage of the lower respiratory tract; microbial aerosols from coughs and tidal breathing; and nasopharyngeal and oropharyngeal tissue biopsy including lung biopsy.

In yet another preferred embodiment of the above method, the set of broad-taxonomic range amplification primers is a set of amplification primers for amplifying at least one rRNA ITS region of bacteria of the phylum Bacteriodetes and/or Firmicutes.

In yet another preferred embodiment of the above method, said set of broad-taxonomic range amplification primers comprises each of the amplification primers of SEQ ID NOs: 1 and 3-5, or each of the amplification primers of SEQ ID NOs: 2-5, or each of the amplification primers of SEQ ID NOs: 1-5 as provided in Table 1.

In yet another preferred embodiment of the above method, said set of broad-taxonomic range amplification primers comprises each of the amplification primers of SEQ ID NOs: 6 and 7-13, as provided in Table 1.

In yet another preferred embodiment of the above method, said set of broad-taxonomic range amplification primers is a set of universal bacterial amplification primers, said set preferably comprising each of the amplification primers of SEQ ID NOs: 14-15, as provided in Table 1.

In yet another preferred embodiment of the above method, said step of PCR amplifying comprises qPCR.

In yet another preferred embodiment of the above method, in step c) the length of the PCR amplicons is recorded by capillary electrophoresis or sequencing. In yet another preferred embodiment of the above method, step c), and optionally also step b), is performed in a miniaturized device, preferably a lab-on-a-chip (LOC) device.

In yet another preferred embodiment of the above method, said database comprising high resolution melting curves and PCR amplicon lengths of reference amplicons generated from reference microbial species or strains of known taxonomic identity, further comprises high resolution melting curves and PCR amplicon lengths of reference amplicons generated from human sequences as controls for aspecific amplicon generation using said set of broad-taxonomic range amplification primers.

In yet another preferred embodiment of the above method, said PCR amplification reaction further comprises the use of a PCR calibrator system, comprising a set of PCR amplification primers at least one of which primers comprises a label, and a set of at least two PCR calibrators, each PCR calibrator consisting of a DNA fragment of a given length flanked by upstream and downstream adapter DNA sequences that comprise primer binding sites for binding of said PCR amplification primers wherein said set of PCR amplification primers is for PCR amplifying the DNA sequence of all PCR calibrators in said set of at least two PCR calibrators, wherein the spacer region DNA sequence comprised in each of said PCR calibrators in said set of at least two PCR calibrators is of a different length, and wherein each PCR calibrator in said set of at least two PCR calibrators is present in equal amount or in a known amount relative to other PCR calibrators in said set; and wherein said step b) of PCR amplifying further comprises PCR amplifying the at least two PCR calibrators using the PCR amplification primers of the PCR calibrator system.

In yet another preferred embodiment of the above method, said set of broad-taxonomic range amplification primers for amplifying at least one microbial rRNA ITS region comprises a labelled forward and/or labelled reverse primer, preferably a labelled forward primer, more preferably a fluorescently labelled forward primer.

In yet another aspect, the present invention provides a method of predicting disease course severity in a subject suffering from or at risk of suffering from an infectious respiratory disease, preferably a viral respiratory disease, the method comprising the steps of: a) optionally providing a stratified pool of reference subjects, wherein said reference subjects are stratified according to disease course severity following contact with an infectious agent, such as a respiratory virus; b) optionally performing a method of typing a subject in accordance with any one of the two methods of typing a subject as described herein above on said stratified pool of reference subjects to thereby provide a database of reference signatures stratified according to disease course severity, wherein each of said reference signatures is annotated to the disease course severity observed in each of said reference subjects; c) performing a method of typing a subject in accordance with any one of the two methods of typing a subject as described herein above on a test subject suffering from or at risk of suffering from an infectious disease caused by said infectious agent, such as a respiratory virus, to thereby provide a test signature, d) comparing said test signature with a database of stratified reference signatures as provided in accordance with steps a) and b), and predicting disease course severity in said test subject on the basis of that comparison.

In a preferred embodiment of the above method, the stratification according to disease course severity following contact with an infectious agent, such as a respiratory virus, in step a) comprises a stratification of at least one of:

- no disease,

- reduced or increased risk of developing a mild disease course (not requiring hospitalization), - a severe disease course (requiring hospitalization), and

- a very severe disease course (requiring hospitalization with intensive care, due to ARDS and/or cytokine storm).

In a preferred embodiment of any of the methods described herein above, step c) of typing a subject is performed before, during and/or after disease symptoms in said subject arise.

In another embodiment of any of the methods described herein above, said sample is selected from the group consisting of nasopharyngeal and oropharyngeal (e.g. throat) swab, wash and aspirate of the upper respiratory tract; saliva sample; tracheal and endotracheal aspirate, mucus, sputum and bronchoalveolar lavage of the lower respiratory tract; microbial aerosols from coughs and tidal breathing; and nasopharyngeal and oropharyngeal tissue biopsy including lung biopsy.

In yet another preferred embodiment of any of the methods described herein above, said viral respiratory disease is influenza, Respiratory Syncytial virus (RSV), Enterovirus, Rhinovirus, Adenovirus, Coronavirus, Herpes Simplex virus, Epstein-Bar virus, Varicella zoster virus or COVID, preferably COVID.

In yet another preferred embodiment of any one of the methods described herein above, said viral respiratory disease is a disease caused by infection with influenza virus (flu), infection with Respiratory Syncytial virus (RSV), infection with Enterovirus, infection with Rhinovirus, infection with Adenovirus, infection with Coronavirus, infection with Herpes Simplex virus, infection with Epstein-Bar virus, infection with Varicella zoster virus, or, preferably, infection with Coronavirus, such as SARS-CoV, MERS-CoV or, preferably, SARS-CoV-2 (COVID- 19).

In yet another preferred embodiment of any one of the methods described herein above, said viral respiratory disease is a disease caused by (infection with) an enterovirus, Influenza A H1N1, Human adenovirus, Human bocavirus, Human coronavirus 229E, Human coronavirus HKU1, Human coronavirus NL63, Human coronavirus OC43, Human metapneumovirus A or B, Human parechovirus, Human parainfluenza virus type 1, 2, 3 or 4, Human Respiratory Syncytial virus A or B, Human rhinovirus, Influenza A virus (non-HINl), Influenza B virus, SARS-CoV-2.

In yet another preferred embodiment of, or in combination with, any one of the methods described herein above, said an infectious respiratory disease is caused by Mycoplasma pneumoniae, Streptococcus pneumoniae, Haemophilus influenzae, Moraxella catarrhalis, Chlamydia pneumoniae, Chlamydia psitacci, Escherichia coli, Klebsiella pneumoniae, Klebsiella oxytoca. Enterobacter cloacae, Stenotrophomas maltophilia, Burkholderia cepacia or Aspergillus fumigatus.

In yet another preferred embodiment of any one of the methods described herein above, said subject, preferably said sample of said subject, is typed on the basis of the presence, absence or abundance of one or more bacterial species in said test signature, wherein said one or more bacterial species are selected from the group consisting of Haemophilus parainfluenzae, Neisseria cinerea, Streptococcus mitis group, Streptococcus bovis group, Leptotrichia buccalis and Rothia mucilaginosa.

In yet another preferred embodiment of any one of the methods described herein above, said subject, preferably said sample of said subject, is typed on the basis of a test signature having an increased or decreased abundance of one or more bacterial species as compared to an abundance of said one or more bacterial species in a reference signature of a subject that has been tested positive or negative for an infectious respiratory disease, preferably a viral respiratory disease, preferably a viral respiratory disease that is caused by SARS-CoV such as SARS-CoV-2 (COVID-19); wherein said or more bacterial species are selected from the group consisting of Haemophilus parainfluenzae, Neisseria cinerea, Streptococcus mitis group, Streptococcus bovis group, Leptotrichia buccalis and Rothia mucilaginosa. In yet another preferred embodiment of any one of the methods described herein above, said disease course severity is predicted on the basis of the presence, absence or an abundance of one or more bacterial species in said test signature, wherein said one or more bacterial species are selected from the group consisting of Haemophilus parainfluenzae, Neisseria cinerea, Streptococcus mitis group, Streptococcus bovis group, Leptotrichia buccalis and Rothia mucilaginosa.

In another aspect, the invention provides a use of the presence, absence or an abundance, preferably an increased abundance, of one or more bacterial species selected from the group consisting of Haemophilus parainfluenzae, Neisseria cinerea, Streptococcus mitis group, Streptococcus bovis group, Leptotrichia buccalis and Rothia mucilaginosa in a sample of a test subject (preferably a sample as described in any one of the embodiments described herein) for typing said test subject, preferably said sample of said test subject, for an infectious respiratory disease parameter, preferably a viral respiratory disease parameter selected from risk of (viral) infection, risk of developing (viral) respiratory disease, presence or absence of the infectious agent and/or respiratory disease, such as respiratory virus and/or respiratory disease agent, and disease course severity (preferably as defined in any one of the embodiments described herein).

In a preferred embodiment of said use, said abundance is an increased abundance as compared to an abundance of said one or more bacterial species in a sample of a subject that has been tested positive for said infectious agent, such as a respiratory virus and/or (viral) respiratory disease.

In another aspect, the methods of the invention mentioned above can be performed methods of microbial quantification other than molecular analysis such as IS-Pro for instance by using conventional cultivation and enumeration methods. In another aspect, the methods of the invention described above are performed without using a reference such as a reference signature, but wherein quantification is performed using absolute numbers or absolute abundances of micro-organisms in order to type the microbiome.

In yet another aspect, the present invention provides a probiotic composition comprising one or more bacterial species selected from the group consisting of Haemophilus parainfluenzae, Neisseria cinerea, Streptococcus mitis group, Streptococcus bovis group, Leptotrichia buccalis and Rothia mucilaginosa.

In yet another aspect, the present invention provides a method for the treatment or prophylaxis of an infectious respiratory disease, preferably a viral respiratory disease in a subject comprising administering to a subject an effective amount of a probiotic composition of the invention as described above.

In preferred embodiments of this method, said viral infectious disease is a disease caused by infection with influenza virus (flu), infection with Respiratory Syncytial virus (RSV), infection with Enterovirus, infection with Rhinovirus, infection with Adenovirus, Coronavirus, infection with Herpes Simplex virus, infection with Epstein-Bar virus, infection with Varicella zoster virus or infection with Coronavirus, such as SARS-CoV, MERS-CoV or, COVID, preferably, SARS-CoV-2 COVID (COVID-19).

In other preferred embodiments of this method, said viral infectious disease is a disease caused by infection with an enterovirus, Influenza A H1N1, Human adenovirus, Human bocavirus, Human coronavirus 229E, Human coronavirus HKU1, Human coronavirus NL63, Human coronavirus CC43, Human metapneumovirus A or B, Human parechovirus, Human parainfluenza virus type 1, 2, 3 or 4, Human Respiratory Syncytial virus A or B, Human rhinovirus, Influenza A virus (non-HINl), Influenza B virus, SARS-CoV-2. In still further preferred embodiments of this method, said infectious respiratory disease is a disease caused by infection with Mycoplasma pneumoniae, Streptococcus pneumoniae, Haemophilus influenzae, Moraxella catarrhalis, Chlamydia pneumoniae, Chlamydia psitacci, Escherichia coli, Klebsiella pneumoniae, Klebsiella oxytoca. Enterobacter cloacae, Stenotrophomas maltophilia, Burkholderia cepacia or Aspergillus fumigatus.

The invention also provides a probiotic composition as described herein for use as a medicament, preferably for use in a method for the treatment or prophylaxis of an infectious respiratory disease, preferably a viral respiratory disease, in a subject.

DESCRIPTION OF THE DRAWINGS

Figure 1 shows the steps in a predictive diagnostic assay for viral infectious diseases, wherein ultimately a subject is stratified according to viral infectious disease severity status, which includes a status group of “release to isolation”, “continued observation” and “admit to hospital” depending on viral infectious disease severity.

Figure 2 shows the steps in a predictive diagnostic assay stratifying subjects according to infectious disease severity risk groups.

Figure 3 shows average cumulative IS-pro profiles of 20 COVID negative (top profile) and 10 COVID negative patients (bottom profile). The profiles show very clear differences in all phyla (different colors) and also in specific species (represented by peaks).

Figure 4 shows individual profiles of 20 COVID negative (green bar) and 10 COVID positive (red bar) samples. In individual profile too, differences are directly apparent.

Figure 5 shows a heatmap of all profiles. Clustering was performed by the Unweighted Pair Group Method with Arithmetic mean (UPGMA) on a cosine correlation similarity matrix. Even with this unsupervised clustering approach, clear COVID positive and COVID negative clusters can be found.

Figure 6 shows bacterial loads (abundances) per sample split out in the phylum groups FAFV, Bacteroidetes and Proteobacteria. It is clear that especially Bacteroidetes abundance is significantly decreased in COVID positive patients.

Figure 7 shows data for Example 3. Top: number of patients per age group. Bottom: distribution of SARS-CoV-2 PCR results per age group. Red: SARS-CoV-2 PCR positive, green: SARS-CoV-2 PCR negative.

Figure 8 shows the results of Example 3. A. Cluster analysis of 135 throat microbiota profiles. Each column represents a sample, each row represents a bacterial species (bacteria from the phyla Firmicutes and Actinobacteria in blue, top; bacteria from the phylum Proteobacteria in yellow, bottom). For each sample the SARS-CoV-2 PCR outcome is depicted by a green (negative) or red (positive) marking. Centrally in the dendrogram a cluster can be seen with similar profile signatures (cluster L).

Surrounding profiles show a lower degree of similarity. B. SARS-CoV-2 PCR outcomes for the central cluster. 19/77 (25%) are positive (cluster L). C. Of the remaining 58 samples 27/58 (47%) is SARS-CoV-2 positive (cluster H).

Figure 9 shows the proportion of patients in the high- (H, orange) or low- (L, green) positive clusters stratified by age as determined in Example 3. It can be seen that the occurrence of the low-positive clusters decreases with age. Proportion of positives in the high- and low-COVID clusters for the entire population can be seen in the pie chats below the bar chart. The same proportions for patients aged 80 and older can be seen in the pie charts to the right of the bar char (red is positive, green is negative).

Figure 10 shows the Shannon diversity of pharyngeal microbiota (PM) samples stratified by age groups in Example 3. Diversity is shown for all phyla taken together (left top) or for FAFV, Proteobacteria and Bacteroidetes separately (clockwise from top right). A decrease in diversity can be seen in the oldest age groups for all phyla. Age groups and sample counts are given in the table below the figures.

DETAILED DESCRIPTION OF THE INVENTION Definitions

The term “microbiome”, as used herein, refers to a population of microorganisms from a particular environment, including the environment of the body or a part of the body, as well as the population of microorganisms inhabiting soils, plants and waterbodies. The term is interchangeably used to address the population of microorganisms itself (sometimes referred to as the microbiota), as well as the collective genomes of the microorganisms that reside in the particular environment.

The term “environment”, as used herein, refers to all surrounding circumstances, conditions, or influences to which a population of microorganisms is exposed. The term is intended to include reference to any a subject of study, hence, including environments in a subject, such as a human subject, but particularly refers to environments such as soil, a waterbody or a plant.

The term “disease or condition in a subject” may include reference to disease or condition in an environment, by preferably refers to disease or condition in a human or animal subject.

The terms “IS-pro” and “IS-profiling”, as used interchangeably herein, are used in the context of a specific method of analyzing the composition of a microbiome based on taxonomic variation in the DNA sequence of the microbial 16S-23S rRNA internal transcribed spacer (ITS) regions in the genomic DNA of the microorganisms in said microbiome. The technique is described in detail in Budding et al. 2010 (FASEB J., 24(ll):4556-64), which publication is incorporated in its entirety by reference herein, and is also described in detail herein below. The term “intergenic spacer region”, as used herein, refers to a genomic sequence located between two genes.

The terms “16S-23S rRNA internal transcribed spacer (ITS) region” and “16S-23S intergenic spacer (IS) region”, as used herein, refer to a segment of non-functional DNA situated between structural ribosomal RNA (rRNA) genes on a common precursor transcript. In literature, this region is also synonymously called 16S-23S rDNA intergenic spacer region (IS region) (Budding et al., 2010, cited above), 16S-23S rRNA intergenic spacer region (Mora et al. 2003. Microbiology 149: 807-813), 16S-23S rRNA gene internal transcribed spacer (ITS) region (Wang et al 2008. J Clin Microbiol. 46(11): 3555-3563), 16s/23s ribosomal spacer region (Barry, T. et al. 1991. PCR Methods Appl. 1:51-56. In the genome of Escherichia coli CFT073 (Genbank accession AE014075.1), the ITS region separating the 16S and 23S rRNA genes in one of the rrn operons is indicated by nucleotides numbered 236727-237160, comprising 433 bases. In many microbial species the 16S-23S ITS region contains coding sequences for tRNA genes. Multiple rRNA operons (rrn) may be present within the genome of a microorganism, sometimes as many as 15, which often display intragenomic heterogeneity in ITS type. The spacer regions between the 16S and 23S genes in the prokaryotic rRNA genetic loci show a significant level of length and sequence polymorphism across both genus and species lines. Pairs of priming sequences can be selected for the amplification of these polymorphic regions from highly conserved sequences in the 16S and 23S genes occurring adjacent to these polymorphic regions.

The term “16S rRNA gene”, as used herein, refers to a DNA sequence or sequences encoding the 16S rRNA molecule.

The term “23S rRNA gene”, as used herein, refers to a DNA sequence or sequences encoding the 23S rRNA molecule. The term “polymorphic DNA target region”, as used herein, refers to a DNA region varying in length and/or sequence in different taxonomic groups of microorganisms and that serves as a target for PCR amplification.

The term “conserved region”, as used herein, refers to a segment of nucleotide sequence of a gene or amino acid sequence of a protein that is significantly similar between various different nucleotide sequences of a gene. This term is interchangeably used with the term “conserved sequence”. The term “conserved DNA region”, in particular refers to a DNA region that (i) comprises multiple nucleotides (preferably between 15-30 nucleotides), (ii) flanks a polymorphic DNA target region, (in) shares a high degree of homology among genomes of microorganisms in a taxonomic group of microorganisms, thus differentiating between organisms of certain taxonomic groups, and (iv) is able to serve as a primer binding site for PCR amplification primers. In the context of this invention, a DNA region is defined as conserved when said sequence exhibits a sequence homology or nucleotide sequence similarity of at least 60%, preferably 70%, more preferably 80%, even more preferably 90% between different microorganisms belonging to a single taxonomic group, wherein said sequence similarity is calculated over the entire length of the nucleic acid sequence(s).

The term “sample”, as used herein, refers to any sample suitable for analyzing or typing according to the methods of the present invention. A sample may be collected from an organism (e.g., human or other mammal, plant). The biological sample can be in any form, including without limitation a solid material such as a tissue, cells, a cell pellet, a cell extract, or a biopsy, or a biological fluid such as urine, blood, stool, saliva, amniotic fluid, exudate from a region of infection or inflammation, or a mouth wash containing buccal cells, urine, cerebrospinal fluid and synovial fluid and organs. Preferably, the sample of genomic DNA is selected from the group consisting of nasopharyngeal and oropharyngeal (e.g. throat) swab, wash and aspirate of the upper respiratory tract; saliva sample; tracheal and endotracheal aspirate, mucus, sputum and bronchoalveolar lavage of the lower respiratory tract; microbial aerosols from coughs and tidal breathing; and nasopharyngeal and oropharyngeal tissue biopsy including lung biopsy.

The term “amplification”, as used herein, includes methods for copying a target nucleic acid, thereby increasing the number of copies of a selected nucleic acid sequence. Amplification may be exponential or linear.

A target nucleic acid may be either DNA or RNA. The sequences amplified in this manner form an “amplification product”, “amplimer” or “amplicon”, which terms are used interchangeably herein. While the exemplary methods described hereinafter relate to amplification using the polymerase chain reaction (PCR), numerous other methods are known in the art for amplification of nucleic acids (e.g., isothermal methods, rolling circle methods, etc.). The skilled artisan will understand that these other methods may be used either in place of, or together with, PCR methods. See, Saiki, “Amplification of Genomic DNA” in PCR Protocols, Innis et al., Eds., Academic Press, San Diego, Calif. 1990, pp. 13-20; Wharam et al., Nucleic Acids Res., 29(11):E54-E54, 2001; Hafner et al., Biotechniques, 30(4):852-56, 858, 860, 2001: Zhong et al., Biotechniques, 30(4):852-6, 858, 860, 2001.

The terms “amplification product”, and “amplicon”, as used interchangeably herein, refer to a nucleic acid fragment that is the product of a nucleic acid amplification or replication event, such as for instance formed in the polymerase chain reaction (PCR).

The term “template”, as used herein, refers to the nucleic acid from which the target sequence is amplified in a nucleic acid amplification reaction. The term “amplifiable template”, as used herein, refers to a template that, when amplified, results in a single amplicon. Amplifiable templates comprise primer binding sites for hybridization of amplification primers.

The term “abundance”, as used herein, includes reference to the presence of a micro-organism such as a certain bacterial species in a sample in an absolute or relative quantity, or to the bactrerial load of certain bacterial species in a sample or microbiome. If a micro-organism such as a bacterial species is present in a sample, it has a certain absolute or relative abundance in said sample. Preferably, where reference is made to an increased or decreased abundance of a micro-organism such as a certain bacterial species in a sample, the increase or decrease is at least 1%, 2%,

3%, 4%, 5%, 6%, 7%, 8%, 9% or at least 10% or at least 20% as compared to the abundance in a reference sample or as compared to the total quantity of bacterial species in said test sample. An abundance of bacteria in a diagnostic, prophylactic or therapeutic microbiome or probiotic composition may reflect the absolute abundance of that microorganism in the population as a percentage of the total population.

The term "primer" as used herein refers to a single stranded nucleotide sequence which is capable of acting as a as used herein refers to a single-stranded oligonucleotide capable of acting as a point of initiation of template-directed DNA synthesis under appropriate conditions (e.g., buffer, salt, temperature, and pH) in the presence of nucleotides and an agent for nucleic acid polymerization (e.g., a DNA-dependent or RNA-dependent polymerase). Generally, the sequence of the primer is substantially complementary to a nucleic acid strand to be copied, or at least comprises a region of complementarity sufficient for annealing to occur and extension in the 5 ' to 3 ' direction therefrom. The primer may be a DNA primer, RNA primer, or a chimeric DNA/RNA primer. Primers are preferably synthetic oligonucleotide sequences of about 12-100 nucleotides in length; preferably, about 30-60 nucleotides in length. The term "primer" may refer to more than one primer, particularly in the case where there is some ambiguity in the information regarding one or both ends of the target region to be amplified. If a "conserved" region shows significant levels of polymorphism in a population, mixtures of primers can be prepared that will amplify such sequences, or the primers can be designed to amplify even mismatched sequences. A primer can be labeled, if desired, by incorporating a label detectable by spectroscopic, photochemical, biochemical, immunochemical, or chemical means. For example, useful labels include radioactive labels, fluorescent labels, electron- dense reagents, enzymes (as commonly used in ELISAs), biotin, or haptens and proteins for which antisera or monoclonal antibodies are available. Preferred labels for use in this invention comprise fluorescent labels, preferably selected from FAM, TET, HEX, NED, LIZ, Atto dyes, PET, VIC, Yakima yellow, Cy5, Cy5.5, Cy3, Cy3.5, Cy7, Alexa dyes, Texas red, Tamra, ROX, JOE, FITC, and TRITC.

The term “primer set”, as used herein, refers to the primer pair consisting of at least one forward primer and at least one reverse primer used in a PCR amplification reaction.

The term “primer binding site”, as used herein, refers to a specific region of the DNA fragment or segment that, as a result of its DNA sequence, is receptive of binding a PCR amplification primer having a complementary DNA sequence through DNA hybridization. A primer binding site preferably ranges in size of between 15-30 nucleotides. The primer binding site comprised in the upstream adapter DNA sequence preferably differs in DNA sequence from the primer binding site comprised in the downstream adapter DNA sequence. Primer binding sites in the adapter DNA sequence preferably differ in DNA sequence from the primer binding site comprised in the conserved regions of the 16S and 23S rRNA genes used for amplifying the ITS regions from the genomic DNA of the microbiome investigated The term “co-amplified”, as used herein, refers to the simultaneous amplification of different nucleic acid fragments in a single amplification reaction.

The term “target nucleic acid” as used herein refers to the nucleic acid that intended to be amphfied in a nucleic acid amplification reaction, and in particular to the part of the template nucleic acid positioned between and including the primer binding sites.

The term "amplification efficiency", as used herein, refers to the amount of amplification product produced in an amplification reaction from a given initial number of target sequences in a given number of amplification cycles. Thus, the amplification efficiencies of two reaction which differ only in the length of the target sequences are compared by quantitatively measuring the amount of product formed in each reaction. The amplification efficiency is a measure of the efficacy of amplicon formation in a PCR reaction and can be calculated by determining the output amplicon copy number (or product) over the input template copy number. Determination of PCR amplification efficiency is well known to one of skill in the art and is for instance explained in detail in such publications as Lalam, 2006, J Theor Biol. 242(4):947-53.

The term “standard curve”, as used herein, refers to an equation or function that describes the measured relationship between the length of a DNA template and the amount of amplification product produced from this template in a PCR amplification reaction in a given number of amplification cycles.

The term “PCR calibrator”, as used herein, refers to an amplifiable DNA fragment or segment serving, inter alia, as a reference template for determining the length -dependent amplification efficiency in PCR amplification reactions as described herein, in particular for determining the efficiency of the amplification reaction in amplifying sample templates of different length. At least one PCR calibrator is included in each PCR reaction. PCR calibrators consisting of rrn operons or parts thereof of existing microorganisms, wherein the intergenic region between the 16S and 23S rRNA genes are flanked by their native 16S and 23S rRNA gene sequences or at least by conserved regions thereof, are not part of this invention.

The term “flanked”, as used herein, refers to having a defined DNA sequence (i.e. an adapter DNA sequence) contiguous with both ends of a given DNA sequence or segment (i.e. an ITS region DNA sequence).

The terms "upstream" and "downstream", as used herein, refer to a position of a genetic element on a polynucleotide sequence in relation to another genetic element. A first genetic element is upstream to a second genetic element when located in the 5' direction of the second element. A first genetic element is downstream to a second genetic element when located in the 3' direction of the second element.

The term “plasmid”, as used herein, refers to a circular, double- stranded unit of DNA that replicates within a cell independently of the chromosomal DNA.

The term “replicon”, as used herein, refers to a DNA molecule or RNA molecule, or a region of DNA or RNA, that replicates from a single origin of replication. Preferably, the replicon is a plasmid.

The term “microorganism”, as used herein, refers to any unicellular microorganism including bacteria, archaea, protists, fungi, virus, and algae, preferably bacteria. The term “microbial” indicates pertaining to, or characteristic of a microorganism.

The term “intestinal flora”, as used herein, refers to the population of microorganisms inhabiting the gastrointestinal tract.

The term "genomic DNA" as used herein refers to any DNA comprising a sequence that is normally present in the genome of a prokaryotic or eukaryotic cell or a virus. The term refers in particular to the full complement of DNA contained in the genome of a cell or organism comprising the full collective gene set of a cell. In obtaining a sample of genomic DNA from a microbiome for quantitative population analysis, preferably the genomic DNA of essentially all cells in the population is isolated and such a genomic DNA sample is also referred to as total DNA. Total genomic DNA extraction procedures from divers microbiomes are known in the art and commercial genomic DNA isolation kits for this purpose can be obtained from various manufacturers.

The term “population”, as used herein, refers to a plurality of individual organisms, in the context of this invention, the term refers in particular to a collection of organisms of divers taxonomic affiliation, in particular bacteria.

The term “diversity”, as used herein, refers to the extent to which different taxonomic groups of microorganisms are present in a population of microorganism. In order to quantify and compare microbial taxonomic diversity, i.e. “within-sample diversity”, diversity calculation using the Shannon index is preferred (Haegeman et al. 2013. ISME J 2013;7:1092- 101). Diversity analysis, including “between sample diversity” analysis may be performed using R software vegan package (Oksanen, J. et al. vegan: Community Ecology Package. R package version 2.0-7 (2013)). Sample diversity may be assessed for instance at the level of an individual phylum, such as on the level of the phylum Firmicutes, Bacteroidetes, or Proteobacteria. Alternatively, the sample diversity may be assessed overall. High diversity is equivalent with a high taxonomic variation, meaning that many different bacterial taxons (e.g. different bacterial species) are represented in the population, while a low diversity is equivalent with a low taxonomic variation, meaning that a population is characterized by relatively few bacterial taxons. In intestinal (fecal) flora, a Shannon index above 3 for the overall microbiota is considered to represent a high bacterial population diversity. The term “taxonomic variation” refers to the diversity in groups of microorganisms when grouped into species, genera, families, orders, classes and/or phyla in accordance with a biological classification scheme.

The term “typing”, as used herein, refers to classifying a test signature as either corresponding to the signature of a first condition or corresponding to the signature of a second condition in a classification scheme, such as a scheme classifying healthy and diseased subjects.

The term “signature”, as used herein, refers to a profile of amplified nucleic acid fragments representing the diversity in 16S-23S ITS regions from microorganisms in a genomic DNA sample from a sample population, wherein different, ITS regions represent different taxonomic groups of microorganisms, preferably said profile representing the prevalence of the various taxonomic groups of microorganisms in the sample population.

The terms “desirable signature” and “undesirable signature”, as used herein, refer in general to a beneficial signature and an unfavorable signature, these signatures representing an elected biological profile such as (i) non-diseased versus diseased, (ii) sterile versus non-sterile, (iii) non- infected vs. infected, (iv) advantageous for plant growth due to the presence of a specific micro-organism vs. disadvantageous due to the absence of said specific micro-organism.

The term “reference signature”, as used herein, refers to a signature of a control subject.

The term “phylum”, as used herein, refers to a taxonomic rank below kingdom and above class.

The term “nucleic acid sequence”, or “nucleotide sequence”, which terms can be used interchangeably herein, refers to the base sequence of a DNA or RNA molecule in single or double stranded form, particularly a DNA sequence encoding an ITS region of the ribosomal RNA gene. An "isolated nucleic acid sequence" refers to a nucleic acid sequence which is no longer in the natural environment from which it was isolated. The term inter alia refers to a nucleic acid molecule that has been separated from at least about 50%, 75%, 90%, or more of proteins, lipids, carbohydrates, or other materials with which it is naturally associated, e.g. in a microbial host cell.

The term “microbial”, as used herein, refers to a subject as originating from a microorganism, or microbe, which generally refers to an organism that is microscopic, which means too small to be seen by the unaided human eye.

The term “target microbial nucleic acid sequence” , as used herein, refers to the nucleic acid fragment targeted for replication (or amplification) and subsequent detection and that is diagnostic of a particular microorganism whose presence is to be determined.

The term “polymerase chain reaction (PCR)” as used herein refers to the well-known in vitro technique to produce numerous copies of a specific segment of target DNA from a template DNA - i.e., the DNA that contains the target region to be copied. During the reaction a mixture containing the template DNA, primers, dNTPs, and a heat-stable DNA polymerase is heated to 90-95°C to denature the strands of the template DNA. The solution is cooled to a temperature that allows the primers (single-stranded DNA molecules of about 18 to 30 nucleotides long) to anneal to their complementary sequence on the template DNA and provide the 3'-OH required for DNA synthesis. Subsequently, the DNA polymerase synthesizes a new DNA strand complementary to the template by extending the primer, usually at a temperature of about 72 °C. The thermal cychng scheme of denaturing/primer annealing/ primer extension is repeated numerous times with the target DNA synthesized during the previous cycles serving as a template DNA for each subsequent cycle. The result is a doubling of the target DNA present with each cycle, and exponential accumulation of target DNA sequences over the course of 20-40 cycles. A heating block with an automatic thermal cycler is used for precise temperature control. A preferred method for use in the present invention is qPCR amplification (also known as real-time PCR), wherein typically the amplification of a targeted DNA molecule is monitored during the PCR (i.e., in real time), using non-specific fluorescent dyes that intercalate with any double- stranded DNA, or sequence-specific DNA probes consisting of oligonucleotides that are labelled with a fluorescent reporter for the detection of PCR products in real-time.

The term “isolating”, as used herein in the context of isolating nucleic acid sequences from a biological sample, refers to an in vitro process wherein nucleic acids, preferably genomic DNA, are extracted from a sample of interest. The process may generally involve, but is not limited to, lysis of (cells in) a biological sample using a guanidine-detergent lysing-solution that permits selective precipitation of DNA from a (cell) lysate, and precipitation of the genomic DNA from the lysate with ethanol. Following an ethanol wash, precipitated DNA may be solubilized in either water or 8 mM NaOH and used as template in a PCR reaction. Genomic DNA samples for analysis with diagnostic purpose may be obtained by using generally known techniques for DNA isolation. The total genomic DNA may be purified by using, for instance, a combination of physical and chemical methods. Very suitably commercially available systems for DNA isolation may be used, such as the NucliSENS® easyMAG® nucleic acid extraction system (bioMerieux, Marcy l'Etoile, France) or the MagNA Pure 96 System (Roche Diagnostics GmbH, Mannheim, Germany).

The term “PCR mixture”, as used herein, refers to the small volume of biochemical reactants in aqueous liquid for performing the PCR reaction comprising the (genomic) template DNA comprising the target DNA sequence(s), a set of at least two oligonucleotide primers that hybridize to opposite strands of the (genomic) template DNA and flank the region comprising the target DNA sequence(s) to be amplified, a thermo-stable DNA polymerase, the four deoxyribonucleoside triphosphates (dNTPs), and Mg2+ ions.

The term “amplification primers”, as used herein, refers to the oligonucleotide primers that hybridize to opposite strands of the (genomic) template DNA and flank the region comprising the target DNA sequence(s) to be amphfied.

The terms “amplification product”, and “amplicon”, as used interchangeably herein, refer to a nucleic acid fragment that is the product of a nucleic acid amplification or replication event, such as for instance formed in the polymerase chain reaction (PCR). The term “PCR amplicon”, as used herein, refers to the PCR product or amplified target DNA, generally comprising the target DNA sequence(s) with flanking primer sequences.

The term “high resolution melting curve (hrMC) analysis”, as used herein, refers to a post-PCR analysis method used to identify variations in nucleic acid sequences. The method is based on detecting small differences in PCR melting (dissociation) curves. The temperature- dependent dissociation between two DNA-strands can be measured using a DNA-intercalating fluorophore such as SYBR green, EvaGreen or a "saturation dye" (a dye that does not inhibit PCR even if used at concentrations that give maximum fluorescence (saturation)) like LCGreen® I, LCGreen Plus or Cyto9, in conjunction with real-time PCR instrumentation that has precise temperature ramp control and advanced (fluorescence) data capture capabilities. Data are analyzed and manipulated using software designed specifically for hrMC analysis.

The term “high resolution melting curve (hrMC)”, as used herein, refers to the dissociation curve describing the temperature-dependent dissociation between two DNA-strands as measured using a DNA- intercalating fluorophore. The high resolution melting curve may refer to the graph of the negative first derivative of the melting-curve which makes it easier to pin-point the temperature of dissociation (defined as 50% dissociation), by virtue of the peaks thus formed.

The term “electrophoretic separation and amplicon length analysis”, as used herein, refers to the technique whereby mixtures of charged molecules, preferably nucleic acids, in particular PCR amplified DNA fragments, loaded on a gel matrix are caused to migrate from the negative electrode (cathode) toward the positive electrode (anode), on the basis of size, charge, and structure, through the gel when said gel is placed in an electrical field, whereby shorter nucleic acid fragments migrate more rapidly than longer ones, resulting in separation based on size. Electrophoretic separation and amplicon length analysis is preferably performed by capillary electrophoresis, whereby the DNA is detected either by UV absorption or by fluorescent labehng. In the presence of appropriate standards, fragments can be accurately sized (i.e. amplicon length can be determined) based on relative electrophoretic mobihty, for instance using a ABI Prism 3500 Genetic Analyzer (Applied Biosystems) or similar analysers.

“Electrophoretic separation and amphcon length analysis”, as defined herein, may also be performed by DNA sequencing of the amplicons. DNA sequencing, wherein the specific nucleotide sequence of the amplicon is determined, may provide accurate information on the length of the amplicon.

The term “negative control reaction”, as used herein, refers to a post-PCR mixture comprising no PCR amplicon(s) as a result of the deliberate absence of target nucleic acid sequences or template DNA in the pre-PCR mixture.

The term “primer dimer (PD)”, as used herein, refers to a potential by-product in PCR, consisting of primer molecules that have annealed (hybridized) to each other because of strings of complementary bases in the primers. As a result, the DNA polymerase amplifies the PD, leading to competition for PCR reagents, thus potentially inhibiting amplification of the DNA sequence targeted for PCR amplification. In quantitative PCR, PDs may interfere with accurate quantification.

The term “non-specific PCR amplicon”, as used herein, refers to a potential by-product in PCR, consisting of amplified DNA that is not target DNA, usually resulting from a-specific annealed (hybridization) of the primer molecules to other nucleic acid sequences in the template DNA, such as human DNA. A non-specific PCR amplicon results in a hrMC that is different from the PCR amplicon generated from a target microbial DNA sequence.

The term “human PCR amplicon”, as used herein, refers to a non- specific PCR amplicon whereby primer molecules have annealed to human nucleic acid sequences in the template DNA instead of microbial DNA sequences. A human PCR amplicon results in a hrMC that is different from the PCR amplicon generated from the (microbial) target DNA.

The term "quantification cycle" or "Cq" as used herein includes reference to a measurement taken in a real time PCR assay or qPCR assay, whereby a positive reaction is detected by accumulation of a signal, such as a fluorescent signal. The Cq (quantification cycle) can be defined as the number of cycles required for the signal to cross the threshold (i.e. exceeds background level). Cq levels are inversely proportional to the amount of target nucleic acid in the sample (i.e. the lower the Cq level the greater the amount of target nucleic acid in the sample).

The present invention relates to methods of diagnosing and treating infectious disease of the respiratory tract, and such diseases are referred to herein interchangeably by the terms “respiratory disease”, “respiratory infectious disease” and “respiratory infection”, including, but not limited to “pneumonia”. The term “disease” refers to any harmful deviation from the normal structural or functional state of an organ or organism, generally associated with certain signs and symptoms. The terms “respiratory” and “respiratory tract” refers to the airways and the lungs including nose, pharynx, larynx, trachea (windpipe), bronchi (branching from the trachea, and leading to smaller bronchioles), and lungs, and particular the (mucosal) hnings of these organs. Although the infectious agent may mostly be encountered in the lining of these organs, the infection or disease may manifest throughout the entire body of a subject.

The term “virus”, as used herein, includes reference to a small infectious agent that replicates only inside the living cells of an organism. Viruses can infect all types of life forms, from animals and plants to microorganisms, including bacteria and archaea.

The term “infectious disease”, as used herein, includes reference to diseases that are caused by micro-organisms such as bacteria or viruses.

The term “coronavirus”, as used herein, includes reference to a family of viruses also referred to as Coronaviridae or Coronavirinae, which belong to the order of Nidovirales. Preferably, the coronavirus is a virus of the subfamily of Coronaviridae, which inter alia comprises the genera of Alphacoronavirus, Betacoronavirus, Gammacoronavirus and Deltacoronavirus. Preferably, the coronavirus is a severe acute respiratory syndrome (SARS) coronavirus (SARS-CoV), preferably a SARS coronavirus 2 (SARS-CoV-2). Such a virus is the causative virus for coronavirus disease (COVID), preferably coronavirus disease 2019 (COVID-19), respectively. Testing for positive cases of SARS-CoV-2 can be based on detection of virus RNA sequences by NAAT such as real-time reverse-transcription polymerase chain reaction (rRT-PCR) with confirmation by nucleic acid sequencing when necessary. The viral genes targeted so far include the N,

E, S and RdRP genes.

The coronavirus may also be a Middle East respiratory syndrome coronavirus (MERS-CoV), which is the causative virus for Middle East respiratory syndrome. SARS-CoV-1 is a known coronavirus closely related to SARS-CoV-2.

The term “SARS-CoV-2” as used herein refers to severe acute respiratory syndrome coronavirus 2, a member of the β coronavirus family identified as the source of a pneumonia outbreak in Wuhan, China, in late 2019 (COVID-19). The term includes the often used reference strain 2019- nCoV/USA- WA1/2020 (GenBank accession no. MN985325.1), and variants thereof, such as, but not limited to, the United Kingdom (UK) variant 20I/501Y. V1 or B.1.1.7 and lineages thereof, the South Africa variant 20HH501Y.V2 or B.1.351 and lineages thereof, and the Brazil variant 20J/501Y.V3 or P.l and lineages thereof.

The term “COVID”, as used herein, includes reference to an infectious disease caused by severe acute respiratory syndrome coronavirus (SARS-CoV). Preferably, the SARS-CoV is SARS-CoV-2, and the COVID is COVID-19. Common COVID symptoms include fever, cough, and shortness of breath. Muscle pain, sputum production and sore throat are less common. While the majority of cases result in mild symptoms, some progress to severe pneumonia and multi-organ failure. The infection is typically spread from one person to another via respiratory droplets produced during coughing. It may also be spread from touching contaminated surfaces and then touching ones face.

The term “influenza virus”, as used herein, refers to a virus that belongs to the family of Orthomyxoviridae, which itself belongs to the order of Articulavirales. Preferably, the influenza virus is a virus that belongs to the genera of Influenzavirus A, Influenzavirus B, Influenzavirus C or Influenzavirus D, preferably Influenzavirus A, Influenzavirus B, Influenzavirus C. The four genera of influenza virus can identified in a biological sample by antigenic differences in their nucleoprotein and matrix protein. The term “influenza”, as used herein, refers to an infectious disease cause by the influenza virus, and is commonly known as the flu. Symptoms can be mild to severe. The most common symptoms include high fever, runny nose, sore throat, muscle and joint pain, headache, coughing, and feeling tired. These symptoms typically begin two days after exposure to the virus and most last less than a week.

The term “human orthopneumovirus” can be used interchangeably with the term “human respiratory syncytial virus (HRSV)”, and includes reference to a syncytial virus that causes respiratory tract infections. It is a major cause of lower respiratory tract infections. It belongs to the genus of Orthopneumovirus.

The term “subject”, as used herein, can be used interchangeably with the term “patient”, and includes reference to a mammal, preferably a human. Preferably, a subject is at least 30 or at least 40 years old. More preferably, the subject is at least 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64 or at least 65 years old. Even more preferably, a subject is at least 66, 67, 68, 69 or at least 70 or at least 80 years old. In addition to the aforementioned embodiments, or alternatively, the subject is an elderly human, preferably a frail elderly human. The term “healthy subject” refers to a control subject not suffering from a disease of interest, although such a control subject may suffer from a different or related disease.

IS-pro

The IS-pro method is previously described in Budding et al., 2010 (FASEB J. 2010 Nov;24(ll):4556-64. doi: 10.1096/fj.10-156190. Epub 2010 Jul 19). IS-pro involves bacterial species differentiation by the length of the 16S-23S rDNA interspace region with taxonomic classification by phylum- specific fluorescent labelling of PCR primers.

Amplifying the 16S-23S rRNA intergenic spacer region from microorganisms is well known in the art. In order to achieve this, the sequences of conserved DNA regions comprised in the 16S and 23S rRNA gene sequences flanking the intergenic region in the genomic DNA of the microorganism are used as primer binding sites for amplification of the polymorphic DNA region on which the taxonomic diversity analysis of IS-pro is based.

Prokaryotic microorganisms, including bacteria and archaea, comprise in their genome one or more copies of the rrn operon comprising the genes for the 5S, 16S and 23S ribosomal RNAs. In most prokaryotes the ribosomal genes in the operon are in the order 16S-23S-5S and are co- transcribed in a single polycistronic RNA that is processed to provide the RNA species present in the mature ribosome. The rRNA 16S and 23S genes have acquired paramount relevance for the study of bacterial evolution and phylogeny, and the presence of variable and conserved regions in both genes is well documented (Neefs et al., 1993. Nucleic Acids Research 21(13): 3025- 3049; Baker et al. 2003. J. Microbiol. Meth. 55: 41-555; Van de Peer et al., 1996. Nucleic Acids Research 24(17): 3381-3391; Gurtler and Stanisich,

1996. Microbiology 142: 3-16). The spacer between the 16S and 23S genes contains regions with secondary structures and sometimes tRNA genes. The variation found among relatively close taxa is known to be very high for the spacers of the rRNA operons. The extreme divergence in size and sequence of the spacers among different groups of prokaryotes, together with their location between highly conserved rRNA genes, makes them ideally suited as taxonomic markers. The 16S-23S rRNA intergenic spacer region is amplified using primers directed to conserved regions in the ribosomal gene sequences. More preferably, the conserved DNA regions are those located nearest to the 3’-end of the 16S rRNA gene and nearest to the 5’-end of the 23S rRNA gene.

Amplification of ribosomal sequences is described in detail in Devereux and Wilkinson 2004 (Molecular Microbial Ecology Manual, Second Edition 3.01: 509-522, 2004). The pivotal point for the purpose of microbiome analysis using phylum-specific probes is the primer design. Primers can be designed that selectively amplify rRNA genes of phylogenetically defined groups. Selection of primers can often be guided by comparison of sequences in a database. Many selective rRNA gene primers have been described in the literature. PRIMROSE (Ashelford et al., 2002. Nucl Acids Res 30: 3481-3489) is a program that uses sequences from the Ribosomal Database Project to identify and determine the phylogenetic range of oligonucleotides that may be used as rRNA probes or primers. ProbeBase (Loy et al., 2003. Nucl Acids Res. 31, 514-516) is a database of published rRNA probes with information on target site and specificity. The Ribosomal Database Project II (Maidak et al., 2001. Nucl Acids Res 29:173- 174) and the ARB package (Ludwig et al., 2004. Nucleic Acids Research 32(4): 1363-1371) provide software for in silico evaluation of intended specificities of ribosomal primers, including those for 16S and 23S rRNAs, against known rRNA sequences.

A step in the IS-Pro microbial identification process focuses on a piece of DNA that occurs in all bacteria: the ribosomal DNA. Within the ribosomal DNA a part has been identified whose length is specific to the bacterial species: the 16S-23S Interspace (IS) region. The length of this piece of DNA typically varies between 200 and 1200 base pairs, which means that around 1000 different lengths are possible. We usuahy find multiple copies of this IS region per bacterium, each with a different length. Combinations of these IS regions provide unique signatures, which enable differentiation of around 1 million different bacterial species. To ensure we understand which DNA fragments originate from which bacteria, DNA from each bacterial group (phylum) can be amplified with amplification primers with a specific fluorescent label. With this we obtain two sources of information from each bacterium: length and color. This combination allows us to accurately identify all bacterial species, even when there are several species present in a sample. The technical process of IS-Pro is a PCR amplification followed by fragment analysis by capillary electrophoresis. These results undergo automated onhne analysis of digital profiles. The high level of standardisation of the entire process, coupled with a multi-purpose internal control as described herein but also in WO 2015/170979, the contents of which are herewith incorporated by reference, guarantees highly accurate results required for clinical diagnostics. Importantly, obtained profiles are highly reproducible between laboratories at different sites.

This technique does not require cultivation of bacteria, so even uncultivable or antibiotic-treated bacteria can be easily identified. The technique is not selective so, crucially, unknown species can be detected. It is capable of analyzing many bacterial species simultaneously and provides highly reproducible and clinically relevant insights into the composition and disturbances of complex bacterial communities.

IS-Pro for use in viral infectious disease impact prediction

The recently emerged COVID-19, which is an exemplary viral infectious disease of the invention, has a broad clinical spectrum ranging from no symptoms to severe pneumonia, acute respiratory distress syndrome (ARDS) and death. Several chnical risk factors and diagnostic characteristics associated with more severe disease have been described. These include older age, hypertension, obesity, diabetes, neutrophilia and high D-Dimer. Based on current epidemiological studies severe COVID-19 is strikingly rare in children and young adults, but it is not yet known what causes this low risk of severe disease in the younger population.

To mitigate the impact on hospitals and the economy, a rapid viral infectious disease, preferably COVID-19, triage tool is developed based on throat microbiota or microbiome composition that can stratify patients based on expected disease severity; for those already infected this tool would be able to discriminate between patients who are likely to have a severe course of disease and should thus be admitted to the scarce hospital resources and those with an expected mild course of the disease, who can remain at home and not occupy hospital beds unnecessarily. This kind of triage is currently a major challenge for all hospitals in affected areas.

Therefore, preferably, a method of the invention is a method for stratifying a subject for viral infectious disease severity, preferably stratifying or classifying or subtyping a subject in a mild disease course group and a severe disease course group. Such a method could also include a stratification group that is negative for a viral infectious disease, more specifically negative for the causative virus of said disease. Therefore, effectively, a method of the invention can also be a method for diagnosing or typing a subject for the presence or absence of the infectious agent and/or respiratory disease, or presence or absence of a viral infectious disease or virus. Preferably, a method of the invention is a method for stratifying a subject for viral infectious disease severity, preferably stratifying or classifying or subtyping a subject in a group I (positive / low risk / no quarantine needed), group II (positive, continued observation needed but not in hospital / medium risk / mild disease course) and group III (positive, continued observation needed in hospital / high risk / quarantine / severe disease course) (see also Figure 2). Methods of the invention can be performed before, during and/or after symptoms in a subject arise.

Another aspect of the invention is identifying subjects with a protective throat microbiome who have a (very) low susceptibility to a viral infection such as a coronavirus infection. This latter group, together with subjects who have recovered from the disease, could be released from quarantine and return to work to help the economy to recover (see also Figure 2).

Another option for microbiome analysis as described herein, instead of capillary (gel) electrophoresis such as IS-Pro, would be Next generation Sequencing (NGS). Therefore, in methods of the invention that are directed to microbiome analysis by capillary gel electrophoresis, such as with IS-Pro, said steps can be replaced by a step of performing next- generation sequencing methods. Preferably, said step of DNA sequencing is performed by next-generation sequencing methods that are capable of determining a sequence length of amplicons having a length in base pairs in the range of 100-1200 bps, which range generally resembles the overall variation in length of the 16S-23S ITS DNA sequence between different microbial species. Suitable next- generation sequencing methods to determine amplicon length include nanopore-based approaches (e.g. devices manufactured by Oxford Nanopore) or Pacific-Biosciences or single molecule real-time sequencing approaches (e.g. devices manufactured by Pacific Biosciences). Species composition based on 16S-23S ITS DNA sequence can also be obtained by NGS based amplicon sequencing, such as is possible by using for example the Illumina platform, or the Ion Torrent platform (ThermoFisher Scientific). This may obviate the step of length analysis of the amplicons, yet many different amplicons can be identified, also providing for a microbiome signature as used in methods of this invention.

When a step of determining PCR amplicon length is performed in a method of the invention, an amplicon length profile can be generated. Such a generated amplicon length profile can be compared to one or more reference or control amplicon length profiles of a corresponding 16S-23S rRNA internal transcribed spacer (ITS) region of a known strain or species of microbe. Preferably, said reference amplicon length profile of a corresponding 16S-23S rRNA internal transcribed spacer (ITS) region of a known strain or species of microbe is comprised in a library or database of reference amplicon length profiles. Such a library or database can be established by (i) in vitro generating PCR amplicons of a target 16S-23S rRNA internal transcribed spacer (ITS) region of genomic DNA of known individual strains and/or species of microbes, (ii) performing a fragment length analysis of said amplicons, and (in) storing said reference fragment length profiles in a library or database of reference fragment length profiles. Alternatively, a database or library of reference fragment length profiles can be generated by in silico prediction of a fragment length profile of PCR amplicons of a target 16S-23S rRNA internal transcribed spacer (ITS) region of genomic DNA of individual strains and/or species of microbes.

It is the metes and bounds of routine experimentation of the skilled person to establish (pre-determined) reference or control amplicon length profiles suitable for comparison with the generated amplicon length profile of the test sample. For instance, a method of the invention may include the step of analyzing amplification products or amplicons (in a post-PCR) based on length differences in said amplification products to thereby provide an amplicon length profile of the composition of a population of microorganisms in a microbiome or a sample as disclosed herein; and comparing said fragment length profile with at least one reference fragment length profile of a known micro-organisms.

The invention also provides a method of typing or diagnosing a subject, preferably a sample of a subject, for the presence or absence of a respiratory infectious disease agent, preferably a respiratory virus, comprising the steps of: a) performing on a sample of genomic DNA from the microorganisms in a respiratory tract microbiome obtained from a subject a PCR amplification reaction using at least one set of PCR amplification primers directed to a conserved DNA region comprised in the 16S and 23S rRNA sequence flanking microbial 16S-23S rRNA internal transcribed spacer (ITS) regions for amplification of said ITS region to thereby amplify and provide amplification products of said ITS regions comprised in said sample of genomic DNA; b) analyzing said amplification products based on length differences in said amplification products to thereby provide a test signature of a composition of a population of microorganisms in said sample; c) comparing said test signature with at least one reference signature provided by performing steps a) and b) on a reference subject suffering from an infectious respiratory disease, such as a viral infectious disease, or not suffering from an infectious respiratory disease, such as a viral infectious disease, and typing the subject, preferably a sample of said subject, on the basis of that comparison for the presence or absence of an infectious disease agent, preferably a respiratory virus.

Preferably, in methods of the invention, the sample is a sample selected from the group consisting of nasopharyngeal and oropharyngeal (e.g. throat) swab, wash and aspirate of the upper respiratory tract; saliva sample; tracheal and endotracheal aspirate, mucus, sputum and bronchoalveolar lavage of the lower respiratory tract; microbial aerosols from coughs and tidal breathing; and nasopharyngeal and oropharyngeal tissue biopsy including lung biopsy.

Preferably, in methods of the invention, said viral infectious disease is a disease caused by infection with influenza virus (flu), infection with Respiratory Syncytial virus (RSV), infection with Enterovirus, infection with Rhinovirus, infection with Adenovirus, infection with Herpes Simplex virus, infection with Epstein-Bar virus, infection with Varicella zoster virus or infection with Coronavirus, such as SARS-CoV, MERS-CoV or, preferably, SARS-CoV-2 (COVID-19).

Probiotic compositions and use thereof

The present invention further provides probiotic compositions, comprising living bacteria. These compositions are for administration to a subject as a prophylactic or therapeutic treatment of viral respiratory disease.

A probiotic composition of the invention preferably comprises one or more bacterial species selected from the group consisting of Haemophilus parainfluenzae, Neisseria cinerea, Streptococcus mitis group, Streptococcus bovis group, Leptotrichia buccalis and Rothia mucilaginosa. In preferred enbodiments, the probiotic composition of the invention comprises at least 5% of H. parainfluenzae, at least 3.5% of L. buccalis, at least 2% of R. mucilaginosa, at least 10% of S. bovis and/or at least 10% of S. mitis. Preferably, the probiotic composition comprises at least 2, 3, or 4, more preferably 5 of the above bacterial species. Preferably, S. mitis is less than 10% of the composition, most preferably S. mitis is absent.

A probiotic composition of the invention may comprise the living bacteria in an amount in the range 1x10 4 cfu/ml to 1x10 9 cfu/ml.

A probiotic composition of the invention may take the form of tablets, pills, capsules, semisolids, powders, sustained release formulations, solutions, suspensions, elixirs, aerosols, or any other appropriate compositions. A probiotic composition of the invention may be in the form of a dry powder. A dry powder formulation can be made, for example, by combining dry lactose having a particle size between about 1 pm and 100 pm with micronized particles of a liquid composition of the probiotic mixtures of bacteria as disclosed herein and dry blending.

A liquid probiotic composition of the invention may comprise a sugar, a sugar alcohol or a mixture thereof. The sugar is suitably selected from the group consisting of sucrose, glucose, fructose and a mixture thereof. The sugar alcohol may be mannitol, sorbitol, xylitol or a mixture thereof. The amount of sugar and/or sugar alcohol may be between 1-90% w/v based on the volume of the composition. A preferred mixture of sugar and sugar alcohol may comprise a ratio of sugar to sugar alcohol in the range of 0.5 to 5.

A probiotic composition of the invention preferably comprises one or more excipients selected from the group consisting of hydrophilic vehicles, solubilizer, pH modifier, buffer, viscosity modifier, preservatives, and stabilizers.

A probiotic composition of the invention may be administered in the form of a nasal spray or inhalation formulation. When administered as a nasal spray or inhalation formulation the composition may be administered by devices including, but not limited to, an intranasal spray device, an atomizer, a nebulizer, a metered dose inhaler (MDI), a dry powder inhaler (DPI), a pressurized dose inhaler, an insufflator, an intranasal inhaler, a nasal spray bottle, a unit dose container, a pump, a dropper, a squeeze bottle, an aerosolized metered dose pumps and a mist sprayer. Each nasal administration may contain from a few micrograms (meg) up to milligrams (mg) of the probiotic composition delivered in a volume typically between 25 and 100 microliters. A typical nasal dose may consist of one or two sprays of 0.1ml each per nostril, one to three times a day. Each inhalation dose of an inhalation formulation may contain from a few micrograms (meg) up to milligrams (mg) of the probiotic composition delivered in a volume typically between 50 and 500 microliters or micrograms. A typical inhalation dose may consist of one or two inhalations, once to three times a day.

In preferred embodiments, the probiotic composition is a probiotic nasal spray.

The present invention further provides a method for the treatment or prophylaxis of viral respiratory disease in a subject, said method comprising administering to said subject an effective amount of a probiotic composition of the present invention. Prior to the administration of the probiotic composition, the subject may be treated by systemic or, preferably, local administration of an antibiotic to remove or modify the respiratory tract microbiome. Such removal or modification may benefit establishment of the probiotic microbiome in the respiratory tract. Preferably, in methods of the invention, said viral infectious disease is a disease caused by infection with influenza virus (flu), infection with Respiratory Syncytial virus (RSV), infection with Enterovirus, infection with Rhinovirus, infection with Adenovirus, infection with Herpes Simplex virus, infection with Epstein-Bar virus, infection with Varicella zoster virus, or infection with Coronavirus, such as SARS-CoV, MERS-CoV or, preferably, SARS-CoV-2 (COVID-19). For the purpose of clarity and a concise description, features are described herein as part of the same or separate embodiments, however, it will be appreciated that the disclosure includes embodiments having combinations of all or some of the features described.

Table 1

Preferably, in any diagnostic, prophylactic or therapeutic method of the invention as described herein, one or more of the indicators in the above Table 1 are used to type or treat a test subject for a viral respiratory disease parameter as described herein or is used to predict disease course severity as described herein. As is evident from the above, in methods of the invention, a reference signature can be omitted, or the step of comparing can be omitted, since it is possible to type or predict solely on the basis of the relative or absolute abundance of bacterial species within test samples.

The content of the documents referred to herein is incorporated by reference.

EXAMPLES

Example 1. Stratifying patients according to risk of disease severity

A cohort study of patients presenting at the general practitioner and in the emergency department of the Amsterdam UMC and the

Admiraal de Ruijter hospital in Goes (both located in the Netherlands) is performed. To control for generalizability of disease associated microbiota signatures, also included are outpatients from the Hong Kong Gleneagles hospital. 1400 patients presenting as potential COVID-19 or influenza patients as well as 200 patients without respiratory symptoms are included in whom a nose or throat swab is taken for other reasons. Informed consent is asked for at presentation to define respiratory microbiome and to collect clinical and diagnostic parameters.

Respiratory microbiome is defined for all swabs taken from the respiratory tract of included patients. After the swab is used for clinical diagnostics, it is directly frozen in -80 degrees Celsius for possible follow-up analyses.

Clinical and diagnostic parameters regarding the diagnosis of SARS-COV-2 and other respiratory viruses, clinical history including hypertension, diabetes, obesity, lung disease; use of medication including antihypertensive medication, immunosuppressive therapy and (recent) antibiotic use are collected.

At day 30 after study inclusion, chnical outcomes are assessed through chart review, i.e. survival, clinical cure, days in ICU, days on respiratory support, hospitalization days, suspicion of secondary bacterial infection including culture results and need for antibiotics. For outpatients, all patients are contacted by telephone interview with a standardized questionnaire.

Association between the respiratory microbiome profile and coronavirus or influenza virus infection is determined, and 30-day outcome (mortality, hospitalization, recovery), and length of ICU stay. For these analyses, a multivariate model including microbiome profile and relevant covariates are built on 30-day mortality and length of ICU stay. Differences between microbiome profiles across age groups are analyzed. For microbiota or microbiome classification, the classifier of choice is Adaptive Group- Regularized Logistic Ridge Regression (AGRR). This classifier has several advantages. First, it enables estimation and predictor selection when the number of features (i.e., bacterial features) exceeds the number of observations. Hence, in contrast to standard classifiers, it can deal with high-dimensional data. Second, it allows for the structural use of co-data in order to improve predictive performance. Co-data refers to additional information on the measured variables. The AGRR was developed by the Statistics for Omics group of the dept of Epidemiology and Biostatistics of the Amsterdam UMC.

Thereafter, subjects can be diagnosed for a viral infectious disease as described herein on the basis of a comparison between their respiratory microbiome test profile or signature and a reference microbiome test profile signature of a subject having a viral infectious disease or not having a viral infectious disease; or stratified in disease severity risk groups on the basis of a comparison between their respiratory microbiome test profile or signature and a reference microbiome test profile signature of a subject belonging to one of said disease severity risk groups.

Example 2. Diagnosis of COVID positive and negative patients using Molecular Methods (IS-Pro)

Samples from 30 patients that had reported to the Admiraal de Ruijter hospital in Goes, the Netherlands were collected by means of eSWAB (COPAN). Twenty patients tested negative for SARS-CoV-2 with the standard qPCR assay and ten tested positive. Remaining eSWAB fluid was stored at 4 degrees Celsius and sent cooled to the inBiome laboratory in Amsterdam. Here, DNA was extricated with the Chemagen machine (Perkin Elmer) and the IS-pro procedure was performed on all samples.

Results were analysed by inBiome’s proprietary software suite and visualized using Spotfire (TIBCO).

Resulting profiles (Figures 2-6) showed clear differences between COVID positive and negative patients. Differences could be seen across all phyla, Firmicutes, Actinobacteria, Fusobacteria and Verrucomicrobia (FAFV), Bacteroidetes and Proteobacteria. Even with unsupervised clustering, clear diseases-specific clusters could be found. Further stratification within sample groups was possible.

Some of the Coronavirus negative samples clearly cluster with the coronavirus positive signatures, indicating the possible false-negative outcome of the current coronavirus tests, which are based on detection of the virus.

Example 3 Cluster analysis of COVID positive and negative patients, extended study from Example 2

One of the most striking aspects of the novel Corona Virus Disease 2019 (COVID- 19) is the highly variable course of the disease, with the risk for a severe outcome increasing with age. As there is evidence for a role of the pharyngeal microbiota (PM) in susceptibility to other respiratory viral diseases, and their severity, the aim of this study was to identify specific PM associated with the presence or absence of SARS-CoV-2. Although intestinal microbiota composition is known to vary over age, for the composition of the PM, less is known about its variation over different ages. Gaining insight into this may provide a hnk to the age-dependent severity of COVID- 19.

For this study, throat swabs from 46 SARS-CoV-2 PCR positive and 89 SARS-CoV-2 PCR negative patients, presented at the Admiraal de Ruyter Hospital (Goes, The Netherlands) in the second half of March 2020, were used for microbial profiling. DNA was isolated and subjected to IS-pro, a rapid standardized microbiota analysis technique that differentiates bacterial species by length polymorphisms of the 16S-23S rDNA region combined by phylum specific sequence polymorphisms of the 16S rDNA.

A distinct homogeneous microbial cluster was found that contained a low percentage of positive SARS-CoV-2 PCR positive samples (25%). Remaining samples showed a less uniform microbiota and were almost twice as likely to be positive (25% vs 47%, p=0,008). The homogeneous cluster with low SARS-CoV-2 positivity was linearly less common with increasing age.

We found evidence for an association of the composition of the pharyngeal microbiota with SARS-CoV-2 infection. The observed age- dependency of the PM profile occurrence may explain the enhanced susceptibility of the elderly to COVID-19.

Materials and methods

Sample collection

From March 17 to April 1 2020, we included patients with clinical suspicion of SARS-CoV-2 infection that presented at the Admiraal de Ruijter hospital (ADRZ) in Goes, the Netherlands. Samples were collected for routine diagnostics for SARS-CoV-2. All positive samples that came in for routine SARS-CoV-2 diagnostics were selected. Approximately twice as many negative control samples were selected at random from the same period. All control samples were taken from patients presenting with clinical symptoms resembling COVID19, but negative in PCR.

Ten additional throat swabs from patients diagnosed with COVID- 19 were collected at the Onze Lieve Vrouwe Gasthuis (OLVG) in Amsterdam. Two of these patients were negative in PCR, but diagnosis was made based on clinical presentation and CT scan with typical ground-glass lesions. The study was considered not to need ethical approval by the Brabant Medical Ethics Committee, Tilburg, the Netherlands (NW2020-30), as the study was retrospectively performed on residual sample material.

The only patient characteristics we recorded were age in years and gender.

Swabs, DNA isolation and SARS-CoV-2 PCR

Throat swabs were taken with flocked swabs (ESwab model 480C, COPAN Italia, Italy) and directly transported to the laboratory in ESwab medium. For SARS-CoV-2 PCR, DNA was isolated with the Magnapure 96 large volume kit according to the manufacturer’s instructions. SARS-CoV-2 detection was performed by RT qPCR as described by Corman et al., 2020 (Euro Surveill. 2020;25(3):pii=2000045. https://doi.org/10.2807/1560- 7917.ES.2020.25.3.2000045) on a Lightcycler 480 machine (Roche, Switzerland). Residual ESwab medium was stored at 4°C until further processing. For microbiota analysis, DNA was isolated from 100 mΐ of ESwab medium using the CMG-1033-S Chemagic Viral DNA/RNA kit on a Chemagic automated nucleic acid purification device (PerkinElmer). Elution volume was 100 mΐ.

Microbiota analysis

Microbiota analysis was performed on residual isolated DNA with the IS-pro Microbiota kit (inBiome, Amsterdam, the Netherlands). In short, the IS-pro assay differentiates bacteria by species-specific length polymorphisms of the 16S-23S rDNA interspace region combined with phylum-specific sequence polymorphisms of the 16S rDNA. Details of the technique have been described previously, as have comparisons to culture and Next-Generation sequencing approaches (Budding et al., 2010, FASEB J 24: 4556-64; Budding et al., 2016, J Clin Microbiol 54: 934-43; De Meij et al., 2016, FASEB J 30: 1512-22). Briefly, two multiplex PCR reactions are performed. The first reaction targets the phyla Firmicutes, Actinobacteria, Fusobacteria, Verrucomicrobia (amplified with the same labelled forward primer and referred to as the FAFV group) and Bacteroidetes (with a differently labelled forward primer). The second reaction targets Proteobacteria and has an internal amplification control that is used for quahty control of PCR process and downstream software analyses. PCR products were analyzed by high-resolution capillary electrophoresis (ABI3500, ThermoFisher).

Resulting microbial profiles were generated with the IS-pro software suite and are presented as color-labeled peak profiles, where colors represent phylum groups (FAFV (Firmicutes, Actinobacteria, Fusobacteria, Verrucomicrobia ), Bacteroidetes and Proteobacteria) and peak height reflects the relative abundance of species. Downstream data analyses, visuahzations, and comparisons of resulting data were done with the Spotfire software package (Tibco, Palo Alto, CA, USA). Within-sample microbial diversity was calculated using the Shannon diversity index.

Species calling in IS-pro data

Species identification of IS-pro data was performed by comparison of profiles to a respiratory-tract specific curated database (inBiome, the Netherlands). Briefly, 16S-23S interspace region (IS) fragments, show species-specific length polymorphisms. Generally, lengths fall between 200 and 1200 nucleotides (nc) and commonly vary between different operons within the same genome. Therefore, most bacterial genomes harbour combinations of multiple IS fragment lengths. These combinations of different fragment lengths result in a very high discriminatory potential based on length measurement alone (combinations of two fragments within the 1000 unique lengths between 200 and 1200 nc result in almost 1.000.000 possible unique signatures). For species identification in samples with mixed bacterial species, fragments should be correctly combined. To simplify this, the IS-pro method employs phylum -specific forward primers with different fluorescent labels. Only fragments with the same labels may be combined. By using this approach in combination with niche -specific translation libraries, species identification in complex microbiota samples becomes feasible. For most species in the respiratory tract, species identification could be done to species level, with the exception of viridans group Streptococci, which could be identified to group level ( S.mitis group, S.bovis group).

Statistics

Parametric quantitative variables are presented as means ± standard deviation (SD) and tested for significance with a student’s t test or analysis of variance (ANOVA). Non-parametric quantitative variables are presented as median and interquartile ranges (IQR), and tested for significance with a Mann-Whitney U or Kruskal- Wallis test. A p value of <0.05 was considered statistically significant.

Results

Samples

135 throat swab samples were collected at the ADRZ, 46 from SARS-CoV-2 PCR-positive patients and 89 from SARS-CoV-2 negative patients. Age distribution and SARS-CoV-2 status of this group is shown in figure 7. Ten patients were included from the OLVG in Amsterdam. Two of these patients were negative in PCR, but diagnosis was made based on clinical presentation and CT scan with typical ground-glass lesions.

Cluster analysis

An unsupervised cluster analysis was made of all microbiota profiles of a total of 135 patient samples of the ADRZ hospital. Clustering was done by the Unweighted Pair Group Method with Arithmetic Mean (UPGMA) on a similarity matrix based on cosine similarity of bacterial profiles. Most clear clustering was seen when only the phyla Firmicutes, Actinobacteria and Proteobacteria were considered (Figure 8). This clustering showed a clear central cluster of 77 samples with a similar microbiota composition. This composition was mainly characterized by high abundance of Haemophilus parainfluenzae, Neisseria cinerea, Streptococcus mitis group, Streptococcus bovis group, Leptotrichia buccalis and Rothia mucilaginosa. While various Prevotella species were also commonly present in different combinations and abundances, clusters based on these species became too small to identify significant associations. Interestingly, the central cluster showed a significantly lower rate of SARS-CoV-2 positivity than samples falling outside this cluster (25% vs 47%, p=0,008). While further sub clustering was possible, group sizes became too small to perform meaningful analyses. For convenience we will refer to the two groups as the low positive and the high positive cluster from hereon.

To evaluate whether the microbiota clustering was hospital- specific, we additionally performed microbiota analysis on samples from 10 COVID19 positive patients from an Amsterdam-based hospital, geographically separated by 180 km from the ADRZ. All these samples fell within the previously established clusters and showed no hospital-specific clustering. Eight of these ten samples fell in cluster H the high positive cluster, two in cluster L the low positive cluster. Interestingly, two of the samples falling in the high positive cluster H came from patients whose diagnosis had been made based on clinical symptoms and chest CT and who were negative for SARS-CoV-2 by PCR.

Age distribution

Because one of the hallmark features of COVID-19 is its age- dependency, we investigated the distribution of the high- and low positive clusters over different age groups (figure 9). This analysis showed an age- dependent distribution of the two clusters. The low-positive cluster (cluster L) was most common in the low age groups and showed an almost linear decline in occurrence with increasing age. For the lowest age groups, numbers were too small to get an accurate estimation.

It occurred to us that the different percentage of SARS-CoV-2 positives in the two clusters could be a result of their different, distribution over younger and older people. To investigate this possibility, we compared the positivity rates of the two clusters in the elderly (80 years and older) to the positivity rates for entire study group. This analysis showed that the rate of positivity in the two different clusters was the same for the old age group as it was for the study group. This finding suggests that older age does not explain the different positivity ratio’s in the microbiota clusters but rather that different microbiota clusters may explain age-dependent differences in susceptibility to SARS-CoV-2.

Finally, to confirm an age-related effect on PM composition, we assessed diversity of microbiota profiles, which is a very general parameter of microbiota composition. This analysis also indicated age-related differences with a clear decrease in diversity of all measured bacterial phyla for samples from the oldest patient groups (figure 10).

In this study on microbiota profiling in 46 COVID-19 patients and 89 SARS-CoV-2 negative subject presented with similar symptoms, a common pharyngeal microbiota profile was found that was associated with lower occurrence of SARS-CoV-2. Strikingly, this protective profile showed an age-dependent distribution, occurring commonly in the young and decreasing in occurrence with increasing age in an seemingly linear fashion. This finding offers new insights into the propensity of SARS-CoV-2 to preferentially infect the elderly and offers a basis for new intervention strategies.

There are important advantages to the technique as described herein for quick translation to clinical routine compared to alternative approaches (e.g. 16S sequencing). The IS-pro technique is fast and only needs standard laboratory equipment. The technique has also been highly standardized and the inclusion of an internal amplification control greatly improves accuracy. These combined features ensure that findings can be directly translated to clinical application, which is especially relevant in the rapidly developing COVID-19 pandemic.

In conclusion, in this study we found an association between pharyngeal microbiota, COVID-19 infection, and age. A specific microbiota signature was found that was associated with lower occurrence of COVID-19. The occurrence of this profile decreased linearly with age, which is in line with the higher rate of infection and more severe outcomes of SARS-CoV-2 in the elderly. Our findings show that of microbiome-based diagnostics for COVID- 19 is a valuable tool. Application of this tool includes stratification of patients into those with a mild or severe expected course of disease, which thereby provides an objective triage tool in hospitals and the possibility of stratification of non-infected people into those with a high or low risk of contracting the virus, which provides an informed quarantine strategy. Finally, based on the strong correlations between certain components (bacterial species) of the microbiota and SARS-CoV-2 susceptibility, the present inventor proposes a microbiota-based therapy or prophylaxis by administrating to subjects, preferably subjects at risk of suffering from an infectious respiratory disease, preferably a respiratory viral disease, a probiotic composition comprising one or more bacterial species selected from the group consisting of Haemophilus parainfluenzae, Neisseria cinerea, Streptococcus mitis group, Streptococcus bovis group, Leptotrichia buccalis and Rothia mucilaginosai, preferably selected from the group consisting of Haemophilus parainfluenzae, Neisseria cinerea, Streptococcus bovis group, Leptotrichia buccalis and Rothia mucilaginosai.