Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
IN VITRO METHOD FOR PREDICTING A RISK OF DEVELOPING PNEUMONIA IN A SUBJECT
Document Type and Number:
WIPO Patent Application WO/2017/121769
Kind Code:
A1
Abstract:
The invention relates to the field of in vitro diagnosis or prognosis. The invention provides in vitro methods for predicting the risk of developing pneumonia in a subject. More specifically, the method comprises determining the presence of one or more PNEUMONIA Risk Allele(s) in a sample from said subject, wherein the presence of said one or more PNEUMONIA Risk Allele(s) indicates whether the subject is at increased or decreased risk of developing pneumonia.

Inventors:
BLEIN SOPHIE (FR)
MIRA JEAN-PAUL (FR)
PACHOT ALEXANDRE (FR)
Application Number:
PCT/EP2017/050505
Publication Date:
July 20, 2017
Filing Date:
January 11, 2017
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
BIOMERIEUX SA (FR)
International Classes:
C12Q1/68
Domestic Patent References:
WO2005085273A12005-09-15
WO2002029100A12002-04-11
WO2014121180A12014-08-07
WO2014127290A22014-08-21
Other References:
DATABASE Geneseq [online] 18 October 2007 (2007-10-18), "Human single nucleotide polymorphism (SNP) probe SEQ ID NO:171185.", XP002755568, retrieved from EBI accession no. GSN:AGF98693 Database accession no. AGF98693
DATABASE EMBL [online] 19 August 2009 (2009-08-19), "Sequence 15914 from Patent WO2005056837.", XP002755569, retrieved from EBI accession no. EM_PAT:HB575442 Database accession no. HB575442
FERNANDO A. RIVERA-CHÁVEZ ET AL: "A TREM-1 Polymorphism A/T within the Exon 2 Is Associated with Pneumonia in Burn-Injured Patients", ISRN INFLAMMATION, vol. 15, no. 1, 1 January 2013 (2013-01-01), pages 45 - 6, XP055258914, DOI: 10.1016/S1072-7515(00)00785-7
Attorney, Agent or Firm:
CABINET PLASSERAUD (FR)
Download PDF:
Claims:
CLAIMS

1. An in vitro method for assessing the risk of developing pneumonia in a subject, said method comprising determining the presence of one or more PNEUMONIA Risk Allele(s) in a sample from said subject, wherein the presence of said one or more PNEUMONIA Risk Allele(s) indicates whether the subject is at increased or decreased risk of developing pneumonia, and wherein said PNEUMONIA Risk Allele is an allele selected at one or more of the following single nucleotide polymorphisms (SNPs):

or its related SNP allele(s) in high linkage disequilibrium, i.e. having a squared correlation coefficient r2 superior to 0.8 as measured in at least one of the reference populations or superpopulations from which the subject is originating.

The method of Claim 1, wherein related SNP allele(s) in high linkage disequilibrium are determined with respect to the reference population or superpopulation to which the subject is the closest from genetic perspective.

The method according to Claim 1, wherein said PNEUMONIA Risk Allele is selected among one or more of the following SNP allele(s):

SNP# Chromosome Position on genome Allele

hg38 / GRCh38

SNP1 2 118942520 C

SNP2 2 151416989 A SNP3 5 94570535 C

SNP4 10 5218849 A

SNP5 10 109907944 T

SNP6 11 9631369 T

SNP7 11 55318817 c

SNP8 14 50504728 c

SNP9 20 17558523 G

SNP10 20 23921958 G

The method of Claim 1, 2 or 3, comprising determining the presence of one or more following SNP allele(s):

or its related SNP allele(s) in high linkage disequilibrium, wherein detecting said one or more SNP allele(s) is associated to an increased risk of developing pneumonia in said subject.

The method of Claim 4, wherein detecting at least SNP5 allele or its related SNP allele(s) in high linkage disequilibrium is associated to an increased risk of developing pneumonia in said subject.

The method of Claim 1, 2 or 3, comprising determining the presence of one or more of following SNP allele(s):

SNP# Chromosome Position on genome Allele

hg38 / GRCh38

SNP1 2 118942520 C

SNP2 2 151416989 A

SNP3 5 94570535 C

SNP6 11 9631369 T

SNP9 20 17558523 G

SNP10 20 23921958 G or their related SNP allele(s) in high linkage disequilibrium, wherein detecting said one or more SNP allele(s) is associated to a decreased risk of developing pneumonia in said subject. 7. The method of Claim 6, wherein detecting at least SNP1 allele or its related SNP allele(s) in high linkage disequilibrium is associated to a decreased risk of developing pneumonia in said subject.

8. The method of any one of Claims 1 to 7, wherein said pneumonia is Pseudomonas aeruginosa pneumonia.

9. The method of any one of Claims 1 to 8, wherein said subject is a patient who is in need of receiving mechanical ventilation during at least 48 hours. 10. The method of any one of Claims 1 to 9, wherein said risk is a risk of developing serious pneumonia, for example serious ventilator-associated pneumonia.

11. The method according to any one of Claims 1 to 10, wherein presence of a PNEUMONIA Risk Allele is determined by a SNP detection method selected from the group consisting of sequencing methods including Sanger sequencing, next generation sequencing, pyrosequencing, sequencing by ligation; PCR-based methods including PCR, real-time PCR, quantitative PCR and high-resolution melting analysis; and/or any other SNP genotyping techniques such as amplification refractory mutation system (ARMS), restriction fragment length polymorphism (RFLP) analysis, denaturing gradient gel electrophoresis (DGGE), single-strand conformation polymorphism (SSCP), and allele discrimination methods including allele-specific hybridization, molecular beacons, allele-specific single base primer extension, Flap endonuclease discrimination, 5 'nuclease, oligonucleotide ligation and micro-array analysis of genomic DNA.

12. The method according to any one of Claims 1 to 11, further comprising extracting nucleic acids from the sample, wherein the presence of a PNEUMONIA Risk Allele is determined in the extracted nucleic acids.

13. The method according to any one of Claims 1 to 12, further comprising determining a prognosis of the subject based on the presence of one or more of the PNEUMONIA Risk Allele(s).

14. A kit for performing the method of any one of Claims 1-13, comprising means for detecting one or more of SNP(s) as defined in the table of Claim 1 or 2 or its related SNP allele(s) in high linkage disequilibrium.

15. The kit of Claim 14, wherein said means for detecting one or more of the SNP(s) include at least a probe that binds specifically to a nucleic acid comprising one SNP allele of the SNPs 1-10 as defined in a Table of Claim 1 or 2 or its related SNP allele(s) in high linkage disequilibrium, and/or primers that are capable of amplifying a nucleic acid comprising one SNP allele of the SNPs 1-10 as defined in the Table of Claim 1 or 2 or its related SNP allele(s) in high linkage disequilibrium.

16. The kit of Claim 14 or 15, wherein said nucleic acid comprising one SNP allele of the SNPs defined in a Table of Claim 1 or 2 or its related SNPs in high linkage disequilibrium is any one of SEQ ID NOs 1 : 10 or their fragments including the SNP allele with corresponding 5' and 3' flanking regions of at least 5 nucleotides; or, any one of SEQ ID NOs: 11-92 or their fragments including the SNP allele with corresponding 5' and 3' flanking regions of at least 5 nucleotides.

Description:
IN VITRO METHOD FOR PREDICTING A RISK OF DEVELOPING

PNEUMONIA IN A SUBJECT

FIELD OF THE INVENTION

The invention relates to the field of in vitro diagnosis or prognosis. The invention provides in vitro methods for predicting the risk of developing pneumonia in a subject. More specifically, the method comprises determining the presence of one or more PNEUMONIA Risk Allele(s) in a sample from said subject, wherein the presence of said one or more PNEUMONIA Risk Allele(s) indicates whether the subject is at increased or decreased risk of developing pneumonia. BACKGROUND

Patients in the intensive care unit (ICU) are very fragile patients. They are likely to develop secondary infections such as nosocomial infections. Among most common nosocomial infections, pneumonia is the second one, affecting 27% of critically ill patients them. Eighty six percent of nosocomial pneumonias are associated with mechanical ventilation and are termed ventilator-associated pneumonia (VAP). The mortality attributable to VAP has been reported to range between 0 and 50%. Higher mortality rates have been seen in VAP caused by Pseudomonas aeruginosa, Acinetobacter spp. and Stenotrophomonas maltophilia (Koenig and Truwit, 2006, Clinial Microbiology, Vol. 19, No. 4, p. 637-657). Because of this, these patients need to be intensively and closely monitored. In this respect, surveillance is put in place which requires presence of medical staff and equipment, and, in case of secondary infection, a longer stay in ICU, leading to a lack of resource availability for other patients and increased costs.

Being able to stratify patients for their risk to develop nosocomial infections, and in particular pneumonias, would permit to anticipate their medical needs and thus to save lives, while liberating resources for other patients. Therefore, there is a need for providing simple and sensitive method for assessing the risk of developing pneumonia in a subject.

SUMMARY

A first object of the invention relates to an in vitro method for assessing the risk of developing pneumonia in a subject, said method comprising determining the presence of one or more PNEUMONIA Risk Allele(s) in a sample from said subject wherein said one or more PNEUMONIA Risk Allele(s) indicates whether the subject is at increased or decreased risk of developing pneumonia, and wherein said PNEUMONIA Risk Allele is an allele selected at one or more of the single nucleotide polymorphism (SNP) of the following Table 1 :

or its related SNP allele(s) in high linkage disequilibrium.

In some embodiments, the linkage disequilibrium of a related SNP allele is determined within all populations or superpopulations to which the subject is originating.

In other embodiments, the linkage disequilibrium of a related SNP allele is determined within a reference population or superpopulation to which the subject is the closest from genetic perspective. In some embodiments, the linkage disequilibrium of a related SNP allele is determined within one of the following reference superpopulation to which the subject is the closest from genetic perspective:

- Americas (AMR);

- African (AFR);

- South Asian (SAS);

East Asian (EAS); and

European (EUR).

In other embodiments, the linkage disequilibrium of a related SNP allele is determined within one of the following reference population to which the subject is the closest from genetic perspective:

- Esan in Nigeria, Esan (ESN);

Gambian in Western Division, Mandinka, Gambian (GWD);

- Luhya in Webuye, Kenya, Luhya (LWK);

- Mende in Sierra Leone, Mende (MSL);

- Yoruba in Ibadan, Nigeria, Yoruba (YRI);

- African Caribbean in Barbados, Barbadian (ACB);

People with African Ancestry in Southwest USA, African-American SW (ASW); Colombians in Medellin, Colombia, Colombian (CLM);

- People with Mexican Ancestry in Los Angeles, CA, USA Mexican-American

(MXL);

Peruvians in Lima, Peru Peruvian (PEL);

Puerto Ricans in Puerto Rico Puerto Rican (PUR);

Chinese Dai in Xishuangbanna, China Dai Chinese (CDX);

- Han Chinese in Beijing, China Han Chinese (CHB);

Southern Han Chinese Southern Han Chinese (CHS);

- Japanese in Tokyo, Japan Japanese (JPT);

- Kinh in Ho Chi Minh City, Vietnam Kinh Vietnamese (KHV); - Utah residents (CEPH) with Northern and Western European ancestry (CEU); British in England and Scotland British (GBR);

Finnish in Finland Finnish (FIN);

- Iberian Populations in Spain Spanish (IBS);

- Toscani in Italia Tuscan (TSI);

Bengali in Bangladesh Bengali (BEB);

Gujarati Indians in Houston, TX, USA Gujarati (GIH);

- Indian Telugu in the UK Telugu (ITU);

Punjabi in Lahore, Pakistan Punjabi (PJL); and

- Sri Lankan Tamil in the UK Tamil (STU) .

In some embodiments, the subject is from one of the following superpopulation and the linkage disequilibrium is determined within the superpopulation of the subject:

- Americas (AMR);

- African (AFR);

- South Asian (SAS);

East Asian (EAS); and

European (EUR).

In some embodiments, the subject is from one of the following population and the linkage disequilibrium is determined within the population of the subject: - Esan in Nigeria, Esan (ESN);

Gambian in Western Division, Mandinka, Gambian (GWD);

- Luhya in Webuye, Kenya, Luhya (LWK);

- Mende in Sierra Leone, Mende (MSL);

- Yoruba in Ibadan, Nigeria, Yoruba (YRI);

- African Caribbean in Barbados, Barbadian (ACB);

People with African Ancestry in Southwest USA, African-American SW (ASW); Colombians in Medellin, Colombia, Colombian (CLM); People with Mexican Ancestry in Los Angeles, CA, USA Mexican-American (MXL);

Peruvians in Lima, Peru Peruvian (PEL);

Puerto Ricans in Puerto Rico Puerto Rican (PUR);

- Chinese Dai in Xishuangbanna, China Dai Chinese (CDX);

Han Chinese in Beijing, China Han Chinese (CHB);

Southern Han Chinese Southern Han Chinese (CHS);

- Japanese in Tokyo, Japan Japanese (JPT);

- Kinh in Ho Chi Minh City, Vietnam Kinh Vietnamese (KHV);

- Utah residents (CEPH) with Northern and Western European ancestry (CEU);

British in England and Scotland British (GBR);

Finnish in Finland Finnish (FIN);

- Iberian Populations in Spain Spanish (IBS);

Toscani in Italia Tuscan (TSI);

- Bengali in Bangladesh Bengali (BEB);

Gujarati Indians in Houston, TX, USA Gujarati (GIH);

- Indian Telugu in the UK Telugu (ITU);

Punjabi in Lahore, Pakistan Punjabi (PJL); and

Sri Lankan Tamil in the UK Tamil (STU). In specific embodiments, the subject is the closest to the European superpopulation and the linkage disequilibrium is determined within the European superpopulation.

In other specific embodiments, the subject is a European and the linkage disequilibrium is determined within the following reference population to which the subject is the closest from genetic perspective: - Utah residents (CEPH) with Northern and Western European ancestry (CEU);

British in England and Scotland British (GBR);

Finnish in Finland Finnish (FIN);

Iberian Populations in Spain Spanish (IBS); and Toscani in Italia Tuscan (TSI).

In some embodiments, said PNEUMONIA Risk Allele is selected among one or more of the SNP allele(s) of the following Table 2:

or its related SNP allele(s) in high linkage disequilibrium. In related embodiments, the method comprises determining the presence of one or more SNP allele(s) of the following table 3:

or its related SNP allele(s) in high linkage disequilibrium, wherein detecting said one or more SNP allele(s) is associated to an increased risk of developing pneumonia in said subject. In one related specific embodiment, detecting at least SNP5 allele or its related SNP allele(s) in high linkage disequilibrium is associated to an increased risk of developing pneumonia in said subject.

In other related embodiments, the method comprises determining the presence of one or more SNP allele(s) of the following Table 4: SNP# Chromosome Position on genome hg38 / Allele

G Ch38

SNPl 2 118942520 C

SNP2 2 151416989 A

SNP3 5 94570535 C

SNP6 11 9631369 T

SNP9 20 17558523 G

SNPIO 20 23921958 G or its related SNP allele(s) in high linkage disequilibrium, wherein detecting said one or more SNP allele(s) is associated to a decreased risk of developing pneumonia in said subject. In one related specific embodiment, detecting at least SNPl allele or its related SNP allele(s) in high linkage disequilibrium is associated to a decreased risk of developing pneumonia in said subject.

In some embodiments, said pneumonia is Pseudomonas aeruginosa pneumonia.

In some other embodiments, said subject is a patient who is in need of receiving mechanical ventilation during at least 48 hours.

In some embodiments, said risk is a risk of developing serious pneumonia, and for example, serious ventilator-associated pneumonia.

In each of these embodiments, the presence of a PNEUMONIA Risk Allele in a subject may be detected by any SNP detection method selected from the group consisting of sequencing methods including without limitation Sanger sequencing, next generation sequencing, pyrosequencing, sequencing by ligation; PCR-based methods including without limitation PCR, real-time PCR, quantitative PCR; any SNP genotyping techniques such as amplification refractory mutation system (ARMS), restriction fragment length polymorphism (RFLP) analysis, denaturing gradient gel electrophoresis (DGGE), single- strand conformation polymorphism (SSCP); and/or any allele discrimination methods including allele-specific hybridization, molecular beacons, allele-specific single base primer extension, Flap endonuclease discrimination, 5' nuclease, oligonucleotide ligation and micro-array analysis of genomic DNA. In some embodiments, the method further comprises extracting nucleic acids from the sample, wherein the presence of a PNEUMONIA Risk Allele is determined in the extracted nucleic acids.

In some embodiments, the method further comprises determining a prognosis of the subject based on the presence of one or more of the PNEUMONIA Risk Allele(s).

In some embodiments, the method further comprises adapting health care management for said subject based on the presence of one or more of the PNEUMONIA Risk Allele(s).

In another embodiment, the method further comprises administering an adapted or preventive treatment of pneumonia to the subject based on the presence of one or more of the PNEUMONIA Risk Allele(s).

Also provided herein are kits for performing the method as defined above. In one embodiment, said kit comprises:

(i) means for detecting one or more of the SNP(s) as defined in the table 1 or 2 or its related SNP allele(s) in high linkage disequilibrium;

(ii) optionally, instructions for use of the kit.

In some embodiments of the kit, said means for detecting one or more of the SNP(s) include at least a probe that binds specifically to a nucleic acid comprising one SNP allele of the SNPs 1-10 as defined in Table 1 or 2 or its related SNP allele(s) in high linkage disequilibrium, and/or primers that are capable of amplifying a nucleic acid comprising one SNP allele of the SNPs 1-10 as defined in Table 1 or 2 or its related SNP allele(s) in high linkage disequilibrium.

In specific embodiments of the kit, said nucleic acid comprising one SNP allele of the SNPs 1-10 defined in Table 1 or 2 or its related SNP allele(s) in high linkage disequilibrium is either

(i) any one of SEQ ID NOs 1 : 10 or their fragments including the SNP allele with corresponding 5' and 3' flanking regions of at least 5 nucleotides; or, (ii) any one of SEQ ID NOs: 11-92 or their fragments including the SNP allele with corresponding 5' and 3' flanking regions of at least 5 nucleotides.

DETAILED DESCRIPTION

A first object of the invention relates to an in vitro method for assessing the risk of developing pneumonia in a subject, said method comprising determining the presence of one or more PNEUMONIA Risk Allele(s) in a sample from said subject, wherein said one or more PNEUMONIA Risk Allele(s) indicates whether the subject is at increased or decreased risk of developing pneumonia.

The term "patient" or "subject" which is used herein interchangeably refers to a human being, including for example a man or a woman that has or is suspected to have a pneumonia, or a subject at risk in developing pneumonia. In specific embodiments, a subject at risk in developing pneumonia is a patient who is in need of receiving mechanical ventilation during at least 48 hours. In other embodiments, a subject at risk in developing pneumonia is a patient in need of surgical intervention, more specifically, in need of digestive, thoracic, cardiac or neuro surgery. In other embodiments, a subject at risk in developing pneumonia is a transplanted patient or in need of organ transplantation. In still other embodiments, a subject at risk in developing pneumonia is a patient suffering from chronic inflammatory disorders. In still other embodiments, a subject at risk in developing pneumonia is a patient who has had or is in need of having an immunosuppressant and/or anti-inflammatory treatment. In still other embodiments, a subject at risk in developing pneumonia is a patient who is immunocompromised.

Accordingly, in certain embodiments, the method further comprises determining a prognosis of the subject based on the presence of one or more of the PNEUMONIA Risk Allele(s). As used herein, the term "pneumonia" refers to all pneumonia, and more preferably to pneumonia which occurs more than 48 hours after patients have been intubated and received mechanical ventilation (referred hereafter as "ventilator-associated pneumonia" or "VAP"). These include without limitation Pseudomonas spp, and especially Pseudomonas aeruginosa, Acinetobacter spp., Enterobacter spp., Staphylococcus Aureus, Streptococcus pneumoniae, Escherichia coli, Klebsiella spp., Haemophilus influenza, Moraxella catarrhalis, and Stenotrophomonas maltophilia pneumonia (see also Intensive Car Med. 2015, 41(1) 34-48). In a preferred embodiment, pneumonia is Pseudomonas aeruginosa ventilator-associated pneumonia.

As used herein, the term "prognosis" refers to a relative probability that a certain future outcome may occur in a patient. For example, in the context of the present invention, prognosis can refer to the likely severity of the pneumonia (e.g., severity of symptoms, rate of functional decline, survival, etc.). The terms are not intended to be absolute, as will be appreciated by any one of skill in the field of medical diagnostics.

In a preferred embodiment, a "poor prognosis" in the context of the present invention means that a patient is at higher risk of developing serious pneumonia, in particular serious VAP. As used herein, the term "serious pneumonia" refers to pneumonia, for example VAP, with one or more of the following characteristics: at least one recurrence of pneumonia, for example, at least one recurrence of VAP during the same intensive care unit (ICU) stay,

at least one severe pulmonary complication among: acute respiratory distress syndrome and septic shock.

As used herein, the wording "assessing the risk of developing pneumonia" means assigning an increased or decreased probability of having/developing pneumonia and/or assigning an increased or decreased probability of having a poor prognosis for a subject having developed the disease, as compared to the average risk in a population. Of course, an increased probability does not mean that the subject will develop the disease or will have a poor prognosis for the disease. The method may not also give a precise probability for such risk but may give a relative risk assessment as compared to the average risk in a given population. Providing a biological sample

The method of the invention is an in vitro method which can be carried out on any appropriate biological sample obtained from a subject.

As used herein, the term "biological sample" refers to a sample that contains nucleic acid materials reflecting the genomic information of cells, tissue or organs of the subject.

For example, said biological sample may be obtained from urine, blood including without limitation peripheral blood or plasma or serum, stool, sputum, saliva, bronchoalveolar fluid, endotracheal aspirates, wounds, cerebrospinal fluid, lymph node, exudate and more generally any human biopsy tissue or body fluids, tissues or materials. The method may further comprise the step of extracting nucleic acids from the biological sample, wherein the presence of a PNEUMONIA Risk Allele is detected in the extracted nucleic acids.

It is well known that genomic DNA of individuals can easily be purified from individual blood sample or saliva. Therefore, in a preferred embodiment, said biological sample is blood sample or a derivative thereof such as plasma or serum.

The PNEUMONIA Risk Alleles

As used herein, a "PNEUMONIA Risk Allele" refers to a single nucleotide polymorphism (SNP) allele that is associated with an increased or decreased risk of developing the disease. The inventors have indeed identified that certain single nucleotide polymorphism (SNP) alleles are associated to increased or decreased risk of developing pneumonia in a subject, and more particularly an increased or decreased risk of developing ventilator-associated pneumonia. As used herein, a "single nucleotide polymorphism (SNP)" is a single base (nucleotide) polymorphism in a DNA sequence among individuals in a population. Typically in the literature, a single nucleotide polymorphism (SNP) may fall within coding sequences of genes, non-coding regions of genes, or in the intergenic regions between genes. SNPs within a coding sequence will not necessarily change the amino acid sequence of the protein that is produced, due to degeneracy of the genetic code. A SNP in which both forms lead to the same polypeptide sequence is termed "synonymous" (sometimes called a silent mutation) - if a different polypeptide sequence is produced they are "nonsynonymous". A nonsynonymous change may either be "missense" or "nonsense", where a missense change results in a different amino acid, while a nonsense change results in a premature stop codon. The exact sequence of a SNP can be determined from the database of SNPs available at the NCBI website (Entrez SNP, dbSNP build 128). The "position" of the nucleotide of interest gives the location of the SNP in a specified version of the genome, referring to the nucleotide position from the p-terminus of the chromosome in the human genome, see the NCBI SNP website (dbSNP), available on the internet. The version of the genome specified in the context of the present invention is hg38, also known as Genome Reference Consortium Human GRCh38.

As used herein, an allele refers to a particular form of a genetic locus, distinguished from other forms by its particular nucleotide sequence, or one of the alternative polymorphisms found at a polymorphic site (for example a SNP).

The PNEUMONIA Risk Alleles for use in the methods and kits of the invention, include any SNP alleles as described in the following table 5:

SNP# Chromosome Position on genome hg38 / rsID PNEUMONIA SEQ ID NOs

G Ch38 risk allele

SNP1 2 118942520 rs2077344 C 1

SNP2 2 151416989 rs6735771 A 2

SNP3 5 94570535 - C 3

SNP4 10 5218849 rs2398157 A 4

SNP5 10 109907944 rs686155 T 5

SNP6 11 9631369 rs6483579 T 6

SNP7 11 55318817 rsl216159 c 7 SNP8 14 50504728 rs7146431 C 8

SNP9 20 17558523 rs6044883 G 9

SNPIO 20 23921958 rsl797041 G 10 or their related SNP allele(s) in high linkage disequilibrium. The rsID in the above table refers to the SNP as described for example in dbSNP. The corresponding SEQ ID NOs in the above table describe the SNP alleles (PNEUMONIA Risk Alleles) with its flanking 5' and 3' sequences. As used herein, a linkage disequilibrium refers to co-occurrence of two genetic loci (e.g. SNP allele) at a frequency greater than expected for independent loci based on the allele frequencies. Linkage disequilibrium (LD) typically occurs when two loci are located close together on the same chromosome. When alleles of two genetic loci are in high LD, the allele observed at one locus is predictive of the allele found at the other locus. The linkage disequilibrium measure r 2 (the squared correlation coefficient) can be used to evaluate how SNPs are related on a haplotype block. In specific embodiments, SNP allele in high linkage disequilibrium have an r 2 value with a specific (tag) SNP allele of greater than or equal to 0.8, greater than or equal to 0.85, greater than or equal to 0.9, or greater than or equal to 0.95, as measured in at least one of the reference populations or superpopulations to which the subject is originating or is the closest from genetic perspective.

As used herein, a reference superpopulation is selected from the following superpopulation as defined in The 1000 Genomes Project Consortium (Nature 1 October 2015, 526, 68-74, A Global reference for human genetic variation):

- African ancestry (AFR);

- Americas (AMR);

East Asian ancestry (EAS);

- South Asian ancestry (SAS); and

European ancestry (EUR).

For example, in a specific embodiment, if the risk to be assessed according to the disclosed method is for a subject which is closest, from genetic perspective, to the European ancestry (EUR) population, the linkage disequilibrium r 2 value may be measured in the European ancestry population as the reference superpopulation.

As used herein, a reference population is selected from one of the population as defined in The 1000 Genomes Project Consortium (Nature 1 October 2015, 526, 68-74, A Global reference for human genetic variation)

In specific embodiments, the reference population for use in determining the high linkage disequilibrium is selected from one of the following reference population as specifically defined in The 1000 Genomes Project Consortium (Nature 1 October 2015, 526, 68-74, A Global reference for human genetic variation, in particular in the supplementary information Table 1 :

African ancestry

- Esan in Nigeria, Esan (ESN);

Gambian in Western Division, Mandinka, Gambian (GWD);

- Luhya in Webuye, Kenya, Luhya (LWK);

- Mende in Sierra Leone, Mende (MSL);

- Yoruba in Ibadan, Nigeria, Yoruba (YRI);

- African Caribbean in Barbados, Barbadian (ACB); and

People with African Ancestry in Southwest USA, African-American SW (ASW).

Americas - Colombians in Medellin, Colombia, Colombian (CLM);

People with Mexican Ancestry in Los Angeles, CA, USA Mexican-American (MXL);

Peruvians in Lima, Peru Peruvian (PEL); and

Puerto Ricans in Puerto Rico Puerto Rican (PUR). East Asian ancestry Chinese Dai in Xishuangbanna, China Dai Chinese (CDX);

Han Chinese in Beijing, China Han Chinese (CHB);

Southern Han Chinese Southern Han Chinese (CHS);

- Japanese in Tokyo, Japan Japanese (JPT); and

- Kinh in Ho Chi Minh City, Vietnam Kinh Vietnamese (KHV).

European ancestry

- Utah residents (CEPH) with Northern and Western European ancestry (CEU);

British in England and Scotland British (GBR);

Finnish in Finland Finnish (FIN);

- Iberian Populations in Spain Spanish (IBS); and

Toscani in Italia Tuscan (TSI).

South Asian ancestry

Bengali in Bangladesh Bengali (BEB);

Gujarati Indians in Houston, TX, USA Gujarati (GIH);

- Indian Telugu in the UK Telugu (ITU);

Punjabi in Lahore, Pakistan Punjabi (PJL); and

Sri Lankan Tamil in the UK Tamil (STU).

In specific embodiments, the subject may be of mixed genetic background with one predominant genetic background. In such situation, a reference population or superpopulation to which the subject is the closest from genetic perspective may be determined and SNPs in high linkage disequilibrium to such population may be used in the method of the present disclosure.

In other embodiments, the subject is of mixed but equal genetic background, for example 50% from a reference population and 50% from another population. In such situation, the linkage disequilibrium may be determined within all the populations from which the subject is originating. In order to determine the reference population to which a subject is originating or is the closest from genetic perspective, there are many commercially available kits that can be used. As an example one can use the AncestryDNA kit (as commercialized by Ancestry corporate). According to other specific embodiments, the subject is within one of the following superpopulation and the linkage disequilibrium is determined within the superpopulation of the subject:

- Americas (AMR);

- African (AFR);

- South Asian (SAS);

East Asian (EAS); and

European (EUR).

According to other specific embodiments, the subject is from one of the following population and the linkage disequilibrium is determined within the population of the subject:

- Esan in Nigeria, Esan (ESN);

Gambian in Western Division, Mandinka, Gambian (GWD);

- Luhya in Webuye, Kenya, Luhya (LWK);

- Mende in Sierra Leone, Mende (MSL);

- Yoruba in Ibadan, Nigeria, Yoruba (YRI);

- African Caribbean in Barbados, Barbadian (ACB);

People with African Ancestry in Southwest USA, African-American SW (ASW); Colombians in Medellin, Colombia, Colombian (CLM) ;

People with Mexican Ancestry in Los Angeles, CA, USA Mexican-American (MXL);

Peruvians in Lima, Peru Peruvian (PEL);

Puerto Ricans in Puerto Rico Puerto Rican (PUR);

Chinese Dai in Xishuangbanna, China Dai Chinese (CDX); Han Chinese in Beijing, China Han Chinese (CHB);

Southern Han Chinese Southern Han Chinese (CHS);

- Japanese in Tokyo, Japan Japanese (JPT);

- Kinh in Ho Chi Minh City, Vietnam Kinh Vietnamese (KHV);

- Utah residents (CEPH) with Northern and Western European ancestry (CEU);

British in England and Scotland British (GBR);

Finnish in Finland Finnish (FIN);

- Iberian Populations in Spain Spanish (IBS);

Toscani in Italia Tuscan (TSI);

Bengali in Bangladesh Bengali (BEB);

Gujarati Indians in Houston, TX, USA Gujarati (GIH);

- Indian Telugu in the UK Telugu (ITU);

Punjabi in Lahore, Pakistan Punjabi (PJL); and

Sri Lankan Tamil in the UK Tamil (STU).

SNP allele(s) for use in the methods and kits of the invention, in high LD with a squared correlation coefficient r 2 superior to 0.8 with some of the SNP alleles of the above Table 5, are described in the Examples below for the European superpopulation. Such SNP alleles with their flanking 5' and 3' sequences are also given in SEQ ID NOs 11-92.

SNP alleles associated to increased risk of developing pneumonia

The inventors have identified specific SNP alleles that are associated to an increased risk of developing pneumonia, in particular, an increased risk of developing ventilator- associated pneumonia, including an increased risk of developing Pseudomonas aeruginosa VAP.

Therefore, in one embodiment, the method comprises determining the presence of one or more of the SNP allele(s) described in the following Table 3:

SNP# Chromosome Position on genome hg38 / Allele

G Ch38

SNP4 10 5218849 A SNP5 10 109907944 T

SNP7 11 55318817 c

SNP8 14 50504728 c or its related SNP allele(s) in high linkage disequilibrium, wherein detecting said one or more SNP allele(s) is associated to an increased risk of developing pneumonia in said subject, for example, an increased risk of developing ventilator-associated pneumonia.

A preferred SNP allele associated to increased risk of developing pneumonia is SNP5 allele or its related SNP allele(s) in high linkage disequilibrium. SNP5 is SNP rs686155 (SEQ ID NO:5), located at position 109907944 in chromosome 10, and is a SNP located in an intron of XPNPEP1 gene (130bp upstream of exon 3).

SNP alleles associated to decreased risk of developing pneumonia

The inventors have also identified specific SNP alleles that are associated to a decreased risk of developing pneumonia, in particular, a decreased risk of developing ventilator- associated pneumonia, including a decreased risk of developing Pseudomonas aeruginosa VAP.

Therefore, in one embodiment, the method comprises determining the presence of one or more of SNP allele(s) described in the following Table 4:

or their related SNP allele(s) in high linkage disequilibrium, wherein detecting said one or more SNP allele(s) is associated to a decreased risk of developing pneumonia, more particularly, ventilator-associated pneumonia, in said subject. A preferred SNP allele associated to decreased risk of developing pneumonia is SNP1 allele or its related SNP allele(s) in high linkage disequilibrium. SNP1 is SNP rs2077344 (SEQ ID NO: l), located at position 118942520 in chromosome 1 , and is a SNP located in an intron of MARCO gene (lOObp downstream of exon 1). Determining the presence or absence of a PNEUMONIA Risk Allele

The presence of the above-mentioned PNEUMONIA Risk Allele may be determined by any appropriate SNP detection methods in the biological sample.

Preferably, the method of the invention includes determining the presence of a PNEUMONIA Risk Allele by detecting a SNP allele at the nucleic acid level, for example, from genomic DNA extracted from a subject.

In one specific embodiment, the presence of the PNEUMONIA Risk Allele is determined on either one or both chromosomes, wherein the identification of the PNEUMONIA Risk Allele (e.g. one or more specific SNPs as described above) on at least one chromosome indicates that the subject is at increased or decreased risk for developing pneumonia, for example at increased or decreased risk for developing serious pneumonia, such as serious VAP.

In related embodiments, the method thus includes determining at least one SNP genotype (e.g. one or more of the SNPs described herein, in particular one or more of SNPs 1-10 described in the above Table 5 or their related SNPs in high linkage disequilibrium), such that the presence of the at least one PNEUMONIA Risk Allele determines genetic predisposition to pneumonia (increased or decreased risk of developing pneumonia) in the subject.

In some embodiments, the method includes determining whether the subject is homozygous for at least one PNEUMONIA Risk Allele or heterozygous for at least one PNEUMONIA Risk Allele and/or do not have such PNEUMONIA Risk Allele at a given locus. In some embodiments, the presence in the genome of a subject of two or more SNPs of interest is determined. For example, two or more SNPs described in the above Table 3 or their related SNP allele(s) in high linkage disequilibrium may be used to determine whether a subject is at increased risk of developing pneumonia. Alternatively, two or more of the SNPs described in the above Table 4 or their related SNP allele(s) in high linkage disequilibrium may be used to determine whether a subject is at decreased risk of developing pneumonia.

Methods for detecting the presence of a PNEUMONIA Risk Allele (for example one or more of the SNPs described in Table 5 or their related SNP allele(s) in high linkage disequilibrium) in a sample are not limited to any particular way of detecting the presence or absence of a SNP and can employ any suitable method to detect a SNP in a genome of a subject, which numerous methods are well known in the art.

Such methods include without limitation sequencing, PCR-based methods and/or other SNP genotyping techniques. In one embodiment, determining the presence of one or more PNEUMONIA Risk Allele(s) comprises the step of sequencing a genomic DNA fragment including a SNP of interest (i.e. potentially corresponding to a PNEUMONIA Risk Allele) and analysing the sequence for determining the presence or the absence of a PNEUMONIA Risk Allele. Sequencing methods include without limitation, Sanger sequencing, next generation sequencing, pyrosequencing, sequencing by ligation. Prior to sequencing, amplification of said genomic fragment including the SNP of interest may be carried out from the genomic DNA of the biological sample.

In another embodiment, determining the presence of one or more PNEUMONIA Risk Allele(s) comprises the step of amplifying a nucleic fragment susceptible of containing one of the PNEUMONIA Risk Allele(s) using PCR-based methods. PCR-based methods useful for SNP detection include without limitation real-time PCR, quantification PCR, high-resolution melting analysis and amplification refractory mutation system PCR (ARMS-PCR). The amplified nucleic acid fragment may be a fragment of at least 20, 30, 40, 50, 100, 200, 300 or at least 500 consecutive nucleotides, for example comprised between 20 and 1000 consecutive nucleotides, preferably between 30 and 200 consecutive nucleotides. The one man skilled in the art will adapt the size of the fragments according to the method used for determining the presence of the SNPs. Other available methods for SNPs genotyping may be used in the methods of the invention. The traditional gel-based approach uses standard molecular techniques, such as amplification refractory mutation system (ARMS), restriction digests and various forms of gel electrophoresis (e.g., RFLP), denaturing gradient gel electrophoresis (DGGE) and single-strand conformation polymorphism (SSCP). High throughput methods include allele discrimination methods (Allele-Specific Hybridization, Molecular Beacons, Allele- Specific Single-Base Primer Extension, 5 'nuclease), High-throughput assay chemistry (Flap endonuclease discrimination, Oligonucleotide ligation), microarray analysis of genomic DNA, pyrosequencing and light cycler.

In a specific embodiment, dynamic allele specific hybridization method (DASH) is used to detect the presence of a PNEUMONIA Risk Allele. Dynamic allele-specific hybridization (DASH) genotyping takes advantage of the differences in the melting temperature in DNA that results from the instability of mismatched base pairs. The process can be vastly automated and encompasses a few simple principles.

Typically, the target genomic segment is amplified and separated from non-target sequence, e.g., through use of a biotinylated primer and chromatography. A probe that is specific for the particular allele is added to the amplification product. The probe can be designed to hybridize specifically to the PNEUMONIA Risk Allele or alternative SNP alleles. The probe can be either labeled with or added in the presence of a molecule that fluoresces when bound to double stranded-DNA. The signal intensity is then measured as temperature is increased until the Tm can be determined. A non-matching sequence (either PNEUMONIA Risk Allele or alternative SNP alleles, depending on probe design) will result in a lower than expected Tm. Molecular beacons can also be used to detect the SNPs in the methods of the invention. This method makes use of a specifically engineered single-stranded oligonucleotide probe. The oligonucleotide is designed such that there are complementary regions at each end and a probe sequence located in between. This design allows the probe to take on a hairpin, or stem-loop, structure in its natural, isolated state. Attached to one end of the probe is a fluorophore and to the other end a fluorescence quencher. Because of the stem- loop structure of the probe, the fluorophore is in close proximity to the quencher, thus preventing the molecule from emitting any fluorescence. The molecule is also engineered such that only the probe sequence is complementary to the genomic DNA that will be used in the assay. If the probe sequence of the molecular beacon encounters its target genomic DNA during the assay, it will anneal and hybridize. Because of the length of the probe sequence, the hairpin segment of the probe will be denatured in favor of forming a longer, more stable probe-target hybrid. This conformational change permits the fluorophore and quencher to be free of their tight proximity due to the hairpin association, allowing the molecule to fluoresce.

If on the other hand, the probe sequence encounters a target sequence with as little as one non-complementary nucleotide, the molecular beacon will preferentially stay in its natural hairpin state and no fluorescence will be observed, as the fluorophore remains quenched.

If a molecular beacon is designed to match a PNEUMONIA Risk Allele and another to match the alternative allele, the two can be used to identify the genotype of an individual. If only the first probe's fluorophore wavelength is detected during the assay then the individual is homozygous to the PNEUMONIA Risk Allele. If only the second probe's wavelength is detected then the individual is homozygous to the alternative allele. Finally, if both wavelengths are detected, then both molecular beacons must be hybridizing to their complements and thus the individual must contain both alleles and be heterozygous

A microarray can also be used to detect a SNP of interest. Hundreds of thousands of probes can be arrayed on a small chip, allowing for many SNPs to be interrogated simultaneously. Because SNP alleles only differ in one nucleotide and because it is difficult to achieve optimal hybridization conditions for all probes on the array, the target DNA has the potential to hybridize to mismatched probes. This is addressed somewhat by using several redundant probes to interrogate each SNP. Probes are designed to have the SNP site in several different locations as well as containing mismatches to the SNP allele. By comparing the differential amount of hybridization of the target DNA to each of these redundant probes, it is possible to determine specific homozygous and heterozygous alleles.

PCR and amplification based-methods can be used alternatively. For example, tetra-primer amplification refractory mutation system PCR, or ARMS-PCR, employs two pairs of primers to amplify two alleles in one PCR reaction. The primers are designed such that the two primer pairs overlap at a SNP location but each match perfectly to only one of the possible SNPs. As a result, if a given allele is present in the PCR reaction, the primer pair specific to that allele will produce product but not to the alternative allele with a different SNP. The two primer pairs are also designed such that their PCR products are of a significantly different length allowing for easily distinguishable bands by gel electrophoresis or melt temperature analysis.

Primer extension is another possible method for detecting SNP of interest. Primer extension first involves the hybridization of a probe to the bases immediately upstream of the SNP nucleotide followed by a 'mini-sequencing' reaction, in which DNA polymerase extends the hybridized primer by adding a base that is complementary to the SNP nucleotide. The incorporated base that is detected determines the presence or absence of the SNP allele.

Taqman 5 ' nuclease genotyping method may also be used. In this method, oligonucleotide PCR primers are designed that flank the mutation in question and allow PCR amplification of the region. A third oligonucleotide probe is then designed to hybridize to the region containing the base subject to change between different alleles of the gene. This probe is labelled with fluorescent dyes at both the 5' and 3' ends. These dyes are chosen such that while in this proximity to each other the fluorescence of one of them is quenched by the other and cannot be detected. Extension by Taq DNA polymerase from the PCR primer positioned 5' on the template relative to the probe leads to the cleavage of the dye attached to the 5' end of the annealed probe through the 5' nuclease activity of the Taq DNA polymerase. This removes the quenching effect allowing detection of the fluorescence from the dye at the 3' end of the probe. The discrimination between different DNA sequences arises through the fact that if the hybridization of the probe to the template molecule is not complete (there is a mismatch of some form) the cleavage of the dye does not take place. Thus only if the nucleotide sequence of the oligonucleotide probe is completely complimentary to the template molecule to which it is bound will quenching be removed. A reaction mix can contain two different probe sequences each designed against different alleles that might be present thus allowing the detection of both alleles in one reaction.

Thus, in a number of the above methods, primers and probes which span one or more fragments that comprise the putative location of a PNEUMONIA Risk Allele (e.g specific SNP as described in the above paragraphs) may be used to detect said PNEUMONIA Risk Allele.

As used herein, the term "probe" or "primer" refer to one or more nucleic acid fragments whose specific hybridization to a sample can be detected. A probe or primer can be of any length depending on the particular technique it will be used for. Such probes or primers which may be used in the methods of the invention may typically be short nucleic acid molecules, for instance DNA oligonucleotides of 10 nucleotides or more in length, which can be annealed to the complementary target nucleic acid molecule by nucleic acid hybridization to form a hybrid between the primer or probe and the target nucleic acid strand. The probe or primers can be unlabelled or labelled so that its binding to a target sequence can be detected (e.g. with a FRET donor or acceptor label).

A primer can be extended along the target nucleic acid molecule by a polymerase enzyme. Therefore, primers can be used to amplify the target nucleic acid molecule, such as fragments including any of the SNPs 1-10 described in the above Tables 1-5, and/or their related SNPs in high linkage disequilibrium, including any of the SNPs in SEQ ID NO: l 1- 92. The specificity of a probe or a primer increases with its length. Thus, for example, a probe or primer that includes 30 consecutive nucleotides will anneal to a target sequence with a higher specificity than a corresponding primer of only 15 nucleotides. Thus, to obtain greater specificity, probes and primers can be selected that include at least 15, 20, 25, 30,

35, 40, 45, 50, 55, 60, 65, 70 or more consecutive nucleotides. In particular examples, a primer is at least 15 nucleotides in length, such as at least 15 contiguous nucleotides complementary to a target nucleic acid molecule. Particular lengths of primers that can be used to practice the methods of the present disclosure include primers having at least 15, at least 16, at least 17, at least 18, at least 19, at least 20, at least 21, at least 22, at least 23, at least 24, at least 25, at least 26, at least 27, at least 28, at least 29, at least 30, at least 31, at least 32, at least 33, at least 34, at least 35, at least

36, at least 37, at least 38, at least 39, at least 40, at least 45, at least 50, at least 55, at least 60, at least 65, at least 70, or more contiguous nucleotides complementary to the target nucleic acid molecule to be amplified, such as a primer of 15-70 nucleotides, 15-60 nucleotides, 15-50 nucleotides, or 15-30 nucleotides. An "upstream" or "forward" primer is a primer 5' to a reference point on a nucleic acid sequence. A "downstream" or "reverse" primer is a primer 3' to a reference point on a nucleic acid sequence. In general, at least one forward and one reverse primer are included in an amplification reaction.

Nucleic acid probes and primers can be readily prepared based on the nucleic acid sequence flanking the SNPs of interest for use in the methods of the invention, and for example the genomic sequence including any of SEQ ID NO 1-92. PCR primer pairs can be derived from a known sequence by using computer programs intended for that purpose such as Primer 3 (v. 0.4.0 Whitehead Institute for Biomedical Research, Steve Rozen, and Helen Skaletsky).

In other embodiments, the SNPs of interest are detected by specific hybridization of nucleic acid probes, such as oligonucleotide probes to genomic DNA or RNA transcripts or corresponding cDNA, potentially containing the PNEUMONIA Risk Alleles, for example containing any one of the SNPs 1-10 or their related SNP allele(s) in high linkage disequilibrium as described in the above paragraphs, including any of the SNPs in SEQ ID NO 11-92. Such probes can also be immobilized on a solid surface (such as nitrocellulose, glass, quartz, fused silica slide) as in an array or microarray or DNA chip. One of skill in the art will recognize that the precise sequence of particular probes and primers can be modified from the target sequence to a certain degree to produce probes that are "substantially identical" or "substantially complementary" to a target sequence, while retaining the ability to specifically bind to (i.e. hybridize specifically to) the same targets from which they are derived.

In the context of the present invention, the terms "capable of hybridizing to" and "binds specifically to", which are used interchangeably, refer to a polynucleotide sequence that forms Watson-Crick bonds with a complementary sequence. One of skill will understand that the percent complementary need not be 100% for hybridization or specific binding to occur, depending on the length of the polynucleotides, length of the complementary region and stringency of the conditions. For example, a primer or probe is at least 60%, 70%>, 80%), 90%o, 95%), 99%o or 100% complementary over the stretch of the complementary region. Assessing a risk in a subject and health care management

The invention further relates to patient stratification methods based on the risk assessment provided by the above-described methods. In particular, a patient identified at increased risk of developing pneumonia may have adapted health care management in order to reduce the risk of developing pneumonia and for example, in order to reduce the risk of developing serious pneumonia, such as VAP.

Accordingly, the invention further includes a method comprising (i) identifying whether a patient is at increased risk of developing pneumonia according to the above defined methods, and,

(ii) adapting health care management of said patient identified at step (i) to reduce the risk of developing pneumonia.

The patient stratification method is particularly suitable to patients which are already known at risk of developing pneumonia based on their clinical profile. In particular, a subject at risk in developing pneumonia may be a patient in need of surgical intervention, more specifically, in need of digestive, thoracic, cardiac or neuro surgery. In other embodiments, a subject at risk in developing pneumonia is a transplanted patient or in need of organ transplantation. In still other embodiments, a subject at risk in developing pneumonia is a patient suffering from chronic inflammatory disorders. In still other embodiments, a subject at risk in developing pneumonia is a patient who has had or is in need of having an immunosuppressant and/or anti-inflammatory treatment. In still other embodiments, a subject at risk in developing pneumonia is a patient who is immunocompromised. Such patients already known to be at risk of developing pneumonia will be assayed for the presence of one or more PNEUMONIA Risk Alleles in their genome, and an adapted health care management will be decided by the clinician accordingly.

In particular, for patients with an increased risk of developing pneumonia based on the presence or absence of a PNEUMONIA Risk Allele, such adapted health care management may include: daily interruption of sedation, daily weaning trials, head of the bed elevation, oral care, usage of appropriate endotracheal tubes, subglotting secretion drainage via continuous or intermittent suction, new devices to remove bio film from the inside of the endotracheal tube, saline instillation prior to suction, and early tracheostomy. In a specific embodiment, the method further comprises administering an adapted or preventive treatment of pneumonia to the subject, based on the presence or absence of one or more of PNEUMONIA Risk Allele(s).

For example, patients with an increased risk of developing pneumonia based on the presence or absence of a PNEUMONIA Risk Allele may be treated with immunostimulating treatment, and/or prophylactic antibiotic treatment in order to reduce the risk of developing pneumonia, for example in order to reduce the risk of developing serious pneumonia such as serious VAP.

Accordingly, the invention further includes a method comprising (i) identifying whether a patient is at increased risk of developing pneumonia according to the above defined methods, and,

(ii) treating said patient identified at step (i) with a suitable treatment for treating or preventing pneumonia.

As used herein, the term "treating" or "treatment" refers to measures, wherein the object is to prevent or slow down (lessen) the targeted pathologic condition or disorder or slow down or relieve one or more of the symptoms of the disorder.

Examples of suitable immunostimulating treatment for preventing pneumonia including without limitation treatment with GM-CSF, IL7, IFNy, anti-PDl .

Examples of suitable prophylactic antibiotic treatment for preventing pneumonia are described in particular in Annales Francaises d'Anesthesie et de Reanimation 30 (2011) 168-190.

Kits for performing the method

Kits may be prepared for performing the above described methods. In particular, one object of the invention relates to a kit for assessing the risk of developing pneumonia in a subject (e.g. from a biological sample, more particularly a blood sample), according to the above described method, said kit comprising: (i) means for detecting one or more of the PNEUMONIA Risk Allele(s) as defined in the above table 1 or 2 or its related SNP allele(s) in high linkage disequilibrium, in particular related SNP allele(s) in high linkage disequilibrium as measured in a European superpopulation;

(ii) optionally, instructions for use of the kit.

In specific embodiments, said means for detecting one or more of the PNEUMONIA Risk Allele(s) include at least a probe that binds specifically to a nucleic acid comprising one SNP allele of the SNPs 1-10 as defined in the Table 1 or 2 or its related SNP allele(s) in high linkage disequilibrium, and/or primers that are capable of specifically amplifying a nucleic acid comprising one of the SNPs 1-10 as defined in the Table 1 or 2 or its related SNP allele(s) in high linkage disequilibrium. Said means for detecting PNEUMONIA Risk Allele may therefore comprise, specific primers or probes as described above.

In particular, the kit of the invention may comprise probes that binds specifically to any one of SEQ ID NOs 1 : 10 or their fragments including the SNP allele with corresponding 5' and 3' flanking regions of at least 5 nucleotides; or, to any one of SEQ ID NOs: 11-92 or their fragments including the SNP allele with corresponding 5 ' and 3 ' flanking regions of at least 5 nucleotides.

In other embodiments, the kit of the invention may comprise primers that are capable of amplifying (i) a genomic DNA of the subject comprising any one of SEQ ID NOs 1 : 10 or their fragments including the SNP allele with corresponding 5' and 3' flanking regions of at least 5 nucleotides; or, (ii) a genomic DNA of the subject comprising any one of SEQ ID NOs: 11-92 or their fragments including the SNP allele with corresponding 5' and 3 ' flanking regions of at least 5 nucleotides.

In particular, the kit can include one or more isolated primer or primer pairs for amplifying a target nucleic acid containing a region comprising a SNP associated to increased or decreased risk of developing pneumonia as described above. For example, the kit can include primers for amplifying a haplotype including one, two, three, four, five SNPs among SNPs 1-10 as described above or their related SNP allele(s) in high linkage disequilibrium, wherein the amplified sequence includes the SNP associated with pneumonia.

The kit can further include one or more of a buffer solution, a conjugating solution for developing the signal of interest, or a detection reagent for detecting the signal of interest, each in separate packaging, such as a container.

In another example, the kit includes a plurality of size-associated marker target nucleic acid sequences for hybridization with a detection array. The target nucleic acid sequences can include oligonucleotides such as DNA, RNA, and peptide-nucleic acid, or can include PCR fragments. The kit can also include instructions in a tangible form, such as written instructions or in a computer-readable format.

In one preferred embodiment, said kit comprises

(i) means for detecting one or more of SNPs 1-10 described in Table 5 above; and,

(ii) optionally, instructions for use of the kit. In another specific embodiment, said kit according to the invention further comprises means for detecting other relevant genetic information, for example (i) means for detecting other genetic variants, such as SNPs or gene mutations known to be associated to an increased risk of developing respiratory disorders, including without limitation pneumonia, tuberculosis, chronic obstructive pulmonary disorders, and/or (ii) means for detecting genetic variants (e.g. SNPs or gene mutations) known to be associated to resistance to antibiotics, and/or (iii) means for detecting the presence of one or more pathogens including for example Pseudomonas spp, and especially Pseudomonas aeruginosa, Acinetobacter spp., Enterobacter spp., Staphylococcus Aureus, Streptococcus pneumoniae, Escherichia coli, Klebsiella spp., Haemophilus influenza, Moraxella catarrhalis, and Stenotrophomonas maltophilia pneumonia. The invention will now be further illustrated by the following figures and examples. However, these examples and figures should not be interpreted in any way as limiting the scope of the present invention.

BRIEF DESCRIPTION OF THE FIGURES Figure 1 is a representation of cases and controls in the 2 first axes of Principal Component Analysis. The white squares represent the case patient and the black dots represent the control.

Figure 2 is the Manhattan plot of corrected p-values. The line indicates significance threshold after correction

EXAMPLES

Example 1: Identification of SNPs associated to increased or decreased risk of developing ventilator-associated pneumonia

Materials and Methods Individuals and samples

This study includes 99 individuals, 49 cases and 50 controls, which are described in Table 7. Patients previously participated in Pneumagene 1 study that included 3,200 ICU patients. They should be on invasive mechanical ventilation for more than 2 days and had no immunodepression, and no risk factor for Pseudomonas aeruginosa (Pa) Ventilator- Associated pneumonia (VAP), such as COPD, Pa colonization, or mucoviscidosis.

For Pa VAP group (cases), inclusion criteria included occurrence of at least 1 confirmed VAP due to Pa during the same ICU stay, plus other sign of aggravation, i.e. confirmed recurrences to Pa, or severe pulmonary complication among Acute Respiratory Distress Syndrom or septic shock. Cases were negative for germ other than Pa. Control group included patients that had no VAP due to any germ during mechanical ventilation (MV). Cases and controls were matched on values of age, SAPS2, duration of MV, reason for ICU admission, potential risk factors for VAP (coma, ARDS, head trauma) and outcome. Table 7: patient description

Cases Controls Total

Variable P

(49) (n=50) (n=99) -value

Characteristics

25

Gender, male 26 (51%) 51 (52%) 1*

(51%)

55 53 [46.3, 54 [46.5, 0.

Age °

[47, 64] 58.8] 60.5]

56 59 [49, 56 [45.5, 0.

SAPS II °

[43, 66] 70.75] 68] 10 f

Duration of intubation, 22 26 [18, 0.

27 [20, 38]

Days [17, 34] 35.5] 08 f

27 0.

Survivors 26 (52%) 53 (54%)

(55%) 91*

Mac Cabe score

41

Non fatal 46 (92.0%) 87 (88%)

(84%)

8

Ultimately fatal 4 (8.0%) 12 (12%) 0.34*

(16%)

0

Rapidly fatal 0 (0.0%) 0 (0.0%)

(0.0%)

Type of admission

37

Medical 42 (86%) 79 (80%)

(76%)

Elective surgery 3 (6%) 0 (0%) 3 (3%) 0.25 "

9

Emergency surgery 8 (16%) 17 (17%)

(18%)

5

Thoracic trauma 2 (4%) 7 (7%) 0.29 "

(10%)

6

Brain trauma 2 (4%) 8 (8%) 0.16 "

(12%)

20

ARDS 13 (26%) 33 (33%) 0.18*

(40%)

27

Coma 34 (68%) 61 (61%)

(55%) 0.27*

21

MOF 21 (42%) 42 (42%) 1.00*

(43%)

15

Aspiration 11 (22 %) 26 (26%) 0.45*

(30%)

* median [Q1-Q3]

f Wilcoxson test of continuous variables

Chi-squared test

Fisher exact test ARDS : Acute respiratory distress syndrome

SAPSII : Simplified Acute Physiology Score II

MOF: Multiple Organ Failure

Ethics Statements

The protocol was approved by the Comite Consultatif de Protection des Personnes dans la Recherche Biomedicale of Hopital Saint-Louis, Paris, France, in May 1999. All patients or their relatives gave written informed consent before enrollment.

DNA extraction, Exome Capture and Sequencing Exome regions were targeted and captured using the Exome Capture kit Agilent SureSelect All Exon 50Mb. Extracted DNA was sequenced on an Illumina HiSeq 2000 (paired-end sequencing, 2x100 bp). Exome Capture and sequencing were performed by KnowMe® company.

Data analysis

For each individual, reads were aligned against human genome reference version 38 (hg28, GRCh38) with samtools 2 . Variant calling was performed using BCFTools 2 and output was a VCF file. Variants supported by less than 15 good quality reads and variants with a quality not higher than 10 were removed. Insertions and deletions were excluded from downstream analyses. Individuals VCF files were then merged into a unique VCF file regrouping all variants identified in the 99 individuals included in our study. Due to lack of power in the downstream association analyses, variants observed with a frequency less than 5% were removed. Due to their specificity and in the absence of suspicion of any bias in association toward one or the other gender, variants localized on chromosomes X and Y were removed. Multiallelic SNPs (number of distinct alleles observed at least once in our study > 2) were removed. Variants located in repeated regions were excluded (from EnsEMBL, v81 using EnsEMBL perl API). Remaining variants were annotated with Ensembl Variant Effect Predictor Tool (VEP 3 ). Annotation includes: for protein coding variants : gene, codon change, amino acid change, functional consequence predicted by SIFT 4 and Polyphen 5 ; for all variants: rsID if available; description of the genomic context (UTR region, intronic region, intergenic region; regulatory region...).

PCA was performed with EIGENSTAT smartpca 6 on individual genotype data to check for genetic background homogeneity between cases and controls. Association analyses were performed with a Fisher exact test comparing distribution of alleles in cases and controls with Plink 7 . Multiple Testing correction method FDR from Benjamini and Hochberg 8 ' 9 was applied. Corrected p-values were considered significant if lower than 0.05. Manhattan plot representation was used to assess for the presence of significant associated regions at genome-wide scale. When the difference between minor allele frequency in controls and the frequency reported in public databases (1000 Genomes) was > 0.5, variant was excluded. Variants located outside of the target of the exome capture (exons flanked by 200 bp intronic regions) were not further analyzed. Variants located in genome assembly exceptions were also excluded. Results

11 538 291 distinct variants (10 629 910 SNPs, 908 381 indels) were initially called among the 99 individual samples included in our study. After filtering steps, 131 472 SNPs remain to be submitted to downstream association analyses. Representation of individuals in the two first eigenvectors of PCA performed on genotypes is displayed in Figure 1. No specific cluster regrouping cases or controls can be seen: there is no heterogeneity in the genetic background of cases and controls. Figure 2 represents Manhattan plot of association results after p-values correction. After association testing and multiple testing corrections, 25 SNPs remain significantly associated or inversely associated with VAP risk.

Among these 25 polymorphisms, 15 are excluded for the following reasons : 2 are located in alternate assemblies regions of the genome, 1 is located outside of the theoretical exome target region, and 12 present a difference between minor allele frequency in the study and in public reference databases higher than 50%, likely artifacts from sequencing. Finally, 10 variants are considered reliable. Association results, statistics and functional annotation for these polymorphisms are displayed in Table 8a, 8b and Table 9.

Table 8a: Information on reliable significant SNPs

Ref : Allele at this locus in genome reference

A: Major allele in study a: Minor allele in study

Na: Counts of minor allele a

Table 8b: Information on reliable significant SNPs

MAF: Minor allele frequency

OR : Odds-Ratio

95% CI: 95% Confidence Interval

* : If allele is not observed in one or the other groups, OR and associated 95%oCI cannot be estimated.

Table 9: functional annotation of significant reliable SNPs

SNP# Chr Pos rsID Gene Functional annotation

SNP1 2 118942520 rs2077344 MARCO Intronic, 100 bp downstream of exon 1

SNP 2 2 151416989 rs6735771 RIF1 Intronic, 90 bp downstream of exon 6

SNP 3 5 94570535 - RP11-461G12 In an processed pseudogene

SNP 4 10 5218849 rs2398157 AKR1C4 3 ' UTR, 100 bp downstream of exon 7

SNP 5 10 109907944 rs686155 XPNPEP1 Intronic, 130 bp upstream of exon 3

SNP 6 11 9631369 rs6483579 RP11-16F15 10 bp upstream from pseudogene

SNP 7 11 55318817 rsl216159 OR4A11P In an unprocessed pseudogene

SNP 8 14 50504728 rs7146431 MAP4K5 Intronic, 70 bp downstream of exon 3

SNP 9 20 17558523 rs 6044883 BFSP1 Intronic, 140 bp downstream of exon 1

SNP 10 20 23921958 rsl797041 CSTP1 In an unprocessed pseudogene, in first pseudo-exon Discussion

In this study, we explored the genetic susceptibility to repeated Pseudomonas aeruginosa Ventilator- Acquired Pneumopathies by analyzing the frequency of constitutional individual DNA polymorphisms in a case control study. Genotypes were assessed from exome DNA capture and next-generation sequencing. Association analyses were performed to identify loci associated or inversely associated with this susceptibility. After several filtering steps performed to eliminate false-positives, 10 SNPs remain statistically significantly associated (rs2398157, rs686155, rsl216159, rs7146431) or inversely associated (rs2077344, rs6735771, 5:94570535, rs6483579, rs6044883, rsl797041) with the condition of interest (raw p-values ~10 "6 ; corrected p-values between 0.00153 and 0.0481).

Among these 10 SNPs, 6 are located into protein coding genes. None of them leads to an amino acid change, since all are intronic. However, they all are located to less than 140 bp from an exon. Exon flanking regions are known to be potential regulatory regions, as they may contain binding sites for regulation factors, such as splicing factors. The 4 others SNPs are respectively located close or in a pseudogene (RP11-16F15, RP11-461G12) and in unprocessed pseudogenes (OR4A11P, CSTP1).

The alternative allele of rs686155 (XPNPEP1), rsl216159 (OR4A11P), and rs7146431 (MAP4K5), is observed only in cases, with a frequency around 20%, and barely never in controls.

1. Chastre J, Wolff M, Fagon J & et al. Comparison of 8 vs 15 days of antibiotic therapy for ventilator-associated pneumonia in adults: A randomized trial. JAMA 290, 2588- 2598 (2003).

2. Li, H. et al. The Sequence Alignment/Map format and SAMtools. Bioinformatics 25, 2078-2079 (2009).

3. McLaren, W. et al. Deriving the consequences of genomic variants with the Ensembl API and SNP Effect Predictor. Bioinformatics 26, 2069-2070 (2010). 4. Kumar, P., Henikoff, S. & Ng, P. C. Predicting the effects of coding non- synonymous variants on protein function using the SIFT algorithm. Nat. Protoc. 4, 1073- 1081 (2009).

5. Predicting Functional Effect of Human Missense Mutations Using PolyPhen-2. at <http://www.ncbi.nlm.nih.gov.gate2.inist.fr/pmc/articles/ PMC4480630/>

6. Price, A. L. et al. Principal components analysis corrects for stratification in genome-wide association studies. Nat. Genet. 38, 904-909 (2006).

7. Purcell, S. et al. PLINK: A Tool Set for Whole-Genome Association and Population-Based Linkage Analyses. Am. J. Hum. Genet. 81, 559-575 (2007).

8. Benjamini, Y. & Hochberg, Y. Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing. R. Stat. Soc. Ser. B Methodol. 57, 289-300 (1995).

9. Huang, Y., Kong, X., Zhen, Z. & Liu, J. The Comparison of Multiple Testing Corrections Methods in Genome-Wide Association Studies: The Comparison of Multiple Testing Corrections Methods in Genome-Wide Association Studies. Adv. Psychol. Sci. 21, 1874-1882 (2013).

Example 2: Identification of SNPs in high linkage disequilibrium with SNPs of interest identified in the VAP-Exome study

Methods

This search was performed using a web application named rAggr (url: raggr.usc.edu/). rAggr is a web-based software program for finding markers (SNPs and indels) that are in linkage-disequilibrium (LD) with a set of queried markers, using the 1000 Genomes Project and HapMap genotype databases. rAggr uses an expectation-maximization algorithm adapted from the Haploview software (Barrett et al, Bioinformatics. 2005 Jan 15;21(2):263-5) to calculate pairwise r 2 and D'. All calculations are done "on the fly" by the web server. The software was developed at the University of Southern California by Christopher K. Edlund, David V. Conti and David J. Van Den Berg. For more information on the developers and other software, visit the USC Morris Comprehensive Cancer Center Bioinformatics Core website. Copyright (c) 2015 Christopher K Edlund, David V Conti, David J Van Den Berg.

Parameters

This search was performed on the reference data from the 1000 Genomes, phase 3, Oct 2014.

The targeted population is CEU (Utah residents with Northern and Western European ancestry)+FIN (Finnish in Finland)+GBR (British in England and Scotland)+IBS (Iberian populations in Spain)+TSI (Toscani in Italy), which corresponds to individual with European ascendance, enriched with specific European ethnies. Query was performed using rs ID of SNPs of interest.

SNPs were considered if they were observed with a Minor Allele Frequency (MAF) higher than 0.001 in the reference population.

SNPs were considered in high linkage disequilibrium with SNPs of interest if located in less than 500kb, and if paired r 2 was comprised between 0.8 and 1.

Results

Only 9 over 10 SNPs of interest have a rsID, so only 9 over 10 SNPs were investigated. Over the 9 remaining SNPs of interest, 5 have a MAF lower than 0.001 in the reference population, and were not analyzed further.

4 SNPs were finally requested for SNPs in high linkage disequilibrium. Results are presented in Table 10 below.

SNP of interest SNP in high LP SEQ ID NOs SNPl (rs2077344) rs 11694929 11 rsl 1694102 12 rsl438838 13 rsl438839 14 rsl898703 15 rs4849735 16 rs55770063 17 rs4848527 18 rsl 114724 19 rs2119110 20 rsl0180048 21 rsl 318645 22

SNP2 (rs6735771) rsl2616688 23 rsl 13653954 24 rsl 2616740 25 rsl7193268 26 rs79256286 27 rsl0490521 28 rsl7806204 29 rsl2618981 30 rsl0206635 31 rs76604029 32 rsl 1675385 33 rs5024580 34 rsl2994320 35 rs2342911 36 rs4335940 37 rs2432945 38 rs80167808 39 rs2444256 40 rs4467261 41 rsl0173699 42 rs2432956 43 rs2432957 44 rs72860233 45 rsl0175134 46 rs2432942 47 rs2432943 48 rs2432946 49 rs2444267 50 rs2444264 51 rs2045025 52 rs4664438 53 rs2432953 54 rs2432954 55 rsl3555 56 rsl047957 57 rs4500942 58 rs4592848 59 rs6718372 60 rs6718653 61 rsl061305 62 rs7587301 63 rs2288193 64 rsl0497081 65 rsl0497082 66 rsl7270233 67 rs6731486 68 rs6724796 69 rsl3393021 70 rsl3393122 71 rsl40171536 72 rs4461258 73 rs3771901 74 rsl0167358 75

SNP6 (rs6483579) rs6483580 76

SNP10 (rs 1797041) rsl741631 77 rsl741633 78 rs 1797040 79 rsl797039 80 rsl797038 81 rsl797034 82 rs 1627002 83 rsl797033 84 rs 1741628 85 rsl797031 86 rsl797030 87 rs3004162 88 rs2038583 89 rs2984253 90 rs2208660 91 rs3004169 92