Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
METHOD AND SYSTEM FOR DETECTION OF AN ORGANISM
Document Type and Number:
WIPO Patent Application WO/2013/163210
Kind Code:
A1
Abstract:
The invention provides, inter alia, systems, compositions, kits and methods for detecting an organism, such as a microbe, microorganism, pathogen, or organism associated with Hospital Associated Infections (HAIs). The systems, compositions, kits and methods can comprise one or more probes for detecting a strain with high sensitivity, high specificity, or both. The systems, compositions, kits and methods can also be used to detect the strain within a short time frame.

Inventors:
ROLFE PHILIP ALEXANDER (US)
CLARKE IV THOMAS (US)
MAHONEY SARAH (US)
MUTAMBA JAMES (US)
LEONARD JACK THACHER (US)
GRUSZKA SARAH (US)
Application Number:
PCT/US2013/037833
Publication Date:
October 31, 2013
Filing Date:
April 23, 2013
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
ROLFE PHILIP ALEXANDER (US)
CLARKE IV THOMAS (US)
MAHONEY SARAH (US)
MUTAMBA JAMES (US)
LEONARD JACK THACHER (US)
GRUSZKA SARAH (US)
International Classes:
A61P31/04; C12N15/11; A61P31/10; C12N1/14; C12N1/20; C12Q1/04; C12Q1/68; G01N33/48; G06F19/10; C12R1/01; C12R1/725
Domestic Patent References:
WO2007083852A12007-07-26
Foreign References:
EP1570074A12005-09-07
EP2159285A22010-03-03
US6821770B12004-11-23
Other References:
ARTUR SABAT ET AL.: "New Method for Typing Staphylococcus aureus Strains: Multiple-Locus Variable-Number Tandem Repeat Analysis of Polymorphism and Genetic Relationships of Clinical Isolates.", JOURNAL OF CLINICAL MICROBIOLOGY, vol. 41, no. 4, April 2003 (2003-04-01), pages 1801 - 1804
KAMEL A. ABD-ELSALAM.: "Bioinformatic tools and guideline for PCR primer design.", AFRICAN JOURNAL OF BIOTECHNOLOGY, vol. 2, no. 5, May 2003 (2003-05-01), pages 91 - 95
LAURA BRINAS ET AL.: "Detection ofCMY-2, CTX-M-14, and SHV-12 b-Lactamases in Escherichia coli Fecal-Sample Isolates from Healthy Chickens.", ANTIMICROBIAL AGENTS AND CHEMOTHERAPY, vol. 47, no. 6, June 2003 (2003-06-01), pages 2056 - 2058
Attorney, Agent or Firm:
TANNOCH-MAGIN, Vivien J. et al. (BROOK SMITH & REYNOLDS, P.C.,530 Virginia Rd, P.O. Box 913, Concord MA, US)
Download PDF:
Claims:
CLAIMS

What is claimed is:

1. A probe set for detecting pathogenic organisms or strains in a sample, comprising at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 20, 40, 50, 100, 150, 200, 250, 300, 350, 400, 450, 500, 550, 600, or more oligonucleic acid molecules that, when implemented in an assay, detect and distinguish at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, or more different strains, variants, or subtypes of at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 30, 40, 50, 60, 70, 80, 90, 100, 200, 300, or more pathogenic organisms selected from virus, bacterium, fungi, and combinations thereof, wherein each oligonucleic acid molecule in the set comprises a first sequence that specifically hybridizes to a target sequence adjacent to a region of interest in at least one of the pathogenic organisms.

2. The probe set of claim 1, wherein the set comprises oligonucleic acid molecules further comprising a second sequence that specifically hybridizes to a second target sequence adjacent to the region of interest, wherein the oligonucleic acid molecules are capable of circularizing capture of the region of interest, and further wherein the first and second target sequences are separated by at least one nucleotide.

3. The probe set of claim 1 , wherein the set of oligonucleic acid molecules comprises pairs of oligonucleic acid molecules suitable for geometric amplification of the region of interest by polymerase chain reaction.

4. The probe set of any one of the preceding claims, wherein the at least 3 pathogenic organisms include any three or more of Staphylococcus aureus, Staphylococcus epidermis, Staphylococcus saprophyticus, Acinetobacter baumanii, Clostridium difficile, Escherichia coli, Enterobacter (aerogenes, cloacae, asburiae, and combinations thereof), Enterococcus (faecium and/or faecalis), Klebsiella pneumoniae, Proteus mirabilis, Candida albicans, and Pseudomonas aeruginosa; or subtypes or strains thereof.

5. The probe set of any one of the preceding claims, wherein the probe set can further distinguish between common strains or subtypes of the organisms.

6. The probe set of any one of the preceding claims, wherein the probe set detects and distinguishes among the organisms responsible for more than 90% of the hospital acquired infections at a site.

7. The probe set of claim 6 wherein the site is a surgical site, urinary catheter, ventilator, intravenous needle, respiratory tract, or any combination thereof.

8. The probe set of any one of the preceding claims, wherein the probe set comprises: a) oligonucleic acid molecules capable of i) amplifying, geometrically by

polymerase chain reaction or ii) circularizing capture of, 1, 2, 3, 4, 5, 10, 15, 16, or all 17, of the regions of interest provided in column 1 of Table 3, or substantially similar sequences;

b) oligonucleic acid molecules capable of i) amplifying, geometrically by

polymerase chain reaction or ii) circularizing capture of, 1, 2, 3, 4, 5, 10, 15, 20, 30, 50, 100, or all 134, of the regions of interest provided in column 1 of Table 5, or substantially similar sequences;

c) oligonucleic acid molecules capable of i) amplifying, geometrically by

polymerase chain reaction or ii) circularizing capture of, 1, 2, 3, 4, 5, 10, or all 13, of the regions of interest provided in column 1 of Table 7, or substantially similar sequences;

d) oligonucleic acid molecules capable amplifying, geometrically by polymerase chain reaction, or circularizing capture of, 1, 2, 3, 4, 5, 10, 20, 40, 60, 80, or all 85, of the regions of interest provided in column 1 of Table 9, or substantially similar sequences;

e) oligonucleic acid molecules capable of i) amplifying, geometrically by

polymerase chain reaction or ii) circularizing capture of, 1, 2, 3, 4, 5, 10, 20, 25, or all 29 of the regions of interest provided in column 1 of Table 11 , or substantially similar sequences;

f) oligonucleic acid molecules capable of i) amplifying, geometrically by polymerase chain reaction or ii) circularizing capture of, 1, 2, 3, 4, 5, 10, 15, or all 20, of the regions of interest provided in column 1 of Table 13, or substantially similar sequences; or

g) a combination of 1, 2, 3, 4, 5, or all 6 of a), b), c), d), e) and f).

9. The probe set of Claim 8, wherein the substantially similar sequences are 60, 65, 70, 75, 80, 85, 90, 95, 96, 97, 98, 99, 99.5, or 100% identical the sequence of the regions of interest indicated by the probe name in column 1 of Table 3, 5, 7, 9, 1 1, or 13; or

alternatively, or additionally, wherein the substantially similar sequences have endpoints within 100, 90, 80, 70, 60, 50, 40, 35, 30, 25, 20, 19, 18, 17, 16, 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1, or 0 nucleotides upstream or downstream of either of the endpoints of the regions of interest in column 1 of Table 3, 5, 7, 9, 11, or 13.

10. The probe set of any one of the preceding claims, wherein the probe set comprises: a) oligonucleic acid molecules comprising 1, 2, 4, 6, 8, 10, 15, 20, 25, 30, or all 34 of the sequences, or reverse complements thereof, provided in the second or third column of Table 4;

b) oligonucleic acid molecules comprising 1, 2, 4, 6, 8, 10, 20, 50, 100, 150, 200, 250, or all 268 of the sequences, or reverse complements thereof, provided in the second or third column of Table 6;

c) oligonucleic acid molecules comprising 1, 2, 4, 6, 8, 10, 15, 20, 25, or all 26 of the sequences, or reverse complements thereof, provided in the second or third column of Table 8;

d) oligonucleic acid molecules comprising 1, 2, 4, 6, 8, 10, 20, 50, 100, 150, or all 170 of the sequences, or reverse complements thereof, provided in the second or third column of Table 10;

e) oligonucleic acid molecules comprising 1, 2, 4, 6, 8, 10, 20, 30, 40, 50, or all 56 of the sequences, or reverse complements thereof, provided in the second or third column of Table 12;

f) oligonucleic acid molecules comprising 1, 2, 4, 6, 8, 10, 20, 30, or all 40 of the sequences, or reverse complements thereof, provided in the second or third column of Table 14; or

g) a combination of 1, 2, 3, 4, 5, or all 6 of a), b), c), d), e), and f).

1 1. The probe set of any one of Claims 1-10, wherein the probe set detects resistance genes of any one of the CARB, CMY, CTX-M, GES, IMP, KPC, NDM, ampC, OXA, PER, SHV, VEB, VIM, ermA, vanA, canB, mecA, mexA family of genes, or any combination of 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, or all 18 of these families of genes.

12. A probe set comprising: a) oligonucleic acid molecules comprising 1, 2, 4, 6, 8, 10, 15, 20, 25, 30, or all 34 of the sequences, or reverse complements thereof, provided in the second or third column of Table 4;

b) oligonucleic acid molecules comprising 1 , 2, 4, 6, 8, 10, 20, 50, 100, 150, 200, 250, or all 268 of the sequences, or reverse complements thereof, provided in the second or third column of Table 6;

c) ohgonucleic acid molecules comprising 1, 2, 4, 6, 8, 10, 15, 20, 25, or all 26 of the sequences, or reverse complements thereof, provided in the second or third column of Table 8;

d) ohgonucleic acid molecules comprising 1, 2, 4, 6, 8, 10, 20, 50, 100, 150, or all 170 of the sequences, or reverse complements thereof, provided in the second or third column of Table 10;

e) ohgonucleic acid molecules comprising 1, 2, 4, 6, 8, 10, 20, 30, 40, 50, or all 56 of the sequences, or reverse complements thereof, provided in the second or third column of Table 12; and

f) ohgonucleic acid molecules comprising 1, 2, 4, 6, 8, 10, 20, 30, or all 40 of the sequences, or reverse complements thereof, provided in the second or third column of Table 14.

13. A probe set comprising ohgonucleic acid molecules comprising 10, 20, 30, 40, 50, 60, 70, 80, 90, 95, 99, or 100% of the sequences provided in the second column of Table 1.

14. A probe set comprising ohgonucleic acid molecules comprising the sequences, or reverse complements thereof, provided in the second or third column of Table 4.

15. A probe set comprising ohgonucleic acid molecules comprising the sequences, or reverse complements thereof, provided in the second or third column of Table 6.

16. A probe set comprising ohgonucleic acid molecules comprising the sequences, or reverse complements thereof, provided in the second or third column of Table 8.

17. A probe set comprising ohgonucleic acid molecules comprising the sequences, or reverse complements thereof, provided in the second or third column of Table 10.

18. A probe set comprising ohgonucleic acid molecules comprising the sequences, or reverse complements thereof, provided in the second or third column of Table 12.

19. A probe set comprising ohgonucleic acid molecules comprising sequences, or reverse complements thereof, provided in the second or third column of Table 14.

20. A method of detecting one or more organisms, comprising contacting a sample with the probe set of any one of the preceding claims to capture one or more regions of interest of the one or more organisms, wherein capturing a region of interest for the one or more organisms indicates the presence of the one or more organisms in the sample.

21. The method of Claim 20, wherein the one or more organisms comprise a pathogen.

22. The method of Claim 20 or 21, wherein the sample is a nucleic acid sample isolated from a biological sample obtained from a human subject.

23. The method of Claim 22, wherein the biological sample is obtained from a surgical site, catheter, ventilator, intravenous needle, respiratory tractcatheter, medical device, blood, blood culture, urine, stool, fomite, wound, sputum, pure bacterial culture, mixed bacterial culture, bacterial colony, or any combination thereof.

24. The method of Claim 22 or 23, further comprising obtaining a genotype for the human subject.

25. The method of Claim 20 or 21 , wherein the sample is a water sample, a food sample, or an environmental sample.

26. The method of any one of Claims 20-24, wherein the one or more captured regions of interest are sequenced to identify the one or more organisms.

27. The method of any one of Claims 20-26, wherein the one or more regions of interest are captured without culturing of the sample.

28. The method of any one of Claims 20-25, wherein a portion of a plurality of the selected target sequences are sequenced simultaneously and then mapped to a database of reference sequences to determine the identities of the organisms or genes present in the sample.

29. The method of any one of Claims 20-27, which is performed in a single tube.

30. The method of any one of Claims 20-28, wherein the capture reaction is performed in less than three hours.

31. The method of any one of Claims 20-29, wherein massive parallel sequencing is performed to sequence 50,000 to 900 hundred million reads from amplified DNA clones.

32. The method of Claim 30, wherein the reads are between about 50-2000 nucleotides in length.

33. The method of any one of Claims 20-32, wherein the method achieves a

discrimination index superior to MLVA, VNTR, or spa typing.

34. The method of any one of Claims 20-33, comprising simultaneously detecting both viruses and fungi.

35. The method of any one of Claims 20-34, wherein the one or more regions of interest are predicted, in a single bacterial reference, to differ by >1 SNP from >2 other reference genomes, thereby enabling discrimination of this one genome from >2 others for the same species.

36. The method of any one of Claims 20-34, wherein one or more pathogens are detected.

37. The method of Claim 35, further comprising providing a therapeutically effective amount of a suitable antimicrobial treatment to the subject to treat the one or more pathogens detected, wherein the treatment is selected on the basis of the one or more pathogens detected.

38. A method of treating a subject determined to have a pathogenic infection by the method of any one of Claims 20-37, comprising administering a suitable antimicrobial treatment to the subject.

39. A system comprising a non-transient computer readable medium containing instructions that, when executed by a processor, cause the processor to perform steps comprising: comparing one or more captured regions of interest captured by the method of any one of Claims 20-38 to a reference database to identify the one or more organisms present in the sample; and, optionally

displaying an identity of the one or more organisms present in the sample and/or a therapeutic recommendation based on the results of the comparison.

40. The system of Claim 39, wherein the nucleic acid sequence of the captured regions of interest are compared to the reference database, which is a reference database comprising sequence information.

41. The system of Claim 39 or 40, wherein the reference database is specific to a healthcare institution or system of institutions, optionally including all of the drug resistance genes detected at an institution or system of institutions during a specified time window, and optionally allowing comparison between different time windows.

42. The system of any one of Claims 39-41 , further comprising the step of updating the reference database to include the results of the method of any one of Claims 20-38.

43. The system of any one of Claims 39-42, further comprising minimal inhibitory concentration information for one or more pathogenic organisms, such as those listed in table 2.

44. The system of any one of Claims 39-43, further comprising the step of displaying previously observed outcomes for subtypes of pathogens to be displayed alongside related treatment recommendations for a subject, such as the outcomes for different antibiotic choices, and concentrations for a particular substrain as identified by the method of any one of Claims 20-38.

45. The system of Claim 39 or 40, wherein the steps further comprise generating a phylogenetic relationship between the sample and a reference database of samples.

Description:
METHOD AND SYSTEM FOR DETECTION OF AN ORGANISM RELATED APPLICATION

[0001] This application claims the benefit of U.S. Provisional Application No.

61/637,185, filed April 23, 2012. The entire teachings of the above application are incorporated herein by reference.

BACKGROUND

[0002] Detecting, identifying, and phenotyping pathogens found in healthcare settings is critical both for diagnostic and surveillance purposes. Traditional bacterial and fungal diagnostic procedures rely on culture techniques that produce a genus or species level identification after 24-48 hours. Such tests are ordered for patients demonstrating symptoms indicative of an infection. While culture has been the standard diagnostic method for over one hundred years, its slow turnaround time means that a physician must prescribe antibiotics before knowing the identity of the organism or its drug resistances.

[0003] More recently, rapid techniques such as qPCR and mass spectrometry have allowed sub-24 hour turnaround times and enabled surveillance applications. For example, many hospitals in the United States test every patient for MRSA on admission to determine an appropriate caution level (eg, quarantine) for patients who are at a high risk for spreading an infection to other patients. qPCR offers quick results but minimal information- a typical test only detects the presence of one or a few sequences from one organism. Testing for additional organisms or the presence of drug resistance or virulence genes adds substantially to the cost of the test.

[0004] A test that offers sub-24 hour turnaround time while identifying a large number of organisms would offer many benefits in a healthcare setting including broad-range surveillance and faster prescriptions of the most appropriate antibiotic. The present application discloses compositions, kits, and methods that can be used to detect any or several of a large set of organisms present in a sample as well as a number of families of drug resistance genes. SUMMARY

[0005] Provided herein are compositions, kits, and methods for identifying an organism. The organism can be a microbe, microorganism, or pathogen, such as a virus, bacterium, or fungus. In one embodiment, an organism is distinguished from another organism. In another embodiment, a strain, variant or subtype of the organism is distinguished from another strain, variant, or subtype of the same organism. For example, a strain, variant or subtype of a virus can be distinguished from another strain, variant or subtype of the same virus.

[0006] In some aspects, a probe set for identifying pathogenic organisms or strains in a sample comprising a plurality of probes that, when implemented in an assay, allows for detecting and distinguishing at least 5 different strains, variants, or subtypes of at least 3 pathogenic organisms, wherein each probe in said plurality comprises a first sequence that hybridizes to a 5' end of a target sequence of said pathogen, a 3' end of said pathogen, or to said target sequence is provided.

[0007] In some embodiments, pathogen strains or organisms comprise a virus, bacterium, or fungus. In some embodiments, the at least 3 pathogenic organisms include

Staphylococcus aureus, Staphylococcus epidermidis, Staphylococcus saprophyticus,

Acinetobacter baumanii, Clostridium difficile, Escherichia coli, Enterobacter (aerogenes, cloacae, asburiae), Enterococcus (faecium, faecalis), Klebsiella pneumoniae, Proteus mirabilis, Candida albicans, and Pseudomonas aeruginosa; or subtypes or strains thereof.

[0008] In some embodiments, the probe set can not only detect and distinguish between the at least 3 organisms but can also distinguish between common strains or subtypes of the organisms. In some embodiments, the probe set detects and distinguishes among the organisms responsible for more than 90% of the hospital acquired infections at some site.

[0009] In one aspect, a probe set for identifying the presence of drug resistance genes in the organisms in a sample comprising a plurality of probes that, when implemented in an assay, allows for detecting and distinguishing at least 3 classes of resistance genes, wherein each probe in said plurality comprises a first sequence that hybridizes to a 5' end of a target sequence of said pathogen, a 3' end of said pathogen, or to said target sequence is provided.

[0010] In one aspect, a kit containing any probe set described herein and the reagents and protocol to capture the target sequences of the organisms present in the input sample is provided. [0011] In some aspects, a kit for the simultaneous detection of pathogens including three or more of the organisms listed in Table 2 is provided. In some embodiments, the kit is for research use. In some embodiments, the kit is a diagnostic kit. In some aspects, a kit for the simultaneous detection of antibiotic resistance genes including three or more of the genes listed in Table 3 is provided. In some embodiments, the kits described herein can be used to prepare DNA for massively parallel sequencing. In some embodiments, the kits described herein can provide molecular barcodes for the labeling of individual samples. In some embodiments, the kits described herein can include at least 10 of the probe sequences listed in Table 1.

[0012] In some embodiments, the kits described herein can be used to circularize single- stranded DNA probes by: (i) hybridization to a complementary target DNA sequence, (ii) extension across a gap by DNA polymerase, and (iii) ligation of the extended probe to form a single stranded, covalently closed circular DNA molecule.

[0013] In one aspect, a composition comprises a probe set for identifying pathogenic organisms or strains in a sample comprising a plurality of probes that, when implemented in an assay, allows for detecting and distinguishing three or more of the organisms listed in Table 2, wherein each probe in said plurality comprises a first sequence that hybridizes to a 5' end of a target sequence of said pathogen, a 3' end of said pathogen, or to said target sequence. In one embodiment, the plurality of probes, when implemented into an assay, allows for the substantially simultaneous detection and distinguishing of three or more of the antibiotic resistance genes listed in Table 3 is provided.

[0014] In one aspect, a composition comprises a probe set for identifying antibiotic resistance genes of pathogenic organisms or strains in a sample comprising a plurality of probes that, when implemented in an assay, allows for detecting and distinguishing three or more of the antibiotic resistance genes listed in Table 3, wherein each probe in said plurality comprises a first sequence that hybridizes to a 5' end of a target sequence of said pathogen, a 3' end of said pathogen, or to said target sequence is provided.

[0015] In one aspect, a composition comprises a probe set for identifying pathogenic organisms or strains in a sample comprising a plurality of probes that, when implemented in an assay, allows for detecting and distinguishing three or more organisms that cause Hospital Associated Infections (HAIs) at some site, wherein each probe in said plurality comprises a first sequence that hybridizes to a 5' end of a target sequence of said pathogen, a 3' end of said pathogen, or to said target sequence is provided. In some embodiments, the three or more organisms that cause HAIs at some site comprise organisms responsible for more than 90% of the hospital acquired infections at some site. In some embodiments, the three or more organisms that cause HAIs at some site comprise organisms responsible for more than 60% of the hospital acquired infections at some site. In some embodiments, the three or more organisms that cause HAIs at some site comprise organisms responsible for more than 30% of the hospital acquired infections at some site. In some embodiments, the site is a surgical site, catheter, ventilator, intravenous needle, respiratory tract catheter, medical device, blood, blood culture, urine, stool, fomite, wound, sputum, pure bacterial culture, mixed bacterial culture, bacterial colony, or any combination thereof.

[0016] In some embodiments, a probe set is operable to detect CARB, CMY, CTX-M, GES, IMP, KPC, NDM, ampC, OXA, PER, SHV, VEB, VIM, ermA, vanA, canB, mecA, or mexA family or classes of genes, or any combination thereof. In some embodiments, some of the genomic regions chosen as target sequences are known to be highly conserved such that each genus or species tends to contain a single version of the region, thus allowing genus or species identification. In some embodiments, some of the genomic regions chosen as target sequences are known to be highly variable such that each strain or substrain will contain a different version of the region, thus enabling strain or substrain identification and differentiation. In some embodiments, some portion of a plurality of the selected target sequences are sequenced simultaneously and then mapped to a database of reference sequences to determine the most likely identities of the organisms or genes present in the sample. In some embodiments, some portion of a plurality of the selected target sequences are sequenced simultaneously and then assembled into one or more consensus sequences. When sequencing information is gathered from the probes for antibiotic resistance genes, for plasmids, and for an organism, a distinguishing fingerprint can be derived for the pathogen, and can serve as means to identify the source and extent of an outbreak.

[0017] In one aspect, a kit comprising one or more reagents, wherein the reagents comprise a probe set according to claims 1-1 1 , reagents for obtaining a sample, reagents for extracting nucleotides from a sample, enzymes, reagents for amplifying a region of interest, reagents for purifying nucleotides, reagents for purifying captured regions of interest, buffers, sequencing reagents, or any combination thereof, wherein the reagents allow for the capture of target sequences of three more pathogens listed in Table 2 is provided. [0018] In one aspect, a kit comprising one or more reagents, wherein the reagents comprise a probe set according to claims 1-11, reagents for obtaining a sample, reagents for extracting nucleotides from a sample, enzymes, reagents for amplifying a region of interest, reagents for purifying nucleotides, reagents for purifying captured regions of interest, buffers, sequencing reagents, or any combination thereof, wherein the reagents allow for the capture of target sequences of three or more antibiotic resistance genes listed in Table 3 is provided.

[0019] In one aspect, a kit comprising one or more reagents, wherein the reagents comprise a probe set according to claims 1-11, reagents for obtaining a sample, reagents for extracting nucleotides from a sample, enzymes, reagents for amplifying a region of interest, reagents for purifying nucleotides, reagents for purifying captured regions of interest, buffers, sequencing reagents, protocol or any combination thereof, wherein the reagents allow for the capture of target sequences of three or more pathogens listed in Table 2 and capture of target sequences of three or more antibiotic resistance genes listed in Table 3 is provided.

[0020] In some embodiments, the reagents allow the capture reaction to be performed in a single tube. In some embodiments, the reagents allow the capture reaction to be performed in less than three hours. In some embodiments, the reagents allow the capture reaction to be performed in less than two hours. In some embodiments, the detection of the three or more pathogens occurs substantially simultaneously.

[0021] In some embodiments, the plurality of probes comprises at least 3 of the probe sequences listed in Table 1. In some embodiments, each probe comprises the first sequence that hybridizes to a 5' end of said target sequence and a second sequence that hybridizes to a 3' end of said target sequence. In some embodiments, the probe set can distinguish between strains or subtypes of the organisms. In some embodiments, the detection the three or more antibiotic resistance genes occurs substantially simultaneously. In some embodiments, the detection of the three or more pathogens and the three or more antibiotic resistance genes occurs substantially simultaneously.

[0022] In some embodiments, a kit allows for preparation of DNA for massively parallel sequencing. In some embodiments, a kit further comprises molecular barcodes for the labeling of individual samples.

[0023] In some embodiments, the probe set of a kit comprises at least 10 of the probe sequences listed in Table 1. In some embodiments, the probe set of a kit comprises at least 20 of the probe sequences listed in Table 1. [0024] In some embodiments, kit reagents can be used to circularize single-stranded DNA probes by: (i) hybridization to a complementary target DNA sequence, (ii) extension across a gap by a DNA polymerase, and (iii) ligation of the extended probe to form a single stranded, covalently closed circular DNA molecule.

[0025] In one aspect, a method of identifying an organism or pathogenic strain, variant or subtype comprising: a) contacting a sample with a plurality of probes listed in Table 1, wherein said plurality of probes detects and distinguishes at least 3 different organisms or pathogenic strains listed in Table 2, or variants or subtypes thereof; b) hybridizing a 5' end of a target sequence of said organisms or pathogenic strains, or variants or subtypes thereof, a 3' end of said target sequence, or said target sequence with a probe of said plurality; c) sequencing said target sequence; and d) identifying from said sequencing said organisms or pathogenic strains, or variants or subtypes thereof is provided.

[0026] In one embodiment, the method is performed in less than 12 hours. In one embodiment, the identifying is performed in less than 3 hours. In one embodiment, the identifying is performed in less than 2 hours. In one embodiment, the identifying is with at least 99% specificity or sensitivity.

[0027] In one aspect, a method of stratifying a host into a therapeutic group comprising: a) contacting a sample from said host with a plurality of probes listed in Table 1 , wherein each probe specifically distinguishes different non-host organisms or pathogenic strains listed in Table 2, or variants or subtypes thereof; b) hybridizing a 5' end of a target sequence of a non-host organism or pathogen, a 3' end of said target sequence, or said target sequence with a probe of said plurality; c) sequencing said target sequence; d) determining an identity of said non-host organism or pathogenic strain, or variant or subtype thereof, from said sequencing; and e) stratifying said host into a therapeutic group based on said identity is provided. In one embodiment, the method further comprises determining the genotype of the host from the sample.

[0028] In some embodiments, an additional non-host organism is identified. In some embodiments, an additional strain, variant or subtype of said organism or pathogen is identified. In some embodiments, the therapeutic group differs than a therapeutic group in which only one of the non-host organisms is identified. In some embodiments, the therapeutic group differs than a therapeutic group in which only one of said strains, variants, or subtypes of said pathogen is identified. I CORPORATION BY REFERENCE

[0029] All publications, patents, and patent applications mentioned in this specification are herein incorporated by reference to the same extent as if each individual publication, patent, or patent application was specifically and individually indicated to be incorporated by reference.

BRIEF DESCRIPTION OF THE DRAWINGS

[0030] The novel features of the invention are set forth with particularity in the appended claims. A better understanding of the features and advantages of the present invention will be obtained by reference to the following detailed description that sets forth illustrative embodiments, in which the principles of the invention are utilized, and the accompanying drawings of which:

[0031] FIG. 1 depicts an exemplary kit configuration, indicating the position of samples and barcoding reagents within the supplied materials within the kit.

[0032] FIG. 2 provides a matrix depiction of a subset of a probeset for discrimination of genus and species amongst many genomes of various organisms. Each column on the x-axis indicated a single probe capture region, and each row indicates a reference database genome within the genus or species labeled. Dark boxes indicate that a probe is not predicted to provide sequence for this organism, whereas white boxes indicate that this probe is predicted to bind and provide sequence enabling the detection of this organism.

[0033] FIGS. 3A-3B depict exemplary plots of data that can be used to quantify target organisms including (FIG. 3A) Acinetobacter and (FIG. 3B) S. saprophyticus. In each case, genomic DNA isolated from a culture of each organism was quantified and a dilution series of 4 orders of magnitude aliquoted. Each aliquot was sequenced in triplicate and the total sequencing reads per aliquot divided by the number of internal control reads to produce a normalized quantitation of the DNA present in the sample. The plotted results indicate a highly linear and quantitative relationship between sequenced reads detected and input DNA.

[0034] FIG. 4 depicts a graph showing that the kits described herein can resolve mixed samples containing multiple organisms. In each case, genomic DNA isolate from a culture of each organism was quantified and a aliquoted into a sample at even copy numbers, with the sample matrix indicating mixes of up to 5 distinct genomes within each sample. Each sample was sequenced in duplicate and the total sequencing reads for each individual genome per sample divided by the number of internal control reads to produce a normalized relative quantitation of the each of the genomic DNA species present in the sample. The graphed results indicate accurate detection of multiple species within a mixed DNA sample.

[0035] FIG. 5 depicts a plot demonstrating strong correlation (R 2 =0.98) between the log normalized counts obtained via PGM vs log normalized counts obtained via qPCR. Genomic DNA from an organism was quantified and a aliquoted as a dilution series over ~5 orders of magnitude. Each sample was sequenced in triplicate and the total sequencing reads for the genome per sample divided by the number of internal control reads to produce a normalized relative quantitation genomic DNA species present in each aliquot. qPCR was also performed in triplicate on each sample using a genome specific primer pair, and the qPCR relative copy number then plotted against the sequencing data (PGM normalized count). The results demonstrate a linear agreement with quantitation by qPCR over >4 orders of magnitude.

[0036] FIG. 6 depicts a plot of the ratio of viral (HIV) reads to GFP against the initial template concentration in the reaction. cDNA from HIV was quantified and a aliquoted as a dilution series over ~4 orders of magnitude. Samples were prepared in the presence of 1000 genome equivalents of human DNA isolated from cultured HEK-293 cells, or in the absence of background competing DNA. Each sample was sequenced and the total sequencing reads for the genome per sample divided by the number of GFP internal control reads to produce a normalized relative quantitation of HIV cDNA present in each aliquot. No significant difference was observed in the number of sequencing reads per sample in the presence or absence of competing background Human DNA.

[0037] FIG. 7 depicts a plot comparing the detection of cDNA from 2 HIV strains (CN009 and CN006) obtained via PGM vs. MiSeq. The plot is shown as the adjusted GFP . read count against the CN009 Template count. In each case, cDNA from HIV CN009 was quantified and a aliquoted as a dilution series over ~5 orders of magnitude. Into each CN009 aliquot, 3000 genome equivalents of CN006 genome were also added. Each sample was sequenced in duplicate and the total sequencing reads for each individual genome per sample divided by the number of internal control reads to produce a normalized relative quantitation of the each of the genomic DNA species present in the sample. The plots indicated a consistent level of CN006 detection detected per sample, and a linear detection of CN009 over >4 orders of magnitude. This also demonstrates the detection of two species at minor variant frequencies of as low as 1%.

[0038] FIG. 8 depicts a plot of sequencing counts per probe within a probeset, for replicate sample Bl against replicate sample B2. The plot demonstrates a highly linear and reproducible probeset internal performance.

[0039] FIGS. 9A-9B depict plots of the ratio of minor:major pathogen PGM reads against the percent ratio of minor:major pathogen in the reaction for (FIG. 9A) minor pathogen detected to 1% major pathogen (S. epidermidis and E. coli) and (FIG. 9B) minor pathogen detected to 10% major pathogen (S. saprophyticus and A. baumannii). In each case, genomic DNA isolated from a culture of two organisms was quantified and aliquoted into a sample at a ratio of 10: 1, 1 : 1, 1 : 10 and 1 :100. Each sample was sequenced in triplicate and the total sequencing reads for each individual genome per sample divided by the number of internal control reads to produce a normalized relative quantitation of the each of the genomic DNA species present in the sample. The graphed results indicate accurate detection of multiple species within a mixed DNA sample down to a minor variant level of at least 1%.

[0040] FIG. 10A describes a list of assay components within the HAI BioDetection kit.

[0041] FIG. 10B illustrates the layout of customer samples and control samples of two 8 well strips that the BioDetection assay is performed within.

[0042] FIG. IOC indicates validated performance specifications and criteria for the HAI BioDetection kit.

[0043] FIG. 11 A describes the multilevel structure of the HAI BioDetection probeset.

[0044] FIG. 1 IB provides a matrix depiction of two probes within the BioDetection probset to illustrate discrimination of species and strain amongst many genomes of

Staphlyococcus. Each column on the x-axis indicates a SNP detected by either probe 1 or probe 2 capture region, and each row indicates a reference database genome within the genus or species labeled. Black boxes indicate that a probe is not predicted to provide sequence for this organism, whereas shaded boxes indicate that this probe is predicted to bind and provide sequence enabling the detection of this organism.

[0045] FIG. l lC describes the levels of multiplexity achieved by the HAI BioDetection kit by assaying many sequence variants within a sample, compared to the single nucleotide discriminatory ability of a PCR primer. [0046] FIG. 12 illustrates the workflow from input sample, either a purified genomic DNA from culture, e.g. DNA enriched from a swab of a patient wound site. The

BioDetection kit workflow is illustrated in elapsed time (from t=12.00) and the workflow timing for each individual step is broken out to the left of the workflow. A barcoding primer set of allows 16 to 96 samples to be sequenced simultaneously in one run on a sequencing platform (in this illustration and Ion Torrent PGM, but alternatively and Illumina MiSeq or HiSeq platform). The interpretation box illustrates a computational software and graphical display of simplified data output.

[0047] FIG. 13 illustrates a graphical display of summarized sequencing results from the BioDetection kit. The graphical display is subdivided into Genus, species and strain level detection results, resistance gene information if resistance loci are detected, the readcount for samples and internal controls, and also any potential warnings due to poor sample performance. A color-coded similarity score (green=similar, yellow=moderate similarity, red=little similarity) and a similarity score absolute value, are calculated for the sequenced similarity detected by the kit and compared to the next most related organism at genus, species and strain level using a reference database of published genomic sequences, and containing previous genome sequences detected by the BioDetection kit. In the illustrated example, the sample has been demonstrated to contain both Enterococcus faecalis as the primary species, and Escherichia coli a minor species present within the sample. The samples are 65.4% and 74% homologous to the nearest neighbor strains described.

[0048] FIG. 14 illustrates a schematic comparison of the turnaround time and workflow steps to generate substrain level resolution and drug resistance typing of bacterial samples using either traditional microbiology, and combination of PCR and/or Mass spectroscopy, whole genome sequencing (WGS) or the BioDetection kit. A clear advantage is illustrated with the BioDetection kit in terms of fewer workflow steps and faster achievement of substrain resolution and drug resistance typing compared to these alternative methods.

[0049] FIG. 15A describes a collection of 38 MRS A samples were subtyped using both HAI BioDetection Kit and spa locus VNTR typing (using PCR and Sanger sequencing). Sequence regions captured using the BioDetection kit were used to construct a phylogenetic tree was constructed using sequence data, and each sample was annotated with spa-typing result for the same sample. The tree demonstrates the discrimination of samples with the same spa-type into multiple unique isolates using the BioDetection kit. Further the grouping clustering generated by the BioDetection kit largely groups according to spa-type, as would be predicted for more closely related samples.

[0050] FIG. 15B describes the number of sequence variants detected amongst 38 sequenced with the BioDetection kit, or typed by spa type, or 8 representative samples encapsulating the broad phylogenetic tree structure and then Sanger sequenced using then MLST subtyping amplicons, or the 16S ribosomal sequencing amplicon. The total number of MRSA samples uniquely discriminated by each approach is also described.

[0051] FIG. 15C describes 4 bacterial cohorts sequenced using the BioDetection kit. The table indicates the number of samples per cohort, and the number of % of the samples that were discriminated into unique isolates. The data demonstrates that the HAI kit is capable of near unique discrimination of bacterial isolates within many large cohorts.

[0052] FIG. 16A is an in silico model predicting the superiority of the present invention over VNTR approaches. Thirty-three genome sequences were extracted from references databases for which whole genome sequence assemblies were available. An in silico analysis extracted regions of these genomes that are assayed using MLVA, MLST and spa subtyping methods. The discriminatory index (defined as the % of total genomes discriminated into unique isolates) of each technique was calculated based upon the assayed regions, and compared with the regions assayed by the BioDetection kit. The BioDetection kit discriminated all samples into unique groups, and demonstrated a higher discriminatory index than other assays.

[0053] FIG. 16B is a tangelgram demonstrating better phylogeny reproduction by the BioDetection kit. The sequences extracted by the in silico analysis of Figure x2 were used to construct phylogenetic trees to describe the relationships between samples. The whole genome sequence (WGS) was used as a reference tree, and the BioDetection and MLVA constructed trees compared using a tangelgram figure. Red and pink lines represent regions of the tree that are significantly different between methods, whereas parallel grey lines illustrate relationships that are described equivalently by both methods. The figure demonstrates significant discordance between the MLVA and WGS trees, but largely comparable phylogenetic relationships described by the WGS and BioDetection trees. This data indicates that the evolutionary relationships described by the BioDetection kit are a more accurate appraisal of the whole genome relationships and evolutionary distance between samples than MLVA approaches. Similar significant discordances were observed to between WGS and spa-typing and MLST trees compared to MLVA.

[0054] FIG. 17 demonstrates detection from DNA isolated from stool, urine and sputum. Sputum, stool and urine sample derived from human individuals were spiked with genomic DNA isolated from cultured bacteria. Samples were extracted using standard DNA extraction methods, such that each sample contained some amount of gDNA plus any additional complex biomolecules that are carried through the extraction from these sample types (heme, complex polysaccharides and potential enzyme inhibitors). Each isolated DNA sample was assayed using the BioDetection kit and results are tabulated. This data demonstrates accurate species and strain detection from DNA isolated from sputum, urine and stool samples, plus identification of resistance genes present within each sample.

[0055] FIG. 18 is a summary of 707 bacterial samples sequenced and identified using the BioDetection kit. The table demonstrates the capability of the kit to detect species not listed and validated within the performance specifications, due to the broad detection and discriminatory ability of the selected sequence capture regions.

[0056] FIG. 19 shows detection of VRE from rectal swabs. A collection of n=24 positive and negative rectal swab samples screened by microbial culture were collected after primary screening at a hospital laboratory. Bound DNA released into a PBS wash solution by incubation for lhr at 37 degrees centri grade, isolated using a gDNA extraction kit and then assayed using the BioDetection kit. The resulting data was tabulated to indicate the detection of organisms and drug resistance genes (plus read counts) from each sample, and comparison to the clinical surveillance by culture. This data demonstrates accurate species, strain and resistance gene detection from rectal swabs. In particular the data illustrates detection of multiple Enterococcus strains, plus vancomycin resistance, and additional co-present species on the rectal swabs, such as E. coli and K. pneumoniae. Further, sample PGCA963 demonstrates low level E. faecium and E. coli detection in a culture negative sample.

[0057] FIG. 20 describes a summary of drug resistance loci detected over n=707 clinical isolates and clinical specimens sequenced using the HAI kit. Read count for each marker exceeds a minimum of 10 reads and often incorporates detection of multiple sequences within the gene, providing high confidence detection. This table demonstrates a range of drug resistance markers confirmed for detection by the BioDetection kit. [0058] FIG. 21 demonstrates a comparison of 2 samples sequenced using the

BioDetection kit from clinical Klebsiella isolates. The sequence represents the captured sequence of a single probe within the BioDetection probeset. The pairwise sequence comparison illustrates mismatches between a single probe loci (of multiple discriminating probes) and indicates that even a single loci commonly contains multiple SNPs of high confidence discrimination between closely related species.

[0059] FIG. 22 A shows high confidence SNP calling by readcount vs WGS. A cohort of 20 MRSA samples were sequenced using the BioDetection kit on an Ion Torrent PGM, and a Nextera™ whole genome sequencing approach on a MiSeq. Samples were sequenced on a Ion 316 chip (-3.2M reads), and a single MiSeq run (~15M reads). Reads were aligned to a reference genome and coverage compared between sequencing approaches. The plots describe the genomic coordinates (x-axis) and the log 10 sequencing read depth at each nucleotide (y-axis) for 3 individual probes. The BioDetection kit generates considerably higher readcounts (10-100 fold) at discriminatory regions between samples, enabling higher confidence SNP calling for this targeted sequencing vs the low read depth of whole genome sequencing. This also supports accurate detections for each of the SNPs by independent library constructions using different sequencing technologies.

[0060] FIG. 22B shows genomic coordinates. Two sequence alignments compare the consensus read sequence at 2 regions captured by both HAI BioDetection kit, and Nextera Nextera™ whole genome sequencing and reference genome alignment. For samples TCI 4, TC5 and TC4, the sequences show agreement for detection of an indel within sample TCI 4, and two SNPs within TCI 4 relative to TC4 and TC5.

DETAILED DESCRIPTION

[0061] Approximately one out of every twenty hospitalized patients will contract a nosocomial infection, more commonly known as a hospital-acquired infection (HAI). More than 70 percent of the bacteria that cause HAIs can be resistant to at least one of the antibiotics most commonly used to treat them. Early detection can be important for controlling the spread of hospital-acquired infections. After culturing for growth and isolation of pathogens, clinical microbiology laboratories may rely on observable phenotype and simple biochemical assays to determine the bacterial type and antibiotic sensitivity. Determining the most effective antibiotic treatment for the infected patient, not the causal agent of the infection, is usually the prerogative of the physician. The resolution of conventional microbiological assays may be insufficient to determine the precise genotype underlying antibiotic resistance. Consequently, the same organism can infect multiple patients, and the spread of infection can go unnoticed for long periods.

[0062] Urinary tract infection (UTI) is the most common hospital-acquired infection. UTIs account for about 40 percent of hospital-acquired infections, and an estimated 80 percent of UTIs are associated with urinary catheters. Pneumonia is the second most common HAL In critically ill patients, ventilator-associated pneumonia (VAP) is the most common nosocomial infection. VAP can double the risk of death, significantly increase intensive care unit (ICU) length of stay, and can add to each affected patient's hospital costs.

[0063] A key problem for microbiology labs is the turnaround time from receiving a microbial sample to determining key actionable information for patient care, such as antibiotic drug resistance within the sample, or strain identification for comparison to known high-risk strains. Existing technologies such as PCR or mass spectroscopy have allowed the turnaround time to be improved relative to classical methods for some actionable

information, such as species identification, or presence of a select few drug resistance genes, but there are few practical approaches to assaying the large number of drug resistance genes or key species needed to be identified to confidently predict patient treatment.

[0064] DNA microarray offers broad detection ability for genomic loci, but is

complicated by slow sample preparation and false positive and false negative sample results due to the hybridization based approach. Targeted DNA sequencing using the BioDetection kit allows the greater breadth of target detection, and higher resolution and higher accuracy discrimination due to the single base accuracy of DNA sequencing.

[0065] A second competing approach to targeted sequencing is whole genome sequencing. This approach has several disadvantages relative to the targeted sequencing approach provided by the invention. First, whole genome libraries contain many

uninformative regions that are identical between the majority of isolates in a species, and thus provide no information to discriminate. These worthless reads mean that many more WGS reads are required per sample to capture informative regions, and prevent higher numbers of samples to be multiplexed into a single sequencing channel to amortize sequencing costs. Second, WGS libraries contain a representative fraction of any DNA present within a sample. As such, primary samples containing human tissue, or many uninteresting bacteria from the perspective of patient health, will comprise mainly of unwanted human or commensal bacterial reads. Efficient detection of important bacteria and drug resistance genes within a sample requires a more efficient targeted approach. Thirdly, library preparation times are slower and more laborious using WGS approaches, and the data analysis time significantly longer than that of a targeted sequencing approach in which only key informative regions are analyzed. This faster analysis reduces turnaround time and costs, and allow simplified data representations for easier understanding for clinical scientists unfamiliar with next generation sequencing data.

[0066] Provided herein are compositions, methods, systems and kits for detecting an organism, such as a pathogen, such as a pathogen that causes HAIs, as well as methods for using the system to identifying and detect the organism. The system can comprise a probe or plurality of probes. Also provided herein, are compositions, methods, systems and kits for detecting an organism, such as a pathogen, such as a pathogen that causes HAIs, and detecting and identifying antibiotic resistance genes, which, in some embodiments, can be performed simultaneously.

Probes

[0067] In some embodiments, the invention provides panels of probes and methods of using them, where the panels include circularizing capture probes, such as molecular inversion probes. Basic design principles for circularizing probes, such as simple molecular inversion probes (MIPs) as well as related capture probes are known in the art and described in, for example: Nilsson et al , Science, 265:2085-88 (1994); Hardenbol et al, Genome Res. ; 15:269-75 (2005); Akharas et al , PLOS One, 9:e915 (2007); Porecca et al, Nature Methods, 4:931-36 (2007); Deng et al, Nat. Biotechnol, 27(4):353-60 (2009); U.S. Patent Nos.

7,700,323 and 6,858,412; and International Publications WO 2011/156795,

WO/1999/049079 and WO/1995/022623, all ofwhich are incorporated by reference in their entirety.

[0068] A system for detection of an organism, such as identifying a strain, variant or subtype of a pathogen, can comprise a mixture or probe set comprising a plurality of probes. The target organism for a particular probe may be any organism, such as a viral, bacterial, fungal, archaeal, or eukaryotic, organisms, including single cellular and multicellular eukaryotes. In particular embodiments, a target organism is a pathogen. In some embodiments, target organisms include organisms associated with or that cause HAIs, such as those organisms provided in Table 2.

[0069] In some embodiments, each single-stranded capture probe can hybridize to two complementary regions on a target DNA with a gap region in between. An enzyme, such as DNA polymerase, can be used to fill in the gap using the target as template, and stop adding nucleotides when it reaches the phosphorylated 5 '-terminus of the hybridized probe. An enzyme, such as a thermostable ligase, can be used to covalently close the extended probe to form a circular molecule. Exonucleases can be used to digest away residual probe molecules. The filled-in, circularized probe can be resistant to exonuclease digestion, and can serve as template for preparation of the sequencing library by known methods, such as PCR. Sample- associated barcodes can be added and can enable multiple barcoded samples to be blended and analyzed together, such as on a DNA sequencer.

[0070] A probe can refer to a sequence that hybridizes to another sequence. The probe can be a linear, unbranched polynucleic acid. The probe can comprise two homologous probe sequences separated by a backbone sequence, where the first homologous probe sequence is at a first terminus of the nucleic acid and the second homologous probe sequence is at the second terminus to the nucleic acid, and where the probe is capable of

circularizing capture of a region of interest of at least 2 nucleotides. Circularizing capture can refer to a probe becoming circularized by incorporating the sequence complementary to a region of interest.

[0071] In a preferred embodiment, the probes contain two arms, joined by a backbone, that hybridize to a target sequence. A polymerase molecule can extend the 3' end of the probe by copying a target region into a probe molecule. A ligase molecule can circularize a probe molecule by joining the 3' end of the copied target to the 5' end of the original probe molecule.

[0072] In one embodiment, probe arms can hybridize to the target nucleic acid molecule, surrounding the capture region; a polymerase extension can fill in the gap between the arms and a ligase can create a circular molecule out of the extended probe. After an exonuclease digestion removes the original template molecules, primers can be used to amplify the captured probes. The primers can contain a 3' end homologous to the backbone (forward) and its reverse complement (reverse primer). The 5' of the primer may contain a sequencing adapter for a particular next generation sequencing platform and may also contain a barcode sequence between the 5' and 3' segments such that multiple samples, each amplified with primers containing a sample-specific barcode, can be multiplexed into a single sequencing run. As the two probe arms are linked by a backbone, on-target binding is energetically favorable, even when many (hundreds, thousands, or tens of thousands) of probes are present in a single reaction (compare to PCR, in which one primer of a pair may hybridize and extend at an off-target locus). As with PCR, each MIP can capture a well-defined region of the target sequence (compare to hybridization capture methods, which yield a variety of molecules centered around the target).

[0073] In a preferred embodiment, a backbone of a probe molecule contains the same sequence in all probes. A backbone can contain two primer binding sites that allow amplification of probe arms and a captured target sequence. In a preferred embodiment, the primers used may contain a barcode to allow multiple samples to be separated after simultaneous sequencing. In a preferred embodiment, the primers also contain 5' ends that adapters for a next-generation sequencing platform such as the Ion Torrent PGM, Illumina MiSeq, Illumina HiSeq, Nanopore, etc (FIG. 1).

[0074] The probe set can include large number of probes, e.g., 10, 20, 30, 40, 50, 100, 200, 400, 500, 1000, 2000, 3000, 4000, 5000, 10000, 20000, 40000, 80000, or more. The probe set can include one or more probes directed to a large number of different target organisms, e.g., at least 10, 20, 40, 60, 80, 100, 150, 200, 250, 300, 400, 500, 600, 700, 800, 900, 1250, 1500, 1750, 2000, 3000, 4000, 5000, 6000, 7000, 8000, 9000, 10000, or more different target organisms. In some embodiments, a mixture including one or more probes to a plurality of target organisms contains only one probe to a target organism. In other embodiments, the mixture contains more than one probe to a target organism, e.g., about 2, 3, 4, 5, 6, 7, 8,

9, or 10 probes for a target organism. In certain embodiments, such as embodiments designed for use with patient test samples, the mixture further includes probes with homologous probe sequences that specifically hybridize to the host genome for applications such as host genotyping. In some embodiments, the mixtures of the invention further comprise sample internal calibration standards.

[0075] In one embodiment, the plurality of probes can detect at least 2, 3, 4, 5, 6, 7, 8, 9,

10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 110, 120, 130, 140, 150, 160, 170, 180, 190, 200, 300, 400, 500, 600, 700, 800, 900, 1000, 1250, 1500, 1750, or 2000 different organisms or pathogens. In another embodiment, the plurality of probes can detect at least 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 110, 120, 130, 140, 150, 160, 170, 180, 190, 200, 300, 400, 500, 600, 700, 800, 900, 1250, 1500, 1750, 2000, 3000, 4000, 5000, 6000, 7000, 8000, 9000, 10000, or more different strains, variants or sub-types of a pathogen or different strains or sub-types of different pathogens. In one embodiment, the probe set identifies detects at least 2 different bacterial or fungal strains. In another embodiment, the probe set identifies at least 50 different organisms, such as 50 different pathogens, or 50 different strains or subtypes of a pathogen, such as Staphylococcus aureus.

[0076] In another embodiment, the probe set can comprise probes capable of detecting a single molecule of a pathogen, thereby detecting, distinguishing or identifying the pathogen.

[0077] Each probe in the probe set can comprise the same or different backbone size, sequence, chemistries, configuration of barcodes and sequences, specific sequences for probe enrichment, target sites for probe cleavage, hybridization arm physical and chemical properties, probe identification regions, low structure optimized design, or any combination thereof. A probe may be selected to screen key loci for pathogenicity and/or drug susceptibility, and a genetic fingerprint or genotype for each sub-strain that contains key phenotypic information is generated.

[0078] In another embodiment, the probe comprises a first sequence that hybridizes to a 5' end of a target sequence and a second sequence that hybridizes to a 3' end of a target sequence, wherein the target sequence can be used to identify, detect, or distinguish an organism, such as pathogen. In some embodiments, the probes in the mixture each comprise a first and second homologous probe sequence— separated by a backbone sequence— that specifically hybridize to a first and second sequence (such as sequences 3' and/or 5' to a target sequence, respectively) in the genome of at least one target organism. In some embodiments the first and second homologous probe sequences are not complementary to the target sequence, but ligate to the 5' and 3' termini of a target nucleic acid, e.g. a microRNA, and possess appropriate chemical groups for compatibility with a nucleic acid-ligating enzyme, such as phosphorylated or adenylated 5' termini, and free 3' hydroxyl groups. The probe can be capable of circularizing capture of a region of interest.

[0079] In some embodiments, the homologous probe sequences or the sequences of the probe that hybridize or are homologous to the 3' and/or 5' region of a target sequence specifically hybridizes to target sequences in the genome of their respective target organism, but do not specifically hybridize to any sequence in the genome of a predetermined set of sequenced organisms— the exclusion set. In embodiments related to probes that do not hybridize directly to the capture target, the 'homologous probe sequences' are designed specifically to not substantially hybridize to any sequence within a defined set of genomes, i.e., an exclusion set. In the case of biological samples from a subject, the exclusion set includes the host's genome. In particular embodiments, the exclusion set also includes a plurality of viral, eukaryotic, prokaryotic, and archaeal genomes. In more particular embodiments, the plurality of viral, eukaryotic, prokaryotic, and archaeal genomes in the exclusion set may comprise sequenced genomes from commensal, non-virulent, or nonpathogenic organisms. In still more particular embodiments, the exclusion set for all probes in a mixture share a common subset of sequenced genomes comprising, for example, a host genome and commensal, non-virulent, or non-pathogenic organisms. In general, the exclusion set varies between probes in the mixture so that each probe in the mixture does not specifically hybridize with the target sequence of any other probe in the mixture.

[0080] In some embodiments, the sequences 3' and/or 5' to a target sequence are separated by a region of interest {e.g. , the target sequence) of at least two nucleotides. In particular embodiments, they are separated by at least 5, 6, 7, 8, 9, 10, 12, 14, 18, 20, 25, 30, 50,75, 100, 150, 200, 300, 400, 600, 1200, 1500, 2500, or more nucleotides. In some embodiments, the first and second target sequences are separated by no more than 5, 6, 7, 8, 9, 10, 12, 14, 18, 20, 25, 30, 50,75, 100, 150, 200, 300, 400, 600, 1200, 1500, or 2500 nucleotides.

[0081] In some embodiments, probes can be designed to capture conserved regions, and upon DNA sequencing, can reveal polymorphisms and genetic aberrations that allow for the resolution of known or novel variants or closely related strains of organisms. In some embodiments two or more probes can be used for one or more or every organism wished to be tested for, which can permit discrimination of closely related organisms, even when a sample comprises more than one organism.

[0082] In one aspect, the probes in the probe set each comprising homologous probe sequences which are substantially free of secondary structure, do not contain long strings of a single nucleotide {e.g., they have fewer than 7, 6, 5, 4, 3, or 2 consecutive identical bases), are at least about 8 bases {e.g. , 8, 10, 12, 14, 16, 18, 20, 22, 24, 25, 26, 27, 28, 30, or 32 bases in length), and have a T m in the range of 50-72°C {e.g., about 53, 54, 55, 56, 57, 58, 59, 60, 61, or 62°C). In some embodiments the first and second homologous probe sequences are about the same length and have the same T m . In other embodiments, length and T m of the first and second homologous probe sequences differ. The homologous probe sequences in each probe may also be selected to occur below a certain threshold number of times in the target organism's genome (e.g., fewer than 20, 10, 5, 4, 3, or 2 times).

[0083] The backbone sequence of the probes may include a detectable moiety and a primer- binding sequence. In some embodiments, the backbone sequence of the probes comprises a second primer. In particular embodiments, the detectable moiety is a barcode. In certain embodiments the backbone further comprises a cleavage site, such as a restriction endonuclease recognition sequence. In certain embodiments, the backbone contains non-WatsonCrick nucleotides, including, for example, abasic furan moieties, and the like.

[0084] In another aspect, the invention provides a kit comprising one or more sets of probes, such as one or more sets of probes from the probes provided in Table 1. In one embodiment, a kit comprises one or more reagents for obtaining a sample (e.g. , swabs), reagents for extracting DNA, enzymes (such as polymerase and/or ligase to capture a region of interest), reagents for amplifying the region of interest, reagents for purifying the DNA or amplified or captured regions of interest (e.g. , purification cartridge), buffers, sequencing reagents, or any combination thereof. In one embodiment, the kit may be a low throughput kit, such as a kit for a small number of samples. For example, a kit may be a low throughput kit, such as a kit for 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 1 , 12, 18, 20, 24, 28, 32, 36, 40, 42, 48, or between 8-48 samples. In another embodiment, the kit may be a high-throughput kit, such as a kit for a large number of samples. For example, a kit may be a high-throughput kit, such as a kit for 50, 60, 70, 80, 90, 100, 1 10, 120, 130, 140, 150, 160, 170, 180, 190, 200, 300, 400, 500, 600, 700, 800, 900, 1000, 1250, 1500, 1750, 2000, or more samples. For example, a kit may be a high-throughput kit, such as a kit for between 50-96, 50-384, 50-1536, 96-384, 96-1536, or 384-1536 samples. In some embodiments, a kit as described herein can comprise enough reagents to prepare one or more specimens for sequencing. For example, a kit as described herein can comprise enough reagents to prepare 2, 3, 4, 5, 6, 7, 8, 9, 10, 20, 30, 40, 48, 50, 60, 70, 80, 90, 96, 100, 1 10, 120, 130, 140, 150, 160, 170, 180, 190, 200, 300, 384, 400, 500, 600, 700, 800, 900, 1000, 1250, 1500, 1536, 1750, 2000 or more specimens for sequencing. Method of Using Probe

[0085] Also provided herein is a method of using one or more probes disclosed herein, such as one or more probe set, for detecting, identifying, or distinguishing one or more organisms. The method can comprise identifying a an organism with a plurality of probes can detect at least 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 110, 120, 130, 140, 150, 160, 170, 180, 190, 200, 300, 400, 500, 600, 700, 800, 900, 1000, 1250, 1500, 1750, or 2000 different pathogens. In another embodiment, the plurality of probes can detect at least 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 110, 120, 130, 140, 150, 160, 170, 180, 190, 200, 300, 400, 500, 600, 700, 800, 900, 1250, 1500, 1750, 2000, 3000, 4000, 5000, 6000, 7000, 8000, 9000, 10000, or more different strains, variants or sub-types of a pathogen or different strains or sub-types of different pathogens.

[0086] The method can comprise detecting or distinguishing different organisms, different pathogens, different strains, variants or sub-types of a pathogen or different strains, variants or sub-types of different pathogens, with at least 70 % sensitivity, specificity, or both, such as with at least 71 , 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, or 89% sensitivity, specificity, or both, such as with at least 90% sensitivity, specificity, or both. Each probe may detect or distinguish different organisms, different pathogens, different strains or sub-types of a pathogen or different strains or sub-types of different pathogens with at least 91, 92, 93, 94, 95, 96, 97, 98, 99 or 100% sensitivity, specificity, or both, in an assay. Alternatively, a combination of probes may be used for detecting or distinguishing different organisms, different pathogens, different strains, variants or sub-types of a pathogen or different strains, variants or sub-types of different pathogens, with at least 70% sensitivity, specificity, or both, such as with at least 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, or 89% sensitivity, specificity, or both, such as with at least 90%) sensitivity, specificity, or both. Furthermore, the confidence level for determining the specificity, sensitivity, or both, may be with at least 90, 91, 92, 93, 94, 95, 96, 97, 98, or 99% confidence.

[0087] In one embodiment, a method for detecting the presence of one or more target organisms is by contacting a sample suspected of containing at least one target organism with any of the probe set disclosed herein, capturing a region of interest of the at least one target organism {e.g., by polymerization and/or ligation) to form a circularized probe, and detecting the captured region of interest, thereby detecting the presence of the one or more target organisms.

[0088] In certain embodiments, the captured region of interest may be amplified to form a plurality of amplicons (e.g. , by PCR). In some embodiments the sample is treated with nucleases to remove the linear nucleic acids after probe-circularizing capture of the region of interest. In some embodiments, the circularized probe is linearized, e.g. , by nuclease treatment. In other embodiments the circularized probe molecule is sequenced directly by any means known in the art, without amplification. In certain embodiments, the circularized probe is contacted by an oligonucleotide that primes polymerase-mediated extension of the molecules to generate sequences complementary to that of the circularized probe, including from at least one to as many as 1 million or more concatemerized copies of the original circular probe.

[0089] In particular embodiments, the circularized probe molecule is enriched from the reaction solution by means of a secondary-capture oligonucleotide capture probe. A secondary-capture oligonucleotide capture probe may comprise a moiety designed to be captured, such as a biotin molecule, and a nucleic acid sequence designed to hybridize to at least 6 nucleotides of the circularized probe. The nucleic acid sequence designed to hybridize to at least 6 nucleotides of the circularized probe may include 1, 2, 4, 8, 16, 32 or more nucleotides of the polymerase-extended capture product.

[0090] In certain embodiments, the probe and/or captured region of interest is sequenced by any means known in the art, such as polymerase-dependent sequencing (including, dideoxy sequencing, pyro sequencing, and sequencing by synthesis) or ligase based sequencing (e.g. , polony sequencing). The sequencing can be by Sanger sequencing or massive parallel sequencing, such as "next generation" (Next-gen) sequencing, second generation sequencing, or third generation sequencing. For example, sequencing can be by second generation or third generation sequencing methods, such as using commercial platforms such as Illumina, 454 (Roche), Solid, Ion Torrent PGM (Life Technologies), PacBio, Oxford, Life Technologies QDot, Nanopore, or any other available sequencing platform. Massive parallel sequencing can allow for the simultaneous sequencing of one million to several hundred millions, for example 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 20, 30, 40, 48, 50, 60, 70, 80, 90, 96, 100, 110, 120, 130, 140, 150, 160, 170, 180, 190, 200, 300, 400, 500, 600, 700, 800, or 900 million, of reads from amplified DNA clones. The reads can read any number of bases, such as 50-400 bases.

[0091] An internal nucleotide control, such as DNA at a known concentration, can be used with the methods and samples described herein. In one embodiment, an internal nucleotide control can serve as an internal calibrator, such as for determining copy number. In some embodiments, a sequencing read that aligns to the calibrator can also serve as a positive control for the performance of the assay, such as in the context of every sample.

[0092] In one aspect, the probes, methods, and kits described herein can be used to test for the presence of one or more organisms, such as those in Table 2. In one embodiment, the probes, methods, and kits described herein can be used to test for the presence of one or more antibiotic resistance genes, such as those in Table 3. In a preferred embodiment, the probes, methods, and kits described herein can be used to test for the presence of one or more organisms, such as those in Table 2, and test for the presence of one or more antibiotic resistance genes, such as those in Table 3, in parallel, such as in one sample tube, in the same sample, simultaneously, or any combination thereof. In some embodiments, in a single reaction tube, a kit can be used to test for the two or more microbes most commonly associated with hospital-acquired infections, and simultaneously tests for the presence of two or more antibiotic resistance genes. For example, a kit can be used to test for the 3, 4, 5, 6, 7,

8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, or more microbes most commonly associated with hospital-acquired infections, and simultaneously tests for the presence of 3, 4, 5, 6, 7, 8,

9, 10, 1 1, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, or more antibiotic resistance genes simultaneously. For example, in a single reaction tube, a kit can be used to test for the 12 microbes most commonly associated with hospital-acquired infections, and simultaneously tests for the presence of 18 antibiotic resistance genes.

[0093] In one embodiment, one or more organisms can be identified from a sample, such as a sample form a host and the organism being identified is a pathogen. In one embodiment, the sample is a biological sample, such as from a mammal, such as a human. In another embodiment, a genotype of the host is identified or detected from the sample or another sample from the host. The identification of one or more organisms (such as one or more pathogens, such as different pathogens or subtypes or strains of pathogens), can be used to select one or more therapeutics or treatments for the host. In another embodiment, the identification of one or more organisms (such as one or more pathogens, such as different pathogens or subtypes or strains of pathogens), can be used to stratify the host into a therapeutic group, such as for a particular drug treatment or clinical trial. In one

embodiment, HPV strain identification can be used to stratify a host into a cancer therapeutic group or to select a cancer treatment.

[0094] The yet another embodiment identification of one or more organisms (such as one or more pathogens, such as different pathogens or subtypes or strains of pathogens) and the genotype of a host can be used to select one or more therapeutics or treatments for the host. In another embodiment, the identification of one or more organisms (such as one or more pathogens, such as different pathogens or subtypes or strains of pathogens) and the genotype of the host can be used to stratify the host into a therapeutic group, such as for a particular drug treatment or clinical trial.

[0095] Also provided herein is a method for identifying an organism, such as a genetic signature of an organism, a subtype or strain of a pathogen in a short timeframe or with a fast turnaround time. In another embodiment, a genotype of an individual or host can also be identified within the short time frame. For example, the identification of a pathogen in a sample or the genotype of a host can completed in less than 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, or 12 hours. In one embodiment, from contacting the sample with one or more probes to identifying the organism by sequencing can be performed in less than 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 1 , or 12 hours. In yet another embodiment, from contacting the sample with the probe to identifying the organism (such as one or more pathogens) by sequencing, and transmitting the results to a health care professional (such as a clinician or physician) can be performed in less than 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, or 12 hours. In yet another embodiment, from contacting the sample with the probe to identifying the organism (such as one or more pathogens) by sequencing, transmitting the results to a health care professional (such as a clinician or physician), and selection of a therapeutic can be performed in less than 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, or 12 hours.

[0096] Also provided herein is a method for simultaneous quantification and

identification of an organism, such as identifying one or more subtypes or substrains of a pathogen. Multiplexing is also provided herein, wherein a multiple pathogens, substrains or subtypes of pathogens, can be detected simultaneously or in a single reaction tube. [0097] In one embodiment, conversion of sequence data to quantitative report can be performed by using selected validated parameters. Any software known in the arts can be used for any of the methods disclosed herein.

[0098] In some embodiments, an organism identified and/or quantified using the methods described herein can be the cause of an infection in a subject, such as a nosocomial infection (also known as a hospital-acquired infection (HAI)) which is an infection whose development is favored by a hospital environment. In some embodiments, an infection can be acquired by a patient during a hospital visit or one developing among hospital staff. Such infections can include, for example, fungal and bacterial infections and can be aggravated by a reduced resistance of individual patients. Organisms responsible for HAIs can survive for a long time on surfaces in the hospital and can enter or be transmitted to the body through wounds, catheters, and ventilators. In some embodiments, the route of transmission can be contact transmission (direct or indirect), droplet transmission, airborne transmission, common vehicle transmission, vector borne transmission, or any combination thereof.

[0099] People in hospitals can already be in a poor state of health, impairing their defense against bacteria. Advanced age or premature birth along with immunodeficiency, due to, for example, drugs, illness, or irradiation, present a general risk. Other diseases can present specific risks, for example, chronic obstructive pulmonary disease can increase chances of respiratory tract infection. Invasive devices, for example, intubation tubes, catheters, surgical drains, and tracheostomy tubes can bypass the body's natural lines of defense against pathogens and can provide an easy route for infection. Patients already colonized on admission can be put at greater risk when they undergo a procedure, such as an invasive procedure. A patient's treatment itself can leave the patient vulnerable to infection, for example, immunosuppression and antacid treatment can undermine the body's defenses, while antimicrobial and recurrent blood transfusions can also be risk factors.

[00100] Non-limiting examples of HAIs include Ventilator associated pneumonia (VAP), Staphylococcus aureus, Methicillin resistant Staphylococcus aureus (MRSA), Candida albicans, Pseudomonas aeruginosa, Acinetobacter baumannii, Stenotrophomonas maltophilia, Clostridium difficile, Tuberculosis, Urinary tract infection, Hospital-acquired pneumonia (HAP), Gastroenteritis, Vancomycin-resistant Enterococcus (VRE), and Legionnaires' disease. In some embodiments, HAIs can be caused by one or more of the organisms provided in Table 2. [00101] Nucleotides, such as DNA and RNA, can be isolated from any suitable sample and detected using the probes described herein. Non-limiting examples of sample sources include catheters, medical devices, blood, blood cultures, urine, stool, fomites, wounds, sputum, pure bacterial cultures, mixed bacterial cultures, and bacterial colonies.

[00102] In some embodiments, the probe sets described herein can be used to detect and distinguish among the organisms responsible for more than 10% of the hospital acquired infections at a site. For example, the probe sets described herein can be used to detect and distinguish among the organisms responsible for more than 5%, 10%, 15%, 20%, 25%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, or 100% of the hospital acquired infections at a site. In some embodiments, a site can be a surgical site, wound, tract, urinary catheter, ventilator, intravenous needle, syringe, respiratory tract, invasive device, intubation tube, catheter, surgical drain, tracheostomy tube, saline flush syringe, vial, bag, tube or any combination thereof.

Method of Generating Probe

[00103] A further aspect of the invention provides methods of making the mixtures of probes provided by the invention. The methods comprise providing a set of reference genomes and an exclusion set of genomes. The sequence of the reference genomes can be partitioned (in silico) into n-mer strings of about 18-50 nucleotides. The partitioned n-mer strings can be screened to eliminate redundant sequences, sequences with secondary structure, repetitive sequences (e.g. , strings with more than 4 consecutive identical nucleotides), and sequences with a T m outside of a predetermined range (e.g. , outside of 50- 72°C). The screened n-mers can be further screened to identify homologous probe sequences by eliminating n-mers that specifically hybridize to a sequence in the genome in the exclusion set of genomes {e.g. , if a pairwise alignment contains 19 of 20 matches in an n-mer, such as a 25-mer) or occurs in the genome of the target organism more than a specified number of times. The screening may also remove n-mers that are present in more than or less than a specified number of the reference genomes. The screening may also remove n-mers that will not interact favorably with enzymes to be used with the probe sequences. For example, a particular polymerase may work with higher efficiency if the last 3' base of the probe is a G or C. Similarly, a particular ligase may work more efficiently on certain bases at the ligation j unction. For example, Ampligase (Epicentre) will ligate a gap between AG and GT at least 10 times more efficiently than a gap between TC and CC.

[00104] In particular embodiments, a homologous probe sequence may occur only once in the genome of the target organism. For target organisms with a single-stranded genome, the homologous probe sequence may occur only once in the complement of the genome of the target organism. In one embodiment, where a sequenced variant of the target organism is available (e.g. , the same species, genus, or serovar), the homologous probe sequences can be filtered so as to specifically hybridize to the genome of the additional sequenced variant(s) resulting in a probe that groups related organisms. In an alternate embodiment, the homologous probe sequences can be filtered so as to not specifically hybridize to the genome of the sequenced variant (e.g., the sequenced variant is part of the exclusion set), resulting in a probe that discriminates between related organisms. These filter processes can be iterated for each target organism to be detected by the particular mixture. In some embodiments, the candidate homologous probe sequences can be screened to eliminate those that will specifically hybridize with other probes in the mixture.

[00105] Probe selection can be based on a database of different pathogens, strains of a pathogen, or both, such as a database comprising more than 10 different pathogens, strains of a pathogen, or both. For example, probe selection can be based a database comprising more than 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 1 10, 120, 130, 140, 150, 160, 170, 180, 190, 200, 300, 400, 500, 600, 700, 800, 900, 1000, 1250, 1500, 1750, 2000, or more different pathogens, strains of a pathogen, or both. In some embodiments, probe selection can be based on a database of different pathogens, strains of a pathogen, or both, that are known to cause HAIs, such as a database comprising more than 10 different pathogens, strains of a pathogen, or both, that are known to cause HAIs. For example, probe selection can be based a database comprising more than 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 1 10, 120, 130, 140, 150, 160, 170, 180, 190, 200, 300, 400, 500, 600, 700, 800, 900, 1000, 1250, 1500, 1750, 2000, or more different pathogens, strains of a pathogen, or both, that are known to cause HAIs, and optionally with additional strains or sub-types of other pathogens. In one embodiment, probes for organisms associated with HAIs are selected by partitioning all available genomes of organisms associated with HAIs into one or more subsets based on sequence similarity. For each subset candidate probe sets are generated that capture all strains. A filter can then be applied for specificity against human/microbial/viral/fungal genomes.

[00106] Some of clinical tests based on the methods disclosed herein rely on the ability to determine or approximate the number of input template molecules (genomes) in a sample. A two step method can be used to calculate the number of template molecules in a sample from the sequencing read counts. 1) Each sample sequenced can have a known quantity of a control sequence added to it. One embodiment employs GFP as the control sequence. It is contemplated to use several control sequences added in different quantities. The first step in analyzing sequencing reads can be to normalize the counts based on the number of reads that came from the control sequence. This normalization accounts for the fact that more material from sample A than from sample B may have been put into the sequencing reaction. 2) Since different MIPs (or primer pairs or hybridization capture probes) might work with different efficiencies, the second step of the quantification process can be to normalize between probes. In one embodiment, this normalization relies on experiments in which fixed amounts of different templates were sequenced and might reveal, e.g. , that a probe against one strain or organism produces 2 circularized MIPs per template but a probe against anther strain or organism produces 3. Thus, the count for the first probe might be multiplied by 33.3 and the count for the second probe divided by 50 to produce comparable load counts for the two strains.

[00107] Some embodiments use a mixed quantity of GFP as the control sequence and a variable quantity of one or more organisms or strains. Some samples may contain only GFP and template DNA while others also included a human background. After the sequencing reads are separated by sample, the method can calculate the ratio of reads, such as viral (HPV-18, HIV-CNO06, and HIV-CNO09) reads, to GFP and plots that ratio against the number of template molecules in the reaction. Those plots indicate generally excellent agreement between the viral/GFP ratio and the input template quantity.

[00108] Compared to other assays, high throughput sequencing offers a relatively unique ability to detect and genotype the pathogen DNA and the human DNA in a sample from a single reaction. In current clinical practice, genotyping the pathogen and human may require multiple tests, potentially doubling (or more) the expense compared to simply detecting a pathogen. The methods disclosed herein enable simultaneous genotyping with minimal added cost and often no added labor. Other selection/enrichment technologies would also enable these tests.

[00109] The methods disclosed herein provide for simultaneously detecting or genotyping multiple pathogens.

[00110] For example, the methods provide for: coinfection of HIV and HCV,

simultaneously genotyping/quantifying HIV while testing for diseases common in

immunocompromised patients. Doctors typically only test for diseases like Candida, CMV, etc upon presentation of some other symptom. However, if the tests can be added at minimal cost, this might be a unique market and feature for Pathogenica's product, for example, HPV and other STIs. There is an interest in testing for HPV and other STIs, primarily chlamydia and gonorrhea to simplify screening, especially in patient populations with limited access to doctors. There is also an interest in testing for these diseases as additional risk factors for cervical cancer.

Probe Panel

[00111] Table 1 lists the probe arm sequences in one embodiment of the present invention designed to detect a variety pathogenic organisms, such as those provided in Table 2, from a sample. Non limiting examples include Staphylococcus aureus, Staphylococcus epidermidis, Staphylococcus saprophyticus, Acinetobacter baumanii, Clostridium difficile, Escherichia coli, Enterobacter (aerogenes, cloacae, asburiae), Enterococcus (faecium, faecalis), Klebsiella pneumoniae, Proteus mirabilis, Candida albicans, and Pseudomonas aeruginosa. The probe set can also be used to detect many common drug resistance genes, including, but not limited to CARB, CMY, CTX-M, GES, IMP, KPC, NDM, Other ampC, OXA, PER, SHV, VEB, VIM, ermA, vanA , vanB , mecA, and mexA,

[00112] Tables 1 and 3-14 provide regions of interest (leftmost columns, using the format of descriptor {e.g. , organism or gene, if applicable)_reference accession number (if applicable)_first nucleotide of capture regionjast nucleotide of capture region. For example, the probe "acinetobacter_NC_010611_627997_628164" is directed to acineobacter, and is predicted to be capable of capturing nucleotides corresponding to nucleotides 627997 to 628164 of the reference sequence NC_010611. Reference accession sequences can be obtained from, for example, the NCBI Entrez portal. Tables 3, 5, 7, 9, 11, and 13 provide the regions of interest and corresponding annotated genes within that region. Tables 4, 6, 8, 10, 12, and 14, in turn, provide particular exemplary oligonucleic acid sequences— provided as pairs that can be used in a MIP or adapted for use as conventional PCR primers— predicted to capture the region of interest listed in the first column of the. "Binding region 1" in Tables 4, 6, 8, 10, 12, and 14 correspond to the 5', or ligation arm, of a MIP probe and "Binding region 2" corresponds to the 3', or extension arm of a MIP probe. In some embodiments, substantially similar sequences to the regions of interest provided in Tables 1 and 3-14 can be used. In some embodiments, the substantially similar sequences wherein the substantially similar sequences are 60, 65, 70, 75, 80, 85, 90, 95, 96, 97, 98, 99, 99.5, or 100% identical to the sequence of the regions of interest. In other embodiments, the substantially similar sequences have endpoints within 100, 90, 80, 70, 60, 50, 40, 35, 30, 25, 20, 19, 18, 17, 16, 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1, or 0 nucleotides upstream or downstream of either of the endpoints of the regions of interest. In still other embodiments, the substantially similar sequences are 60, 65, 70, 75, 80, 85, 90, 95, 96, 97, 98, 99, 99.5, or 100% identical to the sequence of the regions of interest and have endpoints within 100, 90, 80, 70, 60, 50, 40, 35, 30, 25, 20, 19, 18, 17, 16, 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1 , or 0 nucleotides upstream or downstream of either of the endpoints of the regions of interest. In still more particular embodiments, the particular exemplified endpoints and binding regions are use, e.g. , as pairs of binding regions in either a single MIP capture probe, or as pairs of conventional PCR primers, e.g., using the reverse complement of the ligation arm.

[00113] Subsets of the regions of interest or particular exemplary binding regions in tables Tables 1 and 3-14 can be used concordant with the present invention, e.g. , 10, 20, 30, 40, 50, 60, 70, 80, 90, 95, 96, 97, 98, 99, or 100% of the regions of interest or binding regions in the tables, e.g. :

[00114] oligonucleic acid molecules capable of i) amplifying, geometrically by

polymerase chain reaction or ii) circularizing capture of, 1, 2, 3, 4, 5, 10, 15, 16, or all 17, of the regions of interest provided in column 1 of Table 3, or substantially similar sequences;

[00115] oligonucleic acid molecules capable of i) amplifying, geometrically by

polymerase chain reaction or ii) circularizing capture of, 1, 2, 3, 4, 5, 10, 15, 20, 30, 50, 100, or all 134, of the regions of interest provided in column 1 of Table 5, or substantially similar sequences, such as: [00116] oligonucleic acid molecules capable of i) amplifying, geometrically by

polymerase chain reaction or ii) circularizing capture of, 1, 2, 3, 4, 5, 10, or all 13, of the regions of interest provided in column 1 of Table 7, or substantially similar sequences;

[00117] oligonucleic acid molecules capable amplifying, geometrically by polymerase chain reaction, or circularizing capture of, 1, 2, 3, 4, 5, 10, 20, 40, 60, 80, or all 85, of the regions of interest provided in column 1 of Table 9, or substantially similar sequences;

[00118] oligonucleic acid molecules capable of i) amplifying, geometrically by

polymerase chain reaction or ii) circularizing capture of, 1, 2, 3, 4, 5, 10, 20, 25, or all 29 of the regions of interest provided in column 1 of Table 11, or substantially similar sequences;

[00119] oligonucleic acid molecules capable of i) amplifying, geometrically by

polymerase chain reaction or ii) circularizing capture of, 1, 2, 3, 4, 5, 10, 15, or all 20, of the regions of interest provided in column 1 of Table 13, or substantially similar sequences;

[00120] oligonucleic acid molecules comprising 1, 2, 4, 6, 8, 10, 15, 20, 25, 30, or all 34 of the sequences, or reverse complements thereof, provided in the second or third column of table 4;

[00121] oligonucleic acid molecules comprising 1, 2, 4, 6, 8, 10, 20, 50, 100, 150, 200, 250, or all 268 of the sequences, or reverse complements thereof, provided in the second or third column of table 6;

[00122] oligonucleic acid molecules comprising 1, 2, 4, 6, 8, 10, 15, 20, 25, or all 26 of the sequences, or reverse complements thereof, provided in the second or third column of table 8;

[00123] oligonucleic acid molecules comprising 1, 2, 4, 6, 8, 10, 20, 50, 100, 150, or all 170 of the sequences, or reverse complements thereof, provided in the second or third column of table 10;

[00124] oligonucleic acid molecules comprising 1, 2, 4, 6, 8, 10, 20, 30, 40, 50, or all 56 of the sequences, or reverse complements thereof, provided in the second or third column of table 12;

[00125] oligonucleic acid molecules comprising 1, 2, 4, 6, 8, 10, 20, 30, or all 40 of the sequences, or reverse complements thereof, provided in the second or third column of table 14, as well as any combinations of the foregoing.

[00126] Table 1 provides particular probes assembled as molecular inversion probes (MIPs) capable of circularizing capture of the indicated region of interest in the leftmost column. These exemplary probes share a common backbone sequence of GTTGGAGGCTCATCGTTCCTATATTCCACACCACTTATTATTACAGATGTTATGCT CGCAGGTC, except for the peGFP_Nl_730_925 probe, which uses the backbone

GTTGGAGGCTCATCGTTCCTATATTCCTGACTCCTCATTGATGATTACAGATGTTA TGCTCGCAGGTC. Alternative backbone sequences can readily be used. Conventional PCR primer pairs can be adapted from these MIP probes by omitting the intervening backbone sequence and providing the reverse complement of the ligation arm (5') probe. Tables 4, 6, 8, 10, 12, and 14 provide subsets of the probes in Table 1 where the individual arms are provided in the second and third columns, respectively. Tables 4, 6, 8, 10, 12, and 14 collectively provide the same probe arms that are present in Table 1.

Table 1: A particular embodiment of the probe sets provided by the invention

Name Sequence

peGFP_Nl_730_925 /5Phos/GTGGTATGGCTGATTATGATCTAGAGTGTTGGAGGCTCATCGTTCCTATA

TTCCTGACTCCTCATTGATGATTACAGATGTTATGCTCGCAGGTCGAGTTTGGACAA ACCACAACTAGAA

plasmids_NC_0106 /5Phos/GCTGTCACCGTCCAGACGCTGTTGGCGTTGGAGGCTCATCGTTCCTATAT 60_187035_187205 TCCACACCACTTATTATTACAGATGTTATGCTCGCAGGTCTCCGTGCCTTCAAGCGC

G

plasmids_NC_0142 /5Phos/GACTCCGCAGAATACGGCACCGTGCGCAGTTGGAGGCTCATCGTTCCTAT 32_5501_5677 ATTCCACACCACTTATTATTACAGATGTTATGCTCGCAGGTCGCGTACAGGCCAGTC

AGC

plasmids_NC_0119 /5Phos/GCAGTCGGTAACCTCGCGCGTTGGAGGCTCATCGTTCCTATATTCCACAC 80 58308 58487 CACTTATTATTACAGATGTTATGCTCGCAGGTCGCGCTATCTCTGCTCTCACTGC plasmids_NC_0118 /5Phos/GCTGTCCTGGCTGCAAGCCTGGGTTGGAGGCTCATCGTTCCTATATTCCA 38 178818 178996 CACCACTTATTATTACAGATGTTATGCTCGCAGGTCCCGAACTGCTGATGGACGT plasmids_FN55476 /5Pho£3/GACAGCAGACTCACCGGCTGGTTCCGCTGTTGGAGGCTCATCGTTCCTAT 7_13017_13190 ATTCCACACCACTTATTATTACAGATGTTATGCTCGCAGGTCGCAAGATGCTGCTGG

CCACACTG

plasmids_NC_0136 /5Phos/GACAGAACAAGTTCCGCTCCGGGTTGGAGGCTCATCGTTCCTATATTCCA 55 115365 115542 CACCACTTATTATTACAGATGTTATGCTCGCAGGTCCACGGATACGCCGCGCAT plasmids_NC_0139 /5Phos/GAGGACCGAAGGAGCTAACCGGTTGGAGGCTCATCGTTCCTATATTCCAC 50 90185 90338 ACCACTTATTATTACAGATGTTATGCTCGCAGGTCCGCCGCATACACTATTCTC plasmids_NC_0155 /5Phos/GCTGTAATGCAAGTAGCGTATGCGCTCGTTGGAGGCTCATCGTTCCTATA 99_37281_37455 TTCCACACCACTTATTATTACAGATGTTATGCTCGCAGGTCGAACAGCAAGGCCGCC

AATGCCTGACG

plasmids_NC_0139 /5Phos/GAACGTCTGGCGCTGGTCGCCTGCCGTTGGAGGCTCATCGTTCCTATATT 51_69899_70067 CCACACCACTTATTATTACAGATGTTATGCTCGCAGGTCGCACAGGTGCTGACGTGG

T

plasmids_NC_0073 /5Phos/CGCATATGCTGAATGATTATCTCGTTGCGTTGGAGGCTCATCGTTCCTAT 51_37979_38146 ATTCCACACCACTTATTATTACAGATGTTATGCTCGCAGGTCATCTTGCTCAATGAG

GTTATTCA

plasmids_FN82274 /5PhoS/GACGACAGATGCAGGTTGAGTTGGAGGCTCATCGTTCCTATATTCCACAC 9 1846 2009 CACTTATTATTACAGATGTTATGCTCGCAGGTCCGCATCGCCGATGCTCATC plasmids_NC_0048 /5Phos/CGCCTGCTCCAGTGCATCCAGCACGAATGTTGGAGGCTCATCGTTCCTAT 51_143949_144109 ATTCCACACCACTTATTATTACAGATGTTATGCTCGCAGGTCATGCTCTCCGCCATC

GCGTTGTCA

plasmids_NC_0105 /5Phos/AGTGCGTTCACCGAATACGTGCGCAGTTGGAGGCTCATCGTTCCTATATT 58_156799_156957 CCACACCACTTATTATTACAGATGTTATGCTCGCAGGTCCAGGTTATGCCGCTCAAT

TC Name Sequence

plasmids_NC_0076 /5Phos/AATCCAGGTCCTGACCGTTCTGTCCGTGTTGGAGGCTCATCGTTCCTATA 35_38395_38566 TTCCACACCACTTATTATTACAGATGTTATGCTCGCAGGTCACCTCCGTTGAGCTGA

TGGA

plasmids_NC_0097 /5Phos/GAGGTGGCCAACACCATGTGTGACCGTTGGAGGCTCATCGTTCCTATATT 87_17946_18116 CCACACCACTTATTATTACAGATGTTATGCTCGCAGGTCGACGCCGGTATATCGGTA

TCGAGCTGCT

plasmids_NC_0125 /5Phos/CGCATATGCTGAATGATTATCTCGTTGGTTGGAGGCTCATCGTTCCTATA 47_53585_53752 TTCCACACCACTTATTATTACAGATGTTATGCTCGCAGGTCACGGTGATCTTGCTCA

ATGAGGTTATTC

plasmids_NC_0066 /5Phos/GAAGTGCCGGACTTCTGCAGAGTTGGAGGCTCATCGTTCCTATATTCCAC 71 56259 56438 ACCACTTATTATTACAGATGTTATGCTCGCAGGTCGCACGGCCTGATGGAGGCCGC plasmids_NC_0143 /5Phos/GCTAATCGCATAACAGCTACGTTGGAGGCTCATCGTTCCTATATTCCACA 85_53151_53310 CCACTTATTATTACAGATGTTATGCTCGCAGGTCCATCACGTAACTTATTGATGATA

TT

plasmids_FN64941 /5Phos/GCTGCGGTATTCCACGGTCGGCCGTTGGAGGCTCATCGTTCCTATATTCC 8_57169_57339 ACACCACTTATTATTACAGATGTTATGCTCGCAGGTCGCAGGAACGCTGCCTGTGGT

C

plasmids_NC_0050 /5Phos/GAATCAATTATCTTCTTCATTATTGATGTTGGAGGCTCATCGTTCCTATA 11_8620_8785 TTCCACACCACTTATTATTACAGATGTTATGCTCGCAGGTCCTGCGGCTCAACTCAA

GCA

plasmids_NC_0148 /5Phos/GTCACACGTCACGCAGTCCGTTGGAGGCTCATCGTTCCTATATTCCACAC 43 98413 98578 CACTTATTATTACAGATGTTATGCTCGCAGGTCGCATTCATGGCGCTGATGGC plasmids_NC_0084 /5Phos/GTGTTACTCGGTAGAATGCTCGCAAGGGTTGGAGGCTCATCGTTCCTATA 90_5165_5334 TTCCACACCACTTATTATTACAGATGTTATGCTCGCAGGTCACTAGATGACATATCA

TGTAAGTT

plasmids_NC_0159 /5Phos /CGGAACTGCCTGCTCGTATGTTGGAGGCTCATCGTTCCTATATTCCACAC 63 147516 147686 CACTTATTATTACAGATGTTATGCTCGCAGGTCAACGATATAGTCCGTTAT plasmids_NC_0073 /5PhoS/GCTCTCCGACTCCTGGTACGTCAGGTTGGAGGCTCATCGTTCCTATATTC 65 100545 100708 CACACCACTTATTATTACAGATGTTATGCTCGCAGGTCGCGCGCATTAATGAAGCAC plasmids_NC_0098 /5Phos/GATGTTGCGATTACTTCGCCAACTATTGGTTGGAGGCTCATCGTTCCTAT 38_104163_104332 ATTCCACACCACTTATTATTACAGATGTTATGCTCGCAGGTCGCTGTAATTATGACG

ACGCCG

plasmids_NC_0134 /5Phos/CTCATTCCAGAAGCAACTTCTTCTTGTTGGAGGCTCATCGTTCCTATATT 52_4052_4209 CCACACCACTTATTATTACAGATGTTATGCTCGCAGGTCGGATAGCCATGGCTACAA

GAATA

plasmlds_NC_0104 /5Phos/GCAATACCAGGAAGGAAGTCTTACTGGTTGGAGGCTCATCGTTCCTATAT 09_39768_39935 TCCACACCACTTATTATTACAGATGTTATGCTCGCAGGTCGTCATTGGAGAACAGAT

GATTGATGT

plasmids_NC_0142 /5Phos/GTATCGCCACAATAACTGCCGGAAGTTGGAGGCTCATCGTTCCTATATTC 33 50337 50492 CACACCACTTATTATTACAGATGTTATGCTCGCAGGTCAACGATATAGTCCGTTATG plasmids_NC_0139 /5Phos/GCTGTGGCACAGGCTGAACGCCGGTTGGAGGCTCATCGTTCCTATATTCC 50_91008_91174 ACACCACTTATTATTACAGATGTTATGCTCGCAGGTCGGTGATGTCATTCTGGTTAA

GA

plasmids_NC_0026 /5Phos/ACATAATCTGAATCTGAGACAACATCGTTGGAGGCTCATCGTTCCTATAT 98_168967_169123 TCCACACCACTTATTATTACAGATGTTATGCTCGCAGGTCACGCACTCTGGCCACAC

TGG

plasmids_NC_0133 /5Phos/GTGAAGCGCATCCGGTCACCGTTGGAGGCTCATCGTTCCTATATTCCACA 62 56651 56805 CCACTTATTATTACAGATGTTATGCTCGCAGGTCATGGCATAGGCCAGGTCAATAT plasmids_NC 0142 /5Phos/GGTTCTGGACCAGTTGCGTGAGCGCGTTGGAGGCTCATCGTTCCTATATT 08_52313_52469 CCACACCACTTATTATTACAGATGTTATGCTCGCAGGTCCGTAACATCGTTGCTGCT

CCAT

betalactamase_AB /5PhoS/CGCTGGATTTCACGCCATAGGCGTTGGAGGCTCATCGTTCCTATATTCCA 372224 738 905 CACCACTTATTATTACAGATGTTATGCTCGCAGGTCTGTCGCTACCGTTGATGATT betalactamase EF /5PhoS/CGTATAGGTGGCTAAGTGCAGCGTTGGAGGCTCATCGTTCCTATATTCCA 685371 398 548 CACCACTTATTATTACAGATGTTATGCTCGCAGGTCGTAACTCATTCCTGAGGGTTT Name Sequence

C

betalactamase_DQ /5Phos/GTACATACTCGATCGAAGCACGAGTTGGAGGCTCATCGTTCCTATATTCC 149247 231 371 ACACCACTTATTATTACAGATGTTATGCTCGCAGGTCCCGGAATAGCGGAAGCTTTC betalactamase_AY /5Phos/AAGGTCGAAGCAGGTACATACTCGGTTGGAGGCTCATCGTTCCTATATTC 750911_244_414 CACACCACTTATTATTACAGATGTTATGCTCGCAGGTCAGACATGAGCTCAAGTCCA

AT

betalactamase DQ /5Phos/GAAGCTTTCATAGCGTCGCCTAGGTTGGAGGCTCATCGTTCCTATATTCC 519087_417_575 ACACCACTTATTATTACAGATGTTATGCTCGCAGGTCTTAGCTAGCTTGTAAGCAAA

TTG

betalactamase_AM /5Phos/GAAGCTTTCATGGCATCGCCTAGGTTGGAGGCTCATCGTTCCTATATTCC 231719_379_537 ACACCACTTATTATTACAGATGTTATGCTCGCAGGTCAGCTAGCTTGTAAGCAAACT

G

betalactamase Yl /5Phos/CGCTACCGGTAGTATTGCCCTTGTTGGAGGCTCATCGTTCCTATATTCCA 4156 663 819 CACCACTTATTATTACAGATGTTATGCTCGCAGGTCAGAATATCCCGACGGCTTTC betalactamase JN /5Phos/ATCGCCACGTTATCGCTGTACTGTTGGAGGCTCATCGTTCCTATATTCCA 227085 763 931 CACCACTTATTATTACAGATGTTATGCTCGCAGGTCTTTACCCAGCGTCAGATTCC betalactamase EU /5Phos/CAAGTACTGTTCCTGTACGTCAGCGTTGGAGGCTCATCGTTCCTATATTC 259884_1030_1170 CACACCACTTATTATTACAGATGTTATGCTCGCAGGTCTCGCCAGTAACTGGTCTAT

TC

betalactamase_HQ /5Phos/CAACGTCTGCGCCATCGCCGTTGGAGGCTCATCGTTCCTATATTCCACAC 913565 578 730 CACTTATTATTACAGATGTTATGCTCGCAGGTCCGCAATATCATTGGTGGTGC betalactamase_AY /5Phos/GCCGCCCGAAGGACATCAACGTTGGAGGCTCATCGTTCCTATATTCCACA 524988 385 552 CCACTTATTATTACAGATGTTATGCTCGCAGGTCCAGACGGGACGTACACAAC

CARB_AF030945_64 /5PhoS/CGTGCTGGCTATTGCCTTAGGGTTGGAGGCTCATCGTTCCTATATTCCAC 6 795 ACCACTTATTATTACAGATGTTATGCTCGCAGGTCGTAATACTCCTAGCACCAAATC

CARB_U14749_1227 /5Phos/CATTAGGAGTTGTCGTATCCCTCAGTTGGAGGCTCATCGTTCCTATATTC _1390 CACACCACTTATTATTACAGATGTTATGCTCGCAGGTCAATACTCCGAGCACCAAAT

C

CARB_AF313471_27 /5Phos/AAATTGCAGTTCGCGCTTAGCGTTGGAGGCTCATCGTTCCTATATTCCAC 31 2906 ACCACTTATTATTACAGATGTTATGCTCGCAGGTCGTTCCATAGCGTTAAGGTTTC

C Y_DQ463751_613 /5Phos/GCGCCAAACAGACCAATGCTGTTGGAGGCTCATCGTTCCTATATTCCACA 790 CCACTTATTATTACAGATGTTATGCTCGCAGGTCGATTTCACGCCATAGGCTC

CMY_EF685371_397 /5Phos/GTATAGGTGGCTAAGTGCAGCAGTTGGAGGCTCATCGTTCCTATATTCCA 552 CACCACTTATTATTACAGATGTTATGCTCGCAGGTCTCGTAACTCATTCCTGAGGG

CMY_EU515251_583 /5Phos/GTCATCGCCTCTTCGTAGCTCGTTGGAGGCTCATCGTTCCTATATTCCAC 733 ACCACTTATTATTACAGATGTTATGCTCGCAGGTCGCCATATCGATAACGCTGG

CMY_X92508_126_3 /5Phos/AGTATCTTACCTGAAATTCCCTCACGTTGGAGGCTCATCGTTCCTATATT 01 CCACACCACTTATTATTACAGATGTTATGCTCGCAGGTCCCTCTCGTCATAAGTCGA

ATG

CMY_AB061794_343 /5Phos/CATCACGAAGCCCGCCACAGTTGGAGGCTCATCGTTCCTATATTCCACAC 489 CACTTATTATTACAGATGTTATGCTCGCAGGTCGCCCTTGAGCGGAAGTATC

CMY_JN714478_188 /5Phos/ACCAATACGCCAGTAGCGAGAGTTGGAGGCTCATCGTTCCTATATTCCAC 2 2055 ACCACTTATTATTACAGATGTTATGCTCGCAGGTCGCAACGTAGCTGCCAAATC

CMY_X91840_1872_ /5PhoS/CAATCAGTGTGTTTGATTTGCACCGTTGGAGGCTCATCGTTCCTATATTC 2046 CACACCACTTATTATTACAGATGTTATGCTCGCAGGTCTACCCGGAATAGCCTGCTC

CTX _EF219134_13 /5Phos/CGGATAACGCCACGGGATGAGTTGGAGGCTCATCGTTCCTATATTCCACA 713 13858 CCACTTATTATTACAGATGTTATGCTCGCAGGTCACCGGGTCAAAGAATTCCTC

CTXM_HQ398215_80 /5Phos/GCGGCGTGGTGGTGTCTCGTTGGAGGCTCATCGTTCCTATATTCCACACC 2 947 ACTTATTATTACAGATGTTATGCTCGCAGGTCCGCTGCCGGTCTTATCAC

CTXM_AM982522_63 /5Phos/GCCACGTCACCAGCTGCGGTTGGAGGCTCATCGTTCCTATATTCCACACC 9 788 ACTTATTATTACAGATGTTATGCTCGCAGGTCCGGCTGGGTGAAGTAAGTC

GES_HM173356_116 /5Phos/GCTCGTAGCGTCGCGTCTCGTTGGAGGCTCATCGTTCCTATATTCCACAC 3 1321 CACTTATTATTACAGATGTTATGCTCGCAGGTCTTGACCGACAGAGGCAAC

GES_AF156486_175 /5Phos/CAGCAGGTCCGCCAATTTCTCGTTGGAGGCTCATCGTTCCTATATTCCAC 4 1905 ACCACTTATTATTACAGATGTTATGCTCGCAGGTCAGTGGACGTCAGTGCGC Name Sequence

GES_HQ874631_571 /5PhoS/CCATAGAGGACTTTAGCCACAGTGTTGGAGGCTCATCGTTCCTATATTCC 748 ACACCACTTATTATTACAGATGTTATGCTCGCAGGTCTACACCGCTACAGCGTAAT

GES_FJ820124_117 /5Phos/CATATGCAGAGTGAGCGGTCCGTTGGAGGCTCATCGTTCCTATATTCCAC 4 1338 ACCACTTATTATTACAGATGTTATGCTCGCAGGTCTCAATTCTTTCAAAGACCAGC

IMG_DQ361087_489 /5Phos/CCATTAACTTCTTCAAACGATGTATGGTTGGAGGCTCATCGTTCCTATAT 645 TCCACACCACTTATTATTACAGATGTTATGCTCGCAGGTCACCCGTGCTGTCGCTAT

I G_JN848782_301 /5Phos/GTGCTGTCGCTATGGAAATGTGGTTGGAGGCTCATCGTTCCTATATTCCA _475 CACCACTTATTATTACAGATGTTATGCTCGCAGGTCAACCAAACCACTAGGTTATCT

T

IMG_EF192154_182 /5Phos/GTCAGTGTTTACAAGAACCACCAGTTGGAGGCTCATCGTTCCTATATTCC _328 ACACCACTTATTATTACAGATGTTATGCTCGCAGGTCATGCATACGTGGGAATAGAT

T

IMG_AY033653_134 /5PhoS/CGGAAGTATCCGCGCGCCGTTGGAGGCTCATCGTTCCTATATTCCACACC 3 1500 ACTTATTATTACAGATGTTATGCTCGCAGGTCTTCGATCACGGCACGATC

IMG_AF318077_871 /5Phos/CGAACCAGCTTGGTTCCCAAGGTTGGAGGCTCATCGTTCCTATATTCCAC 1047 ACCACTTATTATTACAGATGTTATGCTCGCAGGTCTCACTGCGTGTTCGCTC

IMG_AF318077_515 /5Phos/GATGCTGTACTTTGTGATGCCTAGTTGGAGGCTCATCGTTCCTATATTCC 657 ACACCACTTATTATTACAGATGTTATGCTCGCAGGTCCGCTTGGCAAGTACTGTTC

KPC_HM066995_226 /5Phos/GCAAGAAAGCCCTTGAATGAGCGTTGGAGGCTCATCGTTCCTATATTCCA 375 CACCACTTATTATTACAGATGTTATGCTCGCAGGTCGCGTTATCACTGTATTGCAC PC_GQ140348_624 /5Phos/AATCAACAAACTGCTGCCGCTGTTGGAGGCTCATCGTTCCTATATTCCAC 799 ACCACTTATTATTACAGATGTTATGCTCGCAGGTCGCTGTACTTGTCATCCTTGT PC_EU729727_683 /5Phos/CCAGTCTGCCGGCACCGCGTTGGAGGCTCATCGTTCCTATATTCCACACC 840 ACTTATTATTACAGATGTTATGCTCGCAGGTCTCGAGCGCGAGTCTAGC

PC_FJ234412_691 /5Phos/CCGACTGCCCAGTCTGCCGGTTGGAGGCTCATCGTTCCTATATTCCACAC 839 CACTTATTATTACAGATGTTATGCTCGCAGGTCCGAGCGCGAGTCTAGCC

NDM_JN104597__64_ /5Phos/GTAAATAGATGATCTTAATTTGGTTCACGTTGGAGGCTCATCGTTCCTAT 211 ATTCCACACCACTTATTATTACAGATGTTATGCTCGCAGGTCTTGCTGGCCAATCGT

CG

NDM_FN396876_274 /5Phos/CACAGCCTGACTTTCGCCGCGTTGGAGGCTCATCGTTCCTATATTCCACA 4 2885 CCACTTATTATTACAGATGTTATGCTCGCAGGTCCAAGCAGGAGATCAACCTGC

NDM_FN396876_295 /5Phos/GGTGGTCGATACCGCCTGGGTTGGAGGCTCATCGTTCCTATATTCCACAC 8 3117 CACTTATTATTACAGATGTTATGCTCGCAGGTCGTGAAATCCGCCCGACG

NDM_JN104597_314 /5Phos/CATGTCGAGATAGGAAGTGTGCGTTGGAGGCTCATCGTTCCTATATTCCA 465 CACCACTTATTATTACAGATGTTATGCTCGCAGGTCTGATGCGCGTGAGTCAC

NDM_FN396876_238 /5PhoS/CAATCTGCCATCGCGCGATTGTTGGAGGCTCATCGTTCCTATATTCCACA 2 2548 CCACTTATTATTACAGATGTTATGCTCGCAGGTCCGGCAATCTCGGTGATGC

OXA_EF650035_239 /5Phos/CGAAGCAGGTACATACTCGGTCGTTGGAGGCTCATCGTTCCTATATTCCA _388 CACCACTTATTATTACAGATGTTATGCTCGCAGGTCACGAGCTAAATCTTGATAAAC

TT

OXA_EU019535_389 /5Phos/TAGAATAGCGGAAGCTTTCATGGGTTGGAGGCTCATCGTTCCTATATTCC _537 ACACCACTTATTATTACAGATGTTATGCTCGCAGGTCAGCTAGCTTGTAAGCAAACT

G

OXA_EF650035_423 /5Phos/CAAGTCCAATACGACGAGCTAAAGTTGGAGGCTCATCGTTCCTATATTCC _594 ACACCACTTATTATTACAGATGTTATGCTCGCAGGTCGAATAGCATGGATTGCACTT

C

OXA_DQ309276_232 /5Phos/GGTACATACTCGGTCGAAGCACGTTGGAGGCTCATCGTTCCTATATTCCA _380 CACCACTTATTATTACAGATGTTATGCTCGCAGGTCAATCTTGATAAACTGAAATAG

CG

OXA_DQ445683_232 /5Phos/GGTACATACTCGGTCGATGCACGTTGGAGGCTCATCGTTCCTATATTCCA 380 CACCACTTATTATTACAGATGTTATGCTCGCAGGTCTCTTGATAAACCGGAATAGCG

OXA_X75562_201_3 /5Phos/GTAATTGAACTAGCTAATGCCGTACGTTGGAGGCTCATCGTTCCTATATT 66 CCACACCACTTATTATTACAGATGTTATGCTCGCAGGTCTTATGACACCAGTTTCTA

GGC

OXA_M55547_995_1 /5Phos/CAAGTACTGTTCCTGTACGTCAGGTTGGAGGCTCATCGTTCCTATATTCC Name Sequence

154 ACACCACTTATTATTACAGATGTTATGCTCGCAGGTCGCCCAGTTGTGATGCATTC

OXA_AY445080__313 /5Phos/TCTCTTTCCCATTGTTTCATGGCGTTGGAGGCTCATCGTTCCTATATTCC 469 ACACCACTTATTATTACAGATGTTATGCTCGCAGGTCTGCGGAAATTCTAAGCTGAC

PER_Z21957_217_3 /5Phos/GTAGGTTATGCAGTTATTAGGTTCAGGTTGGAGGCTCATCGTTCCTATAT 71 TCCACACCACTTATTATTACAGATGTTATGCTCGCAGGTCGACTCAGCCGAGTCAAG

C

PER_HQ713678_600 /5Phos/GCAGTACCAACATAGCTAAATGCGTTGGAGGCTCATCGTTCCTATATTCC 2_6167 ACACCACTTATTATTACAGATGTTATGCTCGCAGGTCAAATAACAAATCACAGGCCA

C

PER_GQ396303_667 /5Phos/GGTCCTGTGGTGGTTTCCACCGTTGGAGGCTCATCGTTCCTATATTCCAC 844 ACCACTTATTATTACAGATGTTATGCTCGCAGGTCCGCGATAATGGCTTCATTGG

PER_X93314_954_1 /5Phos/TAACCGCTGTGGTCCTGTGGGTTGGAGGCTCATCGTTCCTATATTCCACA 122 CCACTTATTATTACAGATGTTATGCTCGCAGGTCTGCGCAATAATAGCTTCATTG

PER_HQ713678_451 /5Phos/GGAAGCGTTGCTTGCCATAGTGTTGGAGGCTCATCGTTCCTATATTCCAC 7 4674 ACCACTTATTATTACAGATGTTATGCTCGCAGGTCAACCGAAGCACCATGTAATT

PER_HQ713678_507 /5Phos/GTTCGGTGCAAAGACGCCGGTTGGAGGCTCATCGTTCCTATATTCCACAC 4 5219 CACTTATTATTACAGATGTTATGCTCGCAGGTCTCGCAGACTTCAATATCAATATT

PER_GQ396303_254 /5Phos/CACCTGATGCAGAACCAGCATGTTGGAGGCTCATCGTTCCTATATTCCAC 399 ACCACTTATTATTACAGATGTTATGCTCGCAGGTCAGGCCACGTTATCACTGTG

SHV_AY661885_656 /5Phos/CAGCTGCCGTTGCGAACGGTTGGAGGCTCATCGTTCCTATATTCCACACC 806 ACTTATTATTACAGATGTTATGCTCGCAGGTCCGCAGATAAATCACCACAATC

SHV_AF535128_587 /5Phos/GCTCAGACGCTGGCTGGTCGTTGGAGGCTCATCGTTCCTATATTCCACAC 761 CACTTATTATTACAGATGTTATGCTCGCAGGTCCCGCAGATAAATCACCACG

SHVJJ92041_406_5 /5Phos/GCCAGTAGCAGATTGGCGGCGTTGGAGGCTCATCGTTCCTATATTCCACA 79 CCACTTATTATTACAGATGTTATGCTCGCAGGTCGAACGGGCGCTCAGACG

SHV_AY288915_617 /5PhoS/CCACTGCAGCAGATGCCGTGTTGGAGGCTCATCGTTCCTATATTCCACAC 764 CACTTATTATTACAGATGTTATGCTCGCAGGTCGTATCCCGCAGATAAATCACC

SHV_HQ637576_88_ /5Phos/TTAATTTGCTTAAGCGGCTGCGGTTGGAGGCTCATCGTTCCTATATTCCA 245 CACCACTTATTATTACAGATGTTATGCTCGCAGGTCCCAGCTGTTCGTCACCG

SHV_AF535128_188 /5Phos/GGGAAAGCGTTCATCGGCGGTTGGAGGCTCATCGTTCCTATATTCCACAC 362 CACTTATTATTACAGATGTTATGCTCGCAGGTCTCGCTCATGGTAATGGCG

SHV_X98102_763_9 /5Phos/TCTTATCGGCGATAAACCAGCCGTTGGAGGCTCATCGTTCCTATATTCCA 13 CACCACTTATTATTACAGATGTTATGCTCGCAGGTCCGTTGCCAGTGCTCGAT

TEM_X64523_2037_ /5Phos/CAGTCCCTCGATATTCAGATCAGAGTTGGAGGCTCATCGTTCCTATATTC 2191 CACACCACTTATTATTACAGATGTTATGCTCGCAGGTCTTAACAATTTCGCAACCGT

C

TEM_J01749_2068_ /5PhoS/CAGCTGCGGTAAAGCTCATCAGTTGGAGGCTCATCGTTCCTATATTCCAC 2239 ACCACTTATTATTACAGATGTTATGCTCGCAGGTCCATAGTTAAGCCAGTATACACT

C

TEM_GQ149347_360 /5Phos/GTCGGAAAGTTGACCAGACATTAGTTGGAGGCTCATCGTTCCTATATTCC 5_3747 ACACCACTTATTATTACAGATGTTATGCTCGCAGGTCATACTAGGAGAAGTTAATAA

ATACG

TE _U36911_4374_ /5Phos/CATTCTCTCGCTTTAATTTATTAACCTGTTGGAGGCTCATCGTTCCTATA 4551 TTCCACACCACTTATTATTACAGATGTTATGCTCGCAGGTCATCGACCTTCTGGACA

TTATC

TEM_AF091113_152 /5Phos/GTAACAACTTTCATGCTCTCCTAAAGTTGGAGGCTCATCGTTCCTATATT 9_1699 CCACACCACTTATTATTACAGATGTTATGCTCGCAGGTCCGGTAACTGATGCCGTAT

TT

TEM_GU371926_118 /5Phos/GTGAAGTGAATGGTCAGTATGTTGGTTGGAGGCTCATCGTTCCTATATTC 01 11944 CACACCACTTATTATTACAGATGTTATGCTCGCAGGTCAGTGCGCAGGAGATTAGC

TEM_J01749_766_9 /5Phos/CCTGTCCTACGAGTTGCATGATGTTGGAGGCTCATCGTTCCTATATTCCA 08 CACCACTTATTATTACAGATGTTATGCTCGCAGGTCATAATGGCCTGCTTCTCGC

TEM_J01749_1634_ /5Phos/CGTTTCCAGACTTTACGAAACACGTTGGAGGCTCATCGTTCCTATATTCC 1783 ACACCACTTATTATTACAGATGTTATGCTCGCAGGTCACGTTGTGAGGGTAAACAAC

TEM_U36911_7596_ /5Phos/CGTTGCTTACGCAACCAAATATCGTTGGAGGCTCATCGTTCCTATATTCC Name Sequence

7762 ACACCACTTATTATTACAGATGTTATGCTCGCAGGTCTGATCTTGCTCAATGAGGTT

A

TEM_U36911_6901_ /5Phos/CATCATGTTCATATTTATCAGAGCTCGTTGGAGGCTCATCGTTCCTATAT 7069 TCCACACCACTTATTATTACAGATGTTATGCTCGCAGGTCTAGATTTCATAAAGTCT

AACACAC

TE _GU371926_339 /5Phos/GTTTCCACATGGTGAACGGTGGTTGGAGGCTCATCGTTCCTATATTCCAC 09 34082 ACCACTTATTATTACAGATGTTATGCTCGCAGGTCAAACCTGTCACTCTGAATGTT

VEB_EU259884_694 /5PhoS/CAAATACTAAATTATACAGTATCAGAGAGGTTGGAGGCTCATCGTTCCTA 7_7094 TATTCCACACCACTTATTATTACAGATGTTATGCTCGCAGGTCATGCAAAGCGTTAT

GAAATTTC

VEB_EF136375_596 /5Phos/GTTCTTATTATTATAAGTATCTATTAACAGTTGTTGGAGGCTCATCGTTC _738 CTATATTCCACACCACTTATTATTACAGATGTTATGCTCGCAGGTCCATTAGTGGCT

GCTGCAAT

VEB_EF420108_234 /5Phos/CATCGGGAAATGGAAGTCGTTATGTTGGAGGCTCATCGTTCCTATATTCC _380 ACACCACTTATTATTACAGATGTTATGCTCGCAGGTCGTTCAATCGTCAAAGTTGTT

C

VEB_AF010416_89_ /5Phos/CGTGGTTTGTGCTGAGCAAAGGTTGGAGGCTCATCGTTCCTATATTCCAC 230 ACCACTTATTATTACAGATGTTATGCTCGCAGGTCCAAAGTTAAGTTGTCAGTTTGA

G

VIM_AY524988_385 /5Phos/GCCGCCCGAAGGACATCAAGTTGGAGGCTCATCGTTCCTATATTCCACAC 552 CACTTATTATTACAGATGTTATGCTCGCAGGTCAGACGGGACGTACACAAC

VIM_Y18050_3464_ /5Phos/GCAACTCATCACCATCACGGAGTTGGAGGCTCATCGTTCCTATATTCCAC 3614 ACCACTTATTATTACAGATGTTATGCTCGCAGGTCTGATGCGTACGTTGCCAC

VIM_AY635904__58_ /5Phos/GCGACAGCCATGACAGACGCGTTGGAGGCTCATCGTTCCTATATTCCACA 203 CCACTTATTATTACAGATGTTATGCTCGCAGGTCGGACAATGAGACCATTGGAC

VIM_HM750249_275 /5 hos/AAACGACTGCGTTGCGATATGGTTGGAGGCTCATCGTTCCTATATTCCAC 454 ACCACTTATTATTACAGATGTTATGCTCGCAGGTCTTCCGAAGGACATCAACGC

VIM_AJ536835_313 /5Phos/ATGCGACCAAACGCCATCGCGTTGGAGGCTCATCGTTCCTATATTCCACA 481 CCACTTATTATTACAGATGTTATGCTCGCAGGTCATCGTCATGGAAGTGCGTA

VIM_EU118148_131 /5Phos/GAACAGGCTTATGTCAACTGGGGTTGGAGGCTCATCGTTCCTATATTCCA 300 CACCACTTATTATTACAGATGTTATGCTCGCAGGTCCATAACATCAAACATCGACCC

VIM_DQ143913_921 /5PhoS/ CGAACCGAACAGGCTTATGTCGTTGGAGGCTCATCGTTCCTATATTCCA 1063 CACCACTTATTATTACAGATGTTATGCTCGCAGGTCTAACGCGCTTGCTGCTT

VIM_EU118148_282 /5Phos/GCTGTAATTATGACGACGCCGGTTGGAGGCTCATCGTTCCTATATTCCAC 1 2961 ACCACTTATTATTACAGATGTTATGCTCGCAGGTCCTCGGTGAGATTCAGAATGC

VI _EU118148_106 /5Phos/CATCATAGACGCGGTCAAATAGAGTTGGAGGCTCATCGTTCCTATATTCC 0 1229 ACACCACTTATTATTACAGATGTTATGCTCGCAGGTCACTCATCACCATCACGGAC van_DQ018710.1_6 /5Phos/GTGTATGTCAGCGATTTGTCCATGTTGGAGGCTCATCGTTCCTATATTCC 481_6652 ACACCACTTATTATTACAGATGTTATGCTCGCAGGTCTGTCATATTGTCTTGCCGAT

T

van_DQ018710.1_6 /5Phos/GTCCACCTCGCCAACAATCAAGTTGGAGGCTCATCGTTCCTATATTCCAC 764 6926 ACCACTTATTATTACAGATGTTATGCTCGCAGGTCATATCAACACGGGAAAGACCT van_AY926880.1_3 /5Phos/GCGTGATTATCACGTTCGGCAGTTGGAGGCTCATCGTTCCTATATTCCAC 640 3785 ACCACTTATTATTACAGATGTTATGCTCGCAGGTCCTTGCAGATTTAACCGACAC van_FJ545640.1_5 /5Phos/GGCTCGACTTCCTGATGAATACGGTTGGAGGCTCATCGTTCCTATATTCC 17 690 ACACCACTTATTATTACAGATGTTATGCTCGCAGGTCTGAAACCGGGCAGAGTATT van_AE017171.1_3 /5Phos/CAACGATGTATGTCAACGATTTGTGTTGGAGGCTCATCGTTCCTATATTC 4715_34859 CACACCACTTATTATTACAGATGTTATGCTCGCAGGTCATTGCGTAGTCCAATTCGT

C

van_NC_008821.1_ /5Phos/CAGGCTGTTTCGGGCTGTGAGTTGGAGGCTCATCGTTCCTATATTCCACA 11898_12045 CCACTTATTATTACAGATGTTATGCTCGCAGGTCGGGTTATTAATAAAGATGATAGG

C

van_FJ349556.1_5 /5Phos/GGCTCGGCTTCCTGATGAATACGTTGGAGGCTCATCGTTCCTATATTCCA 601 5765 CACCACTTATTATTACAGATGTTATGCTCGCAGGTCAGGCATGGTATTGACTTCATT mecA_AY820253.1_ /5Phos/TAATTCAAGTGCAACTCTCGCAAGTTGGAGGCTCATCGTTCCTATATTCC Name Sequence

1431_1608 ACACCACTTATTATTACAGATGTTATGCTCGCAGGTCTTTATTCTCTAATGCGCTAT

ATATT

mecA_AY952298.1_ /5Phos/GGATAGTTACGACTTTCTGCTTCAGTTGGAGGCTCATCGTTCCTATATTC 130_302 CACACCACTTATTATTACAGATGTTATGCTCGCAGGTCTGTATTGCTATTATCGTCA

ACG

mecA_AM048806.2_ /5 hos/CAGTATTTCACCTTGTCCGTAACCGTTGGAGGCTCATCGTTCCTATATTC 1574_1720 CACACCACTTATTATTACAGATGTTATGCTCGCAGGTCGTTTACGACTTGTTGCATG

C

mecA_EF692630.1_ /5Phos/AATGTTTATATCTTTAACGCCTAAACTGTTGGAGGCTCATCGTTCCTATA 239_405 TTCCACACCACTTATTATTACAGATGTTATGCTCGCAGGTCATGCTTTGGTCTTTCT

GCAT

mex_AF092566.1_3 /5PhoS/CTGGCCCTTGAGGTCGCGGGTTGGAGGCTCATCGTTCCTATATTCCACAC 71 520 CACTTATTATTACAGATGTTATGCTCGCAGGTCCGGTCTTCACCTCGACAC mex_AF092566.1_5 /5Phos/GACGTAGATCGGGTCGAGCTGTTGGAGGCTCATCGTTCCTATATTCCACA 0 193 CCACTTATTATTACAGATGTTATGCTCGCAGGTCACGGAAACCTCGGAGAATT mex_CP000438.1_4 /5Phos/GGCGTACTGCTGCTTGCTCAGTTGGAGGCTCATCGTTCCTATATTCCACA 87178 487357 CCACTTATTATTACAGATGTTATGCTCGCAGGTCTGACGTCGACGTAGATCG mex_NZ_AAQW01000 /5Phos/CCTGTTCCTGGGTCGAAGCCGTTGGAGGCTCATCGTTCCTATATTCCACA 001.1_461304_461 CCACTTATTATTACAGATGTTATGCTCGCAGGTCCTTCGGTCACCGCGGA 466

erm_NC_002745.2_ /5Phos/GTCAGGCTAAATATAGCTATCTTATCGGTTGGAGGCTCATCGTTCCTATA 871803_871973 TTCCACACCACTTATTATTACAGATGTTATGCTCGCAGGTCTCAGTTACTGCTATAG

AAATTGAT

erm_NC_002745.2_ /5Phos/CATCCTAAGCCAAGTGTAGACTCGTTGGAGGCTCATCGTTCCTATATTCC 871666_871841 ACACCACTTATTATTACAGATGTTATGCTCGCAGGTCAAGATATATGGTAATATTCC

TTATAAC

erm_EU047809.1_7 /5Phos/GTTTATAAGTGGGTAAACCGTGAATGTTGGAGGCTCATCGTTCCTATATT 9_229 CCACACCACTTATTATTACAGATGTTATGCTCGCAGGTCGAAACGAGCTTTAGGTTT

GC

acinetobacter_NC /5Phos/GCAGCACTTGACCGCCATGAGTGACCAGTTGGAGGCTCATCGTTCCTATA _010611_627997_6 TTCCACACCACTTATTATTACAGATGTTATGCTCGCAGGTCCATCGCACCAACAACA 28164 ATAATCG

acinetobacter_NC /5Phos/GTGATCACTGATGCACCAGATGAAGTGTTGGAGGCTCATCGTTCCTATAT _010611_2417580_ TCCACACCACTTATTATTACAGATGTTATGCTCGCAGGTCATCTTGATATTCAAGTC 2417755 TATGACG

acinetobacter CP /5Phos/GATATTATTGATCATGGTGCCAAGCCAAGTTGGAGGCTCATCGTTCCTAT 002522_11753_119 ATTCCACACCACTTATTATTACAGATGTTATGCTCGCAGGTCCAATATGAAGCTGAC 31 GACGCG

acinetobacter NC /5Phos/GCTGAGCGTGAAGGTTCATGGATTATTAGTTGGAGGCTCATCGTTCCTAT _011586_3908329_ ATTCCACACCACTTATTATTACAGATGTTATGCTCGCAGGTCGGTAAGGCTTACGGT 3908508 CTCAT

acinetobacter NC /5PhoS/GCATCTTGTGCAGCCTGAATAGCAGCGTGTTGGAGGCTCATCGTTCCTAT _010611_145181_1 ATTCCACACCACTTATTATTACAGATGTTATGCTCGCAGGTCACCACGTTGAATATC 45340 ACCTTCGGCAT

acinetobacter NC /5PhoS/ AGTCCATAATTGCTTGAGTGTAGTCATGTTGGAGGCTCATCGTTCCTAT _010611_3854494_ ATTCCACACCACTTATTATTACAGATGTTATGCTCGCAGGTCATCTTCGCACTGAAT 3854662 AATAAGAACAT

acinetobacter_NC /5Phos/GCTTGCTGGTTCTGCACGTAGCTTACTGGTTGGAGGCTCATCGTTCCTAT _010400_56216_56 ATTCCACACCACTTATTATTACAGATGTTATGCTCGCAGGTCAAGATGAACAGGCTA 383 CTGCAA

acinetobacter NC /5Phos/GCAGCGCTGTGCAAGTTCAATGTATTCTGTTGGAGGCTCATCGTTCCTAT _010611_1454960_ ATTCCACACCACTTATTATTACAGATGTTATGCTCGCAGGTCCTCGTGCGAGTATTC 1455136 CTTAAGTGT

acinetobacter_NC /5Phos/GTATAACACTCGGCCAGCGCCAAGGTTCGTTGGAGGCTCATCGTTCCTAT _009085_255964_2 ATTCCACACCACTTATTATTACAGATGTTATGCTCGCAGGTCGTTCACACATCGCCA 56143 CAATATGAT Name Sequence

Clostridium NC 0 /5Phos/ACCATGCAGATACAATGAACCAGTTGGAGGCTCATCGTTCCTATATTCCA 13974_3097606_30 CACCACTTATTATTACAGATGTTATGCTCGCAGGTCGGATGATAAGACACATCCAAT 97772 TC

Clostridium FN66 /5PhoS/CATCAACAGCTTCTTGAAGCATTCGTTGGAGGCTCATCGTTCCTATATTC 5653_103469_1036 CACACCACTTATTATTACAGATGTTATGCTCGCAGGTCGTCCAACAACTATAACAGA 31 ACGTC

clostridium_NC_0 /5Phos/AACATATCACCTGATATTCTAGTATCGTTGGAGGCTCATCGTTCCTATAT 13974_117188_117 TCCACACCACTTATTATTACAGATGTTATGCTCGCAGGTCATTCCATTATATTCAAC 346 AGGATTGTGA

c1ostridium_NC_0 /5Phos/GCTGTTGCTTGCGGATACTGGTTGGAGGCTCATCGTTCCTATATTCCACA 13316_3012882_30 CCACTTATTATTACAGATGTTATGCTCGCAGGTCCGTATATGTAGCTCAAGTTGC 13047

Clostridium FN66 /5Phos/AAGAGCTAATGCAGCTATTGCACTTATGTTGGAGGCTCATCGTTCCTATA 8375_1212250_121 TTCCACACCACTTATTATTACAGATGTTATGCTCGCAGGTCCATACACTTCAGCTAT 2413 AAGACCAT

clostridium_NC_0 /5PhoS/AACAAGAGCAGAAGTTACAGACGTGTTGGAGGCTCATCGTTCCTATATTC 13315_3754484_37 CACACCACTTATTATTACAGATGTTATGCTCGCAGGTCGTATAATGGTGGCTAGAGG 54640 TGA

clostridium_FN66 /5Phos/ACTCGTGAAGACCATGCAGATACAAGTTGGAGGCTCATCGTTCCTATATT 5654_3239860_324 CCACACCACTTATTATTACAGATGTTATGCTCGCAGGTCAATACTTACAATGCCTGA 0039 GGA

Clostridium FN66 /5Phos/ACCATGCAGATACAATGAACCGTTGGAGGCTCATCGTTCCTATATTCCAC 8941_3228320_322 ACCACTTATTATTACAGATGTTATGCTCGCAGGTCCCTGAGGATGATAAGACACATC 8491

clostridium_NC_0 /5Phos/GCATCTGCTGCTTCTATTGCTCCTACTGTTGGAGGCTCATCGTTCCTATA 13974_1962664_19 TTCCACACCACTTATTATTACAGATGTTATGCTCGCAGGTCACATGAACTGATATTA 62825 GTTCTCCAA

clostridium_NC_0 /5Phos/GCACAAGCTGGAGATAACATCGGGTTGGAGGCTCATCGTTCCTATATTCC 03366_2769687_27 ACACCACTTATTATTACAGATGTTATGCTCGCAGGTCGTAGAGGACGTATTCACAAT 69851 CACT

Clostridium FN66 /5PhoS/CTCTATCAGCTTCTACTGCTTCTTCGTTGGAGGCTCATCGTTCCTATATT 5653_127741__1279 CCACACCACTTATTATTACAGATGTTATGCTCGCAGGTCCCATCTCATCCACAGTTA 18 ATATATC

Clostridium NC 0 /5Phos/AGATGAGATTCATACTATCGTTGGAGCTGTTGGAGGCTCATCGTTCCTAT 13316_2259929_22 ATTCCACACCACTTATTATTACAGATGTTATGCTCGCAGGTCAGCAGAGAGAATAGT 60107 AAGAGGAGA

clostridium_NC_0 /5Phos/CATCAACAGCTTCTTGAAGCATTGTTGGAGGCTCATCGTTCCTATATTCC 09089_94774_9493 ACACCACTTATTATTACAGATGTTATGCTCGCAGGTCGTCCAACAACTATAACAGAA 7 CG

Clostridium NC 0 /5Phos/GTCAGCAATACGCCACCAAGCTCCTATGTTGGAGGCTCATCGTTCCTATA 13315_2044225_20 TTCCACACCACTTATTATTACAGATGTTATGCTCGCAGGTCGTGGTGGATATCCTGT 44389 TACC

Clostridium NC 0 /5Phos/GCGCAATAGAGTTGTATAAGAGTGCTGGTTGGAGGCTCATCGTTCCTATA 13315_2299408_22 TTCCACACCACTTATTATTACAGATGTTATGCTCGCAGGTCAGCATTAATTATAGAT 99586 TATAATGTATAA

Clostridium FN66 /5Phos/GGCATAATAGGATGGATAGATGAGTTGGAGGCTCATCGTTCCTATATTCC 8941_3244255_324 ACACCACTTATTATTACAGATGTTATGCTCGCAGGTCACTAATCCAACTTCTACTGC 4408 TAT

clostridium_NC_0 /5Phos/GTACATTCACATATAGACCATCTTAAGTTGGAGGCTCATCGTTCCTATAT 13316_3610909_36 TCCACACCACTTATTATTACAGATGTTATGCTCGCAGGTCACATAGGTGCAGGTAGA 11065 ATAGTATA

clostridium_FN66 /5Phos/CCATACCAGTATCTTGGCATATTGGTTGGAGGCTCATCGTTCCTATATTC 5653_1104859_110 CACACCACTTATTATTACAGATGTTATGCTCGCAGGTCATAATGAATAACAGCAGGT 5031 GTATTA

Clostridium NC 0 /5PhoS/AGATGAAGCACAAGCTGGAGATAAGTTGGAGGCTCATCGTTCCTATATTC 03366 2753681 27 CACACCACTTATTATTACAGATGTTATGCTCGCAGGTCAGGACGTATTCACAATCAC Name Sequence

53838 TG

clostridium_FN66 /5Phos/ATAATCATTCACCTCCATCATTCATAAGTTGGAGGCTCATCGTTCCTATA 5653_710906_7110 TTCCACACCACTTATTATTACAGATGTTATGCTCGCAGGTCACTGAATATGGTTCGT 80 CTCA

clostridium_NC__0 /5Phos/GTACATTCACATATAGACCATCTTAGTTGGAGGCTCATCGTTCCTATATT 09089_3706562_37 CCACACCACTTATTATTACAGATGTTATGCTCGCAGGTCACATAGGTGCAGGTAGAA 06720 TAGT

clostridium_NC_0 /5Phos/ACTCCACCAGGATGTTGTCCGTTGGAGGCTCATCGTTCCTATATTCCACA 13316_1372812_13 CCACTTATTATTACAGATGTTATGCTCGCAGGTCGTAGGACCGTCGTGTCCAAG 72968

clostridium_FN66 /5Phos/GCAATATCAATGGTATCGAAGGCACTATGTTGGAGGCTCATCGTTCCTAT 5652_676696_6768 ATTCCACACCACTTATTATTACAGATGTTATGCTCGCAGGTCGTATTGAAGGTACTA 95 TTAGCGATATGC

clostridium_NC_0 /5Phos/GTGCCGGTCTCGGTTACTCAATGGTTGGAGGCTCATCGTTCCTATATTCC 13316_2641651_26 ACACCACTTATTATTACAGATGTTATGCTCGCAGGTCGGATTATTATAATGCAGCTA 41808 GAAG

Clostridium FN66 /5Phos/GTACATTCACATATAGACCATCTTGTTGGAGGCTCATCGTTCCTATATTC 8375_3595870_359 CACACCACTTATTATTACAGATGTTATGCTCGCAGGTCACATAGGTGCAGGTAGAAT 6026 AGTA

Clostridium FN66 /5Phos/AGTTCCTTCATATGACTCAGTTGATTGAGTTGGAGGCTCATCGTTCCTAT 8941_1105700_110 ATTCCACACCACTTATTATTACAGATGTTATGCTCGCAGGTCGTTATATCTTCAATT 5868 ATACATTCCTGC

Clostridium NC 0 /5Phos/CAGCAGTTGTTGCTAGAGGTATGGTTGGAGGCTCATCGTTCCTATATTCC 13974_2505182_25 ACACCACTTATTATTACAGATGTTATGCTCGCAGGTCGCATCACCAGGTGCAGCAAG 05359 T

Clostridium NC 0 /5Phos/GCAATTCTCTGTTGTTGTCCTCCACTCAGTTGGAGGCTCATCGTTCCTAT 13315_1077126_10 ATTCCACACCACTTATTATTACAGATGTTATGCTCGCAGGTCAGTAAGAGCCTCTTC 77298 TTGGTCATGA

clostridium_NC_0 /5Phos/CTATTCCTGATAATAAGTGTGTCCTCATGTTGGAGGCTCATCGTTCCTAT 09089_2182303_21 ATTCCACACCACTTATTATTACAGATGTTATGCTCGCAGGTCCGGCATCATCTAACA 82482 ATTCTTCT

clostridium_FN66 /5Phos/GTAATTCCAATTACTTCTAGCTCTGGTGGTTGGAGGCTCATCGTTCCTAT 5652_1909777_190 ATTCCACACCACTTATTATTACAGATGTTATGCTCGCAGGTCTACCATCTTCTCCAT 9942 GTGTAT

Clostridium NC 0 /5Phos/CCATGCAGATACAATGAACCAGGTTGGAGGCTCATCGTTCCTATATTCCA 13316_3300896_33 CACCACTTATTATTACAGATGTTATGCTCGCAGGTCGATGATAAGACACATCCAATT 01062 CC

clostridium_NC_0 /5Phos/CCTTCTGCCATTGTAGAACAAGCTCCATGTTGGAGGCTCATCGTTCCTAT 13316_871338_871 ATTCCACACCACTTATTATTACAGATGTTATGCTCGCAGGTCCCTGTAACTGTCCAC 499 TGAGC

clostridium_NC_0 /5PhoS/CAATCATGATAGAATTAGATGGAACGTTGGAGGCTCATCGTTCCTATATT 13316_3608873_36 CCACACCACTTATTATTACAGATGTTATGCTCGCAGGTCAGCAATAGTTCCATCAGG 09047 AGCA C

clostridium_FN66 /5Phos/AGTGGTGAAGGTGTTCAACAAGGTTGGAGGCTCATCGTTCCTATATTCCA 5654_3717059_371 CACCACTTATTATTACAGATGTTATGCTCGCAGGTCACTGAAGCTGGATATGTTGGA 7221 G

clostridium_NC_0 /5Phos/CGCCTCTTCAGAAGCGGATATCAGTTGGAGGCTCATCGTTCCTATATTCC 13315_2010489_20 ACACCACTTATTATTACAGATGTTATGCTCGCAGGTCGCCAGACTTCCGCCACAACC 10657 T

Clostridium NC 0 /5Phos/GGCATAATAGGATGGATAGATGAGCGTTGGAGGCTCATCGTTCCTATATT 13315_3236301_32 CCACACCACTTATTATTACAGATGTTATGCTCGCAGGTCGCAGCAGTTGTACCTACA 36474 ACTAA

Clostridium NC 0 /5Phos/AGTTCCTTCATATGACTCAGTTGATTGGTTGGAGGCTCATCGTTCCTATA 13315_1095924_10 TTCCACACCACTTATTATTACAGATGTTATGCTCGCAGGTCGTTATATCTTCAATTA 96090 TACATTCCTGCG Name Sequence

enterobacter_NC_ /5Phos/GCATGGTAGTTCGCCAGCCGCTGGAACGTTGGAGGCTCATCGTTCCTATA 014121_4735453_4 TTCCACACCACTTATTATTACAGATGTTATGCTCGCAGGTCACAGCAACCGCAAGTT 735632 CTTGACAT

enterobacter NC /5Phos/AATATCATGGTCGTGTCCAGGCACTGGCGTTGGAGGCTCATCGTTCCTAT

015663_1014187_1 ATTCCACACCACTTATTATTACAGATGTTATGCTCGCAGGTCGTTCTGGTAGCTGCT

014345 TCTACTGTA

enterobacter_FP9 /5Phos/AACTTACAACTACGCGCACTTGAATCGGTTGGAGGCTCATCGTTCCTATA 29040_3448334_34 TTCCACACCACTTATTATTACAGATGTTATGCTCGCAGGTCGAGTGTTGTATGATAG 48513 TCTCGGT

enterobacter_NC_ /5Phos/GCAAGTTGAGGAGATGCTGGCATGATTCGTTGGAGGCTCATCGTTCCTAT 009436_4051820_4 ATTCCACACCACTTATTATTACAGATGTTATGCTCGCAGGTCACATGGCTCTGGAAG 051985 ATGTGCTGATC

enterococcus FP9 /5Phos/GCGATAATTGTAATGATTCGTGGTGTTAGTTGGAGGCTCATCGTTCCTAT 29058_1738439_17 ATTCCACACCACTTATTATTACAGATGTTATGCTCGCAGGTCCCGTTGTCAATCCAG 38606 TTAGTAGACT

enterococcus_CP0 /5Phos/ACTGTGGCAGTCTATGTTCCAATTGTAGTTGGAGGCTCATCGTTCCTATA 02621_1819224_18 TTCCACACCACTTATTATTACAGATGTTATGCTCGCAGGTCCTTATCGACATAATCC 19388 TGATAATC

enterococcus_FP9 /5Phos/GCGTCGCTTCTTGCGCTCGCCGTTGGAGGCTCATCGTTCCTATATTCCAC 29058_904007_904 ACCACTTATTATTACAGATGTTATGCTCGCAGGTCAATGTATTCATACCGTCAAGT 173

enterococcus FP9 /5Phos/GCCTTCACAACTACGTTGGAAGGTCTTCGTTGGAGGCTCATCGTTCCTAT 29058_551757_551 ATTCCACACCACTTATTATTACAGATGTTATGCTCGCAGGTCCTAACAGTCCTGCCG 920 ACTAC

enterococcus NC /5PhoS/GCCTTCACAACTACGTTGGAAGGTCTTGTTGGAGGCTCATCGTTCCTATA

004668_1122345_1 TTCCACACCACTTATTATTACAGATGTTATGCTCGCAGGTCCTAACAGTCCTGCCGA

122507 CTACT

klebsiella_NC 00 /5Phos/GCCGCTGAGCGGCGGCAAGCCGATGGCGTTGGAGGCTCATCGTTCCTATA 9648_2885456_288 TTCCACACCACTTATTATTACAGATGTTATGCTCGCAGGTCGAATGGCAGGCCAAGC 5620 TGAAGGCG

klebsiella_NC_00 /5Phos/GCCAAGCGGCATTCTGGCGCCAGTGGAGTTGGAGGCTCATCGTTCCTATA 9648_3899012_389 TTCCACACCACTTATTATTACAGATGTTATGCTCGCAGGTCCCAGACCGGAGTGGAC 9182 AACGTCGAGGCG

klebsiella_NC_00 /5Phos/GCCGTATATCATCGGCAATAACCGCACGGTTGGAGGCTCATCGTTCCTAT 9648_4980596_498 ATTCCACACCACTTATTATTACAGATGTTATGCTCGCAGGTCGCATGATGGTCAACA 0757 AGGTGC

klebsiella_NC_00 /5Phos/ACGAGCCGAGATAGGTCTGCAGCGTACGTTGGAGGCTCATCGTTCCTATA 9648_3266359_326 TTCCACACCACTTATTATTACAGATGTTATGCTCGCAGGTCGTACTGATATTCACCA 6519 TACTGCCG

klebsiella_NC_01 /5Phos/GCAATATCTTCACCGGCAGCCACCGCGGTTGGAGGCTCATCGTTCCTATA 2731_2557467_255 TTCCACACCACTTATTATTACAGATGTTATGCTCGCAGGTCGGTATATGGCACGCCA 7634 ATCGC

klebsiella_NC_01 /5Phos/AATAACCTTAACGTCGCCAACACGGTTGGAGGCTCATCGTTCCTATATTC 2731_4857136__485 CACACCACTTATTATTACAGATGTTATGCTCGCAGGTCCTCGGTGAACACCTCCTGG 7315 CACG

proteus_NC_01055 /5Phos/GCGGAACTGCTTGGCGTAGTAAGCGTTGGAGGCTCATCGTTCCTATATTC 4_547938_548117 CACACCACTTATTATTACAGATGTTATGCTCGCAGGTCCATGTAGTGCCGTAGACCT

TCACCA

pseudomonas NC 0 /5Phos/GCGAGACCGGCGGCACCATCGTCTCCAGGTTGGAGGCTCATCGTTCCTAT 08463_658500_658 ATTCCACACCACTTATTATTACAGATGTTATGCTCGCAGGTCTTCTGCCTGATGGAC 676 GTCTCCGGCTCG

pseudomonas NC 0 /5Phos/GCGGTTCACCTGTTCGCCTTCGAACACGGTTGGAGGCTCATCGTTCCTAT 08463_753931_754 ATTCCACACCACTTATTATTACAGATGTTATGCTCGCAGGTCGCGCAGCATCTGACG 099 CAGGATGGTCTCG

pseudomonas NC 0 /5Phos/ACTCCATCGCCATCAAGGACATGGCCGGGTTGGAGGCTCATCGTTCCTAT 09656 6431649 64 ATTCCACACCACTTATTATTACAGATGTTATGCTCGCAGGTCATCGACGTGTTCCGC Name Sequence

31828 ATCTTCGACGCG

pseudomonas NC 0 / 5 Pho s / GCCTGATGCACTACAGCGCCTGGGTTGGAGGCTCATCGTTCCTATATTCC 08463_560357_560 ACACCACTTATTATTACAGATGTTATGCTCGCAGGTCTACCACATGGTCGATCTCGA 534 CGACTGC

pseudomonas NC 0 / 5Phos / GCGCATCCAGGACGGCGAGTACGGTTGGAGGCTCATCGTTCCTATATTCC 10322_5224859_52 ACACCACTTATTATTACAGATGTTATGCTCGCAGGTCCTTCGAGTGCCTGCACGAGC 25023 TGAA

pseudomonas NC 0 / 5 Pho s / GCTGGAGAACGTCAAGGTGGTGATCATCGTTGGAGGCTCATCGTTCCTAT 08463_4839746_48 ATTCCACACCACTTATTATTACAGATGTTATGCTCGCAGGTCACCGATAACGACGAC 39924 CGCATCAA

staph_FN433596_2 / 5 Phos / ACGATTGGAGAAGGCAGTGTGATTGGGTTGGAGGCTCATCGTTCCTATAT 844085_2844263 TCCACACCACTTATTATTACAGATGTTATGCTCGCAGGTCGGACAGATTACAATTGG

CG

St aph_NC_009632_ / 5 Phos /GCCGCAATACCGATATTCCAGTTGGAGGCTCATCGTTCCTATATTCCACA 1198350 1198529 CCACTTATTATTACAGATGTTATGCTCGCAGGTCCCATTGTCCACCAGCTGAACCG

S taph_FN433596_2 / 5 Phos / GTGAAGGTCGTGCTCCTATCGGTGTTGGAGGCTCATCGTTCCTATATTCC 521244_2521419 ACACCACTTATTATTACAGATGTTATGCTCGCAGGTCAGATCTGGTGAAGTTCGTAT

GAT

staph_NC_009487_ / 5 Phos / GCTGGTACTTGTACTTATATCGAGTTGGAGGCTCATCGTTCCTATATTCC 430842_431017 ACACCACTTATTATTACAGATGTTATGCTCGCAGGTCATCAGAAGATGATATCGTTA

CGTCAT

staph_NC_009782_ / 5 Phos /GCGCATATTGCATTAATGGCTATAGATGTTGGAGGCTCATCGTTCCTATA 2086681_2086849 TTCCACACCACTTATTATTACAGATGTTATGCTCGCAGGTCGCCAGCAGGTTATACA

CTCG

staph_NC_009782_ / 5 Phos / GCAATTCTTACCACAGCACGAAGAACAGGTTGGAGGCTCATCGTTCCTAT 58256_58423 ATTCCACACCACTTATTATTACAGATGTTATGCTCGCAGGTCATCTAGATGAAGATA

ATGAAGTCG

s taph_NC_013450_ / 5Phos /GCATCTTCATACAATACTTCTAGCTTACGTTGGAGGCTCATCGTTCCTAT 991049_991222 ATTCCACACCACTTATTATTACAGATGTTATGCTCGCAGGTCCACAATACCAGTTGT

ATTACG

s taph_NC_013450_ / 5 Phos / GCTTCAGCGCCATTACCGCCACCAGCTGTTGGAGGCTCATCGTTCCTATA 1360842_1361008 TTCCACACCACTTATTATTACAGATGTTATGCTCGCAGGTCACTCTTGATATATTCT

TGTAAGCG

staph_AM990992_2 / 5 Phos / GTTCACACAACGCGCCGACTAGAATCCGTTGGAGGCTCATCGTTCCTATA 526026_2526192 TTCCACACCACTTATTATTACAGATGTTATGCTCGCAGGTCCACGATATCCAAGATA

ATGATTGGCTA

Staph_NC_010079_ / 5 Phos/ GCGCACCTACAATCGCCATTACTACACGTTGGAGGCTCATCGTTCCTATA 361284_361447 TTCCACACCACTTATTATTACAGATGTTATGCTCGCAGGTCACTCATTATCGACTGT

TACATCGACTGA

s taph_NC_007795_ / 5Phos /AGCGCACATGTGACAGCGTGTAGGTTAGTTGGAGGCTCATCGTTCCTATA 2085723_2085901 TTCCACACCACTTATTATTACAGATGTTATGCTCGCAGGTCGTGCCTTAGATTGTTC

AGAACAAT

S t aph_NC_009641_ / 5 Phos / CGAATGGATATGTACCATGGTCGATATCGTTGGAGGCTCATCGTTCCTAT 23125_23297 ATTCCACACCACTTATTATTACAGATGTTATGCTCGCAGGTCCTCTCTAATATGATG

TCCAT

staph_FN433596_2 /SPhos /ACTACAACAGCAACCGCATTACAATGGCGTTGGAGGCTCATCGTTCCTAT 144570_2144734 ATTCCACACCACTTATTATTACAGATGTTATGCTCGCAGGTCGGTGCTAAGAGGTCA

TCGGA

Staph_NC_009782_ / 5Phos /AGCTTCAGATAAGTACCTATCTGAGTTGGAGGCTCATCGTTCCTATATTC 54857_55020 CACACCACTTATTATTACAGATGTTATGCTCGCAGGTCGGAAGAATAGTTATTCTTG

ATAATGTAT

staph_AM990992_l / 5Phos / CGTATTGCTCGAATACATGATAGTTGGAGGCTCATCGTTCCTATATTCCA 656616 1656789 CACCACTTATTATTACAGATGTTATGCTCGCAGGTCACAATGTATCAAGGCCAGCT s taph_NC_007793_ / 5Phos/GCGACCAGTTGTTATCGACCGTGTGTTGGAGGCTCATCGTTCCTATATTC 44227 44395 CACACCACTTATTATTACAGATGTTATGCTCGCAGGTCCAGAACGATACGGTGCTGT Name Sequence

ATA

s t aph_NC_009641_ / 5Phos/ CAATTACATTGTCTGTTGCGTAGATACCGTTGGAGGCTCATCGTTCCTAT 1102949_1103116 ATTCCACACCACTTATTATTACAGATGTTATGCTCGCAGGTCGTTGTGGCTAATGTG

CCAGTT

staph_NC_0096 1_ / 5Phos/GCACCACTCTATAGCAGTAGCGTATTGGTTGGAGGCTCATCGTTCCTATA 1137731_1137898 TTCCACACCACTTATTATTACAGATGTTATGCTCGCAGGTCACAGCCAATGTCACCT

AAGTCAACA

s t aph_FN433596_2 / 5Phos /ACAGTCCGAATAAGATACGACTATTCGAGTTGGAGGCTCATCGTTCCTAT 715713_2715871 ATTCCACACCACTTATTATTACAGATGTTATGCTCGCAGGTCCGTTGTAACGTATAT

GAATAGTTGA

S taph_NC_009782_ / 5 Phos/AGATGCAATAACAGGTCGAATATTAATTGTTGGAGGCTCATCGTTCCTAT 606652_606825 ATTCCACACCACTTATTATTACAGATGTTATGCTCGCAGGTCGCCATAGTGAGAGTA

GTGAA

£31 aph_FN43359 S_6 / 5 Pho s / AGATGCAATAACAGGTCGAATATTAAGTTGGAGGCTCATCGTTCCTATAT 57625_657803 TCCACACCACTTATTATTACAGATGTTATGCTCGCAGGTCACACATACGGCCATAGT

GAGAG

species_NC_00474 / 5 hos/GAACATAACGCGACGTTCCAGCTGGTTGGAGGCTCATCGTTCCTATATTC 1_4338803_433898 CACACCACTTATTATTACAGATGTTATGCTCGCAGGTCGCTTCAGAGGTGTTGTAGT 2 CG

species__NC_00964 / 5 Pho s / GCGCTGGCGCAGTATCGTGAACTGGGTTGGAGGCTCATCGTTCCTATATT 8_4535521_453568 CCACACCACTTATTATTACAGATGTTATGCTCGCAGGTCACCAACGTAATCTCTATT 3 ACCG

species_NC_01041 / 5 Pho s / GCTGTAATGCAAGTAGCGTATGCGCTCAGTTGGAGGCTCATCGTTCCTAT 0_3677607_367778 ATTCCACACCACTTATTATTACAGATGTTATGCTCGCAGGTCAAGGCCGCCAATGCC 2 TGACG

species_CP001844 / 5Phos /GCCTGTAGCAACAGTACCACGACCAGTGTTGGAGGCTCATCGTTCCTATA _589057_589217 TTCCACACCACTTATTATTACAGATGTTATGCTCGCAGGTCCACCACGTAATAATGC

ACCAA

species_CP002110 / 5Phos /ACTACGCTGAAGCTGGTGACAACATTGGTTGGAGGCTCATCGTTCCTATA _2761329_2761492 TTCCACACCACTTATTATTACAGATGTTATGCTCGCAGGTCGTTGAGGACGTATTCT

CAATC

species_NC_01047 / 5 Phos/ GCTGGTACTTACGTTCAGATGTTGGAGGCTCATCGTTCCTATATTCCACA 3_3546640_354681 CCACTTATTATTACAGATGTTATGCTCGCAGGTCACGGTGAACGCCGTTACATCC 8

species_CP001844 / 5 Phos/GCAATTCTTACCACAGCACGAAGTTGGAGGCTCATCGTTCCTATATTCCA _57304_57465 CACCACTTATTATTACAGATGTTATGCTCGCAGGTCATCTAGATGAAGATAATGAAG

TCG

species_NC_01273 / 5 Pho s / GCGGCGGCAGGCGGTAACGCCAGGTTGGAGGCTCATCGTTCCTATATTCC 1_1975396_197555 ACACCACTTATTATTACAGATGTTATGCTCGCAGGTCACGCGGTTATCTACCACGGC 9 G

species_NC_003 92 / 5Phos/GCACCTACTTGTCCAGCACCAGCCATGTTGGAGGCTCATCGTTCCTATAT 3_198857_199024 TCCACACCACTTATTATTACAGATGTTATGCTCGCAGGTCAATACCACCACCAATAC

AAGCA

species_NC_01040 / 5 Phos/GCGCGGTAACATGCCATATTCTGCGTTGGAGGCTCATCGTTCCTATATTC 0_52102_52263 CACACCACTTATTATTACAGATGTTATGCTCGCAGGTCCCTGAATGACATCACAGTC

G

species_NC_01047 / 5 Phos/AATCAGGTCAAGGAACTGCAAGCGTTGGAGGCTCATCGTTCCTATATTCC 3_3310005_331016 ACACCACTTATTATTACAGATGTTATGCTCGCAGGTCGTCTCAATCATATGCACCGG 4 AATAC

species_FP929058 / 5 Phos/GAACATATGTGTATGACGATGCGCGGGTTGGAGGCTCATCGTTCCTATAT _3 022053_3022226 TCCACACCACTTATTATTACAGATGTTATGCTCGCAGGTCGTACATGTCGCTTATCT

GCCAGAAGGT

species_NC_00908 / 5 Phos / CGTGTGCGTAGTGACGAGTTGGAGAGTTGGAGGCTCATCGTTCCTATATT 5_1010393_101055 CCACACCACTTATTATTACAGATGTTATGCTCGCAGGTCAGAATACGATGATGTAAG 6 GTACACCTA Name Sequence

species_CP002621 /5Phos/CAGGAGTTACTTCTGTTCCATGTTGGAGGCTCATCGTTCCTATATTCCAC 172633 172802 ACCACTTATTATTACAGATGTTATGCTCGCAGGTCTTGAACAATTAGATCACCTCG species_FP929040 /5PhoS/CGTAATCTCCATTACCGATGGTCAGATCGTTGGAGGCTCATCGTTCCTAT _442484_442653 ATTCCACACCACTTATTATTACAGATGTTATGCTCGCAGGTCACGTATTCTACCTCC

ACTCTCGTCT

species_NC_00392 /5Phos/CATTCGACGTTCTGGTATTACTTGTTGGAGGCTCATCGTTCCTATATTCC 3_1334345_133450 ACACCACTTATTATTACAGATGTTATGCTCGCAGGTCCACGCTCCGCATCAGCAGCA 1 CCACGTT

species_NC_00908 /5Phos/CTGAACCACGGATTACTGGAGTGTCGTTGGAGGCTCATCGTTCCTATATT 5_1010678_101085 CCACACCACTTATTATTACAGATGTTATGCTCGCAGGTCGCCTGTTACTACTGTACC 3 ACGAC

pseudomonas_NC_0 /5Phos/GAATCGAACGGTCTCATTAACAGATGTTGGAGGCTCATCGTTCCTATATT 08463_4756080_47 CCACACCACTTATTATTACAGATGTTATGCTCGCAGGTCGCTTTCCAGGGATATAAG 56240 ACGC

pseudomonas_NC_0 /5Phos/CCCGCAGAGTCACACTCGGAGTTGGAGGCTCATCGTTCCTATATTCCACA 02516_1063894_10 CCACTTATTATTACAGATGTTATGCTCGCAGGTCACTCTTGGTACTACTCACTAGC 64077

pseudomonas NC 0 /5Phos/GAGTCTCTTTCAACCTGGATTAGATATGTTGGAGGCTCATCGTTCCTATA 08463_3182693_31 TTCCACACCACTTATTATTACAGATGTTATGCTCGCAGGTCAAGATTAATAGCGTAC 82865 TTTACTCC

pseudomonas_NC_0 /5Phos/ATCCCGCAGATACTAGGTTCTTAATGTTGGAGGCTCATCGTTCCTATATT 09656_2819490_28 CCACACCACTTATTATTACAGATGTTATGCTCGCAGGTCGAACTATTCATATTACAC 19655 CCTAAGG

pseudomonas_NC_0 /5Phos/CAGTGGGCTATCCTAAGCCAAAGGTTGGAGGCTCATCGTTCCTATATTCC 08463_3184022_31 ACACCACTTATTATTACAGATGTTATGCTCGCAGGTCCATAAGCGAACTAACTATCA 84185 CTTA

pseudomonas NC 0 /5PhoS/ACAAAGCGTTCTAAACGATTAGAACTGTTGGAGGCTCATCGTTCCTATAT 02516_1065937_10 TCCACACCACTTATTATTACAGATGTTATGCTCGCAGGTCCGAGAAAGGAAACAGGA 66093 TAGTAC

pseudomonas_NC_0 /5Phos/CCAATGGAGAAGTCTAAATGTCCAAGTTGGAGGCTCATCGTTCCTATATT 02516_1067833_10 CCACACCACTTATTATTACAGATGTTATGCTCGCAGGTCTTATCAGAGATACATGAC 68007 TCTTAGG

pseudomonas_NC_0 /5Phos/CGAATCACTGGACTACATTTATATTTCTGTTGGAGGCTCATCGTTCCTAT 08463_3182351_31 ATTCCACACCACTTATTATTACAGATGTTATGCTCGCAGGTCAGCGAACCTTTATAT 82508 TTGACCAT

pseudomonas NC 0 /5PhoS/CTCAAGTCTTGCCCTGATAGAATTATGTTGGAGGCTCATCGTTCCTATAT 08463_3184314_31 TCCACACCACTTATTATTACAGATGTTATGCTCGCAGGTCTCACGACTTATCTACTT 84473 TAGAAATC

pseudomonas_AP01 /5Phos/GGTGATCGTTATTATGATAGTACGGCGTTGGAGGCTCATCGTTCCTATAT 2280_3765216_376 TCCACACCACTTATTATTACAGATGTTATGCTCGCAGGTCCTCGGTTAAGGGAATTA 5383 CGAC

pseudomonas AP01 /5Phos/ACTCGGATGGTAGGTTTATTAAAGCGTTGGAGGCTCATCGTTCCTATATT 2280_3765033_376 CCACACCACTTATTATTACAGATGTTATGCTCGCAGGTCGTGATCGTTATTATGATA 5192 GTACGG

enterococcus NZ /5Phos/ACAATCGTTGTCGCACTGCATAGGTTGGAGGCTCATCGTTCCTATATTCC

GG703715_13422_1 ACACCACTTATTATTACAGATGTTATGCTCGCAGGTCGAACTTGGTCTACCGTACCA

3573 C

enterococcus_NZ_ /5Phos/GGATAATACAATCCTAATACGTACGGAGTTGGAGGCTCATCGTTCCTATA GG703582_76982_7 TTCCACACCACTTATTATTACAGATGTTATGCTCGCAGGTCGCTGCTGTAACTAGGG 7140 TAGC

enterococcus NZ /5Phos/CTATATTCAACGGGTCACGGGTAGGTTGGAGGCTCATCGTTCCTATATTC

GL455004_28219_2 CACACCACTTATTATTACAGATGTTATGCTCGCAGGTCTCATTGATTCGATCTCGTA

8381 ACTC

enterococcus_NZ_ /5Phos/AATGTTATTGTGGTTGCGTGTTCGGTTGGAGGCTCATCGTTCCTATATTC GG703720_94699_9 CACACCACTTATTATTACAGATGTTATGCTCGCAGGTCTACTTTGGAAGTGCCCTGA 4852 C Name Sequence

enterococcus NZ /5Phos/CATGTCTTCTAGTACAGGTTTGCCGGTTGGAGGCTCATCGTTCCTATATT

GG703715_15795_1 CCACACCACTTATTATTACAGATGTTATGCTCGCAGGTCTGTAAGAGGCCGCTAACT

5951 TC

enterococcus NZ /5Phos/CTCTGGCTCGTGGGCTCGGGTTGGAGGCTCATCGTTCCTATATTCCACAC

GL455899_32848_3 CACTTATTATTACAGATGTTATGCTCGCAGGTCTTCTTGAGATAGTCCGGTATAATC

2984

enterococcus NZ /5PhoS/ATTCGATCACGATGGGCTGGGGTTGGAGGCTCATCGTTCCTATATTCCAC

GG692918_325104_ ACCACTTATTATTACAGATGTTATGCTCGCAGGTCAATTTCCTGTGTCATACACGC

325257

enterococcus_NC_ /5Phos/CAATTGATTTAGCCACTACACCTTACGTTGGAGGCTCATCGTTCCTATAT 004668_920608_92 TCCACACCACTTATTATTACAGATGTTATGCTCGCAGGTCCACTATTCTGGCGACCA 0750 CC

enterococcus NZ /5Phos/GATAAAGAAGCGTCTTGACCCAGTGTTGGAGGCTCATCGTTCCTATATTC

GG703575_78829_7 CACACCACTTATTATTACAGATGTTATGCTCGCAGGTCATCTGGTGCTCCTTGACGC

8963

enterococcus_NZ_ /5Phos/GCAAATTTAGAGAGTGCATGCATGGTTGGAGGCTCATCGTTCCTATATTC GL455931_26355_2 CACACCACTTATTATTACAGATGTTATGCTCGCAGGTCGGAAGAGGACGGCATACAA 6493 C

enterococcus_NZ_ /5Phos/CATTTCATCTAGACCGCTCGTGTGTTGGAGGCTCATCGTTCCTATATTCC GG669058_207026_ ACACCACTTATTATTACAGATGTTATGCTCGCAGGTCGCTTGAAGTGTATGTTGGGA 207172 C

proteus_NZ_GG661 /5Phos/GTCGCCCTCGTGCTAACGTGTTGGAGGCTCATCGTTCCTATATTCCACAC 998_111187_11134 CACTTATTATTACAGATGTTATGCTCGCAGGTCGGTTCTTTGATGTACCGGTT 2

proteus_NC_01055 /5Phos/GCTGATGACGGTGAAGTTTATCAGTTGGAGGCTCATCGTTCCTATATTCC 4_2037943_203809 ACACCACTTATTATTACAGATGTTATGCTCGCAGGTCCATTATCGCACATATTGACC 1 AC

proteus_NZ_GG668 /5Phos/GAAATTAGCTAAAGGGATATCGCGGTTGGAGGCTCATCGTTCCTATATTC 576_810893_81105 CACACCACTTATTATTACAGATGTTATGCTCGCAGGTCAACTTTCCGCCAATCCTGC 4

proteus_NZ_GG668 /5Phos/CACCTACGTTCTCACCTGCACGTTGGAGGCTCATCGTTCCTATATTCCAC 594 760 939 ACCACTTATTATTACAGATGTTATGCTCGCAGGTCATTCGATAGTACCAGTTACGTC proteus_NZ_GG668 /5Phos/GTTGCTTATAGCGTCGCTGCTGTTGGAGGCTCATCGTTCCTATATTCCAC 579 22072 22234 ACCACTTATTATTACAGATGTTATGCTCGCAGGTCCTGGTTATCGAGAAGATAAAGG proteus_NC_01055 /5Phos/GTAAGCGTAGCGATACGTTGAGGTTGGAGGCTCATCGTTCCTATATTCCA 4_2448957_244911 CACCACTTATTATTACAGATGTTATGCTCGCAGGTCGAGTGAACGCACCACTGG 9

proteus_NC_01055 /5Phos/TCAGGTAGAGAATACTCAGGCGCGTTGGAGGCTCATCGTTCCTATATTCC 4_3033758_303393 ACACCACTTATTATTACAGATGTTATGCTCGCAGGTCCGGAGAAGGCTAGGTTGTC 6

proteus_NC_01055 /5Phos/GCAACCCACTCCCATGGTGTGTTGGAGGCTCATCGTTCCTATATTCCACA 4 454391 454540 CCACTTATTATTACAGATGTTATGCTCGCAGGTCCGTTCTTCATCAGACAATCTG gyrB_NC_015663_1 /5Phos/GCCCTTTCAGGACTTTGATACTGGGTTGGAGGCTCATCGTTCCTATATTC 455472_1455621 CACACCACTTATTATTACAGATGTTATGCTCGCAGGTCTGTACGGAGACGGAGTTAT

CG

gyrB_NC_010410_4 /5Phos/ACACTGACCGATTCATCCTCGTGGTTGGAGGCTCATCGTTCCTATATTCC 215_4366 ACACCACTTATTATTACAGATGTTATGCTCGCAGGTCCTTGAAAGTGCGTTAACAAC

C

gyrB_NC_005773_4 /5Phos/CGGAAGCCCACCAAGTGAGTACGTTGGAGGCTCATCGTTCCTATATTCCA 904_5052 CACCACTTATTATTACAGATGTTATGCTCGCAGGTCCGAAACCAGTTTGTCCTTAGT

C

gyrB_NC_016514__5 /5Phos/ACCAGCTTGTCTTTAGTCTGAGAGGTTGGAGGCTCATCGTTCCTATATTC 343_5487 CACACCACTTATTATTACAGATGTTATGCTCGCAGGTCCTTTACGACGGGTCATTTC

AC

gyrB_NC_016603_2 /5Phos/CATTGGTTTGTTCTGTTTGAGAGGCGTTGGAGGCTCATCGTTCCTATATT 631439 2631616 CCACACCACTTATTATTACAGATGTTATGCTCGCAGGTCGATTCATCTTCGTGAATT Name Sequence

GTGAC

gyrB_NC_009436_4 / 5 Phos /GGACTTTGATACTGGAGGAGTCATAGTTGGAGGCTCATCGTTCCTATATT 366_4524 CCACACCACTTATTATTACAGATGTTATGCTCGCAGGTCTGTACGGAAACGGAGTTA

TCG

gyrB_NC_009512_4 / 5Phos /ATGCTGGAGGAGTCGTACGTTTGTTGGAGGCTCATCGTTCCTATATTCCA 203 4373 CACCACTTATTATTACAGATGTTATGCTCGCAGGTCGTCGCGCACACTAATAGATTC pseudomonas NC 0 / 5 Phos / AACTAAACCTACACGGAATTGGTTCGTTGGAGGCTCATCGTTCCTATATT 09085_307050_307 CCACACCACTTATTATTACAGATGTTATGCTCGCAGGTCGCAGATACACGACGTTTA 218 TGT

pseudomonas_NC_0 / 5 PhoS / GCCGCTTCACCTACGTTAGGAAGTTGGAGGCTCATCGTTCCTATATTCCA 09085_308225_308 CACCACTTATTATTACAGATGTTATGCTCGCAGGTCCGTAAAGATGAGTCTTTAACG 377 TC

p s eudomona s_NC_0 / 5Phos /GACGTTTGTGCGTAATCTCAGACGTTGGAGGCTCATCGTTCCTATATTCC 16612_1674334_16 ACACCACTTATTATTACAGATGTTATGCTCGCAGGTCGAGGAAACCGTATTCGTTCG 74490 T

pseudomonas NC 0 / 5 Phos / ACAACACTTTACCACTTGAGTGGGGTTGGAGGCTCATCGTTCCTATATTC 16603_3425179_34 CACACCACTTATTATTACAGATGTTATGCTCGCAGGTCGTAACTGCCCATGTCAAGA 25337 TAC

p s eudomona s_NC_0 / 5 Phos / CCACGTTTAGTTGAACCACCGCGTTGGAGGCTCATCGTTCCTATATTCCA 16603_3427629_34 CACCACTTATTATTACAGATGTTATGCTCGCAGGTCTCAATACGCCAGTTGTTAGTT 27808 C

pseudomonas NC 0 / 5 Pho s / AATCGATAATAAGTACGGTGCATCCGTTGGAGGCTCATCGTTCCTATATT 10410_3543925_35 CCACACCACTTATTATTACAGATGTTATGCTCGCAGGTCGAAGAATACATTCGCGTA 44088 CATC

pseudomonas NC 0 /5 Phos/AAGCAAGATCGAGTCTTCATAGTTGGTTGGAGGCTCATCGTTCCTATATT 05966_304936_3 05 CCACACCACTTATTATTACAGATGTTATGCTCGCAGGTCGATATACACGATACCTGA 079 TTCGT

pseudomonas NC 0 / 5 Phos / CCGATATTCATACGAGAAGGTACACGTTGGAGGCTCATCGTTCCTATATT 08593_226005_226 CCACACCACTTATTATTACAGATGTTATGCTCGCAGGTCCAGTAACTCTATTGTCAA 171 ACGGT

pseudomonas_NC_0 / 5Phos / GTAGTGAGTCGGGTGTACGTCTCGTTGGAGGCTCATCGTTCCTATATTCC 16514_213592_213 ACACCACTTATTATTACAGATGTTATGCTCGCAGGTCTCTTCGATAGCAGACAGATA 738 GT

pseudomonas NC 0 / 5Phos /ACCTACACGGAATTGGTTCTCAGTGTTGGAGGCTCATCGTTCCTATATTC 05966_303883_304 CACACCACTTATTATTACAGATGTTATGCTCGCAGGTCGATACACGACGTTTGTGTG 054 TA

enterobacter NC / 5 PhoS / CAACATCATTAGCTTGGTCGTGGGGTTGGAGGCTCATCGTTCCTATATTC

014618_3997909_3 CACACCACTTATTATTACAGATGTTATGCTCGCAGGTCTTGCGTGTTACCAACTCGT

998085 C

enterobacter NZ / 5 Phos / CGGCACGTCCGAATCGTATCAGTTGGAGGCTCATCGTTCCTATATTCCAC

GL892086_615149_ ACCACTTATTATTACAGATGTTATGCTCGCAGGTCTCGTGTCCCGTATATGTTGG

615324

enterobacter_NZ_ / 5Phos /AATAGAGGCCCACAAGTCTTGTTCGTTGGAGGCTCATCGTTCCTATATTC GL892086_1664663 CACACCACTTATTATTACAGATGTTATGCTCGCAGGTCCGCTCTCCACTATGGGTAG 1664834 T

enterobacter_NZ_ / 5 Phos / GCTACATTAATCACTATGGACAGACAGTTGGAGGCTCATCGTTCCTATAT GG704865_427821_ TCCACACCACTTATTATTACAGATGTTATGCTCGCAGGTCGATGGTCGATCTATCGT 427978 CTCT

enterobacter NZ / 5Phos/ GAAGTGTTATTCAAACTTTGGTCCCGTTGGAGGCTCATCGTTCCTATATT GL892087_1610708 CCACACCACTTATTATTACAGATGTTATGCTCGCAGGTCCTTGAACCCTTGGTTCAA 1610874 GGT Table 2: A list of organisms for which the methods and kits described herein have been validated to detect using the compositions described herein

Acinetobacter baumannii 1656-2 Escherichia coli Ol 11 :H- str. 11 128

Acinetobacter baumannii AB0057 Escherichia coli 0127:H6 str. E2348/69

Acinetobacter baumannii AB307-0294 Escherichia coli 0157:H7 str. EC41 15

Acinetobacter baumannii ACICU Escherichia coli 0157:H7 str. EDL933

Acinetobacter baumannii ATCC 17978 Escherichia coli 0157:H7 str. Sakai

Acinetobacter baumannii AYE Escherichia coli 0157:H7 str. TW14359

Acinetobacter baumannii MDR-ZJ06 Escherichia coli 026:H11 str. 11368

Acinetobacter baumannii SDF Escherichia coli 055:H7 str. CB9615

Acinetobacter baumannii TCDC-AB0715 Escherichia coli 07:K1 str. CE10

Acinetobacter calcoaceticus PHEA-2 Escherichia coli 083:H1 str. NRG 857C

Acinetobacter sp. ADP1 Escherichia coli S88

Acinetobacter sp. DR1 Escherichia coli SE11

Clostridium acetobutylicum ATCC 824 Escherichia coli SE15

Clostridium acetobutylicum DSM 1731 Escherichia coli SMS-3-5

Clostridium acetobutylicum EA 2018 Escherichia coli str. 'clone D il4'

Clostridium beijerinckii NCIMB 8052 Escherichia coli str. 'clone D i2'

Clostridium botulinum A2 str. Kyoto Escherichia coli str. K-12 substr. DH10B

Clostridium botulinum A3 str. Loch Maree Escherichia coli str. K-12 substr. MDS42

Clostridium botulinum A str. ATCC 19397 Escherichia coli str. K-12 substr. MG1655

Clostridium botulinum A str. ATCC 3502 Escherichia coli str. K12 substr. W31 10

Clostridium botulinum A str. Hall Escherichia coli UM146

Clostridium botulinum Bl str. Okra Escherichia coli UMN026

Clostridium botulinum Ba4 str. 657 Escherichia coli UMNK88

Clostridium botulinum BKT015925 Escherichia coli UTI89

Clostridium botulinum B str. Eklund 17B Escherichia coli W

Clostridium botulinum E3 str. Alaska E43 Escherichia fergusonii ATCC 35469

Clostridium botulinum F str. 230613 Klebsiella pneumoniae 342

Clostridium botulinum F str. Langeland Klebsiella pneumoniae KCTC 2242

Clostridium botulinum H04402 065 Klebsiella pneumoniae NTUH-K2044

Clostridium cellulolyticum H10 Klebsiella pneumoniae subsp. pneumoniae MGH

78578

Clostridium cellulovorans 743B Klebsiella variicola At-22

Clostridium clariflavum DSM 19732 Proteus mirabilis HI4320

Clostridium difficile 630 Pseudomonas aeruginosa LESB58

Clostridium difficile BI1 Pseudomonas aeruginosa Ml 8

Clostridium difficile BI9 Pseudomonas aeruginosa NCGM2.S1

Clostridium difficile CD 196 Pseudomonas aeruginosa PA7

Clostridium difficile strain 2007855 Pseudomonas aeruginosa PAOl

Clostridium difficile strain CF5 Pseudomonas aeruginosa UCBPP-PA14

Clostridium difficile strain Ml 20 Pseudomonas brassicacearam subsp. brassicacearam

NFM421

Clostridium difficile M68 Pseudomonas entomophila L48

Clostridium difficile R20291 Pseudomonas fluorescens Fl 13

Clostridium kluyveri DSM 555 Pseudomonas fluorescens PfO-1

Clostridium kluyveri NBRC 12016 Pseudomonas fluorescens Pf-5

Clostridium lentocellum DSM 5427 Pseudomonas fluorescens SBW25

Clostridium ljungdahlii DSM 13528 Pseudomonas fulva 12-X

Clostridium novyi NT Pseudomonas mendocina NK-01

Clostridium perfringens ATCC 13124 Pseudomonas mendocina ymp

Clostridium perfringens SMI 01 Pseudomonas putida BlRD-1

Clostridium perfringens str. 13 Pseudomonas putida Fl Clostridium phytofermentans ISDg Pseudomonas putida Fl

Clostridium saccharolyticum-like 10 Pseudomonas putida GB-1

Clostridium saccharolyticum WM1 Pseudomonas putida KT2440

Clostridium sp. SY8519 Pseudomonas putida S16

Clostridium sticklandii DSM 519 Pseudomonas putida W619

Clostridium tetani E88 Pseudomonas stutzeri A 1501

Clostridium thermocellum ATCC 27405 Pseudomonas stutzeri ATCC 17588 = LMG 11 199

Clostridium thermocellum DSM 1313 Pseudomonas stutzeri DSM 4166

Enterobacter aerogenes KCTC 2190 Pseudomonas syringae pv. phaseolicola 1448 A

Enterobacter asburiae LF7a Pseudomonas syringae pv. syringae B728a

Enterobacter cloacae SCF1 Pseudomonas syringae pv. tomato str. DC3000

Enterobacter cloacae subsp. cloacae ATCC 13047 Shigella boydii CDC 3083-94

Enterobacter cloacae subsp. cloacae NCTC 9394 Shigella boydii Sb227

Enterobacter sp. 638 Shigella dysenteriae Sdl97

Enterococcus faecalis 62 Shigella flexneri 2002017

Enterococcus faecalis OG1RF Shigella flexneri 2a str. 2457T

Enterococcus faecalis V583 Shigella flexneri 2a str. 301

Enterococcus sp. 7L76 Shigella flexneri 5 str. 8401

Escherichia coli 042 Shigella sonnei Ss046

Escherichia coli 536 Staphylococcus aureus

Escherichia coli 55989 Staphylococcus carnosus subsp. carnosus

Staphylococcus epidermidis

Escherichia coli ABU 83972 Staphylococcus haemolyticus JCSC1435

Escherichia coli APEC 01 Staphylococcus lugdunensis

Escherichia coli ATCC 8739 Staphylococcus pseudintermedius

Escherichia coli BL21(DE3) Staphylococcus saprophyticus subsp.

Escherichia coli 'BL21-Gold(DE3)pLysS AG' Staphylococcus aureus

Escherichia coli B str. REL606 Staphylococcus saprophyticus

Escherichia coli BW2952 Staphylococcus epidermis

Escherichia coli CFT073 Acinetobacter baumannii

Escherichia coli DH1 (ME8569) Enterococcus faecalis

Escherichia coli E24377A Enterobacter cloacae

Escherichia coli ED la Enterobacter aerogenes

Escherichia coli ETEC HI 0407 Enterococcus faecium

Escherichia coli HS Candida albicans

Escherichia coli IAI1 Klebsiella pneumoniae

Escherichia coli IAD 9 Escherichia coli

Escherichia coli IHE3034 Clostridium difficile

Escherichia coli KOI 1 Proteus mirabilis

Escherichia coli LF82 Pseudomonas aeruginosa

Escherichia coli NA1 14

Escherichia coli O 103 :H2 str. 12009

Table 3: Genus level regions can be used for coarse discrimination of organisms.

Probe Coordinates Gene

species_NC_004741_43388 rpsC, S4416 , 30S ribosomal protein S3

03 4338982

species_NC_009648_45355 atpA, KPN_04139, FOFl ATP synthase subunit alpha 21 4535683

species_NC_010410_36776 int , ABAYE3575 , integrase/recombinase (E2 protein) 07 3677782

species_CP001844_589057 tufA, SA2981_0525, Translation elongation factor Tu Probe Coordinates Gene

_589217

species_CP002110_276132 tuf ,HMPREF0772_12641, elongation factor EFIA 9 2761492

species_NC_010473_35466 rplB, ECDH10B_3492 , 50S ribosomal protein L2

40 3546818

species_CP001844_57304_ tnpB, SA2981_0055 , Transposase B from transposon 57465 Tn554 tnpB, SA2981_1617 , Transposase B from

transposon Tn554

species_NC_012731_19753 putA, KP1_2030 , trifunctional transcriptional 96_1975559 regulator/proline dehydrogenase/pyrroline-5- carboxylate dehydrogenase

species_NC_003923_19885 MW0166 , hypothetical protein

7 199024

species_NC_010400_52102 rph, ABSDF0051 , ribonuclease PH

52263

species_NC_010473_33100 rpoD , ECDH10B_3242, RNA polymerase sigma factor RpoD 05 3310164

species_FP929058_302205 ENT_30090, GTP cyclohydrolase I

3 3022226

species_NC_009085_10103 A1S_0279, elongation factor Tu

93 1010556

species_CP002621_172633 rplO, OG1RF_10170 , 50S ribosomal protein L15

172802

species_FP929040_442484 ENC_04200, proton translocating ATP synthase, Fl 442653 alpha subunit

species_NC_003923_13343 katA, 1221, catalase

45 1334501

species_NC_009085_10106 A1S_0279 , elongation factor Tu

78 1010853

Table 4: Genus level probes

Probe Coordinates Binding region 1 Binding region 2

species_NC_004741_ GAACATAACGCGACGTTCCAGCTG GCTTCAGAGGTGTTGTAGTCG 4338803 4338982

species_NC_009648_ GCGCTGGCGCAGTATCGTGAACTGG ACCAACGTAATCTCTATTACCG 4535521 4535683

species__NC__010410_ GCTGTAATGCAAGTAGCGTATGCGCTC AAGGCCGCCAATGCCTGACG 3677607 3677782 A

species_CP001844_5 GCCTGTAGCAACAGTACCACGACCAGT CACCACGTAATAATGCACCAA 89057 589217

species CP002110_2 ACTACGCTGAAGCTGGTGACAACATTG GTTGAGGACGTATTCTCAATC 761329 2761492

species_NC_010473_ GCTGGTACTTACGTTCAGAT ACGGTGAACGCCGTTACATCC 3546640 3546818

species_CP001844_5 GCAATTCTTACCACAGCACGAA ATCTAGATGAAGATAATGAAGTCG 7304 57465

species_NC_012731_ GCGGCGGCAGGCGGTAACGCCAG ACGCGGTTATCTACCACGGCG 1975396 1975559

species_NC_003923_ GCACCTACTTGTCCAGCACCAGCCAT AATACCACCACCAATACAAGCA 198857 199024

species_NC_010400_ GCGCGGTAACATGCCATATTCTGC CCTGAATGACATCACAGTCG 52102 52263 Probe Coordinates Binding region 1 Binding region 2

species_NC_010473_ AATCAGGTCAAGGAACTGCAAGC GTCTCAATCATATGCACCGGAATAC 3310005 3310164

species_FP929058_3 GAACATATGTGTATGACGATGCGCGG GTACATGTCGCTTATCTGCCAGAAG 022053 3022226 GT

species_NC_009085_ CGTGTGCGTAGTGACGAGTTGGAGA AGAATACGATGATGTAAGGTACACC 1010393 1010556 TA

species_CP002621_l CAGGAGTTACTTCTGTTCCAT TTGAACAATTAGATCACCTCG 72633 172802

species_FP929040_4 CGTAATCTCCATTACCGATGGTCAGAT ACGTATTCTACCTCCACTCTCGTCT 42484 442653 C

species_NC_003923_ CATTCGACGTTCTGGTATTACTT CACGCTCCGCATCAGCAGCACCACG 1334345 1334501 TT

species_NC_009085_ CTGAACCACGGATTACTGGAGTGTC GCCTGTTACTACTGTACCACGAC 1010678 1010853

Table 5: Species/ strain level regions can be used for discrimination at the level of species and strains.

Probe Coordinates Gene

acinetobacter_NC_010 ACICU_00572 , pyridine nucleotide transhydrogenase 611 627997 628164 (proton pump) subunit alpha (part2)

acinetobacter NC 010 pepN, ACICU_02288 , aminopeptidase N

611_2417580_2417755 trpC , ACICU_02557, indole- 3 - glycerol -phosphate

synthase

acinetobacter_CP0025 recF, ABTW07_0010 , recombination protein F

22 11753 11931

acinetobacter_NC_011 gshB, AB57_3788 , glutathione synthetase

586 3908329 3908508

acinetobacter_NC_010 ACICU_00129, NAD-dependent aldehyde dehydrogenase 611 145181 145340

acinetobacter NC 010 ACICU_03630, A/G-specific DNA glycosylase

611 3854494 3854662

acinetobacter_NC_010 nadC, ABSDF0056 , nicotinate-nucleotide

400_56216_56383 pyrophosphorylase (guinolinate

phosphoribosyltransferase)

acinetobacter NC 010 near ACICU_01347 , carbonic anhydrase

611 1454960 1455136

acinetobacter_NC_009 A1S 0230 , phosphoglyceromutase

085 255964 256143

clostridium_NC_01397

4 3097606 3097772

Clostridium FN665653

103469 103631

clostridium_NC_01397

4 117188 117346

clostridium_NC_01331 nifJ , CDR20291_2570, pyruvate- flavodoxin

6 3012882 3013047 oxidoreductase

clostridium_FN668375

1212250 1212413

clostridium_NC_01331 pykF, CD196_3170, yruvate kinase

5 3754484 3754640

clostridium_FN665654

3239860 3240039 Probe Coordinates Gene

clostridium_FN668941

3228320 3228491

clostridium_NC_01397

4 1962664 1962825

clostridium_NC 00336 tuf, CPE2407 , elongation factor Tu

6 2769687 2769851

clostridium_FN665653

127741 127918

clostridium_NC_01331 clpB , CDR20291_1933, chaperone

6 2259929 2260107

clostridium_NC__00908 rpoC, CD0067 , DNA-directed RNA polymerase subunit 9 94774 94937 beta '

clostridium_NC_01331 CD196_1764 , cell surface protein

5 2044225 2044389

Clostridium NC 01331 near CD196_1987 , multiprotein- complex assembly 5 2299408 2299586 protein

clostridium_FN668941

3244255 3244408

clostridium_NC_01331 gpml , CDR20291_3027 , phosphoglyceromutase

6 3610909 3611065

clostridium_FN665653

1104859 1105031

Clostridium NC 00336 tuf ,CPE2407, elongation factor Tu

6 2753681 2753838

clostridium_FN665653

710906 711080

clostridium_NC_00908 gpml , CD3171 , hosphoglyceromutase

9 3706562 3706720

clostridium_NC_01331 dnaF, CDR20291_1146, DNA polymerase III PolC-type 6 1372812 1372968

clostridium_FN665652

676696 676895

clostridium_NC_01331 CDR20291_2249

6 2641651 2641808

clostridium_FN668375

3595870 3596026

clostridium_FN668941

1105700 1105868

clostridium_NC_01397

4 2505182 2505359

clostridium_NC_01331 potA, CD196_0900 , spermidine/putrescine ABC

5 1077126 1077298 transporter ATP-binding protein

clostridium_NC_00908 CD1878A

9 2182303 2182482

clostridium_FN665652

1909777 1909942

Clostridium NC 01331 ntpB,CDR20291_2788,V-type ATP synthase subunit B 6 3300896 3301062

clostridium_NC_01331 spoVAD, CDR20291_0703 , stage V sporulation protein AD 6 871338 871499

clostridium_NC_01331 bclA2 , CDR20291_3090 , exosporium glycoprotein

6 3608873 3609047 eno,CDR20291 3026,enolase

clostridium_FN665654

3717059 3717221

clostridium_NC_01331 CD196_1739 , hypothetical protein Probe Coordinates Gene

5_2010489_2010657

clostridiura_NC_01331 adhE, CD196_2753 , bifunctional acetaldehyde- 5_3236301_3236474 CoA/alcohol dehydrogenase CD196_2095 , sodium: solute symporter spoVD , CD196_2497 , stage V sporulation protein D (sporulation specific penicillin-binding protein)

Clostridium NC 01331 CD196_0911 , -acetylmuramoyl-L-alanine amidase 5 1095924 1096090

enterobacter_NC_0141 ECL_04612 , 50S ribosomal subunit protein L13

21 4735453 4735632

enterobacter_NC_0156 EAE_24795 , hemagluttinin domain- containing

63 1014187 1014345 protein, rplR, EAE 04875, 50S ribosomal protein L18 enterobacter_FP92904

0 3448334 3448513

enterobacter_NC_0094 rplD, Ent638_3750 , 50S ribosomal protein L4

36 4051820 4051985

enterococcus_FP92905 ENT_17660, hypothetical protein

8 1738439 1738606

enterococcus CP00262 0G1RF_11736 , group 2 glycosyl transferase

1 1819224 1819388

enterococcus FP92905 near ENT_09350 , Uncharacterized protein conserved in 8 904007 904173 bacteria

enterococcus_FP92905

8 551757 551920

enterococcus NC 0046

68 1122345 1122507

klebsiella_NC_009648 mgo, KPN_02629 , malate : quinone oxidoreductase

2885456 2885620

klebsiella_NC_009648 garL, KPN_03538 , alpha-dehydro-beta-deoxy-D-glucarate 3899012 3899182 aldolase

klebsiella_NC_009648 frdB, KPN_04552 , fumarate reductase iron- sulfur 4980596 4980757 subunit

klebsiella_NC_009648 KPN_02970 , integral transmembrane protein acridine 3266359 3266519 resistance

klebsiella_NC_012731 KP1_2672 , putative malate dehydrogenase

2557467 2557634

klebsiella_NC_012731 glpR, KP1_5123 , DNA-binding transcriptional repressor 4857136 4857315 GlpR

proteus_NC_010554_54 PMI0497 , hage terminase large subunit

7938 548117

pseudomonas_NC_00846 PA14_07660 , hypothetical protein

3 658500 658676

pseudomonas__NC_00846 rpoC, PA14_08780 , DNA-directed RNA polymerase subunit 3 753931 754099 beta '

pseudomonas_NC_00965 oadA, SPA7 6223 , pyruvate carboxylase subunit B 6 6431649 6431828

pseudomonas_NC_00846 PA14_06330 , serine/threonine protein kinase

3 560357 560534

pseudomonas NC 01032 PputGBl 0612 , arginine decarboxylase

2 5224859 5225023 PputGBl_4676 , ketol-acid reductoisomerase

pseudomonas_NC_00846 dadA, PA14_70040 , D-amino acid dehydrogenase small 3 4839746 4839924 subunit ung, PA14_54590 , uracil-DNA glycosylase

Staph_FN433596_28440 SATW20_26770 , putative acetyltransferase

85 2844263

Staph_NC_009632_1198 SaurJHl 1177 , branched- chain alpha-keto acid Probe Coordinates Gene

350_1198529 dehydrogenase subunit E2

staph_FN433596_25212 rplB, SATW20_23810 , 50S ribosomal protein L2

44 2521419

Staph_NC_009487_4308 SaurJH9 0396 , hypothetical protein

42 431017

staph_NC_009782_2086 SAHV_1928 , truncated amidase

681 2086849

Staph_NC_009782_5825 tnpB, SAHV_1645 , transposase B

6 58423

staph_NC_013450_9910 SAAV_0970 , ribosomal large subunit pseudouridine 49 991222 synthase D

staph_NC_013450_1360 opuDl, SAAV_1329,BCCT family osmoprotectant

842 1361008 transporter

staph_AM990992_25260 SAPIG2450 , nitrate reductase, alpha subunit

26 2526192

staph_NC_010079_3612 near USA300HOU_0330 , PfoR family transcriptional 84 361447 regulator

staph_NC_007795_2085 SAOUHSC_02251, hypothetical protein

723 2085901

staph_NC_009641_2312 purA, NWMN_0016 , adenylosuccinate synthetase

5 23297

staph_FN433596_21445 hlb, SATW20_19320 , hospholipase C precursor

70 2144734 (pseudogene) , SAT 20 19830, phage protein

Staph_NC_009782_5485 SAHV_0049 , hypothetical protein

7 55020

staph_7AM990992_16566 proC, SAPIG1569 , yrroline-5-carboxylate reductase 16 1656789

Staph_NC_007793_4422 SAUSA300_0036 , hypothetical protein

7 44395

Staph_NC_009641_1102 NWMN_0995 , hage anti-repressor protein

949 1103116

Staph_NC_009641_1137 NWMN_0310,phage tail fiber

731 1137898

staph_FN433596_27157 SATW20 25670 , putative amino acid permease

13 2715871

Staph_NC_009782_6066 rpoB, SAHV_0540, DNA-directed RNA polymerase subunit 52 606825 beta

staph_FN433596_65762 rpoB, SATW20_06120, DNA-directed RNA polymerase beta 5 657803 chain protein

pseudomonas_NC_00846

3 4756080 4756240

pseudomonas_NC_00251

6 1063894 1064077

pseudomonas_NC_00846 PA14_35780 , hypothetical protein

3 3182693 3182865

pseudomonas NC 00965 PSPA7_0044 , filamentous hemagglutinin

6 2819490 2819655

seudomonas_NC_00846 PA14 35790 , homospermidine synthase

3 3184022 3184185

pseudomonas_NC_00251 PA0984 , colicin immunity protein

6 1065937 1066093

pseudomonas NC 00251 near pyoS5 , PA0985 , pyocin S5

6 1067833 1068007

pseudomonas NC 00846 PA14_35780 , hypothetical protein

3 3182351 3182508 Probe Coordinates Gene

pseudomonas_NC_00846 PA14 35790 , homospermidine synthase

3 3184314 3184473

pseudomonas_AP012280

3765216 3765383

pseudomonas_AP012280

3765033 3765192

enterococcus NZ GG70

3715 13422 13573

enterococcus_NZ_GG70

3582 76982 77140

enterococcus NZ GL45

5004 28219 28381

enterococcus_NZ_GG70

3720 94699 94852

enterococcus_NZ_GG70

3715 15795 15951

enterococcus NZ GL45

5899 32848 32984

enterococcus_NZ_GG69

2918 325104 325257

enterococcus_NC_0046 EF0957 , maltose phosphorylase

68 920608 920750

enterococcus_NZ_GG70

3575 78829 78963

enterococcus NZ GL45

5931 26355 26493

enterococcus_NZ_GG66

9058 207026 207172

proteus_NZ_GG661998_

111187 111342

proteus_NC_010554_20 lepA, PMI1890 , GTP-binding protein LepA

37943 2038091

proteus_NZ_GG668576_

810893 811054

proteus_NZ_GG668594_

760 939

proteus_NZ_GG668579_

22072 22234

proteus_NC_010554_24 PMIr002

48957 2449119

proteus__NC_010554_30 P Ir002

33758 3033936

proteus_NC_010554_45 PMIr006

4391 454540

pseudomonas_NC_00 08 rpoB,AlS_0287, DNA-directed RNA polymerase subunit 5 307050 307218 beta

pseudomonas_NC_00908 rpoB, A1S_0287 , DNA-directed RNA polymerase subunit 5 308225 308377 beta

pseudomonas ^ NC_01661 rpoB,KOX 07910 , DNA-directed RNA polymerase subunit 2 1674334 1674490 beta

pseudomonas NC 01660 rpoB,BDGL_003192,RNA polymerase subunit B

3 3425179 3425337

pseudomonas_NC__01660 rpoB,BDGL_003193,DNA-directed RNA polymerase subunit 3 3427629 3427808 beta

pseudomonas_NC_01041 rpoB,ABAYE3489, DNA-directed RNA polymerase subunit Probe Coordinates Gene

0_3543925_3544088 beta

pseudomonas_NC_00596 rpoB,ACIAD0307, DNA-directed RNA polymerase subunit 6 304936 305079 beta

pseudomonas_NC_00859 rpoB,NT01CX_1107, DNA-directed RNA polymerase subunit 3 226005 226171 beta

pseudomonas_NC_01651 rpoB, EcWSUl_00211, DNA-directed RNA polymerase

4 213592 213738 subunit beta

pseudomonas_NC_00596 rpoB,ACIAD0307, DNA-directed RNA polymerase subunit 6 303883 304054 beta

enterobacter NC 0146 Entcl_3718 , two component transcriptional regulator, 18 3997909 3998085 winged helix family

enterobacter NZ GL89

2086 615149 615324

enterobacter NZ GL89

2086 1664663 1664834

enterobacter_NZ_GG70

4865 427821 427978

enterobacter_NZ_GL89

2087 1610708 1610874

Table 6: Species/ strain level probes

Probe Coordinates Binding region 1 Binding region 2 acinetobacter_NC_010 GCAGCACTTGACCGCCATGAGTGACCA CATCGCACCAACAACAATAATCG 611 627997 628164

acinetobacter_NC_010 GTGATCACTGATGCACCAGATGAAGT ATCTTGATATTCAAGTCTATGAC 611 2417580 2417755 G

acinetobacter CP0025 GATATTATTGATCATGGTGCCAAGCCA CAATATGAAGCTGACGACGCG 22 11753 11931 A

acinetobacter_NC_011 GCTGAGCGTGAAGGTTCATGGATTATT GGTAAGGCTTACGGTCTCAT 586 3908329 3908508 A

acinetobacter_NC_010 GCATCTTGTGCAGCCTGAATAGCAGCG ACCACGTTGAATATCACCTTCGG 611 145181 145340 T CAT

acinetobacter NC 010 AAGTCCATAATTGCTTGAGTGTAGTCA ATCTTCGCACTGAATAATAAGAA 611 3854494 3854662 T CAT

acinetobacter_NC_010 GCTTGCTGGTTCTGCACGTAGCTTACT AAGATGAACAGGCTACTGCAA 400 56216 56383 G

acinetobacter NC 010 GCAGCGCTGTGCAAGTTCAATGTATTC CTCGTGCGAGTATTCCTTAAGTG 611 1454960 1455136 T T

acinetobacter NC 009 GTATAACACTCGGCCAGCGCCAAGGTT GTTCACACATCGCCACAATATGA 085 255964 256143 C T

clostridium_NC_01397 ACCATGCAGATACAATGAACCA GGATGATAAGACACATCCAATTC 4 3097606 3097772

clostridium_FN665653 CATCAACAGCTTCTTGAAGCATTC GTCCAACAACTA AACAGAACGT 103469 103631 C

clostridium_NC_01397 AACATATCACCTGATATTCT G ATC ATTCCATTATATTCAACAGGATT 4 117188 117346 GTGA

clostridium_NC_01331 GCTGTTGCTTGCGGATACTG CGTATATGTAGCTCAAGTTGC 6 3012882 3013047

clostridium_FN668375 AAGAGCTAATGCAGCTATTGCACTTAT CATACACTTCAGCTATAAGACCA 1212250 1212413 T

clostridium_NC_01331 AACAAGAGCAGAAGTTACAGACGT GTATAATGGTGGCTAGAGGTGA 5 3754484 3754640 Probe Coordinates Binding region 1 Binding region 2 clostridium_FN665654 ACTCGTGAAGACCATGCAGATACAA AATACTTACAATGCCTGAGGA 3239860 3240039

clostridium_FN668941 ACCATGCAGATACAATGAACC CCTGAGGATGATAAGACACATC 3228320 3228491

clostridium_NC_01397 GCATCTGCTGCTTCTATTGCTCCTACT ACATGAACTGATATTAGTTCTCC 4 1962664 1962825 AA

clostridium_NC_00336 GCACAAGCTGGAGATAACATCGG GTAGAGGACGTATTCACAATCAC 6 2769687 2769851 T

clostridium_FN665653 CTCTATCAGCTTCTACTGCTTCTTC CCATCTCATCCACAGTTAATATA 127741 127918 TC

clostridium_NC_01331 AGATGAGATTCATACTATCGTTGGAGC AGCAGAGAGAATAGTAAGAGGAG 6 2259929 2260107 T A

clostridium_NC 00908 CATCAACAGCTTCTTGAAGCATT GTCCAACAACTATAACAGAACG 9 94774 94937

clostridium_NC_01331 GTCAGCAATACGCCACCAAGCTCCTAT GTGGTGGATATCCTGTTACC 5 2044225 2044389

clostridium_NC_01331 GCGCAATAGAGTTGTATAAGAGTGCTG AGCATTAATTATAGATTATAATG 5 2299408 2299586 TATAA

clostridium_FN668941 GGCATAATAGGATGGATAGATGA ACTAATCCAACTTCTACTGCTAT 3244255 3244408

clostridium_NC_01331 GTACATTCACATATAGACCATCTTAA ACATAGGTGCAGGTAGAATAGTA 6 3610909 3611065 TA

clostridium_FN665653 CCATACCAGTATCTTGGCATATTG ATAATGAATAACAGCAGGTGTAT 1104859 1105031 TA

clostridium_NC_00336 AGATGAAGCACAAGCTGGAGATAA AGGACGTATTCACAATCACTG 6 2753681 2753838

clostridium_FN665653 ATAATCATTCACCTCCATCATTCATAA ACTGAATATGGTTCGTCTCA 710906 711080

Clostridium NC 00908 GTACATTCACATATAGACCATCTTA ACATAGGTGCAGGTAGAATAGT 9 3706562 3706720

clostridium_NC_01331 ACTCCACCAGGATGTTGTCC GTAGGACCGTCGTGTCCAAG 6 1372812 1372968

clostridium_FN665652 GCAATATCAATGGTATCGAAGGCACTA GTATTGAAGGTACTATTAGCGAT 676696 676895 T ATGC

Clostridium NC 01331 GTGCCGGTCTCGGTTACTCAATG GGATTATTATAATGCAGCTAGAA 6 2641651 2641808 G

clostridium_FN668375 GTACATTCACATATAGACCATCTT ACATAGGTGCAGGTAGAATAGTA 3595870 3596026

clostridium_FN668941 AGTTCCTTCATATGACTCAGTTGATTG GTTATATCTTCAATTATACATTC 1105700 1105868 A CTGC

clostridium_NC_01397 CAGCAGTTGTTGCTAGAGGTATG GCATCACCAGGTGCAGCAAGT 4 2505182 2505359

clostridium_NC_01331 GCAATTCTCTGTTGTTGTCCTCCACTC AGTAAGAGCCTCTTCTTGGTCAT 5 1077126 1077298 A GA

clostridium_NC_00908 CTATTCCTGATAATAAGTGTGTCCTCA CGGCATCATCTAACAATTCTTCT 9 2182303 2182482 T

clostridium_FN665652 GTAATTCCAATTACTTCTAGCTCTGGT TACCATCTTCTCCATGTGTAT 1909777 1909942 G

clostridium_NC_01331 CCATGCAGATACAATGAACCAG GATGATAAGACACATCCAATTCC 6 3300896 3301062

clostridium_NC_01331 CCTTCTGCCATTGTAGAACAAGCTCCA CCTGTAACTGTCCACTGAGC 6 871338 871499 T

clostridium_NC_01331 CAATCATGATAGAATTAGATGGAAC AGCAATAGTTCCATCAGGAGCAT 6 3608873 3609047 C

clostridium_FN665654 AGTGGTGAAGGTGTTCAACAAG ACTGAAGCTGGATATGTTGGAG Probe Coordinates Binding region 1 Binding region 2

_3717059_3717221

clostridium_NC_01331 CGCCTCTTCAGAAGCGGATATCA GCCAGACTTCCGCCACAACCT 5 2010489 2010657

clostridium_NC_01331 GGCATAATAGGATGGATAGATGAGC GCAGCAGTTGTACCTACAACTAA 5 3236301 3236474

clostridium_NC_01331 AGTTCCTTCATATGACTCAGTTGATTG GTTATATCTTCAATTATACATTC 5 1095924 1096090 CTGCG

enterobacter_NC_0141 GCATGGTAGTTCGCCAGCCGCTGGAAC ACAGCAACCGCAAGTTCTTGACA 21 4735453 4735632 T

enterobacter NC 0156 AATATCATGGTCGTGTCCAGGCACTGG GTTCTGGTAGCTGCTTCTACTGT 63 1014187 1014345 C A

enterobacter_FP92904 AACTTACAACTACGCGCACTTGAATCG GAGTGTTGTATGATAGTCTCGGT 0 3448334 3448513

enterobacter_NC_0094 GCAAGTTGAGGAGATGCTGGCATGATT ACATGGCTCTGGAAGATGTGCTG 36 4051820 4051985 C ATC

enterococcus FP92905 GCGATAATTGTAATGATTCGTGGTGTT CCGTTGTCAATCCAGTTAGTAGA 8 1738439 1738606 A CT

enterococcus_CP00262 ACTGTGGCAGTCTATGTTCCAATTGTA CTTATCGACATAATCCTGATAAT 1 1819224 1819388 C

enterococcus_FP92905 GCGTCGCTTCTTGCGCTCGCC AATGTATTCATACCGTCAAGT 8 904007 904173

enterococcus_FP92905 GCCTTCACAACTACGTTGGAAGGTCTT CTAACAGTCCTGCCGACTAC 8 551757 551920 C

enterococcus_NC_0046 GCCTTCACAACTACGTTGGAAGGTCTT CTAACAGTCCTGCCGACTACT 68 1122345 1122507

klebsiella_NC_009648 GCCGCTGAGCGGCGGCAAGCCGATGGC GAATGGCAGGCCAAGCTGAAGGC 2885456 2885620 G

klebsiella_NC_009648 GCCAAGCGGCATTCTGGCGCCAGTGGA CCAGACCGGAGTGGACAACGTCG 3899012 3899182 AGGCG

klebsiella_NC_009648 GCCGTATATCATCGGCAATAACCGCAC GCATGATGGTCAACAAGGTGC 4980596 4980757 G

klebsiella_NC_009648 ACGAGCCGAGATAGGTCTGCAGCGTAC GTACTGATATTCACCATACTGCC 3266359 3266519 G

klebsiella_NC_012731 GCAATATCTTCACCGGCAGCCACCGCG GGTATATGGCACGCCAATCGC 2557467 2557634

klebsiella_NC_012731 AATAACCTTAACGTCGCCAACACG CTCGGTGAACACCTCCTGGCACG 4857136 4857315

proteus_NC_010554_54 GCGGAACTGCTTGGCGTAGTAAGC CATGTAGTGCCGTAGACCTTCAC 7938 548117 CA

seudomonas_NC_00846 GCGAGACCGGCGGCACCATCGTCTCCA TTCTGCCTGATGGACGTCTCCGG 3 658500 658676 G CTCG

seudomonas_NC_00846 GCGGTTCACCTGTTCGCCTTCGAACAC GCGCAGCATCTGACGCAGGATGG 3 753931 754099 G TCTCG

pse domonas_NC_00965 ACTCCATCGCCATCAAGGACATGGCCG ATCGACGTGTTCCGCATCTTCGA 6 6431649 6431828 G CGCG

pseudomonas_NC_00846 GCCTGATGCACTACAGCGCCTGG TACCACATGGTCGATCTCGACGA 3 560357 560534 CTGC

pseudomonas NC 01032 GCGCATCCAGGACGGCGAGTACG CTTCGAGTGCCTGCACGAGCTGA 2 5224859 5225023 A

pseudomonas_NC_00846 GCTGGAGAACGTCAAGGTGGTGATCAT ACCGATAACGACGACCGCATCAA 3 4839746 4839924 C

Staph_FN433596_28440 ACGATTGGAGAAGGCAGTGTGATTGG GGACAGATTACAATTGGCG 85 2844263

Staph_NC_009632_1198 GCCGCAATACCGATATTCCA CCATTGTCCACCAGCTGAACCG 350 1198529 Probe Coordinates Binding region 1 Binding region 2

Staph_FN433596_25212 GTGAAGGTCGTGCTCCTATCGGT AGATCTGGTGAAGTTCGTATGAT 44 2521419

Staph_NC_009487_4308 GCTGGTACTTGTACTTATATCGA ATCAGAAGATGATATCGTTACGT 42 431017 CAT

staph_NC_009782_2086 GCGCATATTGCATTAATGGCTATAGAT GCCAGCAGGTTATACACTCG 681 2086849

Staph_NC_009782_5825 GCAATTCTTACCACAGCACGAAGAACA ATCTAGATGAAGATAATGAAGTC 6 58423 G G

staph_NC_013450_9910 GCATCTTCATACAATACTTCTAGCTTA CACAATACCAGTTGTATTACG 49 991222 C

staph_NC_013450_1360 GCTTCAGCGCCATTACCGCCACCAGCT ACTCTTGATATATTCTTGTAAGC 842 1361008 G

staph_7AM990992_25260 GTTCACACAACGCGCCGACTAGAATCC CACGATATCCAAGATAATGATTG 26 2526192 GCTA

Staph_NC_010079_3612 GCGCACCTACAATCGCCATTACTACAC ACTCATTATCGACTGTTACATCG 84 361447 ACTGA

Staph_NC_007795_2085 AGCGCACATGTGACAGCGTGTAGGTTA GTGCCTTAGATTGTTCAGAACAA 723 2085901 T

staph_NC_009641_2312 CGAATGGATATGTACCATGGTCGATAT CTCTCTAATATGATGTCCAT 5 23297 C

Staph_FN433596_21445 ACTACAACAGCAACCGCATTACAATGG GGTGCTAAGAGGTCATCGGA 70 2144734 C

staph_NC_009782_5485 AGCTTCAGATAAGTACCTATCTGA GGAAGAATAGTTATTCTTGATAA 7 55020 TGTAT

staph_AM990992_16566 CGTATTGCTCGAATACATGATA ACAATGTATCAAGGCCAGCT 16 1656789

staph_NC_007793_4422 GCGACCAGTTGTTATCGACCGTGT CAGAACGATACGGTGCTGTATA 7 44395

staph_NC_009641_1102 CAATTACATTGTCTGTTGCGTAGATAC GTTGTGGCTAATGTGCCAGTT 949 1103116 C

Staph_NC_009641_1137 GCACCACTCTATAGCAGTAGCGTATTG ACAGCCAATGTCACCTAAGTCAA 731 1137898 CA

Staph_FN433596_27157 ACAGTCCGAATAAGATACGACTATTCG CGTTGTAACGTATATGAATAGTT 13 2715871 A GA

staph_NC_009782_6066 AGATGCAATAACAGGTCGAATATTAAT GCCA AGTGAGAGTAGTGAA 52 606825 T

Staph_FN433596_65762 AGATGCAATAACAGGTCGAATATTAA ACACATACGGCCATAGTGAGAG 5 657803

pseudomonas_NC_00846 GAATCGAACGGTCTCATTAACAGAT GCTTTCCAGGGATATAAGACGC 3 4756080 4756240

pseudomonas_NC_00251 CCCGCAGAGTCACACTCGGA ACTCTTGGTACTACTCACTAGC 6 1063894 1064077

pseudomonas_NC_00846 GAGTCTCTTTCAACCTGGATTAGATAT AAGATTAATAGCGTACTTTACTC 3 3182693 3182865 C

pseudomonas NC 00965 ATCCCGCAGATACTAGGTTCTTAAT GAACTATTCATATTACACCCTAA 6 2819490 2819655 GG

pseudomonas NC 00846 CAGTGGGCTATCCTAAGCCAAAG CATAAGCGAACTAACTATCACTT 3 3184022 3184185 A

pseudomonas NC 00251 ACAAAGCGTTCTAAACGATTAGAACT CGAGAAAGGAAACAGGATAGTAC 6 1065937 1066093

pseudomonas_NC_00251 CCAATGGAGAAGTCTAAATGTCCAA TTATCAGAGATACATGACTCT A 6 1067833 1068007 GG

pseudomonas_NC_00846 CGAATCACTGGACTACATTTATATTTC AGCGAACCTTTATATTTGACCAT 3 3182351 3182508 T

pseudomonas NC 00846 CTCAAGTCTTGCCCTGATAGAATTAT TCACGACTTATCTACTTTAGAAA Probe Coordinates Binding region 1 Binding region 2

3_3184314_3184473 TC

pseudomonas_AP012280 GGTGATCGTTATTATGATAGTACGGC CTCGGTTAAGGGAATTACGAC 3765216 3765383

pseudomonas_AP012280 ACTCGGATGGTAGGTTTATTAAAGC GTGATCGTTATTATGATAGTACG 3765033 3765192 G

enterococcus_NZ_GG70 ACAATCGTTGTCGCACTGCATAG GAACTTGGTCTACCGTACCAC 3715 13422 13573

enterococcus_NZ_GG70 GGATAATACAATCCTAATACGTACGGA GCTGCTGTAACTAGGGTAGC 3582 76982 77140

enterococcus_NZ_GL45 CTATATTCAACGGGTCACGGGTAG TCATTGATTCGATCTCGTAACTC 5004 28219 28381

enterococcus_NZ_GG70 AATGTTATTGTGGTTGCGTGTTCG TACTTTGGAAGTGCCCTGAC 3720 94699 94852

enterococcus_NZ_GG70 CATGTCTTCTAGTACAGGTTTGCCG TGTAAGAGGCCGCTAACTTC 3715 15795 15951

enterococcus_NZ_GL 5 CTCTGGCTCGTGGGCTCGG TTCTTGAGATAGTCCGGTATAAT 5899 32848 32984 C

enterococcus_NZ_GG69 ATTCGATCACGATGGGCTGGG AATTTCCTGTGTCATACACGC 2918 325104 325257

enterococcus NC 0046 CAATTGATTTAGCCACTACACCTTAC CACTATTCTGGCGACCACC 68 920608 920750

enterococcus NZ GG70 GATAAAGAAGCGTCTTGACCCAGT ATCTGGTGCTCCTTGACGC 3575 78829 78963

enterococcus_NZ_GL45 GCAAATTTAGAGAGTGCATGCATG GGAAGAGGACGGCATACAAC 5931 26355 26493

enterococcus NZ GG66 CATTTCATCTAGACCGCTCGTGT GCTTGAAGTGTATGTTGGGAC 9058 207026 207172

proteus_NZ__GG661998_ GTCGCCCTCGTGCTAACGT GGTTCTTTGATGTACCGGTT 111187 111342

proteus_NC_010554_20 GCTGATGACGGTGAAGTTTATCA CATTATCGCACATATTGACCAC 37943 2038091

proteus_NZ_GG668576_ GAAATTAGCTAAAGGGATATCGCG AACTTTCCGCCAATCCTGC 810893 811054

proteus_NZ_GG668594_ CACCTACGTTCTCACCTGCAC ATTCGATAGTACCAGTTACGTC 760 939

proteus_NZ_GG668579_ GTTGCTTATAGCGTCGCTGCT CTGGTTATCGAGAAGATAAAGG 22072 22234

proteus_NC_010554_24 GTAAGCGTAGCGATACGTTGAG GAGTGAACGCACCACTGG 48957 2449119

proteus_NC_010554_30 TCAGGTAGAGAATACTCAGGCGC CGGAGAAGGCTAGGTTGTC 33758 3033936

proteus_NC_010554_45 GCAACCCACTCCCATGGTGT CGTTCTTCATCAGACAATCTG 4391 454540

pseudomonas_NC_00908 AACTAAACCTACACGGAATTGGTTC GCAGATACACGACGTTTATGT 5 307050 307218

pseudomonas_NC_00908 GCCGCTTCACCTACGTTAGGAA CGTAAAGATGAGTCTTTAACGTC 5 308225 308377

pseudomonas NC 01661 GACGTTTGTGCGTAATCTCAGAC GAGGAAACCGTATTCGTTCGT 2 1674334 1674490

pseudomonas NC 01660 ACAACACTTTACCACTTGAGTGGG GTAACTGCCCATGTCAAGATAC 3 3425179 3425337

pseudomonas NC 01660 CCACGTTTAGTTGAACCACCGC TCAATACGCCAGTTGTTAGTTC 3 3427629 3427808

pseudomonas NC 01041 AATCGATAATAAGTACGGTGCATCC GAAGAATACATTCGCGTACATC 0 3543925 3544088 Probe Coordinates Binding region 1 Binding region 2

pseudomonas_NC_00596 AAGCAAGATCGAGTCTTCATAGTTG GATATACACGATACCTGATTCGT 6 304936 305079

pseudomonas_NC_00859 CCGATATTCATACGAGAAGGTACAC CAGTAACTCTATTGTCAAACGGT 3 226005 226171

pseudomonas NC 01651 GTAGTGAGTCGGGTGTACGTCTC TCTTCGATAGCAGACAGATAGT 4 213592 213738

pseudomonas_NC_00596 ACCTACACGGAATTGGTTCTCAGT GATACACGACGTTTGTGTGTA 6 303883 304054

enterobacter NC 0146 CAACATCATTAGCTTGGTCGTGGG TTGCGTGTTACCAACTCGTC 18 3997909 3998085

enterobacter NZ GL89 CGGCACGTCCGAATCGTATCA TCGTGTCCCGTATATGTTGG 2086 615149 615324

enterobacter NZ GL89 AATAGAGGCCCACAAGTCTTGTTC CGCTCTCCACTATGGGTAGT 2086 1664663 1664834

enterobacter_NZ_GG70 GCTACAT AATCAC ATGGACAGACA GATGGTCGATCTATCGTCTCT 4865 427821 427978

enterobacter NZ GL89 GAAGTGTTATTCAAACTTTGGTCCC CTTGAACCCTTGGTTCAAGGT 2087 1610708 1610874

Table 7: Marker regions are highly polymorphic regions (like, e.g., VNTRs) that provide fine resolution.

Table 8: Marker probes

Probe Coordinates Binding region 1 Binding region 2

plasmids_NC_011980 GCAGTCGGTAACCTCGCGC GCGCTATCTCTGCTCTCACTGC 58308 58487

plasmids_NC_015599 GCTGTAATGCAAGTAGCGTATGCGCTC GAACAGCAAGGCCGCCAATGCCTGACG 37281 37455

plasmids_NC_007351 CGCATATGCTGAATGATTATCTCGTTG ATCTTGCTCAATGAGGTTATTCA 37979 38146 C

plasmids_FN822749_ GACGACAGATGCAGGTTGA CGCATCGCCGATGCTCATC Probe Coordinates Binding region 1 Binding region 2

1846_2009

plasmids_NC_004851 CGCCTGCTCCAGTGCATCCAGCACGAA ATGCTCTCCGCCATCGCGTTGTCA 143949 144109 T

plasmids_NC_010558 AGTGCGTTCACCGAATACGTGCGCA CAGGTTATGCCGCTCAATTC 156799 155957

plasmids_NC_012547 CGCATATGCTGAATGATTATCTCGTTG ACGGTGATCTTGCTCAATGAGGTTATT 53585 53752 C

plasmids_NC_013950 GCTGTGGCACAGGCTGAACGCCG GGTGATGTCATTCTGGTTAAGA 91008 91174

plasmids_NC_002698 ACATAATCTGAATCTGAGACAACATC ACGCACTCTGGCCACACTGG 168967 169123

CMY_AB061794_343_4 CATCACGAAGCCCGCCACA GCCCTTGAGCGGAAGTATC

89

IMG_AY033653_1343_ CGGAAGTATCCGCGCGCC TTCGATCACGGCACGATC

1500

TEMJJ36911_4374_45 CATTCTCTCGCTTTAATTTATTAACCT ATCGACCTTCTGGACATTATC 51

TEM_U36911_7596_77 CGTTGCTTACGCAACCAAATATC TGATCTTGCTCAATGAGGTTA 62

Table 9: Resistance regions can be used to detect one or more genes associated with resistance to antimicrobial compounds, such as antibiotic resistance

Probe Coordinates Gene

plasmids_NC_013950_90185_90338 pKF94_115 , beta- lactamase

plasmids_NC_013452_4052_4209 SAAV b4 , tetracycline resistance protein plasmids_NC_014208_52313_52469 pKOX105p23 , VIM- 1 , OX105p24 , IntIA

pKOX105p67 , truncated AadA

betalactamase_AB372224_738_905 blaCMY-39, class C beta- lactamase CMY-39 betalactamase_EF685371_398_548 beta- lactamase CMY-29

betalactamase_DQ149247_231_371 bla-OXA- 86 , OXA- 86

betalactamase_AY750911_244_414 bla-oxa-69,beta-lactamase OXA-69

betalactamase_DQ519087_417_575 blaOXA- 93 , beta- lactamase OXA- 93

betalactamase_AM231719_379_537 blaOXA- 90 , class D beta lactamase

betalactamase_Y14156_663_819 CTX-M-4,beta lactamase

betalactamase_JN227085_763_931 blaCTX-M-117, CTX-M-117 beta- lactamase betalactamase_EU259884__1030_ll aacA4,AacA4 aminoglycoside (6 1 )

70 acetyltransferase

betalactamase_HQ913565_578_730 blaCTX-M- 106, beta- lactamase CTX-M-106 betalactamase_AY524988_385_552 blaVIM-9, VIM- 9

CARB_AF030945_646_795 CARB-6,class A beta- lactamase

CARB_U14749_1227_1390 blaCARB-4 , CARB-4 precursor

CARB_AF313471_2731_2906 aadAla, AAD (3 1 ') aminoglycoside (3' ')

adenylyltransferase

CMY_DQ463751_613_790 blaCMY- 23 , hypothetical CMY-23 protein

precursor

CMY_EF685371_397_552 beta- lactamase CMY-29

CMY_EU515251_583_733 blaCMY-40 , AmpC beta- lactamase

CMY_JN714478_1882_2055 blaCMY-66,AmpC beta- lactamase CMY-66 Probe Coordinates Gene

CMY_X91840_1872_2046 bla CMY- 2 , extended spectrum beta- lactamase

CTXM_EF219134_13713_13858 AadA2 aminoglycoside adenylytransferase ;

confers resistance to streptomycin and spectinomycin

CTXM_HQ398215_802_947 blaCTX-M- 98, beta- lactamase CTX-M-102

CTXM_AM982522_639_ 88 blaCTX-M-78, CTX-M-78 beta- lactamase

GES_HM173356_1163_1321 blaGES-16, carbapenem-hydrolyzing extended- spectrum beta lactamase GES-16

GES_AF156486_1754_1905 ges-l,beta-lactamase GES-1

GES_HQ874631_571_748 extended- spectrum beta- lactamase GES-17

GES_FJ820124_1174_1338 beta- lactamase GES10

IMG_DQ361087_489_645 blaIMP-22 , metallo-beta- lactamase IMP-22

IMG_JN848782_301_475 blaIMP-33 , metallo-beta- lactamase IMP-33

IMG_EF192154_182_328 blaIMP-24 , metallo-beta- lactamase IMP-24

IMG_AF318077_871_1047 aacC4 , aminoglycoside 6 1 -N- acetyltransferase

IMG_AF318077_515_657 aacC4 , aminoglycoside 6 ' -N- acetyltransferase

KPC_HM0S6995_226_375 blaKPCbeta-lactamase KPC-11

PC_GQ140348_624_799 KPC- 10, beta- lactamase KPC-10

KPC_EU729727_683_840 carbapenem-hydrolyzing beta- lactamase KPC- 7

PC_FJ234412_691_839 blaKPC- 8 , beta- lactamase KPC- 8

NDM_JN104597_64_211 blaNDM-5 , DM-5 metallo-beta- lactamase

NDM_FN396876_2744_2885 blaNDM- 1 , metallo-beta- lactamase

NDM_FN396876_2958_3117 blaNDM-1 , metallo-beta- lactamase

NDM_JN104597_314_465 blaNDM- 5 , NDM- 5 metallo-beta- lactamase

NDM_FN396876_2382_2548 blaNDM-1, metallo-beta- lactamase

OXA_EF650035_239_388 bla-OXA-109 , beta- lactamase OXA-109

OXA_EU019535_389_537 bla-OXA-80 , beta- lactamase OXA-80

OXA_EF650035_423_594 bla-OXA-109 , beta- lactamase OXA-109

OXA_DQ309276_232_380 bla-OXA- 84 , beta- lactamase OXA-84

OXA_DQ445683_232_380 bla-OXA-89, oxacillinase OXA-89

OXA_X75562_201_366 OXA-7,beta lactamase OXA-7

OXA_M55547_995_1154 tnpR, aac , Aac

OXA_AY445080_313_469 blaOXA- 56 , restricted- spectrum beta- lactamase OXA-56

PER_Z21957_217_371 PER- 1 , extended- spectrum beta- lactamase

PER-l

PER_HQ713678_6002_6167 blaPER- 7 , blaPER- 7

PER_GQ396303_667_844 blaPER- 6 , extended- spectrum beta- lactamase

PER- 6

PER_X93314_954_1122 bla (per- 2) , extended- spectrum beta- lactamase

PER_HQ713678_4517_4674 transposase

PER_HQ713678_5074_5219 transposase

PER_GQ396303_254_399 blaPER-6, extended- spectrum beta- lactamase

PER- 6

SHV_AY661885_656_806 blaSHV- 30, beta- lactamase SHV-30

SHV_AF535128_587_761 blaSHV-40, beta- lactamase SHV-40 Probe Coordinates Gene

SHVJJ92041_406_579 SHV- 8 , beta- lactamase

SHVJYY288915_617_764 blaSHV-50,beta-lactamase SHV-50

SHV_HQ637576_88_245 blaSHV- 135, beta- lactamase SHV- 135

SHV_AF535128_188_362 blaSHV-40 , beta- lactamase SHV-40

SHV_X98102_763_913 blaSHV-2a, beta- lactamase SHV-2a

TE _GQ149347_3605_3747 near kanamycin resistance protein

TEM_GU371926_11801_11944 traN, TraN

TEM_J01749_766_908 tet , tetracycline resistance protein

VEB_EU259884_6947_7094 blaVEB-6, VEB-6 extended- spectrum beta- lactamase

VEB_EF136375_596_738 blaVEB-4 , extended- spectrum beta- lactamase

VEB-4

VEB_EF420108_234_380 blaVEB- 5, extended spectrum beta- lactamase

VEB-5

VEB_AF010416_89_230 veb- 1 , extended spectrum beta- lactamase

VIM_AY524988_385_552 blaVIM-9, VIM- 9

VIM_Y18050_3464_3614 blaVIM, beta-lactamase VIM-1

VIM_AY635904_58_203 blaVIM-11 , metallo-beta- lactamase

VIM_HM750249_275_454 bla, metallo-beta-lactamase VIM- 25

VIM_AJ536835_313_481 blaVIM- 7 , metallo-b- lactamase

VIM_EU118148_131_300 near intIl,DNA integrase INTI1

VIM_DQ143913_921_1063 near intll,lntll

VIM_EU118148_1060_1229 blaVIM- 17 , metallo-beta- lactamase VIM- 17 van_NC_008821.1_11898_12045 vanB , VEF236 , D- alanine- -D- lactate ligase mecA_AY820253.1_1431_1608 mecA, PBP2a- like protein

mecA_AY952298.1_130_302 Pbp2 '

erm_NC_002745.2_871803_871973

erm_NC_002745.2_871666_871841 ermA, SA1951, rRNA methylase Erm (A)

Table 10: Resistance probes

Probe Coordinates Binding region 1 Binding region 2 plasmids_NC_013950 GAGGACCGAAGGAGCTAACCG CGCCGCATACACTATTCTC 90185 90338

plasmids_NC_013452 CTCATTCCAGAAGCAACTTCTTCTT GGATAGCCATGGCTACAAGAATA 4052 4209 .

plasmids_NC_014208 GGTTCTGGACCAGTTGCGTGAGCGC CGTAACATCGTTGCTGCTCCAT 52313 52469

betalactamase_AB37 CGCTGGATTTCACGCCATAGGC TGTCGCTACCGTTGATGATT 2224 738 905

betalactamase_EF68 CGTATAGGTGGCTAAGTGCAGC GTAACTCATTCCTGAGGGTTTC 5371 398 548

betalactamase DQ14 GTACATACTCGATCGAAGCACGA CCGGAATAGCGGAAGCTTTC 9247 231 371

betalactamase AY75 AAGGTCGAAGCAGGTACATACTCG AGACATGAGCTCAAGTCCAAT 0911 244 414

betalactamase DQ51 GAAGCTTTCATAGCGTCGCCTAG TTAGCTAGCTTGTAAGCAAATTG 9087 417 575

betalactamase_AM23 GAAGCTTTCATGGCATCGCCTAG AGCTAGCTTGTAAGCAAACTG 1719 379 537 Probe Coordinates Binding region 1 Binding region 2 betalactamase_Y141 CGCTACCGGTAGTATTGCCCTT AGAATATCCCGACGGCTTTC 56 663 819

betalactamase_JN22 ATCGCCACGTTATCGCTGTACT TTTACCCAGCGTCAGATTCC 7085 763 931

betalactamase_EU25 CAAGTACTGTTCCTGTACGTCAGC TCGCCAGTAACTGGTCTATTC 9884 1030 1170

betalactamase HQ91 CAACGTCTGCGCCATCGCC CGCAATATCATTGGTGGTGC 3565 578 730

betalactamase AY52 GCCGCCCGAAGGACATCAAC CAGACGGGACGTACACAAC 4988 385 552

CARB_AF030945_646_ CGTGCTGGCTATTGCCTTAGG GTAATACTCCTAGCACCAAATC 795

CARBJJl4749_1227_1 CATTAGGAGTTGTCGTATCCCTCA AATACTCCGAGCACCAAATC 390

CARB_AF313471_2731 AAATTGCAGTTCGCGCTTAGC GTTCCATAGCGTTAAGGTTTC 2906

CMY_DQ463751_613_7 GCGCCAAACAGACCAATGCT GATTTCACGCCATAGGCTC 90

CMY_EF685371_397_5 GTATAGGTGGCTAAGTGCAGCA TCGTAACTCATTCCTGAGGG 52

CMY_EU515251_583_7 GTCATCGCCTCTTCGTAGCTC GCCATATCGATAACGCTGG 33

CMY_JN71 478_1882_ ACCAATACGCCAGTAGCGAGA GCAACGTAGCTGCCAAATC 2055

CMY_ 91840_1872_20 CAATCAGTGTGTTTGATTTGCACC TACCCGGAATAGCCTGCTC 46

CTXM_EF219134_1371 CGGATAACGCCACGGGATGA ACCGGGTCAAAGAATTCCTC 3 13858

CTXM_HQ398215_802_ GCGGCGTGGTGGTGTCTC CGCTGCCGGTCTTATCAC 947

CTXM_AM982522_639_ GCCACGTCACCAGCTGCG CGGCTGGGTGAAGTAAGTC 788

GES_HM173356_1163_ GCTCGTAGCGTCGCGTCTC TTGACCGACAGAGGCAAC 1321

GES_AF156486_1754_ CAGCAGGTCCGCCAATTTCTC AGTGGACGTCAGTGCGC 1905

GES_HQ874631_571_7 CCATAGAGGACTTTAGCCACAGT TACACCGCTACAGCGTAAT 48

GES_FJ820124_1174_ CATATGCAGAGTGAGCGGTCC TCAATTCTTTCAAAGACCAGC 1338

IMG_DQ361087_489_6 CCATTAACTTCTTCAAACGATGTATG ACCCGTGCTGTCGCTAT 45

I G_JN848782_301_4 GTGCTGTCGCTATGGAAATGTG AACCAAACCACTAGGTTATCTT 75

IMG_EF192154_182_3 GTCAGTGTTTACAAGAACCACCA ATGCATACGTGGGAATAGATT 28

IMG_AF318077_871_1 CGAACCAGCTTGGTTCCCAAG TCACTGCGTGTTCGCTC 047

IMG_AF318077_515_6 GATGCTGTACTTTGTGATGCCTA CGCTTGGCAAGTACTGTTC 57

KPC_HM066995_226_3 GCAAGAAAGCCCTTGAATGAGC GCGTTATCACTGTATTGCAC 75

KPC_GQ140348_624_7 AATCAACAAACTGCTGCCGCT GCTGTACTTGTCATCCTTGT 99 Probe Coordinates Binding region 1 Binding region 2

KPC_EU729727_683_8 CCAGTCTGCCGGCACCGC TCGAGCGCGAGTCTAGC 40

KPC_FJ234412_691_8 CCGACTGCCCAGTCTGCCG CGAGCGCGAGTCTAGCC 39

NDM_JN104597_64_21 GTAAATAGATGATCTTAATTTGGTTCA TTGCTGGCCAATCGTCG 1 C

NDM_FN396876_2744_ CACAGCCTGACTTTCGCCGC CAAGCAGGAGATCAACCTGC 2885

NDM_FN396876_2958_ GGTGGTCGATACCGCCTGG GTGAAATCCGCCCGACG 3117

NDM_JN104597_314_4 CATGTCGAGATAGGAAGTGTGC TGATGCGCGTGAGTCAC 65

NDM_FN396876_2382_ CAATCTGCCATCGCGCGATT CGGCAATCTCGGTGATGC 2548

OXA_EF650035_239_3 CGAAGCAGGTACATACTCGGTC ACGAGCTAAATCTTGATAAACTT 88

OXA_EU019535_389_5 TAGAATAGCGGAAGCTTTCATGG AGCTAGCTTGTAAGCAAACTG 37

OXA_EF650035_423_5 CAAGTCCAATACGACGAGCTAAA GAATAGCATGGATTGCACTTC 94

OXA_DQ309276_232_3 GGTACATACTCGGTCGAAGCAC AATCTTGATAAACTGAAATAGCG 80

OXA_DQ445683_232_3 GGTACATACTCGGTCGATGCAC TCTTGATAAACCGGAATAGCG 80

OXA_X75562_201_366 GTAATTGAACTAGCTAATGCCGTAC TTATGACACCAGTTTCTAGGC

OXAJYI55547_995_115 CAAGTACTGTTCCTGTACGTCAG GCCCAGTTGTGATGCATTC 4

OXA_AY445080_313_4 TCTCTTTCCCATTGTTTCATGGC TGCGGAAATTCTAAGCTGAC 69

PER_Z21957_217_371 GTAGGTTATGCAGTTATTAGGTTCAG GACTCAGCCGAGTCAAGC

PER_HQ713678_6002_ GCAGTACCAACATAGCTAAATGC AAA ACAAATCACAGGCCAC 6167

PERJ3Q396303_667_8 GGTCCTGTGGTGGTTTCCACC CGCGATAATGGCTTCATTGG 44

PER_X93314_954_112 TAACCGCTGTGGTCCTGTGG TGCGCAATAATAGCTTCATTG 2

PER_HQ713678_4517_ GGAAGCGTTGCTTGCCATAGT AACCGAAGCACCATGTAATT 4674

PER_HQ713678_5074__ GTTCGGTGCAAAGACGCCG TCGCAGACTTCAATATCAATATT 5219

PER_GQ396303_254_3 CACCTGATGCAGAACCAGCAT AGGCCACGTTATCACTGTG 99

SHV_AY661885_656_8 CAGCTGCCGTTGCGAACG CGCAGATAAATCACCACAATC 06

SHV_AF535128_587_7 GCTCAGACGCTGGCTGGTC CCGCAGATAAATCACCACG 61

SHV_U92041_406_579 GCCAGTAGCAGATTGGCGGC GAACGGGCGCTCAGACG

SHV_AY288915_617_7 CCACTGCAGCAGATGCCGT GTATCCCGCAGATAAATCACC 64

SHV_HQ637576_88_24 TTAATTTGCTTAAGCGGCTGCG CCAGCTGTTCGTCACCG 5

SHV_AF535128_188_3 GGGAAAGCGTTCATCGGCG TCGCTCATGGTAATGGCG 62

SHV_X98102_763_913 TCTTATCGGCGATAAACCAGCC CGTTGCCAGTGCTCGAT Probe Coordinates Binding region 1 Binding region 2

TEM_GQ149347_3605_ GTCGGAAAGTTGACCAGACATTA ATACTAGGAGAAGTTAATAAATACG 3747

TEM_GU371926_11801 GTGAAGTGAATGGTCAGTATGTTG AGTGCGCAGGAGATTAGC 11944

TEM_J01749_766_908 CCTGTCCTACGAGTTGCATGAT ATAATGGCCTGCTTCTCGC

VEB_EU259884_6947_ CAAATACTAAATTATACAGTATCAGAG ATGCAAAGCGTTATGAAATTTC 7094 AG

VEB_EFl36375_596_7 GTTCTTATTATTATAAGTATCTATTAA CATTAGTGGCTGCTGCAAT 38 CAGTT

VEB_EF420108_234_3 CATCGGGAAATGGAAGTCGTTAT GTTCAATCGTCAAAGTTGTTC 80

VEB_AF010416_89_23 CGTGGTTTGTGCTGAGCAAAG CAAAGTTAAGTTGTCAGTTTGAG 0

VIM_AY524988_385_5 GCCGCCCGAAGGACATCAA AGACGGGACGTACACAAC 52

VIM_Y18050_3464__36 GCAACTCATCACCATCACGGA TGATGCGTACGTTGCCAC 14

VIM_AY635904_58_20 GCGACAGCCATGACAGACGC GGACAATGAGACCATTGGAC 3

VIM_HM750249_275_4 AAACGACTGCGTTGCGATATG TTCCGAAGGACATCAACGC 54

VIM_AJ536835_313_4 ATGCGACCAAACGCCATCGC ATCGTCATGGAAGTGCGTA 81

VIM_EU118148_131_3 GAACAGGCTTATGTCAACTGGG CATAACATCAAACATCGACCC 00

VIM_DQ143913_921_1 ACGAACCGAACAGGCTTATGTC TAACGCGCTTGCTGCTT

063

VIM_EU118148_1060_ CATCATAGACGCGGTCAAATAGA ACTCATCACCATCACGGAC 1229

van_NC_008821.1_11 CAGGCTGTTTCGGGCTGTGA GGGTTATTAATAAAGATGATAGGC 898 12045

mecA_AY820253.1_14 TAATTCAAGTGCAACTCTCGCAA TTTATTCTCTAATGCGCTATATATT 31 1608

mecA_AY952298.1_13 GGATAGTTACGACTTTCTGCTTCA TGTATTGCTATTATCGTCAACG 0 302

erm_NC_002745.2_87 GTCAGGCTAAATATAGCTATCTTATCG TCAGTTACTGCTATAGAAATTGAT 1803 871973

erm_NC_002745.2_87 CATCCTAAGCCAAGTGTAGACTC AAGATATATGGTAATATTCCTTATA 1666 871841 AC

Table 11: Additional regions may be used for additional discrimination and

characterization of organisms.

Probe Coordinates Gene

peGFP_Nl_730_925

CMY_X92508_126_301

TEM_X64523_2037_21 near tnpR, resolvase

91

TE __JO1749_2068_22 near ROP protein

39

TEM_AF091113_1529_

1699 Probe Coordinates Gene

TEM_J01749_1634_17

83

TEM_U36911_6901_70

69

TEM_GU371926_33909 klcA, KlcA

34082

VIM_EU118148_2821_ qacEdeltal , guarternary ammomium compound- resistance 2961 protein QacEdeltal sull , dihydropteroate synthase SULl van_DQ018710.1_648

1 6652

van_DQ018710.1_676

4 6926

van_AY926880.1_364

0 3785

van_FJ545640.1_517

690

van_AE017171.1_347

15 34859

van_FJ349556.1_560

1 5765

mecA_AM048806.2_15

74 1720

mecA_EF692630.1_23

9 405

mex_AF092566.1_371

520

mex_AF092566.1_50_

193

mex_CP000438.1_487

178 487357

mex_NZ_AAQW0100000

1.1 461304 461466

erm_EU047809.1_79_

229

gyrB_NG_015663_145 EAE 24795 , hemagluttinin domain- containing

5472 1455621 protein, gyrB, EAE_07020 , DNA gyrase subunit B

gyrB_NC_010410_421 gyrB, ABAYE0004, DNA gyrase, subunit B

5 4366

gyrB_NC_005773_490 gyrB, PSPPH_0004 , DNA gyrase subunit B

4 5052

gyrB_NC_016514_534 gyrB, EcWSUl_00004 , DNA gyrase subunit B

3 5487

gyrB_NC_016603_263 gyrB, BDGL_00243 , DNA gyrase, subunit B

1439 2631616

gyrB_NC_009436_436 gyrB, Ent638_0004 , DNA gyrase subunit B

6 4524

gyrB_NC_009512_420 gyrB, Pput_0004 , DNA gyrase subunit B

3 4373

Table 12: Additional arms

Probe Coordinates Binding region 1 Binding region 2

peGFP_Nl_730_925 GTGGTATGGCTGATTATGATCTAGAGT GAGTTTGGACAAACCACAACTAGAA Probe Coordinates Binding region 1 Binding region 2

CMY_X92508_126_301 AGTATCTTACCTGAAATTCCCTCAC CCTCTCGTCATAAGTCGAATG

TEM_X64523_2037_21 CAGTCCCTCGATATTCAGATCAGA TTAACAATTTCGCAACCGTC 91

TEM_J01749_2068_22 CAGCTGCGGTAAAGCTCATCA CATAGTTAAGCCAG ATACACTC 39

TEM_AF091113_1529_ GTAACAACTTTCATGCTCTCCTAAA CGGTAACTGATGCCGTATTT 1699

TEM_J01749_1634_17 CGTTTCCAGACTTTACGAAACAC ACGTTGTGAGGGTAAACAAC 83

TEM_U36911_6901_70 CATCATGTTCATATTTATCAGAGCTC TAGATTTCATAAAGTCTAACACAC 69

TEM_GU371926_33909 GTTTCCACATGGTGAACGGTG AAACCTGTCACTCTGAATGTT 34082

VI _EU118148_2821_ GCTGTAATTATGACGACGCCG CTCGGTGAGATTCAGAATGC 2961

van_DQ018710.1_648 GTGTATGTCAGCGATTTGTCCAT TGTCATATTGTCTTGCCGATT 1 6652

van_DQ018710.1_676 GTCCACCTCGCCAACAATCAA ATATCAACACGGGAAAGACCT 4 6926

van_AY926880.1_364 GCGTGATTATCACGTTCGGCA CTTGCAGATTTAACCGACAC 0 3785

van_FJ545640.1_517 GGCTCGACTTCCTGATGAATACG TGAAACCGGGCAGAGTATT 690

van_AE017171.1_347 CAACGATGTATGTCAACGATTTGT ATTGCGTAGTCCAATTCGTC 15 34859

van_FJ349556.1_560 GGCTCGGCTTCCTGATGAATAC AGGCATGGTATTGACTTCATT 1 5765

mecA_AM048806.2_15 CAGTATTTCACCTTGTCCGTAACC GTTTACGACTTGTTGCATGC 74 1720

mecA_EF692630.1_23 AATGTTTATATCTTTAACGCCTAAACT ATGCTTTGGTCTTTCTGCAT 9 405

mex_AF092566.1__371 CTGGCCCTTGAGGTCGCGG CGGTCTTCACCTCGACAC 520

mex_AF092566.1_50_ GACGTAGATCGGGTCGAGCT ACGGAAACCTCGGAGAATT 193

mex_CP000438.1_487 GGCGTACTGCTGCTTGCTCA TGACGTCGACGTAGATCG 178 487357

mex_NZ_AAQW01000.00 CCTGTTCCTGGGTCGAAGCC CTTCGGTCACCGCGGA

1.1 461304 461466

erm_EU047809.1_79_ GTTTATAAGTGGGTAAACCGTGAAT GAAACGAGCTTTAGGTTTGC 229

gyrB_NC_015663_145 GCCCTTTCAGGACTTTGATACTGG TGTACGGAGACGGAGTTATCG 5472 1455621

gyrB_NC_010410_421 ACACTGACCGATTCATCCTCGTG CTTGAAAGTGCGTTAACAACC 5 4366

gyrB_NC_005773_ 90 CGGAAGCCCACCAAGTGAGTAC CGAAACCAGTTTGTCCTTAGTC 4 5052

gyrB_NC_016514_534 ACCAGCTTGTCTTTAGTCTGAGAG CTTTACGACGGGTCATTTCAC 3 5487

gyrB_NC_016603_263 CATTGGTTTGTTCTGTTTGAGAGGC GATTCATCTTCGTGAATTGTGAC 1439 2631616

gyrB_NC_009436_436 GGACTTTGATACTGGAGGAGTCATA TGTACGGAAACGGAGTTATCG 6 4524

gyrB_NC_009512_420 ATGCTGGAGGAGTCGTACGTTT GTCGCGCACACTAATAGATTC 3 4373 Table 13: Plasmid regions can be used for identification purposes and can evidence horizontal gene transfer.

Table 14: Plasmid arms

Probe Coordinates Binding region 1 Binding region 2 plasmids_NC_01066 GCTGTCACCGTCCAGACGCTGTTGGC TCCGTGCCTTCAAGCGCG 0 187035 187205

plasmids_NC_01423 GACTCCGCAGAATACGGCACCGTGCGC GCGTACAGGCCAGTCAGC Probe Coordinates Binding region 1 Binding region 2

2_5501_5677 A

plasmids_NC_01183 GCTGTCCTGGCTGCAAGCCTGG CCGAACTGCTGATGGACGT 8 178818 178996

plasmids_FN554767 GACAGCAGACTCACCGGCTGGTTCCGC GCAAGATGCTGCTGGCCACACTG 13017 13190 T

plasmids_NC_01365 GACAGAACAAGTTCCGCTCCGG CACGGATACGCCGCGCAT 5 115365 115542

plasmids_NC_01395 GAACGTCTGGCGCTGGTCGCCTGCC GCACAGGTGCTGACGTGGT 1 69899 70067

plasraids_NC_00763 AATCCAGGTCCTGACCGTTCTGTCCGT ACCTCCGTTGAGCTGATGGA 5 38395 38566

plasmids_NC_00978 GAGGTGGCCAACACCATGTGTGACC GACGCCGGTATATCGGTATCGAGCT 7 17946 18116 GCT

plasmids_NC_00667 GAAGTGCCGGACTTCTGCAGA GCACGGCCTGATGGAGGCCGC 1 56259 56438

plasmids_NC_01438 GCTAATCGCATAACAGCTAC CATCACGTAACTTATTGATGATATT 5 53151 53310

plasmids_FN649418 GCTGCGGTATTCCACGGTCGGCC GCAGGAACGCTGCCTGTGGTC 57169 57339

plasmids_NC_00501 GAATCAATTATCTTCTTCATTATTGAT CTGCGGCTCAACTCAAGCA 1 8620 8785

plasmids_NC_01484 GTCACACGTCACGCAGTCC GCATTCATGGCGCTGATGGC 3 98413 98578

plasmids_NC_00849 GTGTTACTCGGTAGAATGCTCGCAAGG ACTAGATGACATATCATGTAAGTT 0 5165 5334

plasmids_NC_01596 CGGAACTGCCTGCTCGTAT AACGATATAGTCCGTTAT 3 147516 147686

plasmids_NC_00736 GCTCTCCGACTCCTGGTACGTCAG GCGCGCATTAATGAAGCAC 5 100545 100708

plasmids_NC_00983 GATGTTGCGATTACTTCGCCAACTATT GCTGTAATTATGACGACGCCG 8 104163 104332 G

plasmids_NC_01040 GCAATACCAGGAAGGAAGTCTTACTG GTCATTGGAGAACAGATGATTGATG 9 39768 39935 T

plasmids_NC_01423 GTATCGCCACAATAACTGCCGGAA AACGATATAGTCCGTTATG 3 50337 50492

plasmids_NC_01336 GTGAAGCGCATCCGGTCACC ATGGCATAGGCCAGGTCAATAT 2 56651 56805

Table 15: A list of antibiotic resistance genes for which probes can be used to identify, distinguish and/or sequence

Source Sample ID

CARB

CMY

CTX-M

GES

IMP

KPC

NDM

Other ampC OXA

PER

SHV

VEB

VIM

ermA

vanA

vanB

mecA

mexA

[00127] In some embodiments, the oligonucleic acid probes provided by the invention are molecular inversion probes (MIP). Advantages that the MIP probes described herein offer over PCR include:

[00128] 1) Multiplexing: there are published studies using 10k+ inversion probes to genotype humans including: http://www.ncbi.nlm.nih.gov/pubmed/17934468 (Porreca et. al.), 55k probes http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2715272/?tool=pu bmed 30k probes http://www.ncbi.nlm.nih.gov/pubmed/19329998 10k probes .

[00129] This offers a huge capability to expand panels. First uses might be to capture more rare strains/variants that work poorly with current PCR primers. Later uses might involve genotyping HIV and human loci as well as testing for diseases common in HIV patients- such a test can still be performed in a single tube with minimal per-test increase in reagents cost.

[00130] 2) Specificity: the probes described herein are less likely to produce off-target products because the two probe arms must bind together. This provides a thermodynamic advantage for on-target binding compared to mis-priming. Furthermore, the exonuclease step will eliminate extension products that occur when only a single probe arm binds.

[00131] PCR primers can create long extension products that serve as templates for mis- priming in later rounds. This is particularly a problem when there's lots of background (eg human) DNA compared to the target sequence; such as when the exonuclease step didn't remove all of the template and the amplification/barcoding primers misprimed against human DNA. This ends up wasting reads and would have been worse had enrichment for the circularized probes was not being performed. Preventing such reads in a PCR-only system is difficult. [00132] 3) Design optimization: the large published datasets provide good training data for a probe picking algorithm. These large datasets can be useful for picking probe sets that will work reliably and with uniform efficiency. Furthermore, we can generate a set of 10k+ probes on a microarray to generate datasets using preferred enzymes. Currently being tested is the entire set of 10k+ probes in a single reaction and then analyzing the read counts to see what made a good probe and what didn't.

[00133] Understanding the probe behavior is important for pathogens as it helps to understand the sensitivity and specificity, particularly when considering rare strains or the possibility of previously unknown strains. Pathogenica has thermodynamic models of probe behavior that provide quantitative predictions of how well a probe will work against a target.

[00134] 4) Simplicity: the probe protocol can be one-tube all the way through, adding reagents until all of the samples are pooled. PCR protocols often require multiple tubes to purify intermediate or final product from the template (e.g. , Ampliseq requires 7, PCR + Nextera likely requires 3+). Also being used are standard reagents (enzymes+oligos) and equipment (thermal cycler).

[00135] The following references are incorporated by reference in their entirety: Roberts RR, et al., "Costs attributable to healthcare-acquired infection in hospitalized adults and a comparison of economic methods," Medical Care, 48(11): 1026-1035, Nov. 2010; Scott, R.D., II., "The Direct Medical Costs of Healthcare-Associated Infections in U.S. Hospitals and the Benefits of Prevention," U.S. Centers for Disease Control and Prevention, Mar. 2009; and Edwards, J.R., et al., National Healthcare Safety Network (NHSN) report: data summary for 2006 through 2008, issued December 2009, American Journal of Infection Control. 37:783-805, Dec. 2009.

[00136] It should be understood that for all numerical bounds describing some parameter in this application, such as "about," "at least," "less than," and "more than," the description also necessarily encompasses any range bounded by the recited values. Accordingly, for example, the description at least 1, 2, 3, 4, or 5 also describes, inter alia, the ranges 1-2, 1-3, 1-4, 1-5, 2-3, 2-4, 2-5, 3-4, 3-5, and 4-5, et cetera.

[00137] For all patents, applications, or other reference cited herein, such as non-patent literature and reference sequence information, it should be understood that it is incorporated by reference in its entirety for all purposes as well as for the proposition that is recited.

Where any conflict exits between a document incorporated by reference and the present application, this application will control. All information associated with reference gene sequences disclosed in this application, such as GenelDs, Unigene IDs, or HomoloGene ID, or accession numbers (typically referencing NCBI accession numbers), including, for example, genomic loci, genomic sequences, functional annotations, allelic variants, and reference mRNA (including, e.g. , exon boundaries or response elements) and protein sequences (such as conserved domain structures) are hereby incorporated by reference in their entirety.

[00138] Headings used in this application are for convenience only and do not affect the interpretation of this application.

[00139] Preferred features of each of the aspects provided by the invention are applicable to all of the other aspects of the invention mutatis mutandis and, without limitation, are exemplified by the dependent claims and also encompass combinations and permutations of individual features (e.g. , elements, including numerical ranges and exemplary embodiments) of particular embodiments and aspects of the invention including the working examples. For example, particular experimental parameters exemplified in the working examples can be adapted for use in the claimed invention piecemeal without departing from the invention. For example, for material Is that are disclosed, while specific reference of each various individual and collective combinations and permutation of these compounds may not be explicitly disclosed, each is specifically contemplated and described herein. Thus, if a class of elements A, B, and C are disclosed as well as a class of elements D, E, and F and an example of a combination of elements, A-D is disclosed, then even if each is not individually recited, each is individually and collectively contemplated. Thus, is this example, each of the combinations A-E, A-F, B-D, B-E, B-F, C-D, C-E, and C-F are specifically contemplated and should be considered disclosed from disclosure of A, B, and C; D, E, and F; and the example combination A-D. Likewise, any subset or combination of these is also specifically contemplated and disclosed. Thus, for example, the sub-group of A-E, B-F, and C-E are specifically contemplated and should be considered disclosed from disclosure of A, B, and C; D, E, and F; and the example combination A-D. This concept applies to all aspects of this application including, elements of a composition of matter and steps of method of making or using the compositions.

[00140] The forgoing aspects of the invention, as recognized by the person having ordinary skill in the art following the teachings of the specification, can be claimed in any combination or permutation to the extent that they are novel and non-obvious over the prior art— thus to the extent an element is described in one or more references known to the person having ordinary skill in the art, they may be excluded from the claimed invention by, inter alia, a negative proviso or disclaimer of the feature or combination of features.

[00141] While this invention has been particularly shown and described with references to example embodiments thereof, it will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the scope of the invention encompassed by the appended claims.

Examples

[00142] Procedure:

1) Remove the DxSeq Kit from the -20°C freezer.

2) Remove one Reagent Set Pack from the DxSeq Kit, and place the tubes on ice.

3) Remove two blue FrameStrips and matching strip caps from the kit. [The Break- A- Way Plate with primers is not needed at this point in the protocol.]

4) Label the FrameStrips and the strip caps #1 and #2 with a permanent marker.

[Both the FrameStrips and strip caps should be labeled to avoid cross- contamination during subsequent handling steps.]

5) Return the kit to the -20°C freezer for later use.

6) After the components have thawed, pulse-spin any droplets from the cap or sidewalls to the bottom of the tubes using a microcentrifuge.

7) Using barrier pipette tips, prepare 75 μΐ, Hybridization Master Mix for 12 samples and 2 controls, as follows:

a. 22.5 μΐ, 1 Ox Buffer A

b. 15 \i MIP Probe mixture

c. 37.5 ih of nuclease-free water

8) Using barrier pipette tips, pipette 5 i ~ L of Hybridization Master Mix into wells A - G of two blue FrameStrip PCR 8-strips (n=14 wells). [Do not pipette Hybridization Master Mix into wells H: these are reserved for negative controls.]

9) Being very careful not to cross-contaminate the wells, add 10 μΐ ^ of each DNA sample to the A - F wells of the two FrameStrips (n=12 wells). [Do not pipette your DNA samples into the G & H wells: these four wells are reserved for control reactions.]

) Add 10 μΐ of nuclease-free water to the G wells (n= 2 wells). These will serve as the "no target DNA" negative controls.

) Add 13.5 of nuclease-free water and 1.5 of 1 OX Buffer A to the H wells (n= 2 wells). These will serve as the "no probe" negative controls.

) Seal the two FrameStrips with the flat strip caps.

) Vortex the sealed FrameStrips briefly to mix the contents; and then pulse-spin down the contents in a microcentrifuge with a rotor that accommodates 8-well strip PCR tubes.

) Enter the following program into a thermocycler, using the heated lid option.

) Place the sealed FrameStrips in the thermocycler; and begin the hybridization portion of the MIP Program.

) While the hybridization is underway, prepare the Polymerase/Ligase Master Mix on ice:

a. 5 μΐ, Polymerase

b. 5 μΐ, 1 OX Buffer A

c. I μΐ ^ Ligase

d. 1.25 μΙ^ΝΤΡ8

e. 37.75 μΐ, nuclease-free water

) When the hybridization reaction reaches the 60°C hold step (approximately 26 minutes into the program), add 2 μΐ, of the Polymerase/Ligase Master Mix to every well (n= 16 wells).

) Reseal the FrameStrips with the same strip caps as before and mix. [Special care needs to be taken not to cross-contaminate the samples.] ) Advance the thermocycler to the next step in the MIP Program (60°C for 10 min).

) When the thermocycler reaches the 15°C hold step, advance the thermocycler to the next step (94°C for 2 min) in the MIP Program.

) When the thermocycler reaches the 37°C hold step, immediately add 1 μί, of Exonuclease to each sample.

) Reseal the FrameStrips with the same strip caps as before and mix. [Special care needs to be taken not to cross-contaminate the samples.]

) Advance the thermocycler to the next step (37°C for 30 min) in the MIP

Program.

) While the reactions are incubating at 37°C, prepare the amplification mix: a. (components of the PCR reaction)

) Remove the Purple Break- A- Way 96 Well Plate containing PCR primers from the -20°C freezer. Break off three columns from the left side of the plate.) Return the unused portion of the Break- A- Way 96 Well Plate to the freezer (before the primers thaw).

) When the thermocycler reaches the 4°C hold, add 2.5 μΐ ^ of tube-specific barcoding primer and 29.5 μΐ, of amplification mix.

) Begin the PCR Amplification Program on the thermocycler:

a. 94°C, 3 min

b. 30 cycles of:

i) 94°C, 15 sec

ii) 60°C, 15 sec

iii) 72°C, 30 sec

c. 72°C, 4 min

d. 4°C hold

) Purify the PCR amplicons using AMPure beads (Beckman Coulter).

) Proceed to the IonTorrent Template preparation workflow.

Pathogenica Software installed on the Ion Torrent PGM reports the results.