Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
INJECTION SYRINGE
Document Type and Number:
WIPO Patent Application WO/1993/010842
Kind Code:
A1
Abstract:
A syringe, comprising a casing (1) with a piston/piston rod assembly (9, 10) slidable therein and a needle foot (6) latched to a first end of the casing, means being provided to couple the piston with the needle foot and to unlatch the needle foot in an inner extreme position of this assembly in order to be able to withdraw the needle foot with the piston into the casing and thereby to shield off the needle (7) fixed to the needle foot and, as the case may be, to destroy this needle, stroke limiting means (15, 18) being provided which are adapted to limit the stroke length in such a way that the piston remains at a predetermined distance from the needle foot when the piston is pushed inward for the first time to eject the air, and after the withdrawal of the piston for drawing in the injection fluid, to increase the stroke length in such a way that the piston can be coupled with the needle foot when the piston is thereafter pushed inwards again for the ejection of the injection fluid. The means to limit the stroke comprise a ring (18) which, is either slidable by interaction with this assembly, over a predetermined distance with respect to this assembly or to the casing after the assembly has been drawn outwards at least once, or is rotatable and is provided with abutments staggered to each other over the predetermined distance, so that, when the assembly slides, this ring is turned in such a way that, finally, the assembly can be pushed inwards up to the deepest abutment.

Inventors:
VAN DEN HAAK ABRAHAM (NL)
Application Number:
PCT/NL1992/000217
Publication Date:
June 10, 1993
Filing Date:
November 27, 1992
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
ADVANCED PROTECTIVE INJECTION (NL)
International Classes:
A61M5/315; A61M5/32; (IPC1-7): A61M5/315; A61M5/50
Domestic Patent References:
WO1991004065A11991-04-04
WO1991012842A11991-09-05
WO1990001962A11990-03-08
Foreign References:
EP0423347A11991-04-24
Download PDF:
Claims:
WHAT IS CLAIMED IS:
1. A method of delecting the presence ol an organism, infectious agent, or biological component of a cell or organism in a biological sample containing pols nucleotides, comprising the steps of: (a) immobilizing a first polvnucleotide probe to a solid support, wherein the nucleotide sequence of said first polynucleotide probe is ufficiently complementary to a first nuclcolidc sequence contained in an analvlc polvnucleotide in said organism, infectious agent, or biological component lhat said first polvnucleotide probe can hybridize lo said first nucleotide sequence of said analyte poh nuclcolidc of aid organism, infeclious agent, or biological component; (b) contacting the poh nucleotides present in said sample with said first polynucleotide probe; (c) hybridizing aid anahle polv nucleotide in said ample to said first polynucleotide probe, if said ana le polynucleotide is present in said sample; (d) contacting a second polvnucleotide pi obe with aid analyte polynucleotide hybridized lo said first polynucleolide probe, if said analvlc polynucleolide from said sample has hybridized lo said first polvnucleotide probe, w herein lhc nucleotide sequence of said second polynucleotide probe is sufficient!) complementary to a second nucleotide sequence contained in said analyte polynucleolide ol said organism, infectious agent, or biological component thai aid second polvnucleotide probe can hybridize lo said second nucleotide sequence; (e) hybridizing said second polv nucleotide probe lo said analylc polynucleotide hybridized to said first polynucleolide probe, il said analyte polynucleolide has hybridized to said first polynucleolide probe; and (f) determining the presence of said organism, infectious agent, or biological component in said sample by detecting the presence of said econd polynucleotide probe hybridized to said anahle polynucleolide which has hybridized lo said first polynucleotide probe.
2. The method of Claim 1, wherein said second polynucleolide probe has the same or lower T as said first polynucleotide probe.
3. The method of Claim 1. hei ein said lirsi pol nucleotide probe has a Tm within the range of from approximately 4S°C to approximated (>t)°C.
4. The method of Claim 1 , whei ein said Iii si nuclcolidc sequence of said analyte polynucleolide from said sample is common lo a pluralitv of oiganisms, infeclious agents, or biological components of a cell or organism. 10S .
5. The method of Claim 1, whei ein said fust nuclcolidc sequence of said analyte polynucleolide from said sample is specific lo a particular oiganism. infectious agent, or biological component of a cell or organism.
6. The method of Claim 4, wherein said fust polynucleotide probe has a sequence complementary to an rRNA sequence that is specific to a particular fungal species.
7. The method of Claim 1, whei ein said second nucleotide sequence of said analyte polynucleotide from said sample is common to a plui lilv ol oiganisms, infectious agents, or biological components of a cell or organism.
8. The method of Claim 7, wherein said second polynucleotide probe has a sequence complementary to an rRNA sequence thai is common lo to a plurality ol fungal species.
9. The method of Claim 1, wherein said second nucleotide sequence of said analyte polynucleotide from said sample is specific lo a parliculai oiganism, infectious agent, or biological component of a cell or organism.
10. The method of Claim 1, whei ein a label is attached lo said second polynucleotide probe.
11. The method of Claim 10. whei ein said label is selected fi om the group consisting of a radionuclide, an enzyme, an enzvme substrate, a specific binding moiety, an binding parlncr for a specific binding moicly, biotin, avidin, a nucleic acid slain, and a flouresecenl material.
12. The method of Claim 11, wherein said label is a nucleic acid slain selected from the group consisting of cthidium bromide. yovo1, and toto 1.
13. The method of Claim 1 1, w herein said label compi ises alkaline phosphatase, and wherein step (0 compi ises adding ATTOPHOS and ineasui ing the fluorcscense emitted using a fluorimeter.
14. The method ol Claim 10, whei ein said label can be measui ed by light emitted therefrom, and wherein step (f) comprises ineasui ing the amount of light emitted by said label.
15. The method of Claim 14, wheiein the step of measuring the amount of light emitted by said label comprises: recording the amount of light on film; and measuring the exposure of the film using a densilometer.
16. The method ol Claim 1. whei ein said solid suppoi t compi ises a microtiter plate having a plurality of wells, each of aid wells hav ing a specific polvnucleotide pi obe immobilized thereon.
17. The method of Claim 1, wherein said fu st polvnucleotide pi obe comprises DNA.
18. The method of Claim 1, wherein said fii si and second poh nucleotide probes comprise DNA.
19. The method of Claim 1. additionally compi ising the slep of washing said solid support after hybridizing said analvle polynucleolide in said sample to said fii st polynucleotide probe so lhat Un¬ substantially all of said biological sample not annealed lo said first polvnucleotide probe is removed from said solid support.
20. The method of Claim 1, additionally comprising the slep of washing said solid support after hybridizing said second polynucleolide probe wilh said ana le polynucleolide in said sample which is hybridized to said first polynucleotide probe so thai substantially all of said second polynucleotide probe not hybridized with said analyte polynucleotide is removed from said solid support.
21. The method of Claim 1, wherein the polynucleotide hybridized to said first polynucleotide probe is selected from the group consisting of mRNA, rRNA, and genomic DNA.
22. The method of Claim 1, additionally comprising lhc step of identifying said first and second polynucleolide probes.
23. The method of Claim 22, wherein said identifying step is performed by means of a computerassisted method.
24. The method of Claim 23, wherein said identifying step comprises the use of an Hsite model.
25. The method of Claim 24. wherein the identification of aid first polynucleotide probe using said Hsile model comprises the sleps of: specifying a minimum melting temperature for the first nuclcolidc probe and the nucleotide sequence specific lo aid organism; specifying a nucleation threshold that places a minimum value on the number of base pairs at any nucleation site; determining the melting temperatures (Tm) of the first nucleotide probe and said sequence specific lo said organism al every possible hybridization point; and selecting ihe nucleotide probe having the highest Tm value.
26. The method of Claim 25, wherein said melting temperature is determined by the formula; Tm = 81.516.6(log[Nal)() )3«^formamidc + t).4 l (^. /((i + C))()()()/N,vvherein Log|Na]isthelogfunction of the sodium concentration, 0.063';?. (formamide) is the concentration of formamide, %(G + C) is the percentage of matched GC base pairs, and N is the probe length.
27. The method of Claim 24, wherein the identification of said second polynucleotide probe using said Hsite model comprises the steps of: specifying a minimum melting temperature for the second nucleotide probe and the nucleotide sequence specific lo said organism; specifying a nucleation threshold that places a minimum value on the number of base pairs al any nuclealion site; determining the melting temperatures (Tm) of the first nucleoiide probe and said sequence specific lo said organism al every possible hybridization point; and selecting lhc nucleotide probe of the proper length having the lowest Tm value.
28. The method of Claim 27, wherein said melting lemperature is determined by the formula; Tm = 81.516.6(log| Na])ϋ.63% (formamide) + 0.41 (':; (( i + C))(.()0/N, wherein Log| Na] is lhc log function of the sodium concentration, 0.063';? (formamide) is the concentration of formamide, %(G + C) is the percentage of matched GC base pairs, and N is the probe length.
29. The method of Claim 1, wherein said organism or infeclious agent lo be detected is a variety of fungus, said first polynucleolide probe being a polynucleolide slrand comprising a sequence complementary to a sequence selected from the group consisting of SEO ID NO:81, SEQ ID NO:104, SEQ ID NOJ31 through SEQ ID NO: 133, SEO ID NO: 154 through SEO ID NO:156, SEQ ID NO:176, SEQ ID NO:199, SEQ ID NO:267, SEO ID NO:2 0. SEO ID NO:312, SEO ID NO:335, SEQ ID NO:364 through SEQ ID NO:376, SEO ID NO:39 l through SEO ID NO:392, a sequence homologous to any of the foregoing sequences, and a sequence capable of hybridizing to any of the foregoing sequences.
30. The method of Claim 1, wherein said organism or infeclious agent lo be detected is a variety of fungus, said second polynucleotide probe being a polynucleolide slrand comprising a sequence complemeniarv lo a sequence selected from the group consisting of SEO ID NO:l through SEQ ID NO:80, a sequence homologous to any of SEQ ID N : l through SEQ ID NO:80, and a sequence capable of hybridizing to any of the foregoing sequences.
31. The method of Claim 1, wherein said biological component to be delected is a jun oncogene, said firsl polynucleotide probe being a polynucleolide slrand comprising a sequence selected from the group consisting of SEO ID NO:473, SEO ID NO:600, SEO ID NO:607, SEQ ID NO:615, SEQ ID NO:622, SEQ ID NO:637; SEO ID NO:730, SEQ ID NO:747, SEO ID NO:748, SEQ ID NO:488, SEQ ID NO:513, SEO ID NO:630, AND SEO ID NO:639, a sequence complementary to any of such sequences, and a sequence capable of hybridizing With any of these sequences.
32. The method of Claim 1, wherein said biological component to be detected is a jun oncogene, said second polynucleotide probe being a polynucleolide slrand comprising a sequence complementary to a sequence selected from the group consisting of SEO ID NO: 728, SEQ ID NO:729, SEQ ID NO:733, SEQ ID NO:734, SEO ID NO:73lλ SEO ID NO:740, SEO ID NO:741, SEQ ID NO:742, SEQ ID NO:743, and SEO ID NO:744, a sequence complementary to any of such sequence, and a sequence capable of hybridizing wilh any of these sequences.
33. The method of Claim 1, wherein said biological component to be detected is a Substance P receptor, said firsl polvnucleotide probe being a polynucleolide strand comprising a sequence complementary lo a sequence selected from the group consisting of SEO ID NO:758, a sequence complementary lo SEQ ID NO:75s, and a sequence capable of hybridizing wilh SEO ID NO:758.
34. The method of Claim 1, wherein said biological component to be detected is a β receptor, said first polynucleolide probe being a polynucleolide slrand comprising a sequence I l l complementary to a sequence selected from ihe group consisting of SEQ ID NO:759, a sequence complementary to SEQ ID NO:759, and a sequence capable of hybridizing wilh SEQ ID NO:759.
35. The method of Claim 1, wherein said biological component lo be delected is a G protein, said first polynucleotide probe being a polynucleotide slrand comprising a sequence complementary lo a sequence selected from the group consisting of SEQ ID NO:751, SEQ ID NO:553, SEQ ID NO:670, SEQ ID NO:752, SEO ID NO:753, SEO ID NO:565, SEO ID NO:678, SEQ ID NO:686, SEQ ID NO:754, SEO ID NO:577, SEO ID NO:697. SEO ID NO:704, SEQ ID NO:755, SEQ ID NO:756, SEQ ID NO:732, SEQ ID NO:642, SEO ID NO:652, SEO ID NO:757, SEO ID NO:593, SEQ ID NO:710, and SEO ID NO:721, a sequence complementary lo any of such sequences, and a sequence capable of hybridizing with any of these sequences.
36. The method of Claim 1, wherein said biological component to be detected is a G protein, said second polynucleolide probe being a polynucleotide strand comprising a sequence complementary to a sequence selected from the group consisting of SEO ID NO:528, SEO ID NO:731, SEQ ID NO:749, and SEO ID NO:750, a sequence complementary to any of such sequences, and a sequence capable of hybridizing wilh any of these sequences.
37. A solid supportpolynucleotide structure for identifying the presence of an organism, infectious agent, or biological component of a cell or organism in a biological sample containing polynucleotides, comprising: a solid support having immobilized thereto a firsl polynucleolide probe, said first polynucleotide probe having a sequence complementary lo a firsl nucleotide sequence specific to said organism, infectious agent, or biological component; an analvle polynucleolide from said cell, organism, or infeclious agenl containing said first nucleotide sequence, aid analyte polynucleolide being hybridized lo said first polynucleolide probe at aid firsl nucleotide sequence; a second polynucleolide probe complementary to a second nucleotide sequence present on said analyte polynucleolide from said cell, organism, or infectious agent which is hybridized lo aid first polynucleolide probe, said second polynucleotide probe being hybridized lo aid analvle polynucleolide al said second nuclcolidc sequence.
38. The solid supportpolynucleotide structure of Claim 37, wherein said second polynucleotide probe includes a label.
39. The solid supportpolynucleotide structure of Claim 38, wherein said label is selected from the group consisting of a radionuclide, an enzyme, an enzyme substrate, a specific binding moiety, an binding parlner for a specific binding moiety, biotin, avidin, a nucleic acid slain, and a flouresecent material.
40. The solid supportpolynucleotide structure of Claim 37, wherein said second polynucleotide probe is common to polynucleotides contained in a plurality of organisms, infectious agenls, or biological components of a cell or organism.
41. The solid supportpolynucleotide structure of Claim 37, wherein said first and second polynucleolide probes are determined through the use of a computer syslcm for designing oligonucleotide probes for use wilh a gene sequence data source, aid computer system comprising: an input means for retrieving said gene sequence data: a processor; instructions directing said processor lo determine said first oligonucleotide probe;.
42. The solid supportpolynucleotide structure of Claim 37, wherein said polynucleotide is selected from the group consisting of mRNA, rRNA, and genomic DNA.
43. The solid supportpolynucleotide structure of Claim 37, wherein said first polynucleotide probe is a polynucleolide strand comprising a sequence complementary lo a sequence selected from the group consisting of SEO ID NO:Sl, SEO ID NO: 104. SEO ID NO:131 through SEO ID NO:133, SEQ ID NOJ54 through SEO ID NO:156, SEQ ID NO: 17<., SEO ID NO:1W, SEO ID NO:267, SEQ ID NO:290, SEQ ID NO:312, SEO ID NO:335, SEO ID NO:364 through SEO ID NO:376, SEQ ID NO:391 through SEO ID NO:392, a sequence homologous to any of the foregoing sequences, and a sequence capable of hybridizing to any of the foregoing sequences.
44. The solid supporlpolynucleolide .structure of Claim 37, wherein said second polynucleolide probe is a polynucleolide slrand comprising a sequence complementary to a sequence selected from the group consisting of SEO ID NO: l through SEO ID NO:80, a sequence homologous to any of SEQ ID NO:l through SEO ID N():S0. and a sequence capable of hybridizing lo any of the foregoing sequences.
45. The solid supporlpolynucleolide structure of Claim 37, wherein said first polynucleotide probe is a polynucleotide strand comprising a sequence selected from the group consisting of SEQ ID NO:473, SEQ ID NO:600, SEO ID N():6()7, SEO ID NO:615, SEO ID NO:622, SEO ID NO:637, SEQ ID NO:730, SEQ ID NO:747, SEO ID NO:74S. SEO ID NO:4SS, SEO ID NO:513, SEO ID NO:630, AND SEO ID NO:639, a sequence complementary lo anv of such sequences, and a sequence capable of hybridizing wilh any of these sequences.
46. The solid supportpolynucleotide .structure of Claim 37, wherein said second polynucleotide probe is a polynucleolide slrand comprising a sequence complementary lo a sequence selected from the group consisting cJ SEO ID NO: 72S, SEO ID NO:729, SEQ ID NO:733, SEQ ID NO:734, SEQ ID NO:739, SEQ ID NO:740, SEQ ID NO:741, SEO ID NO:742, SEQ ID NO:743, and SEQ ID NO:744, a sequence complementary lo any of such sequence, and a sequence capable of hybridizing wilh any of these sequences.
47. The solid supporlpolynucleolide structure of Claim 37, wherein said first polynucleotide probe is a polynucleotide strand comprising a .sequence complementary lo a sequence selected from the group consisting of SEQ ID NO:75S. a sequence complementary lo SEO ID NO:758, and a sequence capable of hybridizing with SEO ID NO:758.
48. The solid supporlpolynucleolide structure of Claim 37, wherein said first polynucleotide probe is a polynucleolide strand comprising a sequence complementary lo a sequence selected from the group consisting of SEQ ID NO:759, a sequence complementary to SEQ ID NO:759, and a sequence capable of hybridizing with SEQ ID NO:759.
49. The solid supportpolynucleotide structure of Claim 37, wherein said first polynucleotide probe is a polynucleotide strand comprising a sequence complementary to a sequence selected from the group consisting of SEQ ID NO:751, SEQ ID NO:553, SEQ ID NO:670, SEQ ID NO:752, SEQ ID NO:753, SEQ ID NO:565, SEQ ID NO:678, SEO ID NO:6S6, SEO ID NO:754, SEO ID NO:577, SEQ ID NO:697, SEQ ID NO:704, SEQ ID NO:755, SEO ID NO:756, SEO ID NO:732, SEQ ID NO:642, SEQ ID NO:652, SEQ ID NO:757, SEO ID NO:593. SEO ID NO:710, and SEO ID NO:721, a sequence complementary lo any of such sequences, and a sequence capable of hybridizing wilh any of these sequences.
50. The solid supporlpolynucleolide structure of Claim 37, wherein said second polynucleolide probe is a polynucleolide slrand comprising a .sequence complementary to a sequence selected from the group consisting of SEO ID NO:528, SEO ID N():731, SEO ID NO:749, and SEQ ID NO:750, a sequence complementary to any of such .sequences, and a sequence capable of hybridizing with any of these sequences.
51. A kit for identifying the presence of an organism, infeclious agent, or biological component of a cell or organism in a biological sample, comprising the following components: a specific polynucleotide probe, said specific polvnucleotide probe being complementary to or homologous to a first nucleotide sequence in an analyte polynucleotide specific to a particular organism, infectious agent, or biological component lo be delected; and a common polynucleotide probe complementary lo or homologous to a second nucleotide sequence in said analyte polynucleolide of said organism, infeclious agent, or biological component, said common polynucleotide probe being complementary to. polynucleotides contained in a plurality of organisms, infeclious agents, or biological components.
52. The kil of Claim 51. additionally comprising a solid support lo which a polynucleolide can be immobilized. 53.
53. The kil of Claim 52, wherein said specific polynucleolide probe is immobilized to said solid support.
54. The kit of Claim 53, wherein said solid support has a plurality of specific polynucleotide probes immobilized therelo, each of said probes specific lo a different organism, infectious agent, or biological component.
55. The kit of Claim 53, wherein said solid support comprises a plurality of wells, each of said specific polynucleolide probes being immobilized to a different well.
56. The kit of Claim 55, additionally comprising a buffer appropriate for the hybridization of said probes and polynucleotides, said polynucleotides being selected from the group consisting of mRNA, rRNA, and genomic DNA.
57. The kil of Claim 51, wherein said second polynucleolide probe bears a label.
58. The kit of Claim 57, wherein said label is selected from the group consisting of a radionuclide, an enzyme, an enzyme substrate, a specific binding moiety, an binding partner for a specific binding moiety, biotin, avidin, a nucleic acid slain, and a floureseccnl material.
59. The kit of Claim 51, wherein said specific polynucleolide probe comprises a first specific primer which is complemenlary or homologous lo a sequence specific lo a particular organism, infectious agent, or biological component, said kit additionally comprising a second specific polynucleolide primer which is complementary or homologous lo a different sequence specific to said organism, infectious agent, or biological component.
60. The kil of Claim 59, additionally comprising al least one of the following: dNTP's, a reverse transcriptase, a polymerase, and a buffer appropriate for addition of dNTP's lo a primer using a reverse transcriptase or polymerase.
61. The kil of Claim 60, including a DNA polymerase that has significant polymerase activity at temperatures above 50 °C.
62. The kil of Claim 61, additionally comprising al least one of the following: dNTP's, a reverse transcriptase, a polymerase, and a buffer appropriate for addition of dNTP's lo a primer using a reverse transcriptase or polymerase.
63. The kit of Claim 51. wherein said specific polynucleolide probe comprises a sequence complementary lo or homologous to a sequence selected from the group consisting of SEQ ID NO:81, SEQ ID NO:104, SEO ID NO:131 through SEO ID NO.133, SEO ID NO:154 through SEQ ID NOJ56, SEQ ID NO:176, SEQ ID NO:199, SEO ID NO:267, SEQ ID NO:290, SEQ ID NO:312, SEQ ID NO:335, SEQ ID NO:364 through SEO ID NO:37(>. and SEO ID NO:391 through SEO ID NO:392.
64. The kit of Claim 51, wherein aid common polynucleolide probe comprises a sequence complementary to or homologous lo a sequence selected from the group consisting of SEQ ID NO:l through SEQ ID NO:80.
65. The kit of Claim 51. wherein aid specific polynucleolide probe is a polynucleolide strand comprising a sequence selected from the group consisting of SEQ ID NO:473, SEQ ID NO:600, SEQ ID NO:607, SEQ ID NO:615. SEQ ID NO:622, SEO ID NO:637, SEQ ID NO:730, SEO ID NO:747, SEQ ID NO:74S, SEQ ID N :488, SEQ ID NO:513, SEO ID NO:630, AND SEQ ID NO:639, a sequence complemenlary to any of such sequences, and a sequence capable of hybridizing with any of these sequences. 66. The kil of Claim 51. wherein said common polynucleolide probe is a polynucleotide strand comprising a sequence complementary lo a sequence selected from lhc group consisting of SEQ ID NO: 728, SEQ ID NO:729. SEO ID N():733. SEO ID NO:734, SEO ID NO: 739, SEO ID NO:740, SEQ ID NO:741, SEQ ID NO:742. SEO ID NO:743.
66. and SEO ID NO:744, a sequence complementary to any of such sequence, and a sequence capable of hvbridizing wilh any of these sequences.
67. The kil of Claim 51, wherein said specific polynucleolide probe is a polynucleolide strand comprising a sequence complemenlary to a sequence selected from the group consisting of SEQ ID NO:758, a sequence complemenlary to SEQ ID NO:75S, and a sequence capable of hybridizing with SEQ ID NO:758.
68. The kil of Claim 51, wherein said specific polvnucleotide probe is a polynucleotide strand comprising a sequence complementary lo a sequence selected from the group consisting of SEQ ID NO:759, a sequence complemenlary lo SEO ID NO:759, and a sequence capable of hybridizing with SEQ ID NO:759.
69. The kil of Claim 51, wheiein said specific polynucleolide probe is a polynucleotide strand comprising a sequence complemenlary lo a sequence selected from the group consisting of SEQ ID NO:751, SEQ ID NO:553. SEO ID NO:070, SEO ID N():752. SEO ID NO:753, SEO ID NO:565, SEQ ID NO:678, SEQ ID NO:6S , SEO ID N():754. SEO ID NO:577, SEO ID NO:697, SEQ ID NO:704, SEQ ID NO:755, SEO I D N :75(>. SEO I D NO:732. SEO I D NO:642, SEQ ID NO:652, SEQ ID NO:757, SEQ ID NO:593, SEO ID NO:710. and SEO ID NO:721, a sequence complementary to any of such sequences, and a sequence capable of hybridizing with any of these sequences.
70. The kit of Claim 51, wherein said common polynucleolide probe is a polynucleotide strand comprising a sequence complemenlary lo a sequence selected from the group consisting of SEQ ID NO:528, SEQ ID NO:731, SEQ ID NO:749. and SEO ID NO:75(), a sequence complementary to any of such sequences, and a sequence capable of hybridizing with any of these sequences.
71. An isolated oligonucleotide eight nucleotides or longer thai is useful in the detection and quantification of jun oncogenes. said nucleotide containing a sequence specific to a particular jun oncogene, said sequence being selected from the group consisting of SEO ID NO:473, SEQ ID NO:600, SEQ ID NO:607, SEO ID NO:615. SEQ ID NO:<>22. SEO ID NO:637, SEO ID NO:730, SEQ ID NO:747, SEQ ID NO:74S, SEO ID N :4S8. SEO ID N :513, SEO ID NO:630, AND SEQ ID NO:639, a sequence complemenlary lo any of such sequences, and a .sequence capable of hybridizing wilh any of these sequences.
72. An isolated oligonucleotide eight nucleotides or longer thai is useful in ihe detection and quantification of jun oncogenes, said nucleotide containing a sequence common to a plurality of jun oncogenes, said sequence being selected from the group consisting of SEQ ID NO: 728, SEQ ID NO:729, SEQ ID NO:733, SEO ID NO:734, SEO ID NO:739. SEO ID NO:740, SEQ ID NO:741, SEQ ID NO:742, SEQ ID NO:743, and SEO ID NO:744. a sequence complementary lo any of such sequence, and a sequence capable of hybridizing wilh any of these sequences.
73. An isolated oligonucleotide eight nucleotides or longer thai is useful in lhc detection and quantification of β receptors, said sequence being selected from the group consisting of SEQ ID 1 1 (> NO:759, a sequence complementary to SEO ID NO:759, and a sequence capable of hybridizing with SEQ ID NO:759.
74. An isolated oligonucleotide eight nucleotides or longer that is useful in the detection and quantification of Substance P receptors, said sequence being selected from the group consisting of SEQ ID NO:758, a sequence complementary lo SEO ID N():758, and a sequence capable of hybridizing with SEQ ID NO:758.
75. An isolated oligonucleotide eight nucleotides or longer lhat is useful in the deteclion and quantification of G proteins, said nuclcolidc containing a sequence specific to a particular G protein, said sequence being selected from the group consisting of SEQ ID NO:751, SEQ ID NO:553, SEQ ID NO:670, SEQ ID NO:752, SEQ ID NO:753, SEQ ID NO:565, SEQ ID NO:678, SEQ ID NO:686, SEQ ID NO:754, SEQ ID NO:577, SEQ ID NO:697, SEQ ID NO:704, SEO ID NO:755, SEQ ID NO:756, SEQ ID NO:732, SEQ ID NO:642, SEO ID NO:652, SEO ID NO:757, SEO ID NO:593, SEQ ID NO:710, and SEQ ID NO:721, a sequence complementary lo any of such sequences, and a sequence capable of hybridizing with any of these sequences.
76. An isolated oligonucleotide eight nucleotides or longer that is useful in the detection and quantification of G proteins, said nucleotide containing a sequence common to a plurality of G proteins, said sequence being selected from lhc group consisting of SEQ ID NO:528, SEQ ID NO:731, SEQ ID NO:749, and SEO ID NO:75l), a sequence complementary lo any of such sequences, and a sequence capable of hybridizing with any of these sequences.
77. An isolated segment of polynucleolide specific lo the rRNA of a particular fungus, said sequence being complementary or homologous lo a sequence selected from the group consisting of SEQ ID NO:81, SEQ ID NO:104, SEQ ID NO:131 through SEQ ID NO:133, SEQ ID NO:154 through SEQ ID NOJ56, SEQ ID NO:176, SEO ID NO:199, SEO ID NO:267, SEQ ID NO:290, SEQ ID NO:312, SEQ ID NO:335, SEQ ID NO:364 through SEO ID NO:37(). SEO ID NO:391 through SEQ ID NO:392 and a sequence homologous lo any of ihe foregoing sequences.
78. An isolated segment of polynucleolide coding for or complementary to a sequence common to a plurality of fungal species, aid sequence being complemenlary lo or homologous to a sequence selected from the group consisting of SEO ID N():l through SEO ID NO:80 and a sequence homologous to any of the foregoing seqwuences.
79. A method of detecting the presence of one or more organisms, infeclious agents, or biological components in a biological sample containing polynucleotides, wherein al least one of said polynucleotides is indicative of the presence of said one or more organisms, infeclious agents or biological components and is present in minute quantities, comprising the sleps of: (a) obtaining a biological sample containing polynucleotides; (b) contacting said sample with a first polvnucleotide primer, said first primer having a nucleotide sequence complemenlary to a nucleotide sequence common lo a plurality of organisms, infectious agents, or biological components; (c) hybridizing said firsl primer lo an anal le polynucleolide present in said sample that is complementary lo said firsl primer, if such an analyte polynucleotide is present; (d) extending said first primer, thereby producing a doublestranded polynucleotide including a complemenlary nucleotide strand comprising said first primer and having a nucleotide sequence complementary lo said analyte polynucleolide; (e) contacting said sample with a second polynucleolide primer, said second primer being complementary to a sequence contained in said complementary nucleotide strand; (0 hybridizing said second primer to said complementary nucleotide strand; (g) extending said second primer lo form a nuclcolidc slrand homologous to said analyte polynucleolide; (h) contacting said sample with a thii d polynucleolide primer, said third primer having a sequence complemeniarv lo said homologous nucleotide strand, wherein said third primer has a nucleotide sequence complemeniai v lo a sequence lhat is specific lo a particular organism, infectious agent, or biological component whose presence is to be determined; (i) hybridizing said third primer lo said homologous nucleotide slrand; (j) extending aid third primer, therebv producing a doublestranded polynucleotide; and (k) determining the pi esence of the particular organism, infectious agent, or biological component in said sample bv detecting the extension of said third primer.
80. The method of Claim 79, wherein sleps (c). ( ). (f), (g), (i), and (j) are repeated a plurality of times.
81. The method of Claim 79, wherein step (d) comprises extension with a reverse transcriptase, and step (g) comprises extension with a DNA polymerase.
82. The method of Claim S\, wherein said DNA polvmerase has significant polymerase activity at temperatures above 50 °C.
83. The method of Claim 79, w herein the nucleotide sequence of said first primer is determined by a computerassisted method.
84. The method of Claim S3, w herein said compuiei assisted method determines the sequence of said firsl nucleotide probe using an HSile model.
85. The method of Claim 79, wherein the nucleotide sequence of aid second primer is determined by a computerassisted method.
86. The method of Claim 85, wherein said computerassisted method determines the sequence of said second nucleotide probe using an HSite model.
87. The method of Claim 79, wherein said thii d primer includes a label, and wherein step (k) comprises detecting the extension ol aid labeled primei .
88. 1 1 X.
89. The method of Claim 87, wherein said label is selected from the group consisting of a radionuclide, an enzyme, an enzyme substrate, a specific binding moiety, a binding partner for a specific binding moiety, biotin, avidin. a nucleic acid stain, and a flourcsccent material.
90. The method of Claim 79, w herein the nucleotide sequence of said third primer is determined by a computerassisted method.
91. The method of Claim 89, wheiein said co pulerassisted method determines the sequence of said third primer using an HSite model.
92. The method of Claim 79, wherein said second primer has a nucleotide sequence that is common to a plurality of the organisms, infectious agents, or biological components whose presence is being determined.
93. The method of Claim 79, additionally comprising the sleps of: contacting aid sample with a fourth primer, said fourth primer having a sequence complementary lo said complemenlary nucleotide strand; hybridizing said fourth primer to said complemenlary nucleotide strand; and extending aid fourth primer.
Description:
GENE DETECTION SYSTEM

Field of the Invention The present invention relates to methods for detecting the presence of an organism or a member of a group of organisms in a biological sample by probing the sample for polynucleotides indicative of the presence of such organisms. The present invention also relates to methods for detecting the presence of other infectious agents or biological components in a biological sample which comprises polynucleotides. Background of the Invention

At the present time, the identity of an organism or infeclious agent suspected of infecting a subject is normally determined by culturing a sample of biological material from the subject. For example, if it is suspected that a subject is suffering from an infection of the lung caused by the fungus Candida albicans, a sputum sample can be cultured. After a period of lime, the culture is visually observed, and if a fungus grows in the culture in numbers sufficient to indicate a fungal infection, that fungus is identified by observing its morphological characteristics.

This method of confirming a diagnosis, however, has serious drawbacks. For example, it requires that the biological sample be cultured for a long enough period of time to allow a detectable amount of the organism to grow. This method also requires that the cultured sample be inspected by a technician trained in identifying different varieties of organisms. There is therefore a great need for an assay which can quickly and specifically identify an organism or infectious agent or a group of organisms or infectious agents. An assay which does not require a great deal of training to perform and interpret would also be advantageous.

There is also a need for an improved method for identifying other biological components present in a biological sample, where such components comprise polynucleotides or where the presence of such components is indicated by the presence of a polynucleotide. Present methods for detecting polynucleotides in a cell or tissue sample, such as the Norther blot method, require a relatively large amount of starting material. The Norther blot method is a widely accepted method of detecting specific genes (Sambrook J. et al., Molecular Cloning: A Laboratory Manual, 2nd ed., pp. 7.39-7.52) (hereinafter "Molecular Cloning"), and has been adapted for detecting cellular components such as jun oncogenes

(Sherman, et al., Proc. Nail. Acad. Sci. USA, 87:5663-5666 (1990); Oursler MJ et al., Proc. Natl. Acaά. Sci. USA, 88:6613-6617 (199])). In this method. mRNA is first purified from a tissue or cell culture, such as through electrophoresis on an agarose gel. Following clectrophoresis, the mRNA is transferred onto membranes and hybridized with radioactive probes to identify positive band(s) through

autoradiography. However, this method is not sensitive enough to identify signals corresponding to specific genes or gene products if the cells from which the mRNA is extracted have only a small quantity of genetic material.

Alternatively, reverse PCR (discussed in Molecular Cloning) can also be used to detect a wide variety of genes from different tissues and cells. In this method, mRNA is first converted to cDNA by reverse transcriptase, and specific gene fragments are then amplified by PCR using a set of primers

(sense and anti-sense primers). The amplified gene can then be seen through agarose gel electrophoresis.

Summary The present invention provides an improved method of detecting the presence of an organism, an infectious agent, or a biological component of a cell or organism in a biological sample. In this method, a polynucleotide probe is hybridized to an analylc polynucleotide in the biological sample which belongs to the organism, infectious agent, or biological component or which is indicative of the presence of the organism, infectious agent or biological component. When only small amounts of such an analyte polynucleotide are available, an alternative method can be used which employs the Polymerase Chain

Reaction (PCR). In addition, the present invention embodies polynucleotide probes and primers for use in the present methods, as well as kits which incorporate such probes and primers.

In one embodiment, the present invention includes a method of detecting the presence of a particular organism, infectious agent, or biological component of a cell or organism in a biological sample that contains polynucleotides. This method involves detecting an analyte polynucleotide in the sample that is indicative of the presence of the organism, infectious agent, or biological component and comprises the steps of:

(a) identifying a first polynucleotide probe, wherein the nucleotide sequence of the first polynucleotide probe is sufficiently complementary to a first nucleotide sequence contained in the analyte polynucleotide that the first polvnucleotide probe can hybridize to the first nucleotide sequence of the analyte polynucleotide. the first nucleotide sequence of the analyte polynucleotide being specific to the particular organism, infectious agent, or biological component;

(b) immobilizing the first polynucleotide probe to a solid support; (c) hybridizing the analyte polynucleotide in the sample with the first polynucleotide probe;

(d) identifying a second polynucleotide probe, wherein the nucleotide sequence of the second polynucleotide probe is sufficiently complementary to a second nucleotide sequence contained in the analyte polynucleotide that the second polynucleotide probe can hybridize to the second nucleotide sequence, the second nucleotide sequence being common to a plurality

of organisms, infectious agents, or biological components including the particular organism, infectious agent, or biological component;

(e) hybridizing the second polynucleotide probe with the analyte polynucleotide which is hybridized to the first polynucleotide probe; and (f) determining the presence of the particular organism, infectious agent, or biological component in the sample by detecting the presence of the second polynucleotide probe on the solid support.

In another embodiment, the present method for detecting the presence of an organism, infectious agent, or biological component of a cell or organism in a biological sample comprises the steps of:

(a) immobilizing a first polynucleotide probe to a solid support, wherein the nucleotide sequence of the first polynucleotide probe is sufficiently complementary to a first nucleotide sequence contained in an analyte polynucleotide in the organism, infectious agent, or biological component that the first polynucleotide probe can hybridize to the first nucleotide sequence of the analyte polynucleotide of the organism, infectious agent, or biological component;

(b) contacting the polynucleotides present in the sample with the first polynucleotide probe;

(c) hybridizing the analyte polynucleotide in the sample to the first polynucleotide probe, if the analyte polynucleotide is present in the sample;

(d) contacting a second polynucleotide probe with the analyte polynucleotide hybridized to the first polynucleotide probe, if the analyte polynucleotide from the sample has hybridized to the first polynucleotide probe, wherein the nucleotide sequence of the second polynucleotide probe is sufficiently complementary to a second nucleotide sequence contained in the analyte polynucleotide of the organism, infectious agent, or biological component that the second polynucleotide probe can hybridi/e to the second nucleotide sequence;

(e) hybridizing the second polynucleotide probe to the analyte polynucleotide hybridized to the first polynucleotide probe, if the analylc polynucleotide has hybridized to the first polynucleotide probe; and (f) determining the presence of the organism, infectious agent, or biological component in the sample by detecting the presence of the second polynucleotide probe hybridized to the analyte polynucleotide which has hybridized to the first polynucleotide probe.

In this method, the second polvnucleotide probe can preferably have the same or lower T m as the first polynucleotide probe. Preferably, the first polynucleotide probe also has a T m within the range of from approximately 48° C to approximately 60° C. The first nucleotide sequence of the analyte polynucleotide can also in one embodiment be common to a plurality of organisms, infectious agents, or biological components of a cell or organism. Alternatively, the first nucleotide sequence of the

analyte polynucleotide can be specific to a particular orκ:>nism, infectious agent, or biological component of a cell or organism.

The second nucleotide sequence of the analyte pJynuclcolidc can have a sequence common to a plurality of organisms, infectious agents, or biological components of a cell or organism in this embodiment of the present method. In an alternative embodiment, the second nucleotide sequence of the analyte polynucleotide can have a sequence that is specific for a particular organism, infectious agent, or biological component of a cell or organism. Additionally, a label can advantageously be attached to the second polynucleotide probe. Any of a number of polynucleotide labels known to the art can be used, including a radionuclide, an enzyme, an enzyme substrate, a specific binding moiety, an binding partner for a specific binding moiety, biolin, avidin, a nucleic acid stain, or a fluorescent material. If a nucleic acid slain is used as the label, the slain can consist of either ethidium bromide, yoyo-1, or toto-1. When the label is a light-emitting substance, the label can advantageously be detected by measuring the amount of light emitted therefrom. When measuring the amount of light emitted by the label, light can be recorded on film, after which the amount of exposure of the film is measured using a densitometcr. In an even more preferable embodiment, the label comprises alkaline phosphatase and the label is delected by adding ATTOPHOS to the solution containing the labeled probe and then measuring the fluorescence emitted using a fluorimeler.

A solid support such as a microtiter plate having a plurality of wells can be used to perform the present method. Preferably, each of the wells has a specific polynucleotide probe immobilized thereon. The first polynucleotide probe, which can be immobilized on the microtiter plate, advantageously comprises DNA, and more advantageously both the first and second polynucleotide probes comprise DNA.

A further preferable step of the method of the present invention involves washing the solid support after hybridizing the anahle polynucleotide in the sample to the first polynucleotide probe. In this way, substantially all of the biological sample not annealed to the first polynucleotide probe is removed from the solid support. Yet another step of the method includes the step of washing the solid support after hybridizing the second polynucleotide probe with the anahle polynucleotide, which is itself hybridized to the first polynucleotide probe. After such washing, substantially all of the second polynucleotide probe not hybridized with the analyte polvnucleotide is removed from the solid support. The analyte polynucleotide in this method can be selected from the group consisting of mRNA, rRNA, and genomic DNA.

Another embodiment of the present invention is a method of delecting the presence of an organism, infectious agent, or biological component of a cell or organism in a biological sample containing polynucleotides, comprising the steps of: (a) identifying a first polynucleotide probe and a second polynucleotide probe, wherein the nucleotide sequence of the first polynucleotide probe is sufficiently complementary to a first nucleotide sequence contained in an analyte polynucleotide of the organism, infectious

agent, or biological component that the first polynucleotide probe can hybridize to the first nucleotide sequence of the analyte polynucleotide of the organism, infectious agent, or biological component, and wherein the nucleotide sequence of the second polynucleotide probe is sufficiently complementary to a second nucleotide sequence contained in the analyte polynucleotide of the organism, infectious agent, or biological component that the second polynucleotide probe can hybridize to the second nucleotide sequence, the second nucleotide sequence being common to a plurality of organisms, infectious agents, or biological components;

(b) immobilizing the first polynucleotide probe to a solid support;

(c) contacting the polynucleotides present in the sample with the first polynucleotide probe;

(d) hybridizing an analyte polynucleotide in the sample to the first polynucleotide probe, if the analyte polynucleotide is present in the sample;

(c) contacting the second polynucleotide probe with the analylc polynucleotide hybridized to the first polynucleotide probe, ir the analyte polynucleotide from the sample has hybridized to the first polynucleotide probe;

(f) hybridizing the second polynucleotide probe to the analyte polynucleotide hybridized to the first polynucleotide probe, if the analyte polynucleotide has hybridized to the first polynucleotide probe; and

(g) determining the presence of the organism, infectious agent, or biological component in the sample by detecting the presence of the second polynucleotide probe hybridized to the analylc polynucleotide which has hybridized to the first polynucleotide probe.

In this embodiment, the identifying step can advantageously comprise the use of a computer, preferably one which uses an H-sitc model to identify the first polynucleotide probe. Using the H-site model to identify the first polynucleotide probe involves the steps of: specifying a minimum melting temperature for the first nucleotide probe and the nucleotide sequence specific to the organism; specifying a nuclcation threshold that places a minimum value on the number of base pairs at any nuclealion site; determining the melting temperatures (Tm) of the first nucleotide probe and the sequence specific to the organism at every possible hybridization point; and selecting the nucleotide probe having the highest Tm value. When using the H-site model, the melting temperature is preferably determined by the formula; Tm = 81.5-16.6(]og[Na])-0.63%(formamide) + 0.4] (%/(G + C))-600/N,whcreinLog[Na]isthelogfunction of the sodium concentration, 0.063% (formamide) is the concentration of formamide, %(G + C) is the percentage of matched GC base pairs, and N is the probe length.

In this method the second polynucleotide probe can also be identified using a computer which makes use of the H-site model. Identifying the second polynucleotide probe with the H-site model is preferably accomplished by following the steps of: specifying a minimum melting temperature for the second nucleotide probe and the nucleotide sequence specific to the organism; specifying a nucleation threshold that places a minimum value on the number of base pairs at any nucleation site; determining the melting temperatures (Tm) of the first nucleotide probe and the sequence specific to the organism at every possible hybridization point; and selecting the nucleotide probe of the proper length having the lowest Tm value.

The melting temperature of the second polynucleotide probe can likewise be determined by the formula; Tm =81.5-16.6(log[Na])-0.63%(formamidc) + 0.41 (%(G + C))-6()0/N, wherein Log[Na] is the log function of the sodium concentration, 0.063% (formamide) is the concentration of formamide, %(G + C) is the percentage of matched GC base pairs, and N is the probe length. In this method, the second pol> nucleotide probe can advantageously have the same or lower

T m as the first polynucleotide probe. The first polynucleotide probe also preferably has a T m within the range of from approximately 4S C C to approximately 60 °C. The first nucleotide sequence of the analyte polynucleotide can also in one embodiment be common to a plurality of organisms, infectious agents, or biological components of a cell or organism. Alternatively, the first nucleotide sequence of the analyte polynucleotide can be specific to a particular organism, infeclious agent, or biological component of a cell or organism.

The second nucleotide sequence of the anahle polvnucleotide can have a sequence common to a plurality of organisms, infectious agents, or biological components of a cell or organism in this embodimenl of the present method. In an alternative embodiment, the second nucleotide sequence of the analyte polynucleotide can ha%'e a sequence that is specific for a particular organism, infeclious agent, or biological component of a cell or organism. Additionally, a label can advantageously be attached to the second polynucleotide probe. Any of a number of polynucleotide labels known to the art can be used, including a radionuclide, an enzyme, an enzyme substrate, a specific binding moiety, an binding partner for a specific binding moiety, biotin, avidin, a nucleic acid stain, or a fluorescent material. If a nucleic acid slain is used as the label, the stain can consist of either elhidium bromide, yoyo-1, or toto-1. When ihe label is a light-emitting substance, the label can advantageously be detected by measuring the amount of light emitted therefrom. When measuring the amount of light emitted by the label, light can be recorded on film, after which ihe amount of exposure of the film is measured using a densitomctcr. In an even more preferable embodiment, the label comprises alkaline phosphatase and the label is detected by adding ATTOPHOS to the solution containing the labeled probe and then measuring the fluorescence emitted using a fluorimcter.

A solid support such as a microtiter plate having a plurality of wells can be used to perform the present method. Preferably, each of the wells has a specific polynucleotide probe immobilized thereon.

The first polynucleotide probe, which can be immobilized on the microtiter plate, advantageously comprises DNA, and more advantageously both the first and second polynucleotide probes comprise DNA.

A further preferable step of the method of the present invention involves washing the solid support after hybridizing the analyte polynucleotide in the sample to the first polynucleotide probe. In this way, substantially all of the biological sample not annealed to the first polynucleotide probe is removed from the solid support. Yet another step of the method includes the step of washing the solid support after hybridizing the second polynucleotide probe with the analyte polynucleotide, which is itself hybridized to the first polynucleotide probe. After such washing, substantially all of the second polynucleotide probe not hybridized with the analyte polynucleotide is removed from the solid support. The analyte polynucleotide in this method can be selected from the group consisting of mRNA, rRNA, and genomic DNA. Another embodiment of the present invention comprises a solid support-polynucleotide structure for identifying the presence of an organism, infectious agent, or biological component of a cell or organism in a biological sample containing polynucleotides. This structure comprises a solid support having immobilized thereto a first polynucleotide probe, the first polynucleotide probe having a sequence complementary to a first nucleotide sequence specific to the organism, infectious agent, or biological component. The structure also includes an analyte polynucleotide from the cell, organism, or infectious agent which contains the first nucleotide sequence, the analyte polynucleotide being hybridized to the first polynucleotide probe al the first nucleotide sequence. The structure includes as well a second polynucleotide probe, preferably having a label, which is complementary to a second nucleotide sequence present on the analyte polynucleotide from the cell, organism, or infectious agent which is hybridized to the first polynucleotide probe, the second polynucleotide probe being hybridized to the analyte polynucleotide at the second nucleotide sequence.

The label of the above method is preferably selected from the group consisting of a radionuclide, an enzyme, an enzyme substrate, a specific binding moiety, an binding partner for a specific binding moiety, biotin, avidin, a nucleic acid stain, and a fluorescent material. Additionally, the solid support-polynucleotide structure can even more advantageously have a second polynucleotide probe that is common to polynucleotides contained in a plurality of organisms, infectious agents, or biological components of a cell or organism. In still another preferred embodiment of this method, the first and second polynucleotide probes are determined through the use of a computer system for designing oligonucleotide probes for use with a gene sequence data source. This the computer system can comprise an input means for retrieving the gene sequence data, a processor, and instructions directing the processor to determine the first oligonucleotide probe. Yet still more preferably, the solid support-

polynucleotide structure of the above method has a polynucleotide selected from the group consisting of mRNA, rRNA, and genomic DNA.

Another embodiment of the present invention is a kit for identifying the presence of an organism, infectious agent, or biological component of a cell or organism in a biological sample which contains: a specific polynucleotide probe, the specific polynucleotide probe being complementary to or homologous to a first nucleotide sequence in an analyte polynucleotide specific to a particular organism, infectious agent, or biological component to be detected; and a common polynucleotide probe complementary to or homologous to a second nucleotide sequence in the analyte polynucleotide of the organism, infectious agent, or biological component, the common polynucleotide probe being complementary to polynucleotides contained in a plurality of organisms, infectious agents, or biological components.

In addition, the above kit can advantageously have a solid support to which a polynucleotide can be immobilized. Even more preferably, the kit has a specific polynucleotide probe immobilized to the solid support. Still more advantageously, the above kit has a solid support with a plurality of specific polynucleotide probes immobilized thereto, each of the probes specific to a different organism, infectious agent, or biological component. Most preferably, the solid support of the kit has a plurality of wells, each of the specific polynucleotide probes being immobilized to a different well, and a buffer appropriate for the hybridization of the probes and polynucleotides, with the polynucleotides being selected from the group consisting of mRNA, rRNA, and genomic DNA.

Similarly, the second polynucleotide probe of the above kit can advantageously bear a label, with the label being selected from the group consisting of a radionuclide, an enzyme, an enzyme substrate, a specific binding moiety, an binding partner for a specific binding moiety, biotin, avidin, a nucleic acid stain, and a fluorescent material. Further, the above kit can have a specific polynucleotide probe comprising a first specific primer which is complementary or homologous to a sequence specific to a particular organism, infectious agent, or biological component, the kit additionally comprising a second specific polynucleotide primer which is complementary or homologous to a different sequence specific to the organism, infectious agent, or biological component. When the above kit is designed to be used with PCR, it should have one or more of the following: dNTP's, a reverse tran.scriptase, a polymerase, and a buffer appropriate for the addition of dNTP's to a primer using a reverse transcriptase or polymerase. Additionally, the kit can include a DNA polymerase that has significant polymerase activity at temperatures above 50 ° C.

In yet another embodiment, the present method comprises a means of detecting the presence of one or more organisms, infectious agents, or biological components in a biological sample containing polynucleotides, wherein at least one of the polynucleotides is indicativ e of the presence of the one or

ore organisms, infeclious agents or biological components and is present in minute quantities. This method makes use of PCR and employs the steps of:

(a) obtaining a biological sample containing polynucleotides;

(b) contacting the sample with a first polynucleotide primer, the first primer having a nucleotide sequence complementary to a nucleotide sequence common to a plurality of organisms, infectious agents, or biological components;

(c) hybridizing the first primer to an analyte polynucleotide present in the sample that is complementary to the first primer, if such an analyte polynucleotide is present;

(d) extending the first primer, thereby producing a double-stranded polynucleotide including a complementary nucleotide strand comprising the first primer and having a nucleotide sequence complementary to the analyte polynucleotide;

(e) contacting the sample with a second polynucleotide primer, the second primer being complementary to a sequence contained in the complementary nucleotide strand;

(f) hybridizing the second primer to the complementary nucleotide strand; (g) extending the second primer to form a nucleotide strand homologous to the analyte polynucleotide;

(h) contacting the sample with a third and a fourth polynucleotide primer, the third and fourth primers having sequences complementary to the homologous nucleotide strand and the complementary nucleotide strand, respectively, wherein the third primer has a nucleotide sequence complementary to a sequence common to a plurality of organisms, infectious agents, or biological components whose presence is to be determined and wherein the sequence of the third primer is different from that of the first primer;

(i) hybridizing the third and fourth primers to the complementary nucleotide strand and the homologous nucleotide strand; (j) extending the third and fourth primers, thereby producing double- stranded polynucleotides; and

(k) determining the presence of the one or more organisms, infectious agents, or biological components in the sample by delecting the extension of at least one of the first, second, third, or fourth primers. In this method, the second primer can have a nucleotide sequence that is common to a plurality of the organisms, infectious agents, or biological components whose presence is being determined. This method is preferably practiced such that the extending and hybridizing steps are repeated a plurality of times. In this method, the extension step can be accomplished with a reverse transcriptase when the primer is bound to RNA, while this step is accomplished with a DNA polymerase when the bound polynucleotide is DNA. If a DNA polymerase is used, it preferably has significant polymerase activity at temperatures above 5 °C. In the present method the nucleotide sequences of the first and second primers can be determined by a computer-assisted method, and preferably by a computer-assisted

method which determines the sequences of the first and second nucleotide probes using an H-Site model.

Yet another method of detecting the presence of one or more organisms, infeclious agents, or biological components in a biological sample containing polynucleotides, wherein at least one of the polynucleotides is indicative of ihe presence of the one or more organisms, infectious agents or biological components and is present in minute quantities, comprises the sleps of: obtaining a biological sample containing polynucleotides; contacting the sample with a first polynucleotide primer, the first primer having a nucleotide sequence complementary to a nucleotide sequence common to a plurality of organisms, infeclious agenls, or biological components; hybridizing the first primer to an analyle polynucleotide present in the sample that is complementary to the first primer, if such an analylc polynucleotide is present; extending lhc first primer, thereby producing a double-stranded polynucleotide including a complementary nucleotide strand comprising the first primer and having a nucleotide sequence complementary to the analyte polynucleotide; contacting the sample with a second polynucleotide primer, ihe second primer being complementary to a sequence contained in the complementary nucleotide strand; hybridizing the second primer to ihe complementary nuclcolidc strand; extending the second primer to form a nucleotide strand homologous to the analyte polynucleotide; contacting the sample with a third polynucleotide primer, the third primer having a sequence complementary lo the homologous nuclcolidc strand, wherein the third primer has a nucleotide sequence complementary lo a sequence thai is specific to a particular organism, infectious agent, or biological component whose presence is to be determined; hybridizing the third primer lo the homologous nucleotide strand; extending the third primer, thereby producing a double-stranded polynucleotide; and determining the presence of lhc particular organism,, infeclious agent, or biological component in the sample by detecting the extension of at least one of the third primer. This method is preferably practiced such lhat the extending and hybridizing sleps are repeated a plurality of times. The extension step in particular can be accomplished with a reverse transcriptase when the primer is bound to RNA, while a DNA polymerase is used when the bound polynucleotide is DNA. Such a DNA polymerase preferably has significant polymerase activity at temperatures above

50 °C. In the present method the nucleotide sequences of the first and second primers can be determined by a computer-assisted method, and preferably by a computer-assisted method which determines the sequences of the firsl and second nuclcolidc probes using an H-Sitc model. In this

method, the second primer can have a nucleotide sequence that is common to a plurality of the organisms, infeclious agents, or biological components whose presence is being determined. The third primer also preferably includes a label, so that the delecting step comprises the extension of this labeled primer. The label used can be selected from the group consisting of a radionuclide, an enzyme, an enzyme substrate, a specific binding moiety, a binding partner for a specific binding moiety, biotin, avidin, a nucleic acid stain, and a fluorescent material.

Yet another method of detecting the presence of a particular organism, infectious agent, or biological component in a biological sample containing polynucleolides, wherein at least one of the polynucleotides is indicative of the presence of the organism, infeclious agent, or biological component and is present in minute quantities, comprises the steps of:

(a) obtaining a biological sample containing polynucleotides;

(b) contacting the sample with a first polynucleotide primer, the first primer having a nucleotide sequence complementary to a nuclcolidc sequence lhat is specific to the particular organism, infectious agent, or biological componenl, wherein the nucleotide sequence of the first primer is dclermincd by means of a computer-assisted method;

(c) hybridizing the first primer to a sample polynucleotide present in the sample that is complementary lo the first primer, if such a sample polynucleotide is present;

(d) extending the first primer, thereby producing a double-slranded polynucleotide including a complementary nucleotide strand comprising the first primer and having a nucleotide sequence complementary lo ihe sample polynucleotide;

(e) contacling the sample with a second polynucleotide primer, the second primer being complementary lo a sequence contained in the complementary nucleotide strand;

(f) hybridizing the second primer lo the complementary nucleotide strand;

(g) extending the second primer lo form a nucleotide strand homologous to the sample polynucleotide; and

(h) determining the presence of the particular organism, infeclious agent, or biological component in the sample by delecting ihe exlension of at least one of the first or second primers.

We have also discovered a number of useful probes and primers for use in the foregoing methods, including primers for delecting jun oncogenes, G proteins, β receptors, and Substance P receptors, such as those identified with a sequence identifier herein. Among the sequences we have discovered for use as probes and primers in the present method are SEQ ID NO:473, SEQ ID NO:600, SEQ ID NO:615, SEQ ID NO:622, SEQ ID NO:622, SEQ ID N 0:730. SEQ ID NO:747, SEQ ID NO:748, SEQ ID NO:4SS, SEQ ID NO:513. SEQ ID NO:63U. SEQ ID NO:639, SEQ ID NO: 728, SEQ ID NO:729, SEQ ID NO:733, SEQ ID NO:734, SEQ ID NO:739, SEQ ID NO:740, SEQ ID NO:741,

SEQ ID NO:742, SEQ ID NO:743, SEQ ID NO:744, SEQ ID NO:759, SEQ ID NO:758, SEQ ID NO:751, SEQ ID NO:553, SEQ ID NO:670, SEQ ID NO:752. SEQ ID NO:753, SEQ ID NO:565, SEQ

ID NO:678, SEQ ID NO:6S6, SEQ ID NO:754, SEQ ID NO:577, SEQ ID NO:697, SEQ ID NO:704, SEQ ID NO:755, SEQ ID NO:756, SEQ ID NO:732, SEQ ID NO:642, SEQ ID NO:652, SEQ ID NO:757, SEQ ID NO:593, SEQ ID NO:710, SEQ ID NO:721, SEQ ID NO:528, SEQ ID NO:731, SEQ ID NO:749, and SEQ ID NO:750

Brief Description of the Figures

Figure 1 is a schematic representation of one embodiment of the present inventive method. Figure la is a picture of a gel showing the effect of various RNase inhibitors on mRNA preparations containing SDS and EDTA. Figure lb is a graph showing the relationship between YOYO-1 fluorescence and amount of immobilized oligonucleotide.

Figure lc is a graph showing the time course of YOYO-1 analysis. Figure Id is a graph showing the dose response of YOYO-1 concentration. Figure le is a bar graph showing the rcproducibility of YOYO-1 staining based on quantitation of immobilized oligonucleotides.

Figure 2 is a schematic representation of the common and specific sequences identified in various species of fungi.

Figure 3 is schematic representation of an embodiment of the method of the present invention in which PCR is used. Figure 4 is a picture of a gel showing the outcome of an experiment in which jun-D, c-jun, and jun-D jun oncogene subtypes were amplified using SEQ ID NO:728 and SEQ ID NO:729.

Figure 5 is a picture of a gel showing the outcome of an experiment in which cloned rat G j . j , G;_2, G j . 3 , G s , and G 0 G protein subtypes were amplified using SEQ ID NO:528 and SEQ ID NO:731. Figure 6 is a schematic representation of an example of a microtiter plate used in the methods of the present invention.

Figure 7 is a picture of the gel referred lo in Figure 7 which also provides Southern blots using each of the five G protein sequences as a probe.

Figure 7A-7C is a graphic representation of the results of an experiment which shows that SEQ ID NOS:470, 488 and 730 can be used lo delect specific subtypes of jun oncogenes. Figure 8 shows a gel of samples containing varying amounts of each of five different G pr.otein oligonucleotides (as indicated by number o " + " symbols) amplified with the G 2 and G 4 PCR primers and also provides Southern blots using each of the five G protein sequences as a probe.

Figure 9 shows a gel of λZAP cDNA libraries from rate pituitary (P), kidney (K) and intestinal (I) amplified with each of five different G protein primers, as indicated. Figure 10 shows a gel of 500 bp DNA from cDNAs of human 1 9 and Jurkat cells amplified with G 2 and G 4 PCR primers.

Figure 11 is a simplified block diagram of a computer system illustrating the overall design of this invention;

Figures 12A-12C show display screen representations of the main oligoprobe design station dialog windows of this invention;

Figures 13A and 13B arc flow charts of the overall invention illustrating the program and the invention's sequence and structure;

Figure 14 is a display screen representation of the Milsuhashi probe selection diagram;

Figure 15 is a display screen representation of the probeinfo and malchinfo window; Figure 16 is a display screen representation of the probesedil window;

Figures 16A and 16B are printouts of the probesedit output file;

Figure 17 is a flow chart of the overall k_diff program of the Mismatch Model of this invention, including its sequence and structure;

Figures 18A and 18B are flow charts of the k_diff module of this invention; Figure 19 is a fiow chart of the hashing module of this invention;

Figure 20 is a flow chart of the Iran module of this invention;

Figure 21 is a flow chart of the lel_dig module of this invention;

Figure 22 is a flow chart of the update module of this invention;

Figures 23A and 23B are flow charts of the assembly module of this invention; Figures 24A and 24B arc flow charts of the scqload module of this invention;

Figures 25A and 25B are flow charts of the rea l module of this invention;

Figure 26 is a flow chart of the dig el module of this invention;

Figures 27A and 27B are flow charts of the q_colour module of this invention;

Figure 28 is a flow chart of the hit_ext module of this invention; Figure 29 is a fiow chart of the colour module of this invention;

Figure 30 is the first page of a printout of a sample file containing the output of the Mismatch Model program of this invention;

Figure 31 is a flow chart of the H-Sile Model, stage 1, covering the creation of a preprocessed preparation file of this invention; Figure 32 is a flow chart of the H-Sitc Model, stage II, covering the preparation of the target sequence(s);

Figures 33A-33D are flow charts of the H-Sile Model, stage 111, covering the calculation of MPSD data;

Figure 34A is the first page of a printout of a sample file containing output of Mismatch Model program;

Figure 34B is the first page of a printout of a sample file containing output of H-Site Model program;

Figure 35 is a flow chart of the processing used lo create the Milsuhashi probe selection diagram (MPSD); Figure 36 is a flow chart of processing used lo create the malchinfo window;

Figure 37 is a printout of a sample target species file;

Figures 38A-38C are printouts of a sample preparation file

Detailed Description of the Invention

The present method of detecting organisms, infeclious agcnls, or other biological components in a biological sample which contains polynucleotides represents an advance over prior art diagnostic tests in its speed, ease of use, and accuracy. For example, the time necessary to make an accurate identification of an organism in a biological sample using prior art methods required enough time to culture such an organism, which could lake from hours lo days. Such methods also required that a technician trained to differentiate between different disease-causing organisms examine cultures of a biological sample. The present method, on lhc olhcr hand, can be accomplished by someone without such specialized training. The present method can also be performed so as to identify either a specific organism or a group of organisms, such as a genus of fungus or bacteria, without the possibility of error due to the mis-identification of an organism by a technician examining a culture of a biological sample. Infectious agents such as viruses can also be detected. : " : ":'

The present method also includes an improved way of detecting other biological components which comprise polynucleotides or whose presence is indicated by a polynucleotide. A biological sample can thus be probed with oligonucleotides specific to a biological component, such as a jun oncogene or

G protein mRNA. In the present application, "biological component" means a component of a cell or organism which comprises a polynucleotide, such as mRNA, rRNA, and genomic DNA. A component which does not contain a polynucleotide but whose presence is indicated by presence of a polynucleotide in the cell or organism is also included where the detection of a polynucleotide is an indication of the presence of the component. Other probes which can be specific to the biological component or which can be common to the biological component and other biological components can then be used to detect the binding of the specific probe lo ihe biological component. Alternatively, a biological sample can be probed with oligonucleotides capable of binding lo a number of polynuclcolides with related gene sequences in order to increase the sensitivity of the assay. These and olhcr aspects of the present invention will be described in greater detail below.

I. Common and Specific Sequences

We have discovered a number of specific sequences that are unique to the rRNA of a single fungal species or genus which can be used in the present method. These sequences are SEQ ID NO:81, SEQ ID NO:104, SEQ ID NO:131 through SEQ ID NC):133, SEQ ID NO:154 through SEQ ID NO:156, SEQ ID NO:176, SEQ ID NO:199, SEQ ID NO:267. SEQ ID NO:290, SEQ ID NO:312, SEQ ID

NO:335, SEQ ID NO:364 through SEQ ID NO:376, and SEQ ID NO:391 through SEQ ID NO:392. Reference can be made to Tables V through XVI for a comparison of these specific sequences to corresponding sequences in olhcr species or genera.

SEQ ID NO:227 and SEQ ID NO:250 arc sequences that are specific to the rRNA of certain strains of C. albicans. However, reported sequences in other strains of the same species have slight changes in these sequences in their rRNA. as seen in SEQ ID NO:226 and SEQ ID NO:249,

respectively. Accordingly, these particular sequences are less preferred for use as specific sequences within the context of the present invention. However, these sequences can be useful for identifying the particular strain of C. albicans in a sample.

We have also discovered a number of common sequences that are common to the rRNA of several fungal species and genera. These sequences are SEQ ID NO:l through SEQ ID NO:80.

Reference can be made to Tables 1 through 4 to see that these sequences arc common to all species shown.

In a further discovery, we have discovered sequences common lo a number of rat and human

G protein sequences, as well as sequences that are specific lo particular G protein subtypes. Oligonucleotides containing such sequences can be useful as probes or primers for identifying the presence of such sequences in a sample in which such sequences are present in only small quantities.

As will be discussed infra in more detail, oligonucleotides containing the inventive sequences can be used as primers in the Polymerase Chain Reaction (PCR) to detect G protein sequences in a biological sample. Sequences have as well been identified which are common lo the different jun oncogenes which have been identified in humans and in mice. Such sequences can also be used in PCR primers to identify the presence of jun gene sequences in a biological sample. In addition, sequences specific to particular subtypes of jun oncogenes have also been discovered. Other sequences useful in the present method are described below. Advantageously, all of ihese probes, both common and specific, have been designed to have approximately, the same melting temperature (T m )) when annealed to a complementary sequence.

Thus, various procedures requiring annealing of these sequences can all be performed under the same conditions. Those of skill in the art will recognize thai longer or shorter sequences, with a correspondingly higher or lower T m , respectively, can also be obtained upon reference to the full-length sequences available from GenBank. When the term "T m " is used herein in connection with a single- stranded polynucleotide, this term refers to the melting lempcraturc of that single-stranded polynucleotide when it is annealed to a complementary strand.

As is known to those of skill in the art, the T m of a polvnucleotide strand can be determined using the following formulas: (a) T m = 69.3 + 0.41 (G + C)% - 650/L

(where G is the number of guanine residues in the strand, C is the number of cytosine residues, and L is the total length, in bases, of the polynucleotide);

(b) (T m )u, - (T m )u j = 18.5 logn j u-j/u j

(where U j and u 2 are the ionic strengths of two solutions); and (c) The Tm of duplex DNA decreases 1 ° C with every increase of 1% in the number of mismatched base pairs.

In a preferred embodiment of the present invention, a plurality of probes that have the same T are immobilized lo one or more solid supports. When probes having the same T m are used, such probes can be hybridized together under the same conditions because they require the same reaction temperatures. Preferably, the specific probes in this embodiment have a T m between approximately 48°C and 60°C. Other probes that have the same T^ and are within this range can be determined by using the formulas above and by performing routine experimentation.

Unless otherwise specified, in the present application the term "specific sequence" denotes a sequence of nucleic acids which is present only in a specific organism or infectious agent, or which is present in a biological sample only as a result of the presence of a specific organism or infectious agent. A "specific sequence" can also be one present in a particular kind of biological component, such as the mRNA of a subtype of jun oncogcne. Of course, the sequence complementary to a specific sequence can also be said to be specific. The term "complementary" is used herein to describe a polynucleotide sequence in which adcnine is replaced by thymine or a nucleotide that reacts in an equivalent way to thymine such as uracil, and in which thymine (or uraeil) is replaced by adcnine or a nucleotide that reacts in a similar way to adcnine. In such a complementary molecule the guanine residues would also be replaced by cytosines or equivalent nucleotide molecules, and the cytosine residues would be replaced by guanine or equivalent nucleotides.

In the present application, the term "homologous" is used to describe a polynucleotide having a sequence which contains the same nucleotides or equivalent nucleotides, in the same order, as another polynucleotide. For example, a second polynucleotide having the same sequence of nucleotides as a first polynucleotide but in which uracil residues have been substituted for the thymine residues is homologous to the first polynucleotide. Other equivalent nucleotide substitutions known lo the arl are also included.

Sequences useful in the methods of the present invention can be determined in any way known to the art. Preferably, such sequences arc identified with a computer program, as will be detailed infra.

Therefore, the sequences discussed herein are only examples of sequences which will work in the present invention and are not the only sequences which can be used. II. Fungus Assay

One example of the method of the present invention involves the detection of a particular species of fungus in a biological sample. We have discovered that the presence of a particular species of fungus can be determined by probing the ribosomal RNA of a sample for sequences specific to the ribosomal RNA of a particular species of fungus. Probes which carry sequences complementary to sequences found in a group of fungi which include the specific fungus probed for can then be used to detect the specific species of fungus. Those of skill in the art will recognize that other organisms, infectious agents, and biological components present in a biological sample can also be detected using

the present method. Those of skill in the art will similarly recognize that other kinds of nucleic acids present in a biological sample can also be probed, including mRNA and genomic DNA.

In the present example, it has been found that each one of a plurality of fungal species carries ribosomal RNA sequences specific to only one species of fungus. The presence of a particular species of fungus in a biological sample can thus be detected by probing that sample for a sequence of ribosomal RNA found only in the ribosomal RNA of that species of fungus. The specific ribosomal RNA sequences found in a number of species of fungus appear to occur in regions that pick up mutations at a relatively high rale. Thus, many species of fungi are likely to have different nucleotide sequences in those regions. Although ribosomal RNA is not expressed, such regions would be analogous to unexpressed regions of genomic DNA, which pick up mutations al a relatively faster rate than expressed regions.

It has been further discovered that a number of species of fungi share sequences of ribosomal RNA common to all of those species. Thus, the presence of any of those fungal species in a biological sample can be determined by probing the sample for polynucleotides having such common sequences. If a fungus contains both specific and common sequences, probing for such common sequences can be used to detect the presence of a variety of that species of fungus which contains one or more mutations in its specific ribosomal RNA sequences. The existence of common sequences can also be exploited by annealing labeled probes to those sequences in order to facilitate the detection of polynucleotides which carry such common sequences. In one group of pathogenic fungal species, two separate ribosomal RNA sequences have been identified in each of the species which are specific to the individual species carrying such sequences. This group comprises the following fungal species: Pnatmocysti.s carinii. Aspergillus funiagatus, Aspergillus fumigatus, Ciyptococcus ncofonnans, Coccidiodcs im iiis, Blasiomvces dennatiύdis, and a number of species in the Candida group, including Candida albicans and Candida tropicalis. The Genbank accession numbers of the ribosomal RNA of the fungal species of this group are shown in

Table A below:

Table 1

Fungal Species Accession No. Aspergillus fumagatus M 55626 Aspergillus fumagatus M60300 Aspergillus fumagatus M60301 Blasto yces dermatitidis M55624 Candida albicans M 60302 Candida albicans X53497 Candida guillicrmondii M6U304

Candida glabrala X51831

Candida kefyr M60303

Candida krusei M55528

Candida krusei M60305 Candida lusilaniae M55526

Candida lusitaniae M60306

Candida parapsilosis M60307

Candida tropicalis M55527

Candida tropicalis M6030S Candida viswanathii M60309

Pneumocyslis carinii X1270S

Coccidiodcs immitis M55627

Cryptococcus neoformans M55625

In the embodiment of the present invention comprising a fungal assay, the term "specific sequence" is used to indicated a sequence of ribosomal RNA specific to one species of fungus. Such a sequence is specific to that fungus species and thus is not found in the ribosomal RNA of any other fungal species. The sequence is also differcnl from other RNA sequences found in the cells being tested. A probe which has a sequence complementary to one of these specific sequences will therefore anneal only to the ribosomal RNA of a particular species of fungus, or to a polynucleotide homologous to such ribosomal RNA.

In the group of pathogenic fungi containing specific sequences referred to above, four common sequences of ribosomal RNA have also been identified. "Common sequences" are those common to a group of organisms, such as those common lo a particular genus or family of organisms. The term "common" can also denote sequences shared by a group of infeclious agenls or biological components, such as sequences shared by different subtypes of jun oncogene As shown in Figure 1, two such common sequences for the group of fungi listed above in Table 1 occur 5' of the specific sequences identified in such fungi, while two other common sequences are located 3' of these specific sequences.

Thus, in one embodiment of the present invention, primers complementary to the common sequences located 3' of the specific sequences in the ribosomal RNA of a fungal species of this group are used to create a polynucleotide, preferably a strand of cDNA, complementary to ihe portion of the ribosomal RNA of the species thai contains the specific sequences. A probe homologous to one of the common sequences located 5' of the specific sequences can also be annealed lo a strand complementary to the ribosomal RNA of a species of fungus and then extended in order lo create a polynucleotide strand homologous to the strand of fungal ribosomal RNA that contains at lcasl one of the sequences

specific to a particular fungus. Further aspects of the method of delecting fungi of ihe present invention are detailed in the examples below.

III. Obtaining a Biological Sample

In order to obtain a biological sample containing an organism, an infeclious agent, or a biological component of a cell or organism to be detected according to the method of the present invention, an organism or tissue suspected of harboring such an infeclious agent, organism, or biological component can be identified. The identification can be made in any way known to the art. Preferably, the organism suspected of carrying an infectious agent or other organism is a human, and the identification is made by a physician who observes symptoms indicative of the presence of such an organism or agent in such a human. For example, a patient diagnosed as having AIDS who comes down with pneumonia and who does not respond lo anti-bacterial agents is identified by a physician as possibly harboring the fungus Pneumocyslis caiinii.

Alternatively, a biological sample taken from a host with no overt signs of having a medical condition or harboring an organism or agent can be tested. For example, a food sample or tissue from an AIDS patient without signs of a fungal growth can be tested for the presence of a fungus. In this embodiment, any biological sample can be tested, even though overt signs of the presence of a fungus are lacking in that sample Appropriate action may thereby be taken if a fungus is in fact found in such a biological sample.

The biological sample to be tested can be obtained by any means known to the art. For example, if an AIDS patient is suspected of suffering from interstitial plasma pneumonia caused by the fungus Pneumocyslis caiinii, a sputum sample can be taken from the lungs of that patient. The sputum can be obtained by having the patient cough up phlegm from the lungs and deposit it into a cup. Alternatively, a sputum sample can be obtained by scraping the bronchial passage with a sterile swab, or by any other means known lo the arl. Any other biological sample which could possibly carry a fungus is likewise obtained in an appropriate fashion.

IV. Preparing the Biological Sample

The biological sample is next prepared so that the RNA and/or DNA present in the biological sample can be probed. When a fungus is being probed for, ribosomal RNA of any fungi present in the sample can be probed in accordance with the methods of the present invention. In order to probe the ribosomal RNA of any fungal cells present, these cells should first be lysed. Lysis of fungal cells or of other cells containing RNA or DNA of interest can be accomplished by any of a number of methods known to the arl, including those set out in Molecular Cloning.

In one embodiment, the cells are lysed before they come into contact with ihe solid support. This embodiment might be used, for example, when the solid support is one which is not designed lo hold a sample of lysed cells, such as a nitrocellulose filler. In this embodiment, the cells are contacted with the solid support after they have been lysed.

A variety of techniques can be used for cell lysis. When the ribosomal RNA of fungi is being probed, for example, techniques that separate the ribsomal RNA from ihe ribosomal proteins are preferred. Example 1 is provided to show one technique believed lo be useful in oblaining ribosomal RNA. However, techniques for obtaining ribosomal RNA are well known. Techniques for obtaining DNA and other kinds of RNA arc also well known to those of skill in the arl. Thus, the technique of

Example 1 is not necessarily a preferred method of obtaining ribosomal RNA-containing samples. Example 1, like all of the examples provided herein, are provided merely to illustrate certain aspects of the present invention. As such, they are nol inlended lo limit the invention in any way.

EXAMPLE 1 Lvsing Cells in a Biological Sample

Cells present in a biological sample can be lysed by treatment with a solution of 10 mM ethylenediaminelelraacclic acid (EDTA) (pH 8.0), 0.2 M NaCl, 0.5% of sodium dodecyl sulfate (SDS),

500 Unit/ml of RNase inhibitor, lO M of Vanadyl Ribonucleosyl Complex and 200 μg/1 of Proteinase

K (hereafter called Lysis Buffer). After lysis of the cells, the NaCl concentration of ihe resulting cell lysate in solution is adjusted to 0.5M.

V. Solid Support

In a preferred embodiment, the solid support is capable of containing a biological sample and is resistant lo the reagents used to lyse cells in the biological sample The sample can thus be lysed in the solid support. However, the use of such a solid support is not necessary if Ihe sample can be obtained without lysis or is lysed in a separate container. An example of a solid support that is resistant to a large number of treatments is a microtiter well or a plate made from a resistant plastic material.

The solid support can also be any of a variety of other solid supports known lo the art, such as a membrane filler, a bead, or any other solid, insoluble support lo which polynucleotides can be attached. The solid support is preferably made of a material which can immobilize a polynucleotide probe. Immobilization can be through covalent bonds or through any of a variety of interactions that are known to those having skill in the art. Plastic materials containing carboxyl or amino groups on their surfaces, such as polystyrene, are preferred for the solid support of ihe present invention because polynucleotide probes can be immobilized on their surfaces, because they are inexpensive and easy to make, and because they are resistant lo the reagents used to lyse the cells of ihe biological samples used in the present invention. For example, lhc Sumilon microtiter plate MS-3796F made by Sumitomo

Bakelite, which has a carboxyl group on its surface, can be used in such a preferred embodiment. A plastic plate having an amino group on its surface, such as the Sumilon microtiter plate MS-3696, can also be used.

VI. Contacting the Sample and First Polvnucleotide Probe After the cells in ihe biological sample have been lysed, the RNA and/or DNA contained in such cells is substantially released into solution or otherwise made available to being probed. If the

biological sample was not lysed in the solid support, the cell lysate is next brought into contact with the solid support. Immobilized lo the solid support is a first polynucleotide probe which preferably contains a sequence complementary to a specific sequence in the RNA and/or DNA of a particular organism, infectious agent, or biological component of a cell or organism in the biological sample. In an alternate embodiment, the first polynuclcolide probe can also contain a sequence common to a plurality of organisms, infectious agents, or biological components when any of a group of such organisms, infeclious agents, or biological components is sought to be identified. When the RNA and/or DNA present in the cell lysate contacts the solid support, therefore, it also comes into contact wilh the first polynucleotide probe. Preferably, the first polynucleotide probe is an oligodeoxyribonucleotide (DNA) rather than an oligoribonucleotidc (RNA), since DNA is more stable than RNA. The number of nucleotides in the polynucleotide probe is not restricted. However, if an oligodeoxynucleotide is used as the specific polynucleotide probe, a preferred length for the oligodcoxynuclcolide is from 15 lo 100 nucleotides. Lengths longer than 100 nucleotides arc usable within the scope of the present invention. However, lengths of 100 nucleotides or less arc preferable because many automated polynucleotide synthesizers have a limit of 100 nucleotides. Longer sequences can be obtained by ligating two sequences of less than 100 nucleotides.

In one embodiment, the first polynucleotide probe is an oligonucleotide complementary to one of the following sequences: SEQ ID NO:Sl, SEQ ID NO:104, SEQ ID NO:131 through SEQ ID NOJ33, SEQ ID NOJ54 through SEQ ID NO:15ϋ, SEQ ID NO:176, SEQ ID NOJ99, SEQ ID

NO:267, SEQ ID NO:290, SEQ ID NO:312, SEQ ID NO:335, SEQ ID NO:364 through SEQ ID

NO:376, or SEQ ID NO:391 through SEQ ID NO:392. These sequences are specific lo the ribosomal

RNA of various pathogenic species of fungi, as listed in the sequence listing, and Tables V through XII.

Example 2 is provided lo show one particular probe that is useful for determining the presence of Pneumocyslis caiinii in a biological sample.

EXAMPLE 2 Preparing the First Polynucleotide Probe A first polynucleotide probe thai is specific to a sequence of ribosomal RNA in Pneumocyslis carinii is prepared. The probe is produced with a DNA synthesizer such as a DNA synthesizer made by Applied Biosyslems of Menlo Park, CA. The probe is complementary lo a polynucleotide having the following sequence where A, T, G, and C stand for adcnine, thymine, guanine and cytosine, respectively: 5'-GCGCAACTGATCCTTCCC-3' (SEQ ID NO:81). VII. Immobilizing the First Polvnucleotide Probe

Various methods of immobilizing polynucleotides lo a solid support arc known to the art, including covalent binding, ionic binding, and lhc physical absorbance method. In certain embodiments of the present invention, lhc polynucleotides, such as the first polynucleotide probe, are immobilized to

microtiter wells which exhibit functional groups such as carboxyl residues, amine residues, or hydroxyl residues on the surfaces ihcreof. Thus, in one procedure for the immobilization of the first polynucleotide probe to a solid support, the solid support exhibits a functional group and the 5'-terminal end of the polynucleotide is covalenlly linked to the functional group. Any of a variety of methods for the covalent binding of polynucleotides lo these functional groups can be used. Examples of preferred, well-known methods include the maleimide method and the carbodiimide method.

The maleimide method involves a reaction between a substance containing a maleimide group and another material containing a sulfhydryl residue (SH). The 5' end of the specific polynucleotide probe is immobilized on a solid support in this method by reacting the 5' end of the polynucleotide with a maleimide compound. A suitable maleimide compound is sulfosuccinimidyl-4-(N-maleimidomethyl) cyclohexane-1-carboxylatc (sulfo-SMCC).

The SH residue is provided on the support by a reaction between a support having an amino group and succinimidyl-S-acelylthioacctale (SATA), followed by deacelylalion using hydroxylamine (NH2OH). Sulfo-SMCC and SATA are readily available from a variety of commercial sources, including the Pierce Company. The resulting SH group on the support is reacted with ihe maleimide group on the 5' end of the first polynucleotide probe when these groups are brought into contact under the appropriate conditions, thereby immobilizing the first polynucleotide probe to a solid support.

One problem we have experienced in the use of the maleimide method is that the SH group on the support can react not only with an amino group al the 5' end of the first polynucleotide probe, but also with primary amino groups on the purine bases, adcnine and guanine. In order to assure that the polynucleotides are immobilized al their 5' ends, so that the sequences complementary to the ribosomal RNA sequences specific to a particular species of fungus are available for hybridization, the amino groups on the purine bases can be protected by pairing the specific polynucleotide probe to a complementary polynucleotide prior lo immobilization. After immobilization, the complementary polynucleotide can be removed through denaturation, such as through heating, leaving the single- stranded probe immobilized to ihe solid support.

Another method of immobilizing a polynucleotide. lo a solid support is the carbodiimide method. This method involves a reaction between an amino group and a material containing a carboxyl residue using a carbodiimide compound. An example of a carbodiimide compound is l-elhyl-3-(3- dimethylaminopropyl) carbodiimide hydrochloride (hereafter called EDC). This reaction can be enhanced wilh N-hydroxysulfosuccinimidc (hereafter called Sulfo-NHS). Both EDC and Sulfo-NHS are available from well known commercial sources, including the Pierce Company.

In the practice of a preferred carbodiimide method for attaching polynucleotides to a solid support, a support having a carboxyl residue attached is used. Before contacting EDC with the support, the EDC is activated by reacting it with Sulfo-NHS. This activated EDC is then reacted with the solid support containing surface-bound carboxyl residues. The support, alter being so treated, can be reacted

with strands of the first polynucleotide probe, which have an amino group at their 5'-terminal ends, thereby immobilizing the specific polynucleotide probe lo the support.

In order to assure that lhc first polynucleotide probe is immobilized al its 5' end, ihe primary amino groups on the probe (the adcnyl, guanyl and cytosy groups) can be protected by hybridizing the nucleotide to a complementary polynucleotide prior lo immobilization. After immobilization, the complementary polynucleotide can then be removed through denaturation, such as through heating, leaving the single-stranded probe immobilized lo the solid support. In order to further prevent the non¬ specific binding of activated amino or carboxyl residues on solid supports, the solid supports to which the specific polynucleotide probes are immobilized can be treated with a primary amine compound, preferably glycine.

Example 3 is provided as an indication to those of skill in the arl of but a single method of immobilizing a probe to a solid support. Those of skill in the arl will recognize that any of a variety of methods of so immobilizing lhc probe can be used, including those described above.

EXAMPLE 3 Immobilizing the First Polvnucleotide Probe onto a Solid Support with the Carbodiimide Method

Both EDC and sulfo-NHS (Pierce, 1L) are dissolved in DEPC-trealcd water at concentrations of 20 mM and lOmM, respectively. EDC/Sulfo-NHS solution is then prepared by mixing equal volumes of both EDC and sulfo-NHS. The specific nucleotide probe is dissolved in DEPC-trcated water at a concentration of lμg/μl and then mixed with the EDC/Sulfo-NHS solution in the ration 1:25 (VohVol). 50μl of this probe solution is added lo each well of a microtiter plate (MS-3796F, Sumitomo

Bakelite, JAPAN), which is known lo have carboxyl groups on the surface of lhc plate. After incubation at room temperature overnight, the reaction solution is removed by aspiration. VIII. Hybridizing the First Polvnucleotide Probe

In a preferred embodiment, the cell lysate or other sample containing polynucleotides is next hybridized to a first, specific polynucleotide probe immobilized lo the solid support. The first polynucleotide probe can, however, also be a common probe, depending on ihe purpose of a particular assay performed according lo ihe present invention. Hybridization can be accomplished by incubating the cell lysate and the first polynucleotide probe at a temperature dependent on a variety of factors, as is well known to those with ordinary skill in the art. These factors include the length of complementary nucleotide sequences, the ratio of guanine and cytosine bases lo the entire base content in the complementary nucleotide sequences (the GC content), the NaCl concentration in the buffer solution, the number of bases which mismatch in the complementary nucleotide sequence, and the type of nucleotide. In a preferred form of this invention, the following equation can be used to calculate the preferred incubation temperature (T inc ): T jnc = 16.6 x log (M) + 0.41 (GC) + 81.5 - 675/n - 15 (oQ.

In the equation shown above, M is the NaCl concentration in solution, GC represents the GC content (the percentage of guanine and cytosine residues in the sequence), and n represents the length of the nucleotide sequences (the number of hybridizing nucleotides). The incubation temperature can also be determined according to methods described in Molecular Cloning. The time for incubation is preferably from 1 hour lo overnight, and the sample is preferably gently rocked during incubation. Incubation is preferably performed in an appropriate buffer solution. The same buffer used lo hybridize RNA and DNA in the Northern Blot or the Dol Blot methods, as described in the Maniatis treatise, can be used. The buffer is preferably prepared in a way so as not to contaminate it with RNase. If any RNase contamination is present, the activation of RNase should be controlled so as to be as low as possible.

Example 4 illustrates one method of hybridizing ribosomal RNA in a sample to an immobilized probe.

EXAMPLE 4 Hybridizing the First Polvnucleotide Probe to Ribosomal RNA RNase is removed from a microtiter well to which a first polynucleotide probe having a sequence complementary lo SEQ ID NO:Sl has been bound by adding 250 μl of Lysis Buffer containing 0.5 M NaCl and incubating the well at 45oC for one hour. The buffer is removed from individual wells by aspiration, and 50 μl of Lysis Buffer containing the biological sample is added to each well. These solutions are incubated al 39°C for one hour (T m = 54° ) to allow hybridization and then slowly cooled over the course of 20-30 minutes.

In the methods of the present invention thai involve probing samples containing RNA, or in any procedure in which it is desired to prevent the degradation of RNA, it is advantageous to inhibit the activity of any ribonuclcases (RNases) which may be present. One RNase inhibitor which can be used is Vanadyl Ribonucleoside Complex (VRC) (Belhesda Research Laboratories, Gaithcrsburg, MD). VRC has been reported lo be useful during cell fractionaiion and in the preparation of RNA, and has been shown not lo interfere with ihe phenol extraction or ethanol precipitation of RNA. In addition, VRC does not affect other cyloplasmic components of cells. Therefore, VRC is an ideal inhibitor of RNase in many experimcnial procedures.

However, prior arl procedures for inhibiting RNases with VRC taught that VRC should not be used in buffer systems containing EDTA or SDS, which are commonly used in the field of molecular biology. The reason for this prohibition was that it was believed that the Complex would dissociate in the presence of these buffers, leading lo a loss of RNase inhibiting activity. In fact, the BRL insert accompanying the VRC product recommends that a five- lo ten-fold molar excess of EDTA be added to an RNA solution containing VRC in order "destroy" the VRC prior lo elhanol extraclion of the RNA from the solution. The apparent inability lo use VRC together with common buffers that include EDTA and SDS thus presented a major impediment to the exploitation of VRC as an RNase inhibitor.

We have discovered, however, that VRC is in fact an effective RNase inhibitor even in the presence of SDS and/or EDTA. Thus, VRC can be used in assays which use buffers including SDS or EDTA, where heretofore it was believed that VRC would not be effective in such systems. VRC is an effective RNase inhibitor at the concentrations of EDTA and SDS that are normally used when manipulating RNA or when performing a variety of other molecular biology techniques. For example, we have found that VRC effectively inhibits RNase in a buffer solution comprising approximately 1 mM EDTA. We also found VRC to be effective in .5% SDS solutions, and is believed to be effective in solutions ranging up to 2% SDS, more preferably up to 1% SDS. VRC can, of course, also be used in other solutions including EDTA and SDS. We have found that VRC is a particularly potent inhibitor of RNases when used in combination with Proteinase K. Proleinase K is also available from BRL. As shown in ihe gel in Figure la, mRNA prepared from U937 cells (human macrophage cell line) was protected from RNase degradation by a combination of VRC and Proleinase K. Lane 1 of the gel shows the mRNA from a cell preparation which included VRC, Proteinase K, and RNasin. while lane 2 represents the mRNA from a cell preparation thai included VRC and Proteinase K. RNasin is available from Promcga of Madison, WI.

The distinct band 10 in lanes 1 and 2 matches the band seen in lane 7, which contains pure clonal cDNA from an RNase-frcc preparation of U937 cells, thus showing thai lhc mRNA in the preparations of lanes 1 and 2 did not experience substantial mRNA degradation.

A comparison of lanes 1 and 2 with lanes 3-6 shows that VRC in combination with Proteinase K inhibits RNase activity in the above-mentioned mRNA preparation from U937 cells to a far greater extent than either Protein K alone (lane 4), Proteinase K in combination wilh RNasin (lane 3), or a commercial RNase-inhibiting preparation sold under the name "FaslTrack" (available from In Vitrogen of San Diego, CA) (lane 5). None of lanes 3-5 exhibit the distinct band 10 representing undegraded mRNA. On the contrary, lane 4 (Proteinase alone) and lane 5 (FaslTrack) have the same smear of degraded mRNA as band 20 in lane 6 (no inhibitors). The gel shown in Figure la also shows the effectiveness of VRC in a buffer solution of 1 mM EDTA and .5% SDS, which is the buffer used in the U937 cell preparations tested, since lane 2 (VRC and Proleinase K) shows less mRNA degradation than lane 4 (Proteinase K alone) or lane 3 (Proteinase K and RNasin).

In order to eliminate RNase activity from water used in the methods of this invention, the water is preferably treated wilh dielhylpyrocarbonale (DEPC). The preferred DEPC treatment involves addition of 0.1% DEPC to the water, followed by storage overnight al 37oC and sterilization in an autoclave. The DEPC is deactivated by such autoclaving so that it does not interfere wilh the enzymatic processes of the methods of the present invention. Alternatively, if the water is sterilized in some olher manner, the DEPC in lhc water can be deactivated by olher means known lo lhc art. IX. Washing the Solid Support

Following hybridization, the non-hybridized portions of the biological sample are preferably separated from the solid support, so thai substantially all of the biological sample not annealed to the first polynucleotide probe is removed from the solid support. If the solid support is a microtiter well, for example, and the first polynucleotide probe is immobilized lo ihe walls or bottom of the well, the non-hybridized cell lysate can be removed by pouring the lysate out of the well or by aspirating the cell lysate. The well itself can then be "washed" or rinsed wilh a washing solution such as the Lysis Buffer by applying the washing solution lo the walls of the well and then removing the washing solution through aspiration. Any washing solution known lo the art can be used, provided thai the salt content and other parameters of the solution are controlled so that ihe washing solution does not remove any polynucleotide hybridized to the polynucleotide probe.

X. Contacting and Hybridizing the Second

Polvnucleotide Probe

When the solid support has been washed, a second polvnucleotide probe is then contacted with and hybridized to the polynucleotide strand of RNA or DNA which is hybridized lo the immobilized first polynucleotide probe, if such a polynucleotide was present in the cell lysalc. The contacting and hybridization steps are performed as wilh the first polynucleolide probe, above, or by any other methods known to the art.

In a preferred embodiment, the second polynucleotide probe contains a polynucleotide sequence complementary to a sequence which is common lo ihe RNA and/or DNA of a group of organisms, infectious agents, or biological components. For example, if a particular organism is being probed for and that organism is a fungus, the second probe can contain a sequence common to a plurality of species of fungus including the species being probed for. If an infeclious agent such as a virus is being assayed for, the second probe can contain a sequence common to a number of related viruses. In this way, the same second polynucleotide probe can be hybridized to the RNA and/or DNA of any of a group of organisms, infectious agents, or biological components. In an even more preferred embodiment, the second, common probe has the same or lower T as the first, specific probe so that the hybridization of the second probe can be performed under the same conditions as the conditions used to hybridize the first probe.

This embodiment of the present invention is preferred when a plurality of specific probes are used to assay for the presence of a plurality of organisms, infectious agents, or biological components, because then the same second probe can be annealed lo the RNA and/or DNA of lhc plurality of such organisms, agents, or biological components. The common polynucleotide probe of this embodiment can also advantageously be included in a kit in which a plurality of specific probes are used to detect the presence of a plurality of organisms or agents. The second polynucleotide probe can alternatively comprise a second sequence specific to the polynucleolide of the particular organism, infeclious agent, or biological component sought to be

-2S- identified in the sample This second sequence is complementary to a different specific sequence of the RNA and/or DNA of the particular organism, infeclious agent, or biological component than the sequence to which the first polynucleotide probe is complementary. XL Label The second polynucleolide probe is preferably labeled in order lo easily delect its presence and facilitate the detection of the RNA and/or DNA hybridized to the first probe. A variety of chemical substances are available which can label a polynucleotide probe when attached to that probe. For example, a variety of radionuclides can be used, such as the radioisolopes P, S, H, and I. Enzymes or enzyme substrates can also be attached lo the second polynucleotide probe in order to label it. Suitable enzymes include alkaline phosphalase, luciferasc. and peroxidase.

Labels which provide a colorimelric indication or a radionuclide are preferred. Other labels which can be used include chemical compounds such as biotin, avidin, strcptavidin, and digoxigenin. Colorimetric labels, such as fluoresccin, are especially preferred because they avoid the health hazards and disposal problems associated with ihe use of radioactive materials. In a preferred embodiment, biotin is attached to the nucleotide probe, followed by an avidin, such as strcptavidin, which is itself conjugated to an enzyme such as alkaline phosphalase The presence of the enzyme can be detected by various substrates, such as ATTOPHOS, which provide a fluorescent marker that can be detected through fluorimetry.

Nucleic acids can also be "labeled" by staining them wilh a nucleic acid stain. Thus, where a relatively large amount of nucleic acids are present, elhidium bromide (EtBr) can be used lo identify the presence of such nucleic acids. However, more sensitive stains are more preferable. Sensitive stains include various cyaninc nucleic acid slains. such as POPO, BOBO, YOYO and TOTO, available from Molecular Probes (California). These slains are described, e.g., in Science, 257:8S5 (1992). Particularly preferred stains for use in the context of the present invention include the shorter wavelength forms, TOTO-1 and YOYO-1, still more preferably YOY -1. As little as four picograms of stained DNA can be detected by visible fluorescence upon stimulation with a transilluminalor or hand-held UV lamp. Thus, these stains provide a particularly easy and sensitive method of identifying the presence of nucleic acids.

We tested the ability of YOYO-1 lo stain oligonucleotides immobilized lo wells as discussed herein. We first immobilized various known amounts of oligonucleotides lo wells and washed to remove non-immobilized oligonuclcolides. We then added 1:1000 dilution of YOYO-1 in water. We then used a fluorimeter directly wilhoul washing. The relation between pmoles of oligonucleotide is shown in Figure lb with circles. We also incubated the TOTO-1 stained immobilized oligonucleotides for ten minutes, washed and added water, followed by use of the fluorimeter. The washed results are shown in dark circles in Figure lb, while results without washing are show n in open circles. It can be seen that

washing does not significantly effect the amount of staining and that the amount of staining is related to the amount of oligonucleotide.

We also tested the time course of YOYO-l staining over the course of one hour. We included a control in which no oligonucleotide was immobilized lo the plate. The oligonucleotidc-immobilized plate is shown in open circles in Figure lc and lhc control in dark circles in that figure. It can be seen from Figure lc that after an initial spike, relatively constant staining is found.

We further tested ihe dose response of constant amounts of oligonucleotides with oligonucleotides immobilized in wells (open circles in Figure Id) and control wells with no oligonucleotide (closed circles in Figure Id). The difference between the immobilized and non- immobilized is shown in Figure Id as triangles. It can be seen that a sharp increase in fluorescence occurs between 10 "4 and 10 dilution.

We also tested the rcproducibility of staining of both oligonucleotidc-immobilized wells and non-immobilized wells. We repealed the experiment five limes and graphically depicted the data for both oligonucleotide immobilized ( + ) and non-immobili/ed (-) wells in Figure le. It can be seen that the data was substantially similar for each experiment.

Thus, the data depicted in Figures Ib-le shows that use of YOYO-l as a staining agent provides a reliable indication of ihe amount of oligonucleotide present.

Other labels known to the art can also be used. For example, a specific binding moiety which binds to a binding partner for that moiety can be used. For example, a binding moiety such as an antibody can be labeled with a marker and then bound to a binding partner, such as an antigen, on the second polynucleotide probe An agonist and its receptor can also be used, as is known to the art. A receptor such as lhc norepinephrine receptor attached lo ihe second polynucleolide probe can be detected by the addition of a labeled agonist for that receptor, in this case norepinephrine.

Example 5 shows one method of using a labeled common polynucleolide probe to identify the presence of rRNA.

EXAMPLE 5 Preparing. Labeling, and Hvhrid ' r/ing the Second Nucleotide Probe The second nucleotide probe is an oligodeoxynucleotide prepared as in Example 1 comprising a sequence complementary to the following sequence: 5'-GAGGGAGCCTGAGAAACG-3' (SEQ ID

NO:l). This sequence is complementary to the ribosomal RNA of a number of fungal species, including the ribosomal RNA to which the specific polynucleotide probe of Example 1 is complementary. This second nucleotide probe is labeled with fluoresccin at cither the 3' or 5' end of the probe. The 3' end is labeled using terminal transferase and FITC-dUTP. alternatively, the 5' end is labeled by chemical reaction wilh FITC. Fifty μl of Lysis Buffer including 0.5 M NaCl and 1 μl of a solution containing the

second polynucleotide probe to which fluorescein has been attached is added into each well and incubated at 39° for one hour, after which it is allowed lo slowly cool (over 20-30 minutes). XII. Determining the Presence of Organisms. Infectious Agents, or Biological Components in the Sample After the second polynucleolide probe has been hybridized lo any RNA and/or DNA of the organism, infectious agent, or biological component present in the biological sample, substantially all of the solution containing the second probe is removed from the solid support by aspiration or in any other appropriate way, and the solid support is again washed lo remove any unhybridized second polynucleotide. The presence of the particular organism, infeclious agenl, or biological component sought to be delected is then finally determined by delecting the presence of the second polynucleotide probe immobilized to the solid support via the RNA and/or DNA strand of the organism, infectious agent, or biological component being probed for, which is itself hybridized to the first polynucleotide probe. If the second polynucleotide probe is detected, this indicates that the biological sample contained RNA and/or DNA being tested for, and hence that the biological sample harbors the organism, infectious agent, or biological component sought to be identified in the sample.

In one embodiment of lhc present invention, a negative control experiment is performed when the biological sample is tested. The control experiment is run by performing the present method with the same steps and the same materials as when the biological sample is tested, except that no first polynucleotide is attached to lhc solid support on which the control experiment is performed. A positive control experiment is also preferably run when performing the methods of the present invention. A positive control is run by performing the present method wilh the same materials and using the same sleps, with the exception that the first polynucleolide probe immobilized to the solid support contains a sequence common to the RNA and/or DNA of a group of organisms, infectious agents, or biological components of which the organism, infeclious agent, or biological component being probed for is a member. For example, if a species of fungus is being probed for, the first polynucleotide probe immobilized to the solid support of the positive control can contain a sequence common to the genus of fungi of which the species being probed for is a member. Thus, if any of the species of this genus of fungus arc present in the biological sample, the solid support comprising the positive control will indicate the presence of a fungus in the sample EXAMPLE 5a

Identification of Subtypes of ' .lun Oncogenes in a Sample In order to identify subtypes of jun oncogenes using subtype specific probes, approximately 10 picomoles (2 μl) of oligonucleotides specific lo one of jun-B. c-jun, and jun-D jun oncogene subtypes (B-1258 (SEQ ID NO:730), C2147 (SEQ ID NO:470). and HU D965 (SEQ ID NO:488)) were added into each of 3 wells of a plastic microtiter plate (No. 3490, made by Coster, Cambridge, MA) and mixed with 25 mM EDC (Pierce, Rockfor . IL) overnight al 37 ° C. Alter the oligonucleotide-EDC solution

was removed from the wells, 10 mM glycinc was added into each well and incubated at 37°C for 2 hours. Each of the wells was washed with 200 μl of water six times and then stored at 4°C until use. Approximately 50 μl of a solution containing approximately 1 mg/ml of each of jun-B, c-jun, and jun-D mouse jun oncogene cDNA (biotinylaled) in reaction buffer (10 mM Tris, pH 8.0, 1 mM EDTA, and 0.5 M NaCl) was placed in each of three wells in a microtiter plate. After incubating the wells at 37 °C for 2 hours, lhc wells were washed wilh lhc reaction buffer 3 times. Alkaline phosphatase-conjugated strcptavidin (Clontech, Palo Allo, CA) was diluted 1:1000 with the reaction buffer, and 50 μl of the diluted solution was added inio each well and incubated al room temperature for an additional 30 minutes. After the wells were washed w3ith the reaction buffer 3 more times, 50 μl of ATTOPHOS (JBL, San Luis Obispo, CA) was added into each well and incubated at room temperature for 10 minutes. The fluorescence of the individual wells was measured by CytoFluor 2300 (Millipore, Bedford, MA) al a filler selling of 485 nm for excitation and 590 n for emission. As shown in Figure 7A-7C, each sublype-specific oligonucleotide hybridized only to the corresponding mouse jun oncogene subtype. Also, no appreciable cross-hybridi/alion between subtypes occurred. Therefore, each of these probes is specific lo only one subtype of jun oncogene

XIII. Quantifying the Amount of an Organism. Infectious

Agent, or Biological Component Contained in the Sample

The amount of RNA and/or DNA in the biological sample from a particular organism or agent, or the amount of such RNA and/or DNA indicative of the presence of a particular biological component can be quanlitatcd by measuring ihe amount of second polynucleolide probe bound to the solid support after being applied therelo in accordance with the method of the present invention. When testing for an organism or infeclious agent, ihe amount of RNA and/or DNA in the sample can give a rough measurement of the number of organisms or infectious agents contained in the sample, and thus a rough estimate of the exlenl of an infection if lhc sample is from a diseased organism. To quantify the amount of second polynucleolide probe attached lo the solid support, a physical or chemical quantity or activity of the label on the second polynucleotide is measured. A number of techniques for measuring the label on a polynucleotide probe known lo lhc art can be used, the technique used depending on the kind of label. Such techniques include measuring the optical density of the buffer solution, lhc emiticd-light intensity of the buffer solution, or the amount of radiation given off by the immobilized second polynucleotide probe. The label itself can provide this indication or can require other compounds which bind thereto or which catalyze the label. Olhcr mechanisms for detecting label include the use of compounds that can chemically react with the label, the detection of a colored label, the detection of light emission, the detection of radiation, or the catalytic ability of the label. In one measurement technique, the label on the second polynucleotide is biotin. The presence of this label can be delected by reacting ii wilh ihe enzymes peroxidase or alkaline phosphalase. These

enzymes can be specifically directed lo biotin by conjugation with avidin or strcptavidin. The presence of the enzymes is then delected by ihe addition of an appropriate substrate to provide a detectable color-developing or light-emitting reaction. Alkaline phosphaiasc-labcled strcptavidin can be readily obtained from the commercial market. One substrate for alkaline phosphalase is adamantyl-1,2- dioxetane phosphate (AMPPD). Upon reaction with the alkaline phosphalase, AMPPD will emit light at a wavelength of 447 nm. This light can be detected in accordance with lechniqucs known in the art. A more preferred light emitting substrate for alkaline phosphalase is ATTOPHOS.

In the reaction between alkaline phosphalase and AMPPD, an enhancer such as 5-N- tetradecanoyl-amino-fluorcscein can be added. 5-N-teiradecanoyl-amino-fluorcsccin has the ability to convert light of 477 nm wavelength to light of 530 nm wavelength, which is more readily detectable.

Other labels include an antigen, such as digoxigenin or an antibody. An antigen can be detected by its ability lo bind lo an antibody directed thereto. Such antibodies, or an antibody directly used to label the second polynucleolide probe, can be detected by their ability lo bind a protein. The antibody itself can be labeled directly with a radionuclide such as '— J. or can be labeled by binding thereto a protein labeled with the radionuclide The radionuclide can then be detected in accordance with techniques well known in ihe art, such as using X-ray film or a radiation counter.

Well-known techniques for ihe detection of a label include detecting a label in a color- developing reaction with a spectrophotometer or fluorimeter. For example, fiuorescein, which gives off a fluorescent pigment, can be used as a label, and the intensit of the pigment can be used to gauge the amount of label hybridized to target polynucleotides. The use of such color-developing reactions are preferred to using radioactive labels in lhc present methods because lhc problems associated with using radioactive nuclidcs are thereby avoided.

Olhcr such techniques include lhc detection of a light emitting reaction using X-ray film or an instant camera. The emission reactions are recorded by X-ray film or instant camera film in the dark room. The X-ray film which is exposed by emission reactions is recorded as a blot, so that the shading of the blot can be measured by a densiiomcler. If one uses an instant camera such as a Polaroid, the picture is read by a scanner to decide the location of the blot on ihe computer, and the shading of the blot is determined using graphic analysis software.

An example of one embodiment of the present invention is illustrated in Figure 2. As shown in that Figure, following ihe collection and lysis of a sample of biological materials from a human patient, the ribosomal RNA of a species of fungus sought to be delected is hybridized. Following hybridization, a specific sequence 24 of the strand of ribosomal RNA 22 to be detected is annealed to a complementary sequence 28 on a first polynucleotide probe 26. The first probe 26 is immobilized on a solid support 20. After removing substantially all of the unhybridized portions of the sample, a second polynucleotide probe 34 is hybridized lo the strand of ribosomal RNA 22 al a different sequence 30, which is preferably one that is common to a plurality of fungal species. The second probe 34 contains

a sequence 32 that is complementary to the sequence 30. The second probe, in this diagram, also comprises a label 36 attached to the probe which facilitates the detection of the complex formed by the first probe 26, the second probe 34, lhc strand of ribosomal RNA 22, and lhc solid support 20.

Example 6 shows one method of measuring the amount of rRNA of a particular fungal species using a labeled common polynucleotide probe.

EXAMPLE 6 Measurement of Chemical Activities of the Labeled Second Nucleotide Probe Following the hybridization described in Example 5, the hybridization solution (Lysis Buffer containing the labeled second polynucleolide probe) is removed by aspiration and ihe microtiter plate is washed once wilh 250 μl of fresh Lysis Buffer. A blocking buffer consisting of 0.05% (w/v) Tween

20, 500 mM NaCl, and 100 mM Tris-HCl, pH 7.5 is added into each well and incubated at room temperature for five minutes to reduced nonspecific binding. These solutions are then removed by aspiration.

The fiuorescein on the second polynucleolide probe which is bound lo the ribosomal RNA of the species of fungus being tested for, if present, is then visually delected lo determine whether a fungus of that species was present in the biological sample The approximate quantity of fungus present in the sample is then measured by determining the amount of fiuorescein bound lo lhc microtiter plate with a spectrophotometer or in a fluorimeter. XIV. Using PCR to Detect Small Quantit ies of DNA or RNA in a Sample We have also discovered an alternative procedure which is especially useful in the detection of minute quantifies of an organism, infectious agent, or biological component in a sample. This alternative procedure makes use of the polymerase chain reaction (PCR) procedure lo create multiple copies of a polynucleotide strand which is complementary and/or homologous lo a biological component or a strand of RNA and/or DNA which belongs lo an organism or infeclious agent in the sample. In an example of this embodiment of the present invention, shown in Figure 3, a biological sample is first obtained as described previously. The sample is lysed so that the RNA and/or DNA of any organisms, infeclious agents, or cells carrying biological components which are present in the sample can be probed. The lysed sample is then contacted with a first polvnucleotide primer 42. This primer 42 is complementary lo a sequence contained in an anahle polynucleotide of an organism, infectious agent, or biological component to be delected in the sample The primer 42 is ihen contacted wilh such an analyte polynucleotide 40 with which it is complementary and hybridized to that polynucleotide. The hybridizing of the primer 42 can be accomplished in the same manner as was previously described in relation to the hybridizing or annealing of polynucleolide probes.

The sequence complementary to the primer 42 can comprise, for example, a sequence specific to the DNA or RNA of an organism, such as the ribosomal RNA of a species of fungus. Such a sequence can also comprise a sequence specific lo another infectious agent or lo a biological component,

such as an oncogene sequence. However, in another embodiment the primer 42 is complementary to a sequence located 3' of such a specific sequence, so that the primer 42 is positioned 5' of the specific sequence when it is annealed lo a slrand containing the specific sequence In this way the sequence complementary to the specific sequence is incorporated when the primer 42 is extended. In yet another embodiment, the primer 42 is complementary lo a sequence common to a group of organisms, infectious agents, or biological components, such as a sequence common to a plurality of fungal species. In another example, the primer 42 is common to a number of related biological components, such as a plurality of subtypes of jun oncogenes. Thus, in this embodiment, the first polynucleotide primer 42 can be referred to as a common PCR primer. After the common PCR primer 42 has been annealed to the ana le polynucleotide 40 present in the sample, this primer is extended using the four nucleotide triphosphates and a polymerase enzyme, thereby producing a double-stranded polynucleolide including a complementary nucleotide strand 41 comprising cDNA and having a sequence complementary lo the anahle polynucleolide in the sample. If the nucleotide sequence being probed for is contained in RNA, the polymerase is preferably a reverse transcriptase and the nuclcolidc triphosphates are deoxynucleolide triphosphates (dNTP's) so lhat cDNA complementary to the rRNA is produced. Further rounds of amplification can be accomplished by reannealing additional primer 42 to the analylc polynucleolide 40 and then extending that primer. As will be clear to those of skill in lhc art, the first polynucleolide primer 42 can also have a sequence specific to a particular organism, infeclious agent, or biological component. The first primer 42 is preferably extended using deoxvnudeotides lo form a slrand of cDNA

41 that is complementary lo ihe strand of RNA or DNA 40 in the sample. Using cDNA 41, amplification can also be accomplished by hybridizing lo the strand of cDNA 41 a second polynucleotide primer 44 that is complementary lo the newly synthesized complementary nucleotide strand 41. Thus, this second primer 44 is homologous lo a portion of the analyte polynucleolide 40. This second primer 44 can preferably be common to the RNA or DNA of a plurality of organisms, infeclious agents, or biological components and comprise a sequence different from that to which the first polynucleotide primer is complementary. Thus, in a preferred embodiment, the second primer 44 is a second common PCR primer, and the RNA and/or DNA of a plurality of organisms, infectious agents, or biological components can be amplified with il. Amplification is completed by extending the second primer 44 to produce a strand of cDNA 43 thai is homologous to al least a portion of the analyte polynucleotide 40 in a sample. This amplification step is also preferably repeated a plurality of times wilh further second primers 44.

These laler rounds of amplification preferably use the four dNTP ' s in combination wilh a DNA polymerase so that during amplification, double-stranded DNA is produced which contains one strand of cDNA 43 homologous lo lhc analyte polvnucleotide 40 present in the biological sample and one strand of cDNA 41 complementary lo such homologous cDNA 43. Preferably, approximately 20 to 40 cycles of DNA synthesis are performed in order to produce an adequate amount of cDNA homologous

to the RNA and/or DNA present in the sample for later detection. A DNA polymerase is used in the synthesis of such cDNA. This polymerase preferably has significant polymerase activity at temperatures above 50 °C, such as Taq DNA polymerase

Following these rounds of amplification, another round of amplification is preferably performed. In this second round of PCR, a third polynucleotide primer is used and if desired a fourth primer can also be used. One or both of these primers preferably contain a sequence that is homologous or complementary to a sequence specific to a particular organism, infeclious agent, or biological component, in which case such primers can be called specific primers. However, when it is desired to identify or quantitalc lhc presence of a group of organisms, infectious agents, or biological components, one or both of lhc third and fourth primers can contain a sequence complementary or homologous to a sequence common to a plurality of organisms, infeclious agents, or biological components. When PCR is performed using third and/or fourth primers containing sequences specific lo an organism, infectious agent, or biological component, amplification will occur in significant amounts only if RNA and/or DNA of the particular organism, infectious agent, or biological component being tested for is present in the biological sample and the initial amplification has produced cDNA corresponding to such RNA and/or

DNA.

The amplification of sequences contained in the RNA and/or DNA of an organism, infectious agent, or biological component can be delected in several ways, as will be appreciated by those having ordinary skill in ihe arl. For example, a common or specific primer can be labeled and the presence of the label in an extended polynucleolide can be detected. As shown in Figure 3, for example, a label

45 can be attached to the primer 44, which is then extended to produce the homologous polynucleolide 43.

When it is desired to detect a specific organism, infeclious agent, or biological component, the homologous polynucleotide strand 43 can be contacted wilh and hybridized lo a sequence 48 on a specific polynucleotide probe 4(> which is complementary to a sequence 49 that is specific to a particular organism, infectious agent, or biological component and which is immobilized on a solid support 47, as in olher embodiments of the present invention. Following this, the unhybridized portions of the sample are preferably washed from the solid support 47, and the labeled homologous polynucleolide 43 immobilized on ihe solid support 47 is dctecled. Alternatively, if the homologous polynucleolide 43 is not labeled, following the washing of the solid support 47 a second polynucleolide probe (not shown in

Figure 3) carrying a label can be hybridized lo the homologous polynucleotide 43. This can also be followed by washing. After washing unhybridized second probe from the solid support, the labeled second probe can be detected using any of a variety of techniques. The homologous polynucleotide 43 will only be detected, of course, if it is hybridized lo the immobilized polynucleolide probe 46. As will be appreciated by those of skill in the arl, the complementary cDNA strand 41 rather than the homologous polynucleotide 43 can also be annealed lo a polynucleolide probe 46 immobilized on the solid support. In this method, the primer 42 is labeled rather than the primer 44. The strand

-3<>-

41 can then be detected directly, as described above, or can be detected by hybridizing a labeled polynucleotide to the slrand 41.

Example 7 illustrates one method of identifying small quantities of the fungus Pneumocyslis carinii in a sample. EXAMPLE 7

Amplifying Ribosomal RNA Present in Minute Quantities with PCR A sputum sample from a patient suspected of having pneumonia caused by the fungus Pneumocγstis caiinii is first lysed wilh Lysis Buffer and brought lo a total volume of sample and lysis buffer of 50 μl. This mixture is then added lo a well of a microtiter plate to which a common probe

(SEQ ID NO: 1) has been immobilized, thereby contacting the probe wilh ihe mixture The common probe is a polydeoxynudcolidc complementary to a sequence common lo ihe ribosomal RNA of a number of fungal species, including A caiinii. The sequence on the ribosomal RNA a. Pneumocyslis caiinii to which this probe is complemeniarv is located 3' of a specific sequence (SEQ ID NO: 81) of such ribosomal RNA.

The common probe can then be hybridized to the fungal ribosomal RNA in the sample by incubating the mixture in the well at 39°C for one houi and then cooling the mixture over 20-30 minutes. Following the annealing of the common probe lo the ribosomal RNA, lhc probe is extended with a reverse transcriptase to produce a cDNA strand having a sequence complementary to the ribosomal RNA. The complementary slrand is then melted off of the ribosomal RNA by heating the mixture to 94°C for 1 to 2 minutes. This process is repeated several limes in order lo amplify the number of complementary strands.

Following ihe denaturation of the complemeniarv Mi and from the ribosomal RNA, a second primer homologous to a sequence of the ribo.somal RNA that is specific lo Pneumocyslis carinii is added to the mixture. The four dNTP's and a DNA polymerase capable of polymerase activity above 50 °C is added to the mixture. The mixture is then heated lo a temperature below the T m of lhc second primer but high enough to assure specific binding of the primer, in this case approximately 50 °C. After allowing enough time for the second primer to be extended, about 1-2 minutes, the mixture is heated to 94°C for 1 to 2 minutes lo melt the newh sv nthesi/ed strand homologous lo the ribosomal RNA of the sample from the complementary MI and. This process is repeated from between 20-40 times, wilh the addition of primer as necessarv.

Following this round of amplification, two labeled, specific primers are added to the mixture.

One is complemeniarv lo a specific sequence on complementary slrand, while the other is complementary lo a sequence on the homologous strand. After annealing such primers to the complemeniarv and homologous strands, these primers are extended with a polymerase in the same

fashion as the second primer, after which such strands are melted off by raising the temperature of the mixture to 94 ° C for 1 to 2 minutes. This process is also repeated between 20-40 times.

After the final round of amplification, the mixture is heated to melt off any newly synthesized strands from the templates from which ihcy were produced, and the mixture is cooled to 37 °C and incubated at that temperature for an hour in order to allow the strands present in the mixture to anneal to a specific probe immobilized on the microtiter well. This probe is complementary to a specific sequence in the ribosomal RNA of Pneumocyslis carinii and anneals to the ribosomal RNA of that fungus in addition to the synthesized sequences homologous lo such ribosomal RNA.

Once the polynucleolide strands in the mixture have been allowed to hybridize to the specific probe, the non-hybridized portions of the mixture are removed by aspiration. The walls of the microtiter plate are then washed wilh Lysis Buffer to remove any non-specifically bound nucleotide strands from the well. The label attached lo the specific primer is next delected in order to detect the presence of Pneumocyslis carinii in the biological sample If the label is delected, this indicates that the biological sample contained this fungus. XV. Primers for Use in PCR

We have identified several probes and PCR primers for use in ihe present invention. In particular, sequences have been identified for delecting jun oncogenes and G protein sequences in humans and olhcr animal species, as well as sequences for detecting substance P and Beta-receptor sequences. The design and use of uch primers is described below. A. Primers for Delecting Jun Oncogenes

It is well known that certain oncogenes, uch as the jun oncogenes, are most rapidly expressed when cells are stimulated lo proliferate Therefore, the detection or quantification of the expression of jun oncogenes is a good marker for cellular mitogenic activity. Jun oncogenes were first reported by Maki, et al. (Proc. Nail. Acad. Sci. USA. 84:2848-2852 (1987)), and are currently known to exist in at least three different forms or subtypes, jun-B, c-jun. and jun-D. It is still unclear how these three jun subtypes are involved in cell growth, however, so that one has to analyze all three genes during cell growth in order to detect mitogenic activity. In addition, although it has been discovered that mice also carry three different jun oncogene subtypes, the nucleotide sequences of these three subtypes differ from the subtypes present in human cells (Ryder K., Nathans D., Proc. Nail. Acad. Sci. USA 85:8464-8467, 1988; Ryder K. et al., Proc. Nail. Acad. Sci., USA 85:1487-1491, 1988; Hallori K. el al., Proc. Nail. Acad.

Sci. USA 85:9148-9152, 1988; Schuetle J. et al., C ll, 59:987-997, 1989).

In order to design sense and anti-sense PCR primers for delecting jun oncogenes in both humans and mice, sequences were designed with the following considerations in mind:

(a) The nucleotide sequences should be common lo the three subtypes of jun oncogenes (jun-B, c-jun and jun-D), with a maximum of 4 base mismatches.

(b) The nucleotide sequences should be common to both humans and mice, with a maximum of 4 base mismatches.

(c) At lcasl 5 bases at the 3' end of the primers should be 100% identical lo sequences in all three jun subtypes in both humans and mice (d) The length of the nucleotide sequences ol both the sense primers and the anti-sense primers should be between about 17 and 50 bases.

(e) The difference in T between the sense primers and anti-sense primers and their corresponding sequences should be within 2°C.

(1) There is no complementary structure more than 4 bases long in either the sense or the anti-sense primer.

(g) There should be no complementary structure more than 4 bases long between the sense and the anti-sense primer.

(h) Both the sense and the anti-sense primers should span pan of the coding sequence of iheir complementary polynucleotides. (i) The length of polvnucleotide lo be amplified should be greater than 200 bases. j) The nucleotide sequence homologv ol amplified genes among ihe three subtypes of jun oncogenes should not be greater than 80', " ! .

DNA fragments (both sense and anti-sense primers) which satisfied the above conditions were investigated and several candidate oligonucleotides were svnthesi/ed. One set of primers which fit these parameters particularly well is the sense primer 5'-CCCTGAAGGAGGAGCCGCAGAC-3' (SEQ ID

NO:733) and ihe anti-sense primer 5'-CGTGGGTCAA( ) ACTCTGCTTG AGCTG-3' (SEQ ID NO:734).

The homologv between the sense primer SEQ ID NO:733 and several jun gene subtypes is shown in

Table 2 below.

Table 2

Sense primer 5'-CCCTGAAGGAGGAGCCGCAGAC-3 "

Mouse jun-B --T-T--A

(SEQ ID NO:735) c-jun A

(SEQ ID NO:736)

Human jun-B — T-C A (SEQ ID NO:737) c-jun T

(SEQ ID NO:738)

-: indicates identical base lo ihe sense primer. As an alternative lo the sense primer SEO ID No:733 shown in Table 2, 5'-

XCCCTGAAGGAGGAGCCGCAGAC-3' (SEO ID No.7 9) can also be used as a primer for delecting jun oncogenes. In this sequence, X represents a primal \ amine residue, a nucleotide sequence

recognized by a restriction endonuclease, or an RNA promoter sequence. The attachment of a restriction site to the 5' end of the PCR primers is useful for lhc cloning of amplified genes, while the attachment of RNA promoter sequences at uch 5' ends is useful in RNA transcription and RNA transcription-based amplification, as is known lo those of skill in the art. Preferably, the RNA promoter is either a T7, SP6, or T3 RNA promoter sequence The attachment of a primary amine at 5' end can also be useful for coupling reactions, whereby the primer can be attached lo labeling compounds or to solid supports via the primary amine group. Such primary amine residues can be added onto the 5' end of a nucleotide during oligonucleotide synthesis. The antisense analog 5'-

XCGTGGGTCAAGACTTCTGCTTGAGCTG-3' (SEQ ID No:740) can also be used to probe for the different jun gene subtypes, where X represents a primary amine residue, nucleotide sequence recognized by a restriction enzyme, or RNA promoter sequence.

The DNA fragments for the sense primer and anti-sense primer in this invention can be easily synthesized using a DNA synthesizer. These synthesized oligonucleotides can be purified by high pressure liquid chromatography or gel eleelrophoresis. The test material to be analyzed is usually total RNA or purified mRNA from cells or tissues.

If desired, cells or tissues can be tested in their natural slate without any prelrcalmcnt. In ihe case of drug testing, a drug can first be administered to cells or to tissues in a tesl tube A drug can alternatively be administered (through intravenous injection, subcutaneous injection, intramuscular injection, oral administration, or inlra-abdominal injection) to a laboratory animal, after which cells or tissues are removed from the animal.

Total RNA or mRNA can be purified through standard protocols, such as those described in Molecular Cloning, or by using a commercially available kit such as FaslTrack from Invitrogen (San Diego). In cither case, in order lo avoid introducing any RNase into a solution containing RNA, a researcher's hands should be protected wilh vinyl gloves, and the instruments used for experiments should not be touched wilh bare hands. Also, any glass containers to be used in lhc experiments should be heated prior to use at approximately 250 °C for at least 4 hours. Furthermore, any water to be used in the procedures should be treated wilh 0.1', " ! dieth l pyrocarbonale (DEPC) incubaled overnight at 37°C and autoclavcd.

The cDNA to be synthesized from mRNA is made using reverse tran.scriptase, as described in Molecular Cloning. Once such cDNA has been generated, it is mixed with sense primer, antisense primer, 4 types of deoxynucleolides (dATP, dCTP, dGTP and dTTP), Taq polymerase, inorganic salts, and other necessary materials, and a PCR reaction is undertaken in a thermal cycler (Perkin-Elmer Cetus).

In order to analyze the genes amplified in this way. it is appropriate to use eleelrophoresis. After the amplified gene undergoes eleelrophoresis in an agarose gel, the DNA is stained with clhidium bromide. The amplified DNA band will be then visible under fluorescent light. After taking

photographs of the DNA band, it is also possible lo quantify the intensity of each DNA band by scanning such photographs and analyzing the scanned picture wilh a commercially available system such as Stratascan (Stratagenc, La Jolla). Furthermore, after agarose gel eleelrophoresis, amplified genes can be transblotted onto membranes, and subtypes of specific jun genes can be detected by hybridizing the blotted membranes with labeled probes followed by exposing labeled signals, such as P or chemiluminescencc, to either Polaroid films or X-ray films (i.e., doing a Southern blot).

As with the other polynucleotides that can be delected in accordance wilh the methods of the present invention, a variety of jun sequences can serve as sense or antisense primers for PCR methods or as probes for the detection of DNA or RNA as described herein. A method for the identification of such sequences that arc either common lo a variety of jun oncogenes or specific to a particular species is provided hereinbelow. In the preferred embodiment of this method of identification, a computer program is used to identify the sequences. Through use of such a program, we have identified a large number of both common and specific primers and probes. Provided as Tables XVII through XXII and XXX through XXX11 are various sense sequences, i.e., sequences homologous or approximately homologous lo sequences found in an organism, infeclious agent, or biological component, which have been identified through the use of such a computer program as being useful as jun gene probes and primers.

All of the sequences listed in these tables are useful within the context of the PCR methods of the present invention. The complementary antisense sequences are also useful as both probes and/or PCR primers in certain aspects of the invention. As will be known by those having ordinary skill in the art, for common probes thai are similar but not identical lo target sequences, stringency conditions can be varied (e.g. by changes in lemperature and salinity) so thai such probes will hybridize or fail to hybridize with a particular target sequence. Thus, also included within the prescnl invention are sequences that are capable of hybridizing wilh lhc same sequences as either the sense sequences listed or their anti-sense counterparts. Additional probes for jun genes include the following:

Common jun gene probes

5'-CCATGTCGATGGGGGACAGCGG-3' SEQ ID NO:741

5'-CTGTTTAAGCTGCGCCACCTG-3' SEQ ID NO:742

5'-GTCTGCGGCTCCTCCTTCAGGG-3- SEQ ID NO:743 5 , -CGTGGGTCAAGACTTTCTCΪCTTCϊAGCTG-3' SEQ ID NO:744

Specific probes

B type: 5'-CACTTGGTGGCCGCCAG-3' SEQ ID N():745 C type: 5'-GAGCATGTTGGCCGTGG-3 " SEQ ID NO:746 Human D type: 5'-GATGCGCTCCTGCGTGT-3' SEQ ID NO:747 Mouse D type: 5'-GCCTGTTCTGGCTTTTGAGGG-.T SEQ ID NO:748

EXAMPLE 8

Svnthesis of DNA fragments (sense and anti-sense primer) and amplification of mouse clones of jun oncogenes.

SEQ ID No:728 (S943-2) and SEQ ID No:729 (AS1132-2) were synthesized with a 380 B type

DNA synthesizer (Applied Biosystems Co.). After treatment wilh ammonium hydroxide at 55°C overnight, the synthesized oligonuclcolidcs were dried in a Speed-Vac (Savant Co.), and the concentration of each oligonucleotide was adjusted lo 1 microgram/ml with water. These oligonucleotides were then at -20 °C until use.

One microliter (containing approximately 10 ng) of one of the three types of mouse jun clones (jun-B, c-jun, or jun-D, obtained from ATCC), was placed in each of three reaction tubes. To each of these reaction tubes was then added 1 microliier of sense primer SEQ ID No:728, 1 microliter of anti¬ sense primer SEQ ID No:729, 5 microliters of 10X buffer for PCR (Promega), 1 microliter of 25 mM magnesium chloride, 4 microliters of 10 mM dNTP mix, and 0.5 microlitcrs of Taq polymerase (Promega) were mixed, and water was added up lo a total volume of 50 microliters. After adding two drops of mineral oil to each tube, PCR was undertaken using the thermal cycler, model 480 (Pcrkin- Elmer Cetus). After ihe rcaclion mixture was heated at 95°C for 10 minutes, PCR was carried out with the following cycles 30 times: annealing at 55° C for 1.5 minutes, extension at 72°C for 4 minutes, and denaturing at 95 °C for 1.5 minutes.

After PCR, 10 microliters of lhc sample was mixed wilh 1 microliter of lOx loading buffer (0.25% bromophenol blue, 0.25% xylenecyanol FF, and 15% Ficoll, Type 400), and eleelrophoresis was carried out on a 1.5 agarose gel containing 5 microgram/ml elhidium bromide. After eleelrophoresis, the amplified DNA bands were visualized by an ultraviolet light.

As shown in Figure 4, the clones jun-B, c-jun, and jun-D were amplified using SEQ ID NO:728 and SEQ ID NO:729, and a single band for ihe amplified DNA was observed at the position of approximately 270 bp in each case. This indicates that the above-mentioned set of primers can recognize and amplify all three types of mouse jun oncogenes, jun-B, c-jun and jun-D.

EXAMPLE 9 The effect of pretreatmcnl wilh EGF on expression ol " inn oncogenes in human mononuclear leukocytes. The following protocol describes the effect of using EGF lo stimulate the production of jun oncogene mRNA:

(1) Pretreatmcnl of human leukocytes with EGF. 40 ml of phosphate buffered saline (PBS) is added to 20 ml of heparinized human blood and mixed. 10 ml each of this sample is overlayered onto 3 ml of IsoLymph, ihen centrifuged for 30 minutes at 400 x g. After washing ihe pellet three times wilh PBS, the pellet is resuspended in 3 ml of PBS. 1 ml each of this sample is then placed into three tubes (No. 1 lo No. 3). 1 ml of PBS is placed in a fourth tube as control. The four tubes are incubated for 10 minutes al 37°C, and EGF (Epidermal Growth Factor) is then placed in tube No. 1 at a final concentration of 30 ng/ml. After 15 minutes. EGF is placed in the same manner in tube

No. 2, and incubated for another 5 minutes. After 5 min, RNA is extracted from all 4 tubes simultaneously. Thus, the time period of the pretreatmcnl of human leukocytes by EGF is 20 minutes for tube No. 1, 5 minutes for tube No. 2, and 0 minutes for tube No. 3.

(2) Extraction of RNA from cells. The above four tubes are taken out and subjected to centrifugation by a microfuge for 10 seconds. Then, the supernatant is discarded and the below described lysis buffer is added to the pellet. The lysis buffer and pellet arc thoroughly mixed, after which the mixture is incubated for 30 minutes at 45 ° C.

Contents of lysis buffer:

10 mM EDTA pH 8.0 0.5% SDS (with baclcria removed by non-bacterial filter)

0.2 M NaCl

DEPC treated water

RNA inhibitor 500 unit s/ml

Vanadyl Complex 10 mM Proteinase K 200 microgram/ml

(3) Purification of mRNA. After 5 M NaCl is added to the above cell lysates to obtain a final concentration of 0.5 M, oligo (dT) cellulose (Siratagene Co.) is added, and is reacted for 30 minutes at room temperature. Then, after washing the cellulose in 10 ml binding buffer (20 mM Tris- HCl, pH 7.6, 1 mM EDTA, 0.5 M NaCl) 5 times, 0.35 ml of DEPC treated water is added, and mRNA is eluted from the solid phase Then, 53 microliters of 2 M sodium acetate and 2.5 times the volume of ethanol are added, and after cooling in dry ice for 20 minutes, this mixture is centrifuged at 15,000 rpm for 20 minutes. After washing the pellet in 75% ethanol one time, lhc pellet is dried and then dissolved in 10 microliters of DEPC treated water. The resulting mixture contains mRNA from the cells being tested. (4) Synthesis of cDNA. 50 mM Tris-HCl (pH 8.3), 75 mM KCI, 3 mM magnesium chloride, 10 mM DTT, 0.5 mM dNTP (dATP, dCTP, dGTP, dTTP), 50 micrograms/ml oligo (dT) primer, and 10,000 units/ml reverse iranscriptase are added lo 10 microliters of ihe mRNA obtained above to a total volume of 20 microliters, and this undergoes a reaction for one hour al 37 ° C. After the reaction, 20 microliters of a phenol:chlorolorm:isoamvl alcohol mixture is added, and ihe mixture is cooled for 20 minutes in dry ice to precipitate the cDNA. After centrifugalion for 20 minutes at

10,000 rpm, ihe pellet is washed one time in 75% ethanol. Then, after drying, the pellet is dissolved in 20 microliters of autoclavcd water to form a cDNA solution, and stored at -20 degrees C.

(5) PCR. One microliter each of sense primer SEQ ID No:733 and anti-sense primer SEQ

ID NO:740 (1 mg/ml) for jun gene amplification are added with 2 microliters of the cDNA solution above. After mixing this with 50 mM KCI, 10 mM Tris-HCl (pH 8.4), 2.0 mM magnesium chloride, 100 micrograms/ml gelatin, and 0.2 mM dNTP, 2.5 units of Taq polymerase are added (final volume is 50 ml). After the reaction mixture is heated at 95 °C for 10 minutes, PCR is carried out with the following

cycles 30 times: annealing at 55°C for 1.5 minutes, extension al 72° C for 4 minutes, and denaturing at 95 degrees C for 1.5 minutes.

(6) Agarose gel eleelrophoresis. After completing PCR, 10 microliters of the reacted solution is taken and clcclrophorescd in the same method as described in example 1. The results showed that in the leukocytes which did not undergo treatment with EGF (i.e., ihe sample which underwent 0 treatment lime in tube No. 3) is found lo have a minimal band for ihe amplified DNA at the position of about 270 bp size However, it has been found that the band of amplified DNA was increased after 5 min, then returns lo basal levels within 20 min.

EXAMPLE 10 Effect of pretreatment with PHA on jun gene expression in human leukocytes.

In the place of EGF, PHA (al a final concentration of 10 micrograms/ml) is utilized and the pretreatment times are set at 0 minules, 5 minutes, 15 minutes, and 30 minutes. The procedures followed in Example 2 are then followed. As a result, with ihe pretreatmcnl wilh PHA done al 15 minutes, it was found that the band for the amplified DNA al the position of approximately 270 bp was maximized, but for the lime periods aflcrwaids, this band deci eased in intensity and therefore in the quantity of DNA it contained.

B. Primers for Delecting C Protein Seψience.,

Cell surface receptors for hormones and ncurotransmilters arc known lo be coupled to intracellular heterolrimeric GTP-binding proteins (G proteins) composed of α, β and r subunits. Once receptors arc activated by specific ligands, receptor-coupled G proteins transduce signals to intracellular secondary effector systems, such as adcnylyl cyclase, phospholipasc C, and ion channels.

G proteins are believed to be involved in causing various disease stales. For example, a genetic deficiency of G s proteins is the molecular basis of hereditai v osicodyslrophy. Pituitary tumors in acromegalic patients have been shown to contain mutant Gs proteins. G proteins are also involved in invasive and melastatic melanoma cells. Ral models of strepto/oiocin-induced experimental diabetes suggest that the levels of mRNA for various subclasses of Gα proteins are significantly altered from normal control rats. Furthermore, cellular functions of pertussis toxin-sensitive G proteins were shown to be significantly impaired in atherosclerotic porcine coronary arteries, while G protein function in leukocytes of patients with mania was hyperfunctional. However, currently available immunological detection methods (Western blots) and mRNA detection methods (Northern blots) arc not sensitive and require a lot of cellular material, making it difficult lo si h the role of G proteins in such diseases.

Although G proteins have been analyzed extensively from a biochemical and immunological point of view using various antibodies, antibody production without any cross-reactivity among various subclasses or with high species specificity has been quite difficult to obtain. Therefore, recent experiments have focused on Northern blot analyses to identify G protein-specific mRNA from various tissues or cells in different species. However, Northern blots require experienced handling and protection from RNase contamination, in addition lo a large amount of starling cellular materials.

In contrasl to these conventional methods, PCR technology is more convenient and practically useful, because it requires less material than a Northern blot anal sis and has great sensitivity. However, it is difficult for PCR to quantify the amount of DNA or mRNA in starling materials.

We have identified two highly conserved oligonucleotide sequences among five different α subunits of G proteins, SEO ID NO:528 and SEQ ID NO:731, which can be used as PCR primers.

These sequences are able to amplify the sequences of all the subclasses of G proteins under the same PCR conditions, including G prolcin sequences obtained from a mixture of rat GQ protein clones and cDNAs derived from various human tissues. Interestingly, the final PCR products obtained using the novel G-protein PCR primers of lhc present invention reflect the relative composition of each of the subclasses of Gα proteins present in lhc starting materials. This is probably because the five different

GQ proteins cDNAs are amplified al a similar rate with a single set of PCR primers under the same PCR conditions. If known mixtures of each of the ubclasses oI ' Gα protein clones are assayed together with unknown test samples, as shown in Fig. 6, the relativ e composition of GQ proteins can be determined fairly precisely. Therefore, the present method is ideal for the characterization of GQ proteins in various tissues and cells.

Performing PCR wilh the primers of the present invention is also useful in clinical and diagnostic assays in the detection of disease Since G protein abnormalities have been associated wilh hereditary diseases, cancer, forms of diabetes, and other diseases, the present PCR primers for detecting and quantifying G proteins can be used lo delect these diseases and assess their severity. We have identified ϊ j _ j , G j _- > , G j _ , G s and G () using the PCR primers of the present invention

(SEQ ID NO:52S and SEQ ID NO:731 ). Although recent cloning has identified more subclasses of G proteins, all of these newly identified (J protein cDNAs showed a high degree of homology to other known G proteins. Therefore, it is expected that the primers of the present invention will amplify these subclasses as well, and that lhc present PCR lhc present PCR technique can also be applied to these new G proteins. Moreover, this PCR method can be utilized lo clone unique G protein genes as well.

We designed two 22-mer oligonucleotides, G 2 (SEQ ID NO:528) and G 4 (SEQ ID NO:731), as PCR primers for the detection of G protein sequences. As shown in Table 3, these oligonucleotides contain sequences which are highly conserved among five different Gα protein cDNAs, having only 0-4 base mismatches per sequence. No mismatch was found in the 4 bases al the 3' end of the G 2 -sense and G^-antisense sequences. Furthermore, G-, and Cϊ_ j have no self-complemenlary sequences more than 3 base pairs in a row (data not shown). In order lo analyze whether G 2 and G^ are common to all the Gα proteins, but not to other unrelated sequences, a homology search (DNASIS) of G 2 and G 4 sequences was carried out against all mammalian sequences in GcnBank. As a result, G 2 and G 4 were found to be common to all the types of GQ proteins and rhodopsins of various species, but less homologous lo other unrelated sequences (data not shown).

Table 3.

Two consensus oligonucleotides (G2 and G4) among five different cDNAs of G protein α subunits.

Consensus sequence (# of mismatch)

G2 G4

AGCACCATTGTGAAGCAGATGA Length (bp) TGTTTGATGTGGGAGGCCAGAG

Gi-1 AGCACaATTGTGAAGCAGATGA (1) 476 TGTTTGACGTGGGAGGCCAGAG (1)

Gi-2 AGCACCATcGTcAAGCAGATGA (2) 479 TGTTTGATGTGGGtGG CAGcG (3)

Gi-3 AGtACtATTGTGAAaCAGATGA (3) 476 TGTTTGATGTaGGtGGCCAaAG (3)

Gs AGCACCATTGTGAAGCAGATGA (0) 524 TGTTcGATGTGGGcGGCCAGcG (3)

GO AGCACCATTGTGAAGCAGATGA (0) 479 TGTTTGAcGUGGaGGCCAGcG (4)

PCR was first carried out al different annealing temperatures ranging from 37°C to 65°C using the λ gtlO library of human HL-60 cells. As a result, PCR products were seen only at 45 °C and 55 °C with a size of approximately 500 bp (data not shown), which was similar to theoretical values (576 to 524 bp) (see Table 3 above). Therefore, all the PCR was then carried out al an annealing temperature of 45°C.

As shown in Fig. 5, cloned rat Gα protein cDNAs (GJ.J , G j _->, G j _ 3 , G s , G () ) were successfully amplified using the same set of PCR primers (G, and G 4 ) with a size of approximately 500 bp in 1.2% agarose gels stained wilh clhidium bromide According lo lhc computer analysis (DNASIS), the nucleotide sequences of the amplified PCR products were less homologous among five Gα proteins with the percentage of similarity ranging from 76.8% to 47.6%. This indicates lhal after PCR amplification, each of the components of Gα proteins can be identified by Southern blot analysis, even though the sizes of the PCR products generated are very similar among the five Gα proteins. Therefore, another PCR was carried out in which 35% of the dTTP was replaced wilh biotin-conjugaled dUTP in order to prepare subclass-specific, biotin-labeled probes. Southern membranes were then probed with these biotin-PCR products. As shown in Fig. 5, these biotin-PCR probes were highly specific to each Gα protein subclass with washing tempcralure at 65 °C. Al low stringent washing, these probes cross- hybridized with olhcr subclasses of Gα proteins (data not shown).

By using the G 2 and G_, sequences, all the subclasses of Gα protein cDNA were amplified with PCR when an equal amount of G j _ j , G j _* > , G j _ 3 and G () were present in lest samples (Fig. 6, lane 4, 5, 10). However, if all the concentrations of Gα protein cDNA are abundant, G is less amplified (Fig. 6, lane 10), probably because the number of mismatches between G Q and G 4 is higher than others between G 4 and the G 2 sequences. If 1 or 2 of the 5 Gα prolein cDNAs were present in smaller quantities than the others, the amounts of amplified cDNA were relatively correlated with lhc starling concentrations of cDNAs (Fig. 6, lane 1, 2, 3, 8, 9). Furthermore, if 1 of 5 of the Gα proteins' cDNA is more abundant than that of the others, this G prolein gene was amplified more than others (Fig. 6, lane 6, 7).

Using this PCR method, Gα protein genes were amplified not only from cloned cDNAs, but also from various rat cDNAs (Fig. 7). In λZAP cDNA libraries from rat pituitary glands and cDNA from rat kidney KNRK cells, G () was more abundant than G s , G i 2 and G j _ 3 , and G j . j was undetectable

(Fig. 7, lane 1, 2). λZAP cDNA library of rat intestine contained more G ; 2 , G j _ 3 , and G s and G 0 (Fig. 7, lane 3).

According to the sequence analyses, the PCR products of rat G j _ j , G j _ 2 , G j _ 3 , G s and G 0 sequence amplification exhibited a high degree of homology to human G protein cDNAs, see Table 4 below). Furthermore, as shown in Fig. 8, PCR with a pair of G 2 and G_, primers could amplify 500 bp DNA from cDNAs of human 1M9 and Jurkat cells. Unlike rat cDNAs (Fig. 7), both IM9 and Jurkat cells contained all the subclasses of Gα proteins (Fig. 8). However, G,_ 3 is relatively more abundant in

IM9 cells, while G s and G () were more in Jurkat cells than IM9 cells (Fig. 8).

Table 4. Nucleotide sequence similarity of PCR products between rat and human G proteins.

Example 10 describes a method of amplifying and delecting G proteins with another set of PCR primers of the present invention, G2-S and G4-AS.

EXAMPLE 10

Amplifying G Proteins Wilh PCR Primers

Materials. The cDNAs of rat G protein α subunils (G j _,, G j _ 2 , G j _ 3 . G s and G 0 ) were provided by Dr. R.R. Reed (Johns Hopkins Univ., MD). λZAP libraries of rat pituitary and intestine were provided by Dr. D.G. Payan (Univ. Calif. San Francisco). Kirstcn murine sarcoma virus transformed rat kidney cells (KNRK), human IM9 B-lymphocyles and human Jurkat T-lymphocyles were obtained from American Type Tissue Culture Collection, Rockville, MD). Cell culture media, Superscript

(Gibco/BRL, Gaithcrsburg, MD), reagents for PCR (Promega, Madison, Wl), ECL (Amersham,

Arlington Height, 1L), Genius. Lumi-Phos 530 (Boehringer-Mannheim, Indianapolis, IN), FastTrack (Invitrogen San Diego, CA), λ gilt) library of human HL-60 cells, biotin-dUTP, alkaline phosphatase-

conjugated streptavidine (Clontcch, Palo Allo, CA), dNTP (Pharmacia, Piscataway, NJ) were obtained from the designated suppliers. Other chemicals were purchased from Sigma (St. Louis, MO).

Cell culture. KRNK cells were grown in Dulbccco's modified Eagles medium containing 10% fetal calf serum, 100 U/ml penicillin and 100 μg/ml streptomycin at 37°C in 5% C0 2 /95% air. Cells were fed every other day and passaged at 70-90% confiuency wilh 0.1% irypsin in CA -Mg + -free saline containing 0.02% EDTA. IM9 and Jurkal cells was grown in RPMI 1640 containing 10% fetal calf serum, 100 U/ml penicillin and 100 μg/ml streptomycin at 37 ° C in 5% C0 2 /95%. Cell viability was more than 90% as assessed by the exclusion of trypan blue

Primer design. Rat clones of Gα proteins (G j . j (RATBPGTPB), G;_ 2 (RATBPGTPA), G j _ 3 (RATBPGTP), G s (RATBPGTPD), and G n (RATBPGTPC) were retrieved from GcnBank release 65.0

(HIBIIO, Hitachi America, Brisbane, CA). The nuclcolidc sequence similarity among these clones were then analyzed by the multiple alignment program (DNAS1S, Hitachi). We have initially identified 7 highly conserved areas among them. These conserved nucleotide sequences were then analyzed against all mammalian sequences in GcnBank in order lo identif olher similar sequences. The designed oligonucleotides G2-S (SEQ ID NO:528) and G4-AS (SEQ ID NO:731) were synthesized by Genosys

Biotechnologies (Woodlands, TX), and suspended in water al 100 pg/ml.

PCR. One μl of the template DNA was mixed wilh 1 mM each of dATP, dGTP, dCTP and dTTP, 1 μl of each PCR primers, 1 μl of 25 mM MgC 5 μl PCR buffer, and 0.5 μl of Taq polymerase (18). PCR was then carried out in a DNA thermal cycler (model 4S0, Perkin-Elmer Cetus, Norwalk, CT) with 30 cycles of annealing lemperature at ranging from 37 °C lo 65°C for 1.5 min, 72° C extension for 4 min followed by 95 β C denaturization for 1.5 min. In separate experiments, 35% of dTTP was replaced wilh biolin-dUTP in order to prepare biolin-labeled probes.

Southern blot. PCR products were separated by eleelrophoresis in 1.2% agarose, and stained with ethidium bromide (19). Gels were then depurinaled in θ.25 N HCI for 30 minutes and denatured in 0.5 N NaOH containing 1.5 M NaCl for 30 minutes. The gels were then neutralized with 1.0 M Tris, pH 7.6 containing 1.5 M NaCl for 30 minutes. Gels were then placed onto nylon membranes (MagnaGraph, MSI, Wcstboro, MA) prewetled in 10X SSPE for 10 min, and DNA was Iransferred onto membranes by positive pressure at 75 mmHg for 60 minutes (Posiblol, Slratagene. La Jolla, CA). The DNA from the gel was then cross-linked lo lhc membranes with ultraviolet light at 120 mjoules (Stratalinker, Stralagene), and lhc membranes were incubated wilh hybridization buffer (ECL) containing 5% blocking reagent (ECL) and 0.5 M NaCl al 4 °C for more than 1 hour. Heat denatured biotin-labelcd PCR probes were then added, and hybridization was continued overnight. The membranes were washed four times for 15 minutes each time with primary wash buffer (0.5x SSPE, 36 w/v% urea, 0.4w/v% SDS) al 45-65 °C. then washed twice for 5 minutes ilh secondary wash buffer (2x SSPE) at room lemperalure, and were incubated wilh lhc blocking buffer (Genius) for al least 3 hours at room temperature. Alkaline phosphalasc-conjugalcd streptavidine (1:5.000 dilution) was then added, and incubation was continued for an additional 30-60 minutes at room lemperature. The membranes

were washed four times for 15 minutes wilh buffer A (100 mM Tris, pH 7.5, 150 mM NaCl) at room temperature, were washed for 2 minutes once wilh buffer C (100 mM Tris, pH 9.5, 100 mM NaCl, 50 mM MgCl ), and soaked in Lumi-Phos 530 for approximately 1-2 minutes. The membranes were then wrapped with transparency films, and chemiluminesccnt signals were allowed lo expose X-ray films (XAR-5, Kodak, Rochesler, NY) for between 10 minutes and 1 hour. mRNA preparation and cDNA synthesis. The cells were washed with phosphate buffered saline three times, homogenized in lysis buffer (FaslTrack), and then incubated at 45 °C for 1 hour to eliminate any RNase activity. NaCl concentrations were adjusted at 0.5 M, and an oligo (dT) cellulose tablet was added to lysis buffer. Incubation was then continued al room temperature for an additional 40 minutes. After oligo (dT) cellulose was washed with binding buffer (FaslTrack) four times, bound mRNA was eluted with DEPC-trcatcd water. Concentrations of mRNA were determined in a spectrophotometer (Hitachi, U-2000, Irvine, CA) al OD- >w) . The first strand cDNA was synthesized from a template mRNA in the presence of 50 mM Tris, pH 8.3, 75 mM KCI, 3 mM MgCl 2 , 10 mM DTT, 0.5 mM each of cATP, dCTP, dGTP, and dTTP, poly (dT) as a primer, and reverse Iranscriplasc (Superscript) at 37°C for 1 hour. Second strand cDNA was then synthesized in the same lube, conlaining 25 mM Tris, pH 7.5, 100 mM KCI, 5 mM MgCl 2 , 10 mM (NH ) 2 S0 , 0.15 mM 3-NAD + , 250 μM each of dATP, dGTP, dCTP and dTTP, 1.2 mM DTT, 65 U/ml DNA ligase, 250 U/ml DNA polymerase, and 13 U/ml RNase H (Superscript) for 2 hours at 16°C. Synthesized cDNAs were then extracted once with an equal volume of phcnokchlorofornvisoamyl alcohol (25:24: 1 ), precipitated wilh ethanol, and resuspended in H 2 0.

Graphic presentation. Data on Polaroid films and X-ray films was scanned by Slratascan (Stratagene) with optimization of signal-lo-noise ratio, then edited with desk top publishing software (PageMaker, Aldus, Seattle, WA). As shown in Figure 5, ihe combination of SEQ ID NO:528 and SEQ ID NO:731 can amplify all of ihe subtypes of G prolein α subunils. It will be evident to one having ordinary skill in the art thai a variety of sequences could serve as sense or antisense primers for PCR methods or as probes for the detection of DNA or RNA as described herein. A method for identification ol " such sequences that are either common to a variety of G proteins or specific to a particular species is provided hereinbelovv. In lhc preferred embodiment of this method of identification, a computer program is used lo identify the sequences. Through use of such a program, we have identified a large number of both common and specific primers and probes.

Provided as Tables XX11I through XXIX and XXXIII through XXXV11 arc various sense sequences identified through the use of such a program that are useful as G protein probes and primers.

All of the sequences listed in these tables are useful within the conlcxt of lhc PCR methods of the present invention. The complementary antisense sequences are also useful in certain aspects of the invention. As will be known having ordinary skill in ihe arl, for common probes that arc similar, but not identical to target sequences, stringency conditions can be varied (e.g. by changes in temperature and salinity) so that such probes will hybridize or fail lo hybridize with a particular target sequence.

Thus, also included within the present invention are sequences that are capable of hybridizing with the same sequences as either lhc sense sequences listed or their anti-sense counterparts. Additional probes for G prolein also include the following: Common G protein probes 5'-CTCTGGCCTCCCACATCAAACA-3 " SEO ID NO:749

5'-TCATCTGCTTCACAATGGTGCT-3' SEQ ID NO:750

Specific probes (Human & Rat common) Gi-1 5'-GTTTTCACTCTAGTTCTGAGAACATC-3' SEQ ID NO:751 Gi-2 5'-CAAAGTCGATCTGCAGGTTGC-3' SEQ ID NO:752 5'-ATGGTCAGCCCAGAGCCTCCGG-3' SEQ ID NO:753

Gi-3 5'-GTCTTCACTCTCGTCCGAAGA-3' SEQ ID NO:754

Gs 5'-GCCTTGGCATGCTCATAGAATT-3' SEQ ID NO:755

5'-TTCATCCTCCCACAGAGCCTTG-3 " SEQ ID NO:756

Go 5'-CGCATCATGGCAGAAAGCAG-J SEQ ID NO:757

C. Primers for Detecting Olher Biological Components

Another example of a primer or probe for delecting a biological component is a sequence specific for the mRNA of substance P. Substance P is a ncurotransmilter expressed by nerves that are involved in pain receptor pathways. We have discovered that the sequence 5'- TGGTACGCTTTCTCATAAGTCC-3' (SEQ ID NO:758) is very specific for Substance P.

Another biological component which can be probed for is the mRNA for the β receptor. The β receptor is a protein located in human nerve tissue In particular, abnormalities in the β 2 receptor has been found to be closely correlated with asthma. Thus, measuring the mRNA for β 2 receptor can be used to determine ihe pathophysiology of asthma patients, and could also be used lo assess the effectiveness of anti-asthma agents. We have found that the sequence 5'-

ATGCTGGCCGTGACGCACAGCA-3' (SEQ ID NO:759) is common lo a number of human subtypes of β receptor, including βl (only one mismatch), β2 (no mismatches), and 33 (2 mismatches). Thus, SEQ ID NO:759 can be used to probe for all three of these subtypes of β receptor. XVI. Identifying PCR Primers and Probes PCR primers and probes for use in ihe methods ol " the present invention can be identified in any way known lo the arl. Preferably, however, such probes and primers are identified by a computer. We have developed a novel computer system for identifying the sequences lo be used in such probes and primers. This system is an automated system which allows ihe user lo calculate and design extremely accurate oligonucleotide probes and PCR primers. The software of the present invention runs under Microsoft Windows® on IBM® compatible personal computers (PC's). This invention allows a researcher to design oligonucleolide probes based

on the GenBank database of DNA and mRNA sequences. The present invention further allows examination of probes for specificity or commonality wilh respect lo a user-selected target gene ■ sequences. Hybridization strength between a probe and a target subsequence of DNA or mRNA can be estimated through a hybridization strength model. Quantitatively, hybridization strength is given as the melting temperature (Tm).

Two models for estimating hybridization strength models are supported by ihis invention: 1) the Mismatch Model and 2) the H-Sile Model. In either case, the user can select lhc following calculations for each probe, resulls of which are then made available for display and analysis: 1) Sequence, Melting Temperature (T ) and Hairpin characteristics (a hairpin is a nucleotide sequence that is homologous lo itself and can "fold back" wilh one portion of the probe hybridizing to another portion of the same probe); 2) Hybridization to olher species within the preparation mixture; and (3) Location and Tm for the strongest hybridizations. The resulls of the invention's calculations are then displayed on a Mitsuhashi Probe Selection Diagram (MPSD) which is a graphic display of all potential hybridizations between the target mRNA and the probe sequences in the preparation. The Main OligoProbe Design Station dialog window controls all user-definable settings in the program. The user is offered a number of options al this window. The File option allows the user to print, print in color, save selected probes, and exit the program. The Preparation option allows the user to open and create preparation (PRP) files. The Models option allows the user lo chose between the two hybridization models currently supported by the OligoProbe DesignSlation: 1 ) the H-Site Model and 2) the Mismatch Model.

If ihe user selects the H-Site Model option, the melting lemperature for each probe and the nucleation threshold parameters can be set. The nucleation threshold is the number of base pairs constituting a nucleation site (a subsequence with an exact match). If the user selects the Mismatch Model option, the probe length and mismatches (N) can be sel. Mismatch Model

The Mismatch Model is used for designing DNA and mRNA probes utilizing sequence database information from sources such as GcnBank. In this Model, hybridization strength is related only to the number of base pair mismatches between a probe and its target. Generally, the more mismatches a user allows when setting parameters, the more probes will be identified. The Mismatch Model does not take into account the GC content of candidate probes so there is no calculation of the probe's binding strength.

The basic technologies employed by lhc Mismatch model are hashing and continuous seed filtration. Hashing involves the application of an algorithm lo the records in a sel of dala to obtain a symmetric grouping of ihe records. When using an indexed set ol " dala such as a database, hashing is the process of transforming a record key lo an index value for storing and retrieving a record. The

Mismatch Model is essentially a quick process for determining exact and inexact matching between DNA and mRNA sequences lo support the Mitsuhashi Probe Selection Diagram (MPSD).

Thc algorithm used by the Mismatch Model is based on the Walerman-Pcvzncr Algorithm (WPALG), which is a computer-based probe selection pi cess. Essentially, this is a combination of new and improved pattern matching pi ocesses. See Hume and Sunday (1991, Ref. 4), Landau et al (1986- 1990, Refs. 6, 7, 8), Grossi and Luccio (1989, Ref. 3), and Ukkonen (1982, Rcf. 14). There are three principal programs thai make up the Mismatch Model in this implementation of the invention. The firsl is designated by the inventors as "k_diff." WPALG uses k_diff to find all locations of matches of length greater than or equal lo one (1) (length is user-specified) with less than or equal to k number of mismatches (k is also user-specified) between the iwo sequences. If a candidate oligonucleotide probe fails to match lhat well, it is considered unique. k_diff uses hashing and continuous seed filtration, and looks for homologs by searching GcnBank and other databases with similar file formats. The technique of continuous seed fill ilion allows for much more efficient searching than previously implemented techniques.

A seed is defined in this invention to bo a subsequence having a length equal lo the longest exact match in the worst case scenario. For example, suppose the user selects a probe length (1) of 18, with 2 or fewer mismatches (k). If a match exists wilh 2 mismatches, then there must be a perfectly matching subsequence of length equal to 6. Once ihe seed length has been determined, lhc Mismatch Model looks at all substrings of that seed length (in this example, the seed length would be 6), finds the perfectly matched base pair subsequence of length equals 6, and then looks to see if this subsequence extends to a sequence of length equal to the user selected probe length (i.e., 18 in this example). If so, a candidate probe has been found that meets the user's criteria.

Where the seed si/e is large (i.e., a long string of unique nucleotides), the program allocates a relatively large amount of memory for the hash table This invention has an option that allows memory allocation for GcnBank entries just once at the beginning of the program, instead of reallocating memory for each GcnBank entry. This reduces input time lor GcnBank entries by as much as a factor of two (2), bul this method requires ihe user lo know the maximum GenBank entry size in advance.

A probe is found lo hybridi/c if it has k oi lewer mismatches with a target sequence from ihe database or file searched. The hit extension time foi all appi prialc parameters of the Mismatch Model has been found by experimentation lo be less lhan thii ly-five (35) seconds, except in one case where the minimum probe length (1) was set lo 24 and the maximum number of mismatches (k) was set to four

(4). This situation would rarely be used in real gene locali/ation experiments because the hybridization conditions are too weak. H-Site Model

In this embodiment of the invention, the second hybi idi/ation strength model is termed the H- Site Model. One aspect of lhc H-Site Model uses a generali/alion of an experimental formula to analize nucleotide binding strength. The basic formula on which this aspect of the model is built is as follows:

Tm = 81.5 - 16.6(log|Na|) - .63 %(lormamide) T .41 (' , (G + C)) - 600 / N

In this formula, log[Na] is the log of the sodium concentration, %(G + C) is the fraction of matched base pairs which arc G-C complemeniarv, and N is the probe length. This formula relates the fact lhat melting temperature is a function of both probe length and percent GC content. This basic formula has been modified in this invention lo account for the presence of mismatches. Each percent of mismatch reduces the melting temperature by an average of 1.25° (2° C for an AT mismatch, and 4°C for a GC mismatch). This formula is, however, an approximation. The actual melting temperature might potentially differ from this approximation, especially for short probes or probes with a relatively large number of mismatches.

Hybridization strength in the H-Site Model is related to each of the following factors: 1) "binding region"; 2) type of mismatch (GC or AT substitution); 3) length of the probe; 4) GC content of the binding region; and 5) existence of a "nucleation site" (a subsequence with an exact match). The type of mismatch and GC content of the binding region from each sequence contributes to a candidate probe's binding strength. The binding strength from each probe is thereby determined enabling the user to select an optimal probe. The fundamental assumption of the H-Sile Model is that binding strength is mostly determined by a paired subsequence of ihe probe and target, called the binding region. If the subsequence binding region contains more GC pairs than AT pairs, the binding strength will be higher due to the greater number of hydrogen bonds between G and C bases (three bonds) in comparison lo A and T bases (two bonds). Thus, GC rich probes have a higher melting temperature and subsequently form slronger hybridizations.

In ihe H-Sile Model the program determines optimal probes, ideally without any mismatches to the target gene. With this model, however, a candidate probe can have more AT mismatches if the sequence is GC rich. The amount of allowable AT mismatches in a specific sequence is determined in the present invention program by looking primarily at subsequence regions of the probe and target that match without penalizing the probe for areas lhat mismatch. If the mismatches are located at either or both of the ends of the binding region, there is little effect on the overall stability of the base-pairing. Centrally located mismatches in the binding region are much more deleterious, as this will significantly lower the binding strength of the probe.

The formula cited above for the melting lemperature applies within the binding region. The length of the probe is used lo calculate percentages, but all olher parameters of the formula are applied to the binding region only. The H-Site Model further assumes the existence of a nucleation site. The length of this nucleation site may be sel by the user. Typically, a value of 8 to 10 base pairs is used. To complete the H-Site Model, the binding region is chosen so as lo maximize the melting temperature Tm among all regions containing a nucleation site, assuming one exists (otherwise, Tm = 0). The H-Site Model is more complex than the Mismatch Model discussed above in that hybridization strength is modeled as a sum of multiple subsequence contributions, with matches generally providing positive binding energy and mismatches generally providing negative binding energy.

The exact binding energies lo be used depend only on the matched or mismatched pair. These coefficients may be specified by the user, although in the current version of this invcnlion these coefficients are noi explicitly user-selectable, but rather are selected lo best fit the hybridization strength formulas developed by llakura ct al ( 1984, Ref. 5), Bolton and McCarthy (1962, Ref. 2), Bcnner et al (1973, Ref. 1), and Southern (1975, Ref. 13).

A unique aspect of ihe H-Sile Model is lhat hybridization strength is determined by the optimal binding region between the candidate probe and binding locus. This binding region is called the hybridization site, or h-site, and is selected so as to maximize overall hybridization strength, so that mismatches outside the binding region do nol detract from the estimated hybridization strength. Several other unique features of ihe H-Site Model include the fact thai it is more oriented toward RNA and especially cDNA sequences than DNA sequences, and ihe fact lhat the user has control over preparation and environmental variables.

The emphasis on RNA and cDNA sequences allows the user to concentrate on coding regions of genes, rather than necessitating sorting through all of a genomic sequence for the desired probe. The enhanced user control over environmental and preparation variables allows the user lo more accurately simulate laboratory conditions that closely correspond with any experiments he or she is conducting. Further, this implementation of the invcnlion does some preliminary preprocessing of the GenBank database to sort out and select the cDNA sequences. This is done by locating a keyword (in this case CDS) in each GenBank record, thereby eliminating any sequences containing introns. The Mitsuhashi Probe Selection Diagram (MPSD), Figure 14, is a key feature of ihis invcnlion, as it is a unique way of visualizing the resulls of the probe designed by the Mismatch and H-Site Models. It is a graphic display of all of the hybridizations of candidate oligonucleotide probes and the target with all sequences in ihe preparation. Given a gene sequence database and a target mRNA sequence, the MPSD graphically displays all of ihe candidate probes and their hybridization strengths with all sequences from the database. In the present implementation, each melting temperature is displayed as a different color, from red (highest Tm) to blue (lowest Tm). The MPSD allows the user to see visually the number of false hybridizations al various temperatures for all candidate probes, and the sources of these false hybridizations (with a loci and sequence comparison). A locus may be a specific site or place, or, in the genetic sense, a locus is any of the homologous parts of a pair of chromosomes that may be occupied by allelic genes.

These probes may then be used to test for the presence of precursors of specific proteins in living tissues. The oligonuclcoiide probes designed with ihis invention may be used for medical diagnostic kits, DNA identification, and potentially continuous moniloring of metabolic processes in human beings. The present implementation of this computerized design tool runs under Microsoft® Windows™ v. 3.1 (made by Microsoft C'orporation: Redmond. Washington) on IBM® compatible personal computers (PC's).

The H-Sile Model of this invention is unique in thai il offers a multitude of information on selected probes and original and distinctive means of visualizing, analyzing and selecting among candidate probes designed with the invention. Candidate probes are analyzed using the H-Sile Model for their binding specificity relative lo some known set of mRNA or DNA sequences, collected in a database such as the GenBank database. The first step involves selection of candidate probes at some or all the positions along a given target. Next, a melting temperature model is selected, and an accounting is made of how many false hybridizations each candidate probe will produce and what the melting temperature of each will be Lastly, the resulls are presented lo the researcher along with a unique set of tools for visualizing, analyzing and selecting among the candidate probes. This invention is both much faster and much more accurate than the methods that are currently in use. It is unique because il is the only method that can find not only the most specific and unique sequence, but also the common sequences. Further, it allows the user lo perform many types of analysis on the candidate probes, in addition lo comparing those probes in various ways lo the target sequences and to each olhcr. Therefore, it is the object of this invention lo provide a practical and user-friendly system that allows a researcher to design both specific and common oligonucleotide probes, and to do this in less time and wilh much more accuracy than currently done

This invention is employed in the form best seen in Figure 11. There, lhc combination of this invention consists of an IBM® compatible personal computer (PC), running software specific lo this invention, and having access lo a distributed database wilh lhc file formats found in ihe GenBank database and other related databases.

Unless defined otherwise, all technical and scientific terms used herein have ihe same meaning as commonly understood by one of ordinary skill in the art to which this invcnlion belongs. Although any methods and materials similar or equivalent lo those, described herein can be used in the practice or testing of the present invention, the preferred methods and materials are now described. All publications mentioned hereunder are incorporated herein by reference.

The preferred computer hardware capable of operating this invention involves of a system wilh at least the following specifications (Figure 11): 1 ) an IBM® compatible PC, generally designated IA, IB, and 1C, wilh an 80486 coprocessor, running at 33 Mhz or faster; 2) 8 or more MB of RAM, IA; 3) a hard disk IB with al least 200 MB of storage space, but preferably 1 GB; 4) a VGA color monitor

(1C) with graphics capabilities of a size sufficient lo display the invention's output in readable format, preferably wilh a resolution of 1024 x 76S; and 5) a 580 MB CD ROM drive 5 (IB of Figure 11 generally refers lo lhc internal storage systems included in this PC, clockwise from upper right, two floppy drives, and a hard disk). Because the software of this invcnlion preferably has a Microsoft® Windows™ interface, the user will also need a mouse 2, or some other type of pointing device.

The preferred embodiment of this invention would also include a laser printer 3 and/or a color plotter 4. The invention may also require a modem (w hich can be internal or external) if the user does

not have access to the CD ROM versions of the (JenBank database 8 (containing a variable number of gene sequences 6). If a modem is used, information and instructions arc transmitted via telephone lines to and from the GenBank database 8. If a CD ROM drive 5 is used, the GenBank database (or specific portions of it) is stored on a number of CDs. The computer system should preferably hav e al least the Microsoft® DOS version 5.0 operating system running Microsoft® Windows™ version 3.1. All of the programs in the preferred embodiment of the invention were written in the Borland® C+ + (Borland International, Inc.; Scotls Valley, CA) computer language. It should be noted thai subsequently developed computers, storage systems, and languages may be adapted to utilize ihis invention and vice versa. This inventive computer program is designed to enable the user lo access DNA, mRNA and cDNA sequences stored either in ihe GenBank or in databases wilh similar file formats. GcnBank is a distributed flat file database made up of records, each record containing a variable number of fields in ASCII file formal. The stored database itself is distributed, and there is no one database management system (DBMS) common to even a majority of its users. One general format, called the line type format, is used both for the distributed database and for all of GenBank's internal record keeping. All data and system files and indexes for (JenBank are kept in text files in this line type format.

The primary GenBank database is currently distributed in a multitude of files or divisions, each of which represents the genome of a particular species (or at least as much of it as is currently known and sequenced and publicly available). The (JenBank provides a collection of nuclcolidc sequences as well as relevant bibliographic and biological annotation. Release 72.0 (6/92) of ihe GenBank CD distribution contains over 71,01)0 loci with a total of over ninety -two (92) million nucleotides. GenBank is distributed by InlclliGenetics, of Mountain View, CA, in cooperation with the National Center for Biotechnology Information, National Library of Medecinge. in Belhesda, MD.

1. Overall Description of the OligoProbe DesignSlalion a. General Theory

The intent of this invention is to provide one or more fast processes for performing exact and inexact matching between DNA sequences to support the Milsuhashi Probe Selection Diagram (MPSD) discussed below, and olher analysis with interactive graphical analysis tools. Hybridization strength between a candidate oligonucleotide probe and a target subsequence of DNA, mRNA or cDNA can be estimated through a hybridization strength model. Quantitatively, hybridization strength is given as the melting temperature (Tm). Currently, ivvo hybridization strength models are supported by the invention: 1) the Mismatch Model and 2) lhc H-Site Model.

b. Inputs i. Main OligoProbe DesignStation Dialog Window

The Main OligoProbe DcsignStalion dialog window, Figure 12, controls all user-definable settings. This window has a menu bar offering five options: 1 ) File 10; 2) Preparation 80; 3) Models 70; 4) Experimcnl 90; and 5) Help 50. The File 10 option allows the user lo print, print in color, save selected probes, and exit the program. The Preparation 80 option allows the user lo open and create preparation (PRP) files.

The Models 70 option allows ihe user lo chose between the two hybridization models currently supported by the OligoProbe DesignStation: 1) the H-Site Model 71 and 2) the Mismatch Model 75. If the user selects the H-Site Model 71 option, the left hand menu of Figure 12C is displayed and the user sets the following model parameters: 1) the melting lemperalure Tm 72 for which probes are being designed (i.e., the melting lemperalure lhat corresponds lo a particular experiment or condition the user desires to simulate); and 2) the nucleation threshold 73, which is the number of base pairs constituting a nucleation site If the user selects the Mismatch Model 75 option, the right hand menu of Figure 12C is displayed and the user sets the following model parameters: 1) probe length 76, which is the number of base pairs in probes to be considered: and 2) mismatch N 77, which is the maximum number of mismatches constituting a hybridization. Compulation of the user's request lakes longer with the H-Site Model if the threshold 73 selling is decreased, but longer with the Mismatch Model if the number of mismatches K 77 is increased. In addition, for both Model options the user chooses the target species 11 DNA or mRNA for which probes are being designed and Ihe preparation 12, a file ol " all sequences with which hybridizations are to be calculated. A sample of a target species file is shown in Figure 37 (humbjunx.cds), while a sample of a preparation file is shown in Figure 38 (junmix.seq). Each of these inputs is represented by a file name and extension in standard DOS formal. In the target species and preparalion fields, the file format follows the GcnBank format with each of the fields having a default file extension. Pressing the

"OK" button 91 (Figure 12C) will initiate processing, while pressing the "Cancel" button 93 will slop the processing.

The Experiment 90 option and ihe Help 50 option are expansion options not yet available in the current implementation of the invention. c. Processing

Figures 13A and 13B are flow charts of the overall OligoProbe DesignStation Program, illustrating its sequence and structure. Generally, the main or "control" program of the OligoProbe DesignStation performs overall maintenance and control functions. This program, as illustrated in Figures 13A and 13B, accomplishes the general housekeeping functions 51, such as defining global variables. The user-friendly interface 53, carries out the user-input procedures 55, the file 57 or database 59 access procedures, calling of the model program <>2 or 63 selected by the user, and the user-

selected report 65 or display 67, 69, 7J and 73 features. Each of these features is discussed in more detail in later sections, wilh the exception of the input procedures, which involves capturing the user's set-up and control inputs, d. Outputs i. The Milsuhashi Probe Selection Diagram Window

The Milsuhashi Probe Selection Diagram (MPSD). Figure 14, is a key feature of the invention as it is a unique way of visualizing the resulls of the program's calculations. It is a graphic display of all of the hybridizations of probes with the target oligonucleotides in the preparation. Specifically, given a nucleotide sequence database and a large! mRNA, the MPSD graphically displays all of the candidate probes and their hybridization strengths with all sequences from the nucleotide database. The MPSD allows the user to visualize ihe number of false hybridizations al various lemperalures for all candidate probes, and the sources of these false hybridizations (w ith a loci and sequence comparison).

For each melting lemperalure selected, a graph showing the number of hybridizations for each probe is displayed. In the preferred embodiment, the graphs are color coded. In ihis implementation of the invention, the color red 123 identifies ihe highest melting lemperature and the color blue 124 identifies the lowest melting temperature Each mismatch results in a reduction in the Tm value. The melting temperature is also a function of probe length and percent GC conlent. Within the window, the cursor 125 shape is changed from a vertical line bisecting the screen to a small rectangle when the user selects a particular probe The current probe is defined lo be that probe under ihe cursor position (whether it be a line or a rectangle) in the MPSD window. More detailed information about the current probe is given in the Probelnfo and Matchlnfo windows, discussed below. Clicking the mouse button 2 once at the cursor 125 sclecls the current probe. Clicking the mouse button 2 a second time deselects the current probe. Moving lhc cursor across the screen causes the display lo change and reflect the candidate probe under ihe current cursor position. The x-axis 110 of the MPSD, Figure 14, shows the candidate probes' starting positions along the given mRNA sequence. The user may "slide" the display lo the left or right in order to display other probe starting positions. The y-axis 1 15 of the MPSD displays the probe specificity, which is calculated by the program.

The menu options 116, 117, 1 18, 119, and 120 available lo the user while in the MPSD, Figure 14, and are displayed along a menu bar al the top of the screen. The user can click the mouse 2 on the preferred option lo briefly display lhc option choices, or can dick and hold the mouse button on the option to allow an option lo be selected. The user may also type a combination of keystrokes in order to display an option in accordance with well-known computer desk top interface operations. This combination usually involves holding down the ALT key while pressing the key representing the first letter of the desired option (i.e. F, P. M, E or H).

-5S-

The File option 116 allows lhc user to specify input files and databases. The Preparation option

117 allows the user to create a preparation file summari/ing the sequence database The Models option

118 allows the user to specify the hybridization model (i.e., H-Sile or Mismatch) and its parameters. The Experiment option 119 and the Help option 120 are not available in lhc current implementation of this invention. These options are part of the original Main OligoProbe DesignStation dialog window,

Figure 12.

Areas on the graphical display of the MPSD, Figui e 14, where ihe hybridizations for the optimal probes are displayed are lowest and most similar, such as shown al 121, indicate lhal the particular sequence displayed is common lo all sequences. Areas on ihe graphical display of lhc MPSD where the hybridizations for the optimal probes arc displayed are highest and most dissimilar, such as shown at

122, indicate that the particular sequence displayed is extremely specific to lhat particular gene fragment. The high points on the MPSD show many loci in ihe database, lo which the candidate probe will hybridize (i.e., many false hybridi/alions). The low points show few hybridizations, al least relative to the given database. Specifically, the sequence show n at 121 would i cfiecl a probe common lo all of the gene fragments tested, such lhat this pi be could be used to detect each of these genes. The sequence shown at 122 would reflect a probe specific to the particular gene lragment, such lhat this probe could be used to detect this particular gene and no others. ii. The Pi belnfo and Malchinfo Windo

The combined Probelnfo and Malchinfo Window, Figure 15, displays detailed information about the current candidate probe The upper portion of the window is the Probelnfo window, and the lower portion is the Malchinfo window. The Probelnfo window portion displays the following types of information: the target locus (i.e., the mRNA, cDNA, oi DNA from which lhc user is looking for probes) is displayed at 131, while the preparation used for hybridi/alions is displayed at 132. In the example shown in Figure 15. lhc target locus 131 is the file named H UMBJUNX.CDS, which is shown as being located on drive F in the subdirectory MILAN. The preparation 132 is shown as being the file designated JUNMIX.PRP, which is also shown as being located on drive F in the subdirectory MILAN. The JUNMIX.PRP preparation in this example is a mixture of human and mouse jun loci.

The current and optimal probe's starting position is shown at 135. The current candidate oligonucleotide probe is defined at 136, and is listed al 137 as having a lcnglh of 21 bases. The melting temperature for the probe 13(> as hvbridi/ed w ilh the targets is shown in column 140. The melting temperature for the optimal probe is given as 61.7 degi ces C al 138. The Probelnfo Window Figure 15 also displays hairpin characteristics of the probe al 139. In the example shown, the Probelnfo Window shows thai there are four (4) base pairs involved in ihe vvorsl hairpin, and thai the worst hairpin has a length of one (1 ) (see Figure 15, at 139). The Matchlnfo Window portion displays a list of hybridi/alions between the current probe and species within the preparation file, including hybridization loci and hybridi/ation lemperalures. The

hybridizations are listed in descending order by melting lemperalure. The display shows the locus with which the hybridization occurs, the position within the locus, and the hybridization sequence.

In the Malchinfo window portion, the candidate probe 136 is shown al 150 as hybridizing completely with a high binding strength. This is because the target DNA is itself represented in the database in this case, so lhc candidate probe is seen al 150 lo hybridize with itself (a perfect hybridization). The locus of each hybridization from the preparation 132 are displayed in column 141, while the starling position of each hybridization is given in column 142. The calculated hybridizations are shown at 145. iii. The ProbesEdit Window The ProbesEdit Window, Figure 16, is a text editing window provided for convenient editing and annotation of OligoProbe DesignStation text file output. It is also used lo accumulate probes selected from the MPSD, Figure 14, by mouse button 2 clicks. Standard text editing capabilities are available within the ProbesEdit Window. The user may accumulate selected probes in this window (see 155 for an example) and then save them lo a file (which will bear ihe name of the preparation sequence with the file extension of "prb" 156, or may be another file name selected by the user). A sample of this file is shown in Figures 16A and 16B. iv. Miscellaneous Output

The present embodiment of this invention also creates two output files, currently named "test.oul" and "lcsll.oul". depending upon which model ihe user has selected. The first file, "lest.out", is created wilh bolh the Mismatch Model and the H-Sile Model. This file is a textual representation of the Milsuhashi Probe Selection Diagram (M PSD). Il breaks ihe probe sequence down by position, length, delta Tm, screensN, and the actual probe sequence (i.e. nucleotides). An example of this file created by the Mismatch Model is shown in Figure 30. and example created by lhc H-Site Model is shown in Figure 34a. The second file, "lcsll.oul", is created only by the H-Sile Model. This file is a textual representation of the Probelnfo and Malchinfo window that captures all hybridizations, along with their locus, starling position, melting temperature, and possible other hybridizations. A partial example of this file is shown in Figure 34b (10 pages out of a total of 190 pages created by the H-Site Model). 2. Description of the Mismatch Model Program a. Overview

In this invention, one of the hybridization strength models is termed ihe Mismatch Model (see Figure 12 for selection of this model). The basic operation of this model involves the techniques of hashing and continuous seed filtration, as defined earlier but described in more detail below. The essence of the Mismatch Model is a fast process for doing exact and inexact matching between nucleotide sequences lo support the Mitsuhashi Probe Selection Diagram (MPSD). There are a number of modules in the present implementation o the Mismatch Model contained in this invcnlion, the most

significant of which are shown in the flow chart in Figure 17 and in more detail in Figures 18 through 28. The main k_diff module shown in the flow chart in Figure 18 is a structured program that provides overall control of the Mismatch Model, calling various submodules lhat perform different functions. b. Inputs The user-selected input variables for this model are minimum probe length 76 (which is generally from 18 to 30) and maximum number of mismatches 77 (which generally is from 1 to 5). These inputs are entered by the user in the Main OligoProbe DesignStation Dialog Window, Figure 12C. c. Processing i. k_diff Program

Some terms of art need lo be defined before the processing performed by ihis module can be explained. A hash lable basically is an array or table of dala. A linked list is a classical data structure which is a chain of linked entries and involves pointers lo other entry structures. Entries in a linked list do not have lo be stored sequentially in memory, as is ihe case ilh elements contained in an array. Usually there is a pointer to ihe list associated w th the list, which is often initially set to point to the start of the list. A pointer lo a list is useful for sequencing through the entries in the list. A null pointer (i.e., a pointer with a value o zero) is used to mark the end of the list.

As the flow charts in Figures 17 and 18 illustrate, the general process steps and implemented functions of this model can be outlined as follows: Step 1: Firsl, create a hash table and linked list from lhc query (Figure 17, hashing module

222).

Step 2: Next, while ihere arc still (JenBank entries available for searching (Figure 17, assembly module 230):

Step 2a: Read the current (JenBank entry (record) sequence of user-specified length (Figure 17, seqload module 232). or read the current sequence (record) from the file selected by the user (Figure 17, read I module 234).

Step 2b: For the current sequence for each position of the sequence from the first position (or nucleotide) lo ihe last position (or nucleotide) (incrementing the position number once each iteration of the loop) (Figure 17, q_colour module 242), Step 2c: set ihe variable dna iash equal lo lhc hash of the current position of the current sequence (Figure 17, q_colour module 242). Step 2d: While nol at the end of the linked list for dna_hash (Figure 17, q_colour module 242),

Step 2c: sel the qucrvjios equal lo the current position of dnajiash in ihe linked list (Figure 17. q colour module 242) and

Step 2f: Extend the hit wilh the coordinates (query_pos, dna_pos) (Figure 17, hit cxi module 244),

Slep 2g: If there exists a k_mismalch in the current extended hit (Figure 17, colour module 246), then Slep 2h: prim the current hit (Figure 17, q_colour module

242), and repeat from Slep 2.

As this illustrates, there are three (3) basic looping or iteration processes wilh functions being performed based on variables such as whether the GenBank section end has been reached (the first

"WHILE" loop, Step 2), whether lhc end of lhc current DNA entry has been reached (the "FOR" loop, Step 2b), and whether ihe end of lhc dnajiash linked list has been reached (the second "WHILE" loop,

Step 2d). A "hit" will only be primed if there are k_mismalches in ihe current extended hit.

Figures 18 through 28 illustrate the functions of each of the modules of the present embodiment of this invention, all of which were generalized and summarized in the description above. Figure 18, which outlines the main "k_diff module, shows thai ihis module is primarily a program organization and direction module, in addition to performing routine "housekeeping" functions, such as defining the variables and hash tables 251, checking il " the user-selected gene sequence file is open 252, extracting needed identification information from ihe (JenBank 253. and ensuring valid user input 254. This module also performs a one-time allocation of mcmorv lot the gene sequences, and allocates memory for hit information, hashing, hybridi/alion and frequency length profiles and output displays, 255 & 256. The "k_diff module also initializes or "zeros oul" the hashing table, the linked hashing list and the various other variables 257 in preparation for the hashing function. In addition, ihis module forms the hash tables 258 and extracts a sequence and finds the sequence length 259.

One of the most important functions performed b the "k_diff module is to define the seed (or kernel or k tuple) si/c. This is done by selling the variable k tuple equal lo (min_probe length - max_mismatch_#)/(max_mismalch + # + 1 ) Figure 18 al 265. Next, if the remainder of the aforementioned process is nol equal to zero 266, then the value of lhc variable k_tuplc is incremented by one 267. The resulting value is the si/e of the seed. The module then reads the query 268 and copies the LOCUS name 269 for identification purposes (a definition of the lerm locus is given earlier in the specification). The "k_diff module Figure 18 also calls the "assembly" module 260, writes the resulls to a file

261a, plots the results 261b (discussed below ), calcul.iles the hairpin characlei islics 262 (i.e., the number of base pairs and ihe length of lhc worst hairpin) and the melting lemperalure (Tm) for each candidate probe 263, and saves the results lo a file 264.

The screen graphs are plotted 261b by converting the result values to pixels, filing a pixel array and performing a binary search into lhc pixel array. Next, given the number of pixels per probe posilion and which function is of interest lo lhc user (i.e., the ihi ee mismatch match numbers), the program

interpolates the values at the value of (pixelsPerPositionN-1 ) and computes the array of pixel values for drawing the graph. These values arc then plotted on the MPSD.

The "hashing" module, Figure 19, performs hashing of the query. In olher words, il creates the hash table and linked list of query positions wilh ihe same hash. The variable has_table[i] equals the position of the first occurrence of hash i in the query. II " i does not appear in the query, hash_lablc[i] is set to zero.

The "Iran" module, Figure 20, is called by the "hashing" module 271, and performs the hashing of the sequence of k_tuplc (kernel or seed) size. If ihe k_tuple exists (i.e., its length is greater than zero), the variable uns is set equal to uns*ALF+ p 291. The variable p represents the digit returned by the "let_dig" module Figure 21 that represents the nucleotide being examined. ALF is a constant that is set by the program in ihis implementation lo equal four. The query pointer is then incremented, while the size of k uple (lhc seed) is decremented 292. This process is repealed until the sequence of k_tuple has been entirely hashed. Then the "Iran" module returns the variable current_hash 293 to the "hashing" module Figure 19. The "lel_dig" module. Figure 21, is called by the "Iran" module 291, and transforms the nucleotides represented as the characters "A", "T", "U", "G" and "C" in the (JenBank and ihe user's query into numeric digits for easier processing b the program. This module transforms "a" and "A" into "0" 301, "t", "T", "u" and "U" into "1" 302, "g" and "(J" into "2" 303, and "c" and "C" into "3" 305. If the character to be transformed does not match any one of those listed above, the module returns "-1" 305. The "hashing" module, Figure 19, then calls the "update" module 272, Figure 22, which updates the hash with a sliding window (i.e., it forms a new hash alter shifting the old hash by "1"). The remainder of old_hash divided by power_l is calculated 311 (a modulus operation), the remainder is multiplied by ALF 312 (i.e., four), and then the digit representing the nucleotide is added lo the result 313. The "update" module then returns ihe result 314 lo the "hashing" module Figure 19. If the current hash has already occurred in the query, the program searches for ihe end of the linked list for the current hash 273 and marks the end of the linked lisi for the current hash 274. If the current hash has not already occurred in the query, the program puts the hash into lhc hash lable 275. The resulting hash table and linked list are then returned lo the "k diff module, Figure 18 al 258.

The "assembly" module, Figure 23, extracts sequences from the (JenBank and performs hit locating and extending functions. This module is called by the "k_diff" module Figure 18 at 260 if the user has chosen to use the database lo locale matches. The output from the "assembly" module (Figure 23) tells the user thai ihe section of the database searched contains E number of entries 321 of S summary length 322 with H number of hits 323. Further, the program tells the user lhat the number of considered 1-tuples equals T 324. The entry head line is also printed 326. The "seqload" module. Figure 24, is called by the "k_dil ' f module Figure 18 al 259 once the query hash table and linked list have been formed by the "hashing" module Figure 19. The "seqload" module Figure 24 checks to see if the end of the (JenBank file has been reached 327. and, if not,

searches until a record is found wilh LOCUS in ihe head-line 328. Next, the LOCUS name is extracted 329 for identification purposes, and the program searches for the ORIGIN field in the record 330.

The program then extracts the current sequence 331 from the GenBank and performs two passes on each sequence The first is lo determine the sequence length 332 and allocate memory for each sequence 333, and the second pass is to read the sequence into the allocated memory 334. Since the sequences being exlracled can contain cither DNA nucleotides or protein nucleotides, the "seqload" module can recognize the characters "A", "T", "U", "(J", and "C". The bases "A", "T", "G" and "C" are used in DNA sequences, while the bases "A", "U", "(J" and "C" arc used in RNA and mRNA sequences. The extracted sequence is then positioned according lo ihe type of nucleotides contained in ihe sequence 335, and the process is repealed. Once the end of the sequence has been reached, the "seqload" module returns the sequence length 336 lo ihe "k_diff module Figure 18.

If the user has chosen to use one or more files to locale matches, rather than lhc database, the "readl" module, Figure 25, rather than the "seqload" module Figure 24, is called by the "k_diff module Figure 18. The "readl" module, Figure 25, reads the sequence from the user specified query file 341 and allocates memory 342. This module also determines the query length 343, extracts sequence identification information 344, determines the sequence length 345, transforms each nucleotide into a digit 346 by calling the "lct_dig" module Figure 21. creates the query hash table 347 by calling the "dig_let: module Figure 26, and closes the file 348 once everything has been read in.

First, the "readl" module Figure 25 allocates space for ihe query 342. To do this, the "ckalloc" module, Figure 25 al 342, is called. This module allocates space and checks whether ihis allocation is successful (i.e., is there enough memory or has the program run out of memory). After allocating space, the "readl" module Figure 25 opens the user-specified file ) (ihe "ckopen" module, Figure 25 at 349, is called to ensure lhat the query file can be successfully opened 349), determines the query length 343, locates a record with LOCUS in the head-line and extracts the LOCUS name 344 for identification purposes, locates the ORIGIN field in the record and then reads the query sequence from the file 341.

Next, the sequence length is determined 345, memory is allocated for the sequence 342, and the sequence is read into the query file 350. If lhc string has previously been found, processing is returned to 344. If not, then each character in the query file is read into memory 350.

The characters are transformed into digils 34(> using ihe "lel_dig" module, Figure 21, until a valid digit has been found, and then ihe hash table containing the query is sel up 347 using the module

"dig et", Figure 26, which transforms the digils into nucleotides represented by the characters "A" 371, T" 371, "G" 373, "C" 374, and "X" 375 as a default. If the end of the file has not been reached, processing is returned to 344. If it has, the file is closed 348 and the query is then returned lo the "readl" module Figure 25 at 347. The "q_coIour" module, Figure 17 (Figure 23 al 325), is called by the "assembly" module Figure

23 after the current sequence has been exlracled from the (JenBank. The "q_colour" module Figure 27 performs the heart of the Mismatch Model process in lhat il performs the. comparison between the

query and the database or file sequences. If the module finds that there exists a long (i.e., greater than the min hit length) extended hit, it returns a "1" lo the "assembly" module Figure 24. Otherwise, the "q_colour" module, Figure 27, returns a "0".

In the "q_colour" module, Figure 27, all DNA positions are analyzed in the following manner. First, the entire DNA sequence is analyzed 391 to see whether each position is equal lo zero 392 (i.e., whether it is empty or the sequence is finished). If il is not equal lo zero 393, ihe "q_colour" module Figure 30 calls the "Iran" module, Figure 20 described above, which performs ihe hashing of k_tuples. The "tran" module Figure 20 calls olher modules which transform the nucleotides represented by characters inlo digils for easier processing by the program and then updates the hash wilh a sliding window. If the position is equal lo zero, the current hash position is sel lo new_has after one shift of old_hash 390 by calling the "update" module Figure 22.

If the nucleotide at the currentjiash position is equal lo zero, processing is returned lo 391. If not, the query position is sel equal to (nuclcolidc al current hash position - 1). Next, the "q_colour" module Figure 27 looks for the current_hash in the hash table 394. If the current k_tuple docs not match the query 395, then the next k tuple is considered 395. and processing is returned to 391. If the current k_tuple does match the query, then the program checks the hit's (i.e., the match's) vicinity 396 by calling the "hit_ext" module, Figure 28 lo determine if the hit is weak. The inventors have found lhat if the code for the module "hit ext" is included within ihe module "q_colour", rather lhan being a separate module utilizing lhc parameter transfer machinery, 25% of CPU time can be saved. The "hit_exl" module Figure 28 determines the current query position in the hit's vicinity 421, determines the current DNA position in the hit's vicinity 422, and creates the list of mismatch positions (i.e., the mismatch_localion_ahcad 423, the mismatch_localion_behind 423 and the kernel match location). If the hil is weak 424, lhc "hiι_ext" module Figure 28 returns "0" to ihe "q_colour" module Figure 27. If the hit has a chance lo contain 425, the module returns "1" lo ihe "q_colour" module Figure 27. A hit has a chance lo contain, and is therefore not considered weak, if the mismatch location ahead - ihe mismatch_localion_behind is greater lhan lhc min_hit_lcngth. If not, it is a short hit and is loo weak.

If the "hil_ext" module Figure 28 tells the "q_colour" module Figure 27 thai the hit was not a weak one, then lhc "q_colour" module determines whether the current hil is long enough 398 by calling the "colour" module Figure 29. The "colour" module Figure 29 performs qucry_colour modification by the hit data, starting at pos_query and described by mismatch localion_ahead and mismatch_localion_bchind. After the variables to be used in this module are defined, variable isw_print (which is the switch indicating the hit length) is initialized to zero 430. The cur_length is then set equal to the length of the extending hil 431 (mismatch_localion_behind|i] + mismatch ocation ahead[j]-l). Next, if cur_length is greater lhan or equal lo the min_hiι_lengih 432 (i.e., the minimum considered probe size), ihe hil is considered long and isw_prinl is sel equal to two 433. The value of isw_print is then returned 434 to the "q_colour" module Figure 27.

If the length of the extending hit is longer than the min_hil_lenglh. the hil is considered long 399. Otherwise, the hil is considered short. If the hil is short, nothing more is done to the current hit and the module begins again. If, on the olhcr hand, the hit is considered long 399, the "q_colour" module Figure 27 prints the current extended hil 400. The current extended hit can be printed in ASCII, printed in a binary file, or printed lo a memory file The "q_colour" module Figure 27 then repeats until the end of the linked list is reached. d. Outputs

The output of the k_diff program may be either a binary file containing ihe number of extended hits and the kjmismalch hil locations (sec Figure 30), or the output may be kept in memory without writing it to a file. See Section l(d)(iv) for more detail.

3. Description of the H-Sile Model Program a. Overview

In this invention, the second hybridization strength model is termed the H-Sile Model (see Figure 12 for user selection of this model). The formula used in the H-Sile Model is an expression of the fact that melting temperature Tm is a function ol both probe length and percent of GC content.

This basic formula has been modified in this invention to account for the presence of mismatches. Each percent of mismatch reduces the melting lemperalure Tm by an average of 1.25 degrees (2 degrees C for an AT mismatch, and 4 degrees C for a GC mismatch).

In addition, this implementation of the invention does some preliminary preprocessing of the GenBank database lo sort out and select the cDNA sequences. This is done by locating a keyword (in this case CDS) in each GcnBank record.

There arc a number of modules in the present embodiment of the H-Sile Model contained in this invention. Each step of the pi ccssing involved in lhc H-Sile Model is more fully explained below, and is accompanied by detailed fiow charts. b. Inputs

There are iwo basic user-selected inputs for the H-Sile Model (see Figure 12C): 1) the melting temperature Tm 22 for which probes are being designed (i.e., lhc melting temperature that corresponds to a particular experiment or condition the user desires lo simulate); and 2) the nucleation threshold 23, which is the number of base pairs constituting a nucleation site The user is also required to select the 1) target species 11 gene scquencc(s) (DNA, mRNA or cDNA) for which probes are being designed; 2) the preparation 12 of all sequences wilh which hybridi/alions arc to be calculated; and 3) the probe output file 13. The preparation file is the most important, as discussed below. c. Organization of the H-Site Model Program

The current implementation of the H-Sile Model program of this invention is distributed between five files containing numerous modules. The main file is designated by the inventors as "ds.cpp" in its uncompiled version. This file provides overall control lo the entire OligoProbe DesignSlaiion

invention. It is divided into six sections. Section 0 defines and manipulates global variables. Section 1 controls general variable definition and initiali/ation (including the arrays and memory blocks). It also reads and writes buffers for user input selections, and constructs ulti buffers.

Section 2 sets up and initializes various "snippet" variables (see section below for a complete definition of the lerm snippet), converts base pair chai aciei s to a l epresentaiion thai is 96 base pairs long and to ASCII base pair strings, and performs oihei sequence file manipulation such as comparing snippets. This section also reads the sequence format file, l eads base pairs, checks for and extracts sequence identification information (such as ORIGIN and LOCUS) and filters out sequences beginning with numbers. Section 3 involves preparation file manipulation. This section performs the preprocessing on the PRP file discussed above. It also merges and sorts the snippet files, creates a PRP file and sorts it, and outputs the sorted snippets. Next, this section sti earns ihi ough the PRP file.

Section 4 contains the essential code lor H-Sile Model processing (see Figures 31 through 33 for details, discussed below). Streams aie set up, and then R1BI comparisons are performed for hybridizations (see file "ribi.cpp" for definitions of R1BI search techniques). Next, probes are generated, binding strength is converted to melting temperature, and hybridizations are calculated and stored (including hybridization strength). Lastly, other H-Sile calculations arc performed.

Section 5 is concerned with formatting and pi esenling diagnostic and user file (test. out, testl.out, and lesl2.out files) output. This section also handles the graphing functions (the MPSD diagram in particular). In addition, this section calculates the hairpin characteristics for the H-Site

Model candidate probes.

The second H-Sile Model file, designated as "ds.h" defines dala variables and structures. Section 1 of this file concerns generic dala structures (including memory blocks and arrays, and file inputs and outputs). Section 2 defines the variables and sli uctui es used w ilh sequences, probes and hybridizations. Section 3 defines variables and slructui es concerned with protocols (i.e., function prototypes, graphing, etc.).

The third H-Sile Model file, designated as " luncdoclxl", contains very detailed documentation for this implementation of the H-Sile Model pi gi m. Numei ous variables and structures are also defined. The flow of the program is clearly shown in this file The fourth H-Sile Model file, designated as ' l ibi.h" handles the sequence comparisons. The fifth and last H-Sile Model file, designated as "ribi.cpp", pei lorms internal B-Tree indexing. Definitions of Red-black Internal Binary Index (RIB1) searching ai e lound in this file Definitions are also included for the concepts keyed sel, index, binary li ce, internal binaiy index, paths, and red-black trees. Implementation notes are also included in this file

d. Processing

Implementation of the H-Site Model in this invention is done in three slages. First, the invention creates the preparation (PRP) file, which contains all relevant information from the sequence database. This is the preprocessing stage discussed above Next, the target is prepared by the program. Lastly, the invention calculates the MPSD data using the PRP file and largel sequence to find probes. i. Creation of lhc Preprocessed Preparation File

Figure 31. Slep 1: The program first opens the sequence database for reading into memory

461, 462. Step 2: Next, as sequence base pairs are read in 462, "snippets" are saved lo disk 463, along with loci information. A snippet is a fixed-length subsequence of a preparation sequence. The purpose of snippets is to allow the user to examine a small portion of a preparation sequence together wilh its surrounding base pairs. Snippets in the implementation of this invention are 96 base pairs long (except for snippets near ihe end or beginning of a sequence w hich may have fewer base pairs). The "origin" of the snippet is in position 40. For snippets taken near the beginning of a sequence, some of the initial

40 bases are undefined. For snippets near the end of a sequence, some of the final 55 bases are undefined. Snippets are arranged in the preparation file (PRP) in soiled order (lexicographical order beginning at position 40). In ihis invention, ihe term "lexicographical order" means a preselected order, such as alphabetical, numeric or alphanumeric. In order lo conserve space, snippets are only taken at every 4th position of the preparation sequence

Slep 3: The snippcls are merge sorted 464 lo be able to search quickly for sequences which pass the "screen", discussed below. Step 4: The merged file is prepended with identifiers for the sources of the snippets 465. This is done lo identify the loci from which hybridizations arise, ii. Target Preparation

Figure 32. Step 1: The target sequence file is opened 471 and read into memory 472. For each position in the target mRNA, the probe defined al that starting position is ihe shortest subsequence starting at that position whose hybridization strength is greater lhan the user specified melting temperature Tm. Typically, the probes are of length 18 to 50. Slep 2: Four lists of "screens" are formed 473, 474, 475, each shifted by one base pair 475 to correspond lo the fact that snippcls are only taken at every four base pairs. A screen is a subsequence of the target mRNA of length equal to the screening threshold specified by the u.ser. The screens are then indexed 476 and sorted in memory 477. iii. Calculation of the MPSD Dala

Figure 33. Step 3: This step is the heart of the process. Slep 3a: The program streams through the following five items in sync, examining them in .sequential order: the snippet file and the four lists of screens 481-484. Step 3b: Each snippet is compared lo a screen 4S5. Slep 3c: If the snippet does not match, whichever stream is behind is advanced 486 and Slep 3b is repealed. If the snippet does match, Step 4 is performed.

Step 4: If a snippet and a matching screen wei e found in Step 3b 4S7, ihe hybridization strength of ihe binding bclwecn the sequence containing the snippet and all of the probes containing the screen is calculated (see Slep 5). Double counting is avoided by doing this only for the first matched screen containing the probe. Each pair of bases is examined and assigned a numerical binding strength. An AT pair would be assigned a lower binding sli englh than a GC pair because AT pairs have a lower melting temperature Tm. The process is explained more fully below al Step 5b.

Step 5: The hybridization strengths between sequence and all the probes containing it are calculated using a dynamic programming process. The process is as follows: Slep 5a: Begin at the position of the first probe containing the given screen but not containing any other screens which start at an earlier position and also match the sequence. This is done lo avoid double counting. Two running totals are maintained: a) boundSirength, which represents the hybridization strength contribution which would result if the sequence and probe were lo match exactly for all base pairs lo the right of the current position, and b) unboundStrength, which l ep esenls the sli englh of the maximally binding region. Step 5b: Al each new base pair, the variable boundSirength is incremented by 71 if the sequence and probe match and the matched base pair is GC 489. inci emenied by 30 if ihe matched base pair is AT

490 (i.e., this number is about 42.25'-;; of the first number 71 ), and decremented by 74.5 if there is not a match 48S (i.e., this number is about 5% larger than lhc fii si number 71 ). Step 5c: If the current boundStrength exceeds lhc current unboundStrength 4 l ) I (which was originally initialized to zero), a new binding region has been found, and unboundSti cngih is sel equal to boundSirength 492. Step 5d: If the current boundSirength is negative, boundSirength is reset lo zero 493. Slep 5c: If ihe current position is at the end of a probe the resulls (the hybridization sli englhs) ai e tallied for that probe. Step 5f: If the current position is at ihe end of the last probe containing the screen, the process stops.

Slep 6: A tally is kept of the number and melting lemperalure of ihe matches for each candidate probe, and the location of the best 20 candidates, using a priority queue (reverse order by hybridization strength number) 494. Slep 7: A numerical "score" is kepi for each preparation sequence by tallying the quantity exp (which can be expressed as ∑e ) for each match 495, where Tm is the melting temperature for the "perfect" match, ihe pi obe itself. In other words, the probe hybridizes "perfectly" to its target.

Step 8: Hairpins are calculated by fust calculating the complementary piobe. In olher words, the order of the bases in the candidate probe ai e reversed (CTATAG to GATATC), and complementary base pairs are substituted (A for T. T for A, (J foi C. and C for G, changing GATATC to CTATAG in the above example). Next, the variable repi escnling the maximum hairpin length for a candidate probe is initiali/cd lo zero, as is ihe variable repi escnling a hairpin's distance For each offset, the original candidate probe and the complementary i be just created are then aligned with each other and compared. The longest match is then found. I! any two matches have the same length, the one wilh the longest hairpin distance (i.e.. the number of base pa s separating the malch) is then saved.

Step 9: The preparation sequences are then sorted 490 and displayed in rank order, from best to worst 497. Step 10: The resulting MPSD, which includes aM candidate probes, is then displayed on the screen. Step 11: The best 20 matches are also printed or displayed in rank order, as the user requests 497. e. Outputs

The outputs of the H-Sile Model are fully described in Section l(d)(iv), above, and illustrated in Figures 14 through 16. Samples of the two output files created by ihe H-Sile Model are shown in Figures 34A and 34B.

4. Description of the Mitsuhashi Probe Selection Diagram Processing Once the Mitsuhashi Probe Selection Diagram (MPSD) dala has been calculated by the H-Site

Model program (see stage three and Figure 33, discussed abov e), it is necessary to convert this data to pixel format and plot a graph. An overview of this process is shown in Figure 35. First, the program calculates the oulpul (x,y) ranges 500. Next, ihese are conv erted to a logarithmic scale 501. The values are then interpolated 502, and a bitmap is created 503. Lastly, the bitmap is displayed on ihe screen 504 in MPSD format (discussed above in section 3(c)(5)). A sample MPSD is shown in Figure 14.

5. Description of lhc Malchinfo Window Processing

The Probelnfo and Malchinfo windows ai e discussed in great detail in Section l(c)(ii), and a sample of these windows is shown in Figure 15. An overv iew of lhc processing involved in creating the Matchlnfo portion of the window is given in the fiow chart in Figure 36. First, as the u.ser moves the MPSD cursor 570 (seen as a vertical line bisecting lhc MPSD window), the program updates the position of the candidate probe shown under thai cui sor position 521. Next, based upon lhc candidate probe's position, the program updates the sequence 522 and hairpin information 523 for lhat probe. This updated information is then displayed in an updated match list 524, shown in the Malchinfo window. XVII.Detection Kits

The present invention can preferably be embodied in a kit for the detection of an organism, infectious agent, or biological component contained in a biological sample Such a kit can take a variety of forms, as will be apparent to those of skill in lhc art.

In one embodiment, the present invention comprises a kit for identifying ihe presence of a particular species of fungus in a biological sample. Such a kil includes at least one specific polynucleotide probe and a common probe as described above In a preferred embodiment, a plurality of specific probes are included, and such probes ai e prelerably immobilized on one or more solid supports. In a more preferred embodiment, each of the plurality ol specific probes is immobilized on a different solid support. For example, a microliter plate having a plurality of wells can have a different polynucleotide probe immobilized to each well. If such probes contain sequences specific lo the

ribosomal RNA of different species of fungi, the kit can be used to lest a single biological sample for the presence of a plurality of fungi.

An example of such a kit using microliter wells as the solid supports is shown schematically in

Figure 6. In ihis figure, lhc presence of the ribosomal RNA of a particular species of fungus, in this case Candida albicans, is detected in a well 50. which is darkened to show a positive result. No probe has been immobilized lo well 52 in order lo provide a negative control. Well 54, on the other hand, has immobilized lo ils walls a common probe, such as SEQ ID NO:l, which is complementary to a sequence present in the ribosomal RNA of lhc fungi being tested for. This provides a positive control, since well

54 should show a positive result whenever a positive result is delected in well 56 or any of the olher wells in this embodiment of the kit.

Well 54 also provides a means of detecting the presence of fungi which do not contain the specific ribosomal RNA sequences being probed by the specific polynucleolide probes. For example, if a mutant strain of Candida albicans which does not contain the specific sequence complementary to the probe used in well 56 is present in a sample tested with the kit illustrated in Figure 6, well 56 would not show a positive result. Well 54 would, however, indicate the presence of a fungal palhogen, as long as the mutant strain did nol contain a mutation in the common sequence delected in well 54 which interfered with the hybridization of lhat sequence to the common probe immobilized to the walls of well

54.

In addition lo specific probes, common probes, and solid supports, other elements can also be included in the present kit. For example, appropriate buffers for hybridizing DNA or RNA to the probes in the kit can be included. Labels, as described above, can also be incorporated which are attached to a common probe.

In an alternative embodiment, ihe kit can include PCR primers such as those previously described. Such a kil could comprise a common primer and a specific primer, 2 common primers, or 2 specific primers identified through the method of the present invcnlion. Olher components, such as a reverse transcriptase, a DNA polymerase like Taq polymerase and dNTP ' s can also be included in this embodiment of the present kil.

The forgoing embodiments of the kil of ihe present inv cnlion can be adapted lo perform the methods of the present invention that involve PCR as well. In this embodiment, the kil additionally includes a reverse transcriptase and a polymerase, preferably a DNA polymerase thai has significant polymerase activity at temperatures above 50 ° C, such as Taq DNA polymerase.

XVIII.Conclusion

All references cited herein are hereby explicitly incorporated by this reference thereto.

Although the invcnlion has been described wilh reference to certain particular exemplary embodiments of various aspects, these embodiments arc intended only to illustrate and not to limit the present

invention. Accordingly, ihe scope of lhc present invcnlion is to be determined upon reference to the appended claims.

Table I. Common probe (Com-392)* for Fungi

GenBank Sequence Com-392

Species name l.D. No. GAGGGAGCCTGAGAAACG

* Com-392 is identical among 107 different rRNAs registered in GenBank Table II. Common probe (Com-419) , for Fungi

GenBank Sequence Com-419

Species name l.D. No. TCCAAGGAAGGCAGCAGG

I. Fungi

Pneumocystis carinii PHC16SRR1 21

Cryptococcus neoformans CPCDA 22

Coccidiodes immitis C01DA 23

Blastomyces dermatitidis BLODA 24

* Com-419 is identical among 123 different rRNAs registered in GenBank

Table III. Common probe (Com-1205) Λ for Fungi

GenBank Sequence Com-1205

Species ■name l.D. No. ACGGGGAAACTCACCAGG

I. Fungi

Pneumocystis carinii PMC16SRR1 41

Cryptococcus neoformans CPCDA 42

Coccidiodes immitis COIDA 43

Blastomyces dermatitidis BLODA 44

* Com-1205 is identical among 42 different rRNAs registered in GenBank Table IV. Common probe (Com-154 )* for Fungi

GenBank Sequence Com-1544

Species name l.D. No. TCG7GCTGGGGATAGAGC

I. Fungi

Pneumocystis carinii PMC16SRR1 61 Cryptococcus neoformans CPCDA 62 Coccidiodes immitis COIDA 63

Blastomyces dermatitidis BLODA 64

* Com-1544 is identical among 40 different rRNAs registered in GenBank

Table V. Probes for Pneumocystis carinii (Cari-685)

GenBank Sequence Cari-685 Species .name l.D. No. GCGCAACTGATCCTTCCC

I. Fungi

Pneumocystis carinii PMC16SRR1 81 Cryptococcus neoformans CPCDA 82 -T--CGGC---G AT Coccidiodes immitis COIDA 83 A-CTGGT G--A

Blastomyces dermatitidis BLODA 84 A-CTGGT G--A

Table VI. Probes for Pneumocystis Carinii (Cari-1056)

GenBank Sequence Cari-1056

Species name l.D. No. GGCGATGTTTTTTTCTTGACTCG

I . Fungi

Pneumocystis carinii P C16SRR1 104

Cryptococcus neoformans CPCDA 105 C--CA---AAATA--T

Coccidiodes immitis COIDA 106 G--CA---AAATT--T

Blastomyces dermatitidis BLODA 107 G--CA---AAATT--T

N.tabacum

-7(»-

Table VII. Probes for Aspergillus (Asp-693)

GenBank Sequence Asp-693

Species name l.D. No. CTTCTGGGGAACCTCATGG

I. Fungi

Pneumocystis carinii PMC16SRR1 127 AACAC- A CCA

Cryptococcus neoformans CPCDA 128 AACAC- A CCA

Coccidiodes immitis COIDA 129 CT

Blastomyces dermatitidis BLODA 130 --C- -A-G--C

II. Highest homologous sequence in GenBank

Penniclium notatum sub PNNDA 147 Human HLA-B-AT3

HUMBAT3A 148 GC-A-

Rat olfactory protein RATOLFPRON 140 --C CA

Table VIII. Probes for Aspergillus (Asp-1046)

Genbank Sequence Asp-1046

Species name l.D. No. GGCGGTGTTTCTATGATGACC

I. Fungi

Pneumocystis carinii PMC16SRR1 150 A-----T-T-CT T

Cryptococcus neoformans CPCDA 151 TTGTTG G—CG--

Coccidiodes immitis COIDA 152 ACGT---G TT--TTG

Blastomyces dermatitidis BLODA 153 -A---G---CT-

Aspergiltus fumagatus ASNDA 154 fumigatus ASNRR5SS 155 fumi atus ASNRRSSB 156

Candida albicans YSASRSUA 157 CCTTCG-GC---T TT albicans YSAL16S 158 CCTTCG-GC---T TT lusitaniae YSASRRNAA 159 C---CA-T-AG G lusitaniae YSASRSUE 160 C---CA-T-AG G kefyr YSASRSUB 161 --T T-C-T krusei YSASRRNAC 162 -A C-AC G-A-G- krusei YSASRSUD 163 -A C-AC G-A-G- tropicalis YSASRRNAB 164 TCTTCG-AC---T TT tropicalis YSASRSUG 165 TCTTCG-AC---T TT viswanathi i YSASRSUH 166 CCTTCG-GC---T TT parapsi losis YSASRSUF 167 A--G---AT-C-AATTT gui I liermondi i YSASRSUC 168 TCTTTGAGC---T TT glabrata YS5CRRNAS 169 --T T-T-AG

II. Highest homologous sequence in GenBank

Nanochlorum eucaryotum Moraxella sp. Mspl E. coli cvaA,B operon

-7S-

Table IX. Probes for Blastomyces (Btast-694)

GenBank Sequence Blast-694

Species name l.D. No. TCCTGGGAAGCCCCATG

I. Fungi

Pneumocystis carinii PHC16SRR1 173 GT---T--T TTA-

Cryptococcus neoformans CPCDA 174 -G---AA GAC

Coccidiodes immitis COIDA 175 -T G-A---T---

Blastomyces dermatitidis BLODA 176

Aspergillus fumagatus ASNDA 177 -T G-A--T- fumigatus ASNRR5SS 178 -T G-A--T- fumigatus ASNRRSSB 179 -T G-A--T-

Candida albicans YSASRSUA 180 -T T ATT-A albicans YSAL16S 181 GT---T-- -TTA- lusitaniae YSASRRNAA 182 GT---T-- -TTA- lusitaniae YSASRSUE 183 GT---T-- -TTA- kefyr YSASRSUB 184 GT---T-- -TTA- krusei YSASRRNAC 185 GT---T-- -TTA- krusei YSASRSUD 186 GT---T-- -TTA- tropicalis YSASRRNAB 187 GT---T-- ---TTA- tropicalis YSASRSUG 188 GT---T-- ---TTA- viswanat i i YSASRSUH 189 GT---T-- ---TTA- parapsi losis YSASRSUF 190 GT---T-- ---TTA- gui I liermondi i YSASRSUC 191 GT---T-- ---TTA- glabrata YS5CRRNAS 192 G T- -GGTCC

II. Highest homologous sequence in GenBank

Avian influenza FLAHA5 193 A--

Mouse perlecan MUSPERPA 194 C--G--

Mouse basement membrane MUSPGCBHA 195 C--G--

Table X. Probes for Blastomyces (Blast-1046)

GenBank Sequence Blast-1046

Species name ID No. GACGGGGTTCTTATGATGACC

I. Fungi

Pneumocystis carinii PMC16SRR1 196 CG C GAGG---T

Cryptococcus neoformans CPCDA 197 -GT-AAA GAT G

Coccidiodes immitis COIDA 198 CAA---TGA--A---

Blastomyces dermatitidis BLODA 199

-G---T---TC

-G---T---TC

-G---T---TC

AT G

AT G

AG G

AG G

T

-G-A-G- -G-A-G- AT G

AT--

- AT--

- AT-- - TT-- - AG--

II. Highest homologous sequence in GenBank

Rat ITPR2 Type2inosιtol RATITPR2R 216 -CC-CT--

Canine mRNA DOGSRPR 217 CTGCTAA-

Mitochondrion Oenothera OBEMTNAD12 218 C-GTCTT-

-SO-

Table XI. Probes for Candida (Cand-513)

GenBank Sequence Cand-513

Species name ID No. GAGTACAATGTAAATACCTTAACGAG

1. Fungi

Pneumocystis carinii PMC16SRR1 357 T--G

Cryptococcus neoformans CPCDA 358 T C-

Coccidiodes immitis COIDA 359 T C-

Blastomyces dermatitidis BLODA 360 C--

Aspergillus fumagatus ASNDA 361 --- ---c- fumigatus ASNRR5SS 362 --- fumigatus ASNRRSSB 363 --- ---c-

Candida albicans YSASRSUA 364 albicans YSAL16S 365 lusitaniae YSASRRNAA 366 lusitaniae YSASRSUE 367 kefyr YSASRSUB 368 krusei YSASRRNAC 369 krusei YSASRSUD 370 tropicalis YSASRRNAB 371 tropicalis YSASRSUG 372 viswanathii YSASRSUH 373 parapsilosis YSASRSUF 374 gui I liermondii YSASRSUC 375 glabrata YS5CRRNAS 376

II, Highest homologous sequence in GenBank

Yeast 18S rRNA YSCRNA5 377 Yeast (S. cerevisiae) YSCRGEA 378 Kluyveromyces lactis YSK17SRRNA 379 Torulaspora delbrueckii TOUSRSR 380 T. glabrata rRNA YSLSRSUA 381 H. potymorpha rRNA HASSRSUA 382 S. po be rRNA YSPRRNASS 383

-Sl-

Table XII. Probes for Candida (Cand-701)

GenBank Sequence Cand-701

Species name ID No. GGTAGCCATTTATGGCGAACC

I. Fungi

Pneumocystis carinii PMC16SRR1 384 TTA GC---T-T--GT

Cryptococcus neoformans CPCDA 385 TTCG---C-C T---T-

Coccidiodes immitis COIDA 386 TTA GC---T-T--GT

Blastomyces dermatitidis BLODA 387 TTA GC---T-T--GT

Aspergillus fumagatus ASNDA 388 ATA---A-- ACG-TGAA fumigatus ASNRR5SS 389 ATA---A-- ACG-TGAA fumigatus ASNRRSSB 390 ATA- -A -ACG-TGAA

Candida albicans YSASRSUA 391 albicans YSAL16S 392 lusitaniae YSASRRNAA 393 TTA GC---T-T--GT lusitaniae YSASRSUE 394 -C TNG-C CG-N kefyr YSASRSUB 395 -C NC--GC---TTN--T krusei YSASRRNAC 396 C--TTT A--CAA G krusei YSASRSUD 397 TTA GC---T-T--GT tropicalis YSASRRNAB 398 -c T tropicalis YSASRSUG 399 -c T viswanathii YSASRSUH 400 -c T parapsi losis YSASRSUF 401 -c T gui lliermondi i YSASRSUC 402 ---CCG-C---T GTA glabrata YS5CRRNAS 403 ATA---A ACA-TGAA

II. Highest homologous sequence in GenBank

S. enterica STYRFB 404 G CGCTT

Clostridium pasteurianum CLONIFH5 405 T AA--GG C. pasteurianum nifH CLONIFH 406 T AA--GG C. pasteurianum nifH CLONIFH1 407 T AA--GG

Table XIII. Probes for Coccidiodes (Cocc-659)

-ATC---TG-

Table XIV. Probes for Coccidiodes (Cocc-1050)

GenBank Sequence Cocc-1050

Species .name ID No. GGCAACTTTGAATAACCCGTTC

I. Fungi

Pneumocystis carinii PMC16SRR1 288 TTACTAC---G GTGGT

Cryptococcus neoformans CPCDA 289 CTGCC T-C-CA---CC-C

Coccidiodes immitis COIDA 290

Blastomyces dermatitidis BLODA 291 --GTT---ATG--G-

Table XV. Probes for Cryptococcus (Cryp-691)

GenBank Sequence Cryp-691

Species name ID No. GTGGTCCTGTATGCTCTTTACT

I. Fungi

Pneumocystis carinii PMC16SRR1 311 -- -GG--C---GC-G--CT- Cryptococcus neoformans CPCDA 312 -- Coccidiodes immitis COIDA 313 C- -G-CTG-AC C-

Blastomyces dermatitidis BLODA 314 C G-CCG-AC C--

II. Highest homologous sequence in GenBank

Rat o tropomyosin RATTMA3 331 T- -CGT- Human ribonucl/angio HUMRAJ 332 C- -C-ACA- Human ribonucl/angio inh HUMRAI 333 C- -C-ACA-

Table XVI. Probes for Cryptococcus (Cryp-1042)

GenBank Sequence Cryp-1042

Species name ID No. CACGTCAATCTCTGACTGGG

I. Fungi

Pneumocystis carinii PMC16SRR1 334 TT-T-G-T---A--GG---T

Cryptococcus neoformans CPCDA 335

Coccidiodes immitis COIDA 336 GG G-A-TC-G---TC

Blastomyces dermatitidis BLODA 337 GG G-A-TC-G---TC

-S -

Table XVI I . Jun-cortmnon sense primer (S943- 2, SEQ ID N0: 728) .

Sequences (5'-3') LocusPos SEQ ID NO: CCGCTGTCCCCCATCGACATGG

Human

B:humjunca1189 408 -G A- C:humjuna1981 409 C D:humjundr943 410 -T G

Mouse

B:musjunba1079 411 TG A- C:musJune1344 412 c T C:muscjun1646 413 c T C:musjun1084 414 c T D:musjund927 415 D:musjunda782 416 D:musjundr793 417

Rat

C:atjunap11082 418 -CT--

C:ratrjg92984 419 -CT--

Chicken C:chkjun1470 420 -C-- -T--T--

Quail

C:quljun1186 421 -T--T-

Drosophi la C:drojun1038 422 A-CG-TAAT-

Highest matched sequences in EMBL SDNAM2G Yeast NAM2 gene 423 A-A GAAT PRK2TRFB Plasmid PK2 trfB ope 424 GT GC-T- DMSYT D. elanogaster synap 425 G-A UC

Table XVIII. Jun-common antisense primer (AS1132-2, SEQ ID NO:729).

Table XIX. Jun-B specific probe (B-1258, SEQ ID NO:730).

GenBank B1258(5' 3' ) SEQ ID NO: CTGGCGGCCACCAAGTG

Human

B: HUMJUNCA 449 C: HUMJUNA 450 A-C--T---T D: HUMJUNDR 451 A TCCAA Mouse

B: MUSJUNBA 452 C: MUSJUNC 453 A-T--C---T MUSCJUN 454 A-T--C---T MUSJUN 455 A-T--C---T D: MUSJUND 456 C CCCG- MUSJUNDA 457 T GCCA- MUSJUNDR 458 T GCCA- Rat

C: RATRJG9 459 T GCCAA RATJUNAP 460 A-C--T---T Chicken

C: CH JUN 461 -C G CCC Quail

C: QULJUN 462 GGC AG TG- Drosophila C: DROJUN 463 G T--AT

Homologous sequences in GenBank J04695 Figure 2. Nucleo. 464 G A G M27884 Figure 2. Nucleo. 465 A GC HUMCNPG2 green cone photo. 466 T-AA HUMCNPR2 green cone photo. 467 T-AA HUMPIGMF2 colour-blind pho. 468 T-AA

Table XX. c-jun specific probe (C2147).

C2147.DNA

Subtype GenBank name SEQ ID NO: CCACGGCCAACATGCTC

Homologous sequences in GenBank MXBPALPA L.enzymogenes MXBPALP L.enzymogenes

Table XXI. Human jun-D specific probe (HUMD965)

HUMD965

Subtype GenBank name SEQ ID NO: ACACGCAGGAGCGCATC

GG-A--T

-GT-C G---

-AGAC

-GT-T G---

-GT-T G---

GT-T G--- A--A A--A A--A

-GT-T G---

-GT-T G---

-GT A-A---

-GT A-A---

G--A--T A-AA

-ACA

C TGC

G TGC T G-G

Table XXII. Mouse jun-D specific probe (MUSD1063).

MUSD1063.DNA

Subtype GenBank name SEQ ID NO: CCCTCAAAAGCCAGAACACCG

Homologous sequences in GenBank M27221 Figure 3. DROAMY D.erecta DROAMYQ D.erecta HUMIGCMUDE Immunoglob.

Table XXIII. G protein common primer (G2-S)

G2-S

SEQ ID NO: AGCACCATTGTGAAGCAGATGA

525 --T--A 526 527 A 528 529

530 A 531 532 --T--T 533 534 535

Highest matched sequences in GenBank HUMADECYC adenyl cyclase 536 RATACOA1 acyl-coA oxidase 537 GC G A HUMTGASE transglutaminase 538 CT CC-AC

ET

Table XXIV. G protein common primer (G4-AS, SEO ID N0:731).

__

SEQ ID NO: TGTTTGATGTGGGAGGCCAGAG

Highest matched sequences in GenBank HUMLDLRRL LDL-receptor MUSHEPGFA hepatocyte growth HUMKEREP epidermal keratin

Table XXV. Gi-1 protein specific, human-rat common primer (Gi-1)

Gi-1

SEQ ID NO: GATGTTCTCAGAACTAGAGTGAAAAC

553 554 TT- CT-CCCCTGTCCCCT 555 TC-G--G G-- 556 AC---G-C---T---TCCTG--C--G 557 --CA-C---C C--G--C

558 559 G--GC-G--CC-T G-- 560 TC-G--G G-- 561 A-GCACAATTA-TTA CG 562 --CA-C---C C--G--C 563 TCA GAG--C---A-CC CA

Table XXVI. Gi-2 protein specific, human-rat common primer (Gi-2)

Gi-2

SEQ ID NO: GCAACCTGCAGATCGACTTTG

-9(>-

Table XXVII. Gi-3 protein specific, human-rat common primer (Gi-3)

Gi-3 SEQ ID NO: TCTTCGGACGAGAGTGAAGAC

575 ---CA-A--T A-- 576 G--A CC-C--A 577 578 GGGAAATCGA---T---G--- 579 ---C GA---CGTG-A--

580 ---CA-A--T -A-- 581 G--G CC-T- 582 583 TTCCT A---T---T-TG 584 CGCAT---G--C-C CCA 585 AAC G-A CACCAT-

O Table XXVIII. Gs Protein specific, human-rat common primer (Gs, SEQ ID N0:732) O

Gs

AATTCTATGAGCATGCCAAGGC

GGGCGG T---C CT

---ATG GCA GCTA

10 C-G ACTA---T CTC I n I

Table XXIX. Go Protein specific, human-rat common primer (Go)

Go SEQ ID NO: CTGCTTTCTGCCATGATGCG

10

15

I

IΩ CD I

20

25

Table XXX. Human-rodent common jun-B specific probes.

Name in B-504 (5'-3") B-739 (5'-3') Seq. 10 No: Size GenBank CACGACTACAAACTCCTGAAAC Seq. ID No: GGACAGTACTTTTACCCCCG (bp)

600 607 251 D A-AT--A-T AT A 601 ACC T---G-G AA 608

T G-C--C-T TTCC 602 ACG T-C-C GAA 609

603 610 251

AGTA--CC---GA 604 ---A--C-GCGG AC 611

T G-C--C-T TTTG 605 TC T-C-C AA 612 I n VO !_.

I

-T---CG- -GTA 606 ---C- -AT---A 613

Size of PCR products using B-504 and B-739 as primers.

The highest matched sequences in GenBank (release 68.0) next to the jun genes.

Table XXXI. Human-rodent common c-jun specific probes.

O

o o ) I

Size of PCR products using C-2101 and C-2219 as primers.

The highest matched sequences in GenBank (release 68.0) next to the jun genes.

Table XXXII. Human-rodent common jun-D specific probes.

o — I

Size of PCR products using D-916 and D-1153 as primers.

The highest matched sequences in GenBank (release 68.0) next to the jun genes.

Table XXXIII. Human-rodent common Gs specific probes.

Name in Gs-246 (5'-3') Seq. ID No: Gs-824 (5'-3') Seq. ID No: Size GenBank GCCAACAAAAAGATCGAGAAGC GGACAAAGTCAACTTCCACATG (bp)

O 642 c CG--G---G-T CCGCA 643

DO

CAA-GG CA-A 644

TGAGGA T--A T 645 c TTTGG-G-G TA T 646

H m 647

CO 648 o

X I m

649 650

A A--C A- 651 -A- 661

Size of PCR products using Gs-246 and Gs-824 as primers.

The highest matched sequences in GenBank (release 68.0) next to the Gs protein DNAs.

Table XXXIV. Human-rodent common Gi-1 specific probes.

o __

I

i

Size of PCR products using Gϊ1-735 and Gil-1131 as primers.

The highest matched sequences in GenBank (release 68.0) next to the Gs protein DNAs.

Table XXXV. Human-rodent common Gi-2 specific probes.

O

o O I

*: Size of PCR products using GH2-742 and GΪ2-1102 ns primers.

**: The highest matched sequences in GenBank (release 68.0) next to the Gs protein DNAs.

Rat G L _ 2 Mas shown to be specifically anplificd when PCR was performed with _.. _ and SEQ ID NO:678 and SEQ ID NO:686 as PCR primers.

Table XXXVI. Human-rodent common Gi-3 specific probes.

Name in Gι ' 3-407 (5'-3') Seq. ID No: Gi3-730 (5'-3') Seq. ID No: Size GenBank TTGTTTTAGCTGGCAGTGCTGA GAGGGAGTGACAGCAATTATCT (bp)

CC-GAAGT-GATC TC 692 A-T-AT T--C--C 699 ) A GA GGC-CTA-T 693 --A--C T--C--C 700 0 CCTACAC---A--AT C 694 T--C--G--C--C 701 O 695 702 345

CG T--AGTC-TTAC- TC AT TCAACGAC 703

697 704 345 O O I

G C--A T C 698 -AC---T-- 705

Size of PCR products using Gi3-407 and Gi3-730 as primers.

The highest matched sequences in GenBank (release 68.0) next to the Gs protein DNAs.

Rat Gi-3 was shown to be specifically amplified when PCR was performed with ___ 2 and SEQ ID NO:697 and SEQ ID NO: 04 PCR primers.

Table XXXVII. Hunan-rodent comnon Go specific probes.

O

π o

I O

*: Size of PCR products using Go-1224 and Go-1397 as primers.

**: The highest matched sequences in GenBank (release 68.0) next to the Gs protein DNAs.