Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
POLYMORPHISMS ASSOCIATED WITH DEVELOPING COLORECTAL CANCER, METHODS OF DETECTION AND USES THEREOF
Document Type and Number:
WIPO Patent Application WO/2010/019690
Kind Code:
A1
Abstract:
A method for assessing a genetic susceptibility to CRC and potentially other cancers in a subject includes measuring allele specific expression or presence of at-risk haplotypes, where a difference in expression or the presence of at-risk haplotypes is indicative of a colorectal cancer (CRC) or a predisposition to CRC.

Inventors:
DE LA CHAPELLE ALBERT (US)
TANNER STEPHAN M (US)
VALLE LAURA (US)
PASCHE BORIS
Application Number:
PCT/US2009/053582
Publication Date:
February 18, 2010
Filing Date:
August 12, 2009
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
UNIV OHIO STATE RES FOUND (US)
DE LA CHAPELLE ALBERT (US)
TANNER STEPHAN M (US)
VALLE LAURA (US)
PASCHE BORIS
International Classes:
C12Q1/68
Foreign References:
US6291237B12001-09-18
US20070099251A12007-05-03
Other References:
PASCHE ET AL.: "TbetaR-I(6A) is a candidate tumor susceptibility allele.", CANCER RESEARCH, vol. 59, 15 November 1999 (1999-11-15), pages 5678 - 5682
DATABASE NCBI "Single Nucleotide Polymorphism Cluster Report", Database accession no. rs334348
VALLE ET AL.: "Germline allele-specific expression of TGFBR1 confers an increased risk of colorectal cancer", SCIENCE, vol. 321, 5 September 2008 (2008-09-05), pages 1361 - 1365
Attorney, Agent or Firm:
MARTINEAU, Catherine, B. (Sobanski & Todd LLC, One Maritime Plaza, US)
Download PDF:
Claims:
CLAIMS

What is claimed is:

1. A method for the diagnosis and identification of pre-disposition or susceptibility to colorectal cancer (CRC) in an individual, comprising: screening a sample from the individual for at least one allele (or underlying at-risk haplotype) associated with transforming growth factor β type 1 receptor gene (TGFβRl), wherein identifying haplotypes with lowered expression indicates lowered TGF-β signaling and a higher risk for developing CRC.

2. A method for genetically identifying an individual with respect to its potential to be diagnoses with colorectal cancer (CRC) comprising: obtaining a sample of genetic material from an animal; and assaying for the presence of a polymorphism in the transforming growth factor βl receptor gene (TGFβl), wherein the polymorphism is associated with CRC.

3. The method of claim 3, wherein the polymorphism is selected from the group consisting of: a single nucleotide polymorphism (SNP), a deletion, and an insertion.

4. A method of detecting a genetic predisposition in a human subject for developing colorectal cancer (CRC), comprising:

(i) collecting a biological sample from the subject;

(ii) genotyping the sample at polymorphic nucleotide positions, and

(iii) assessing whether a haplotype is present in the sample, the haplotype comprising polymorphic nucleotide positions wherein the presence of the haplotype indicates a genetic predisposition for developing CRC in the subject.

5. A method of detecting a higher than normal risk in a human subject for developing colorectal cancer (CRC), comprising:

(i) collecting a biological sample from the subject;

(ii) genotyping the sample at polymorphic nucleotide positions, and

(iii) assessing whether a haplotype is present in the sample, the haplotype comprising polymorphic nucleotide positions wherein the presence of the haplotype indicates a higher than normal risk for developing CRC in the subject, or assessing whether a haplotype is present in the sample, the haplotype comprising polymorphic nucleotide positions wherein homozygosity for the haplotype indicates a higher than normal risk for developing CRC in the subject.

6. A method of detecting a predisposition to colorectal cancer (CRC), comprising steps of

(i) providing one or more oligonucleotide primers capable of amplifying parts of human TGFβRl gene and its genomic region, (ii) amplifying genomic DNA of CRC patients and normal control individuals using the primers of step (1), (iii) sequencing the amplified genomic DNA and identifying sequence variations

(polymorphisms) of the amplified genomic DNA by comparing with an existing sequence of human TGFβRl gene, (iv) screening normal control individuals and CRC patients for the polymorphisms identified in step (iii) by sequencing or genotyping of the amplified genomic

DNA of the individuals using the primers of step (1), (v) computing risk haplotypes for CRC using the polymorphisms in the human

TGFβRl gene and its genomic region based on their frequency distribution in normal individuals and CRC patients, and (vi) predicting the risk or susceptibility to CRC based on the haplotype present at the polymorphic sites in the individual tested.

7. The method of claim 6, wherein the oligonucleotide primers capable for amplification of TGFβRl gene and its genomic region are designed from the sequence contig NT_008470 containing TGFβRl reference sequence NM_004612.

8. The method of claim 5, wherein the length of the oligonucleotide primers is between 15 and 30 bases.

9. A method of determining a susceptibility to colorectal cancer (CRC) in an individual, comprising detecting an at-risk allele of a SNP associated with colorectal cancer (CRC), wherein the SNP is located within a sequence selected from the group consisting of sequences identified by SEQ ID NOs.: 1-4 and the complements of sequences identified by SEQ ID NOs: 1-4.

10. An isolated polynucleotide comprising a SNP located within a sequence selected from the group consisting of sequences identified by SEQ ID NOS: 1-4 and the complements of sequences identified by SEQ. ID NOS.: 1-4.

11. A method of diagnosing a susceptibility to colorectal cancer (CRC) in an individual, comprising detecting a haplotype associated with Colorectal cancer (CRC) selected from the group consisting of the haplotypes shown in Figure 4C.

12. Isolated polynucleotides comprising a SNP located within a sequence selected from the group consisting of sequences identified by SEQ ID NOS.: 1-4 and the complements of sequences identified by SEQ ID NOS.: 1-4; wherein the presence of a particular allele of a SNP (a particular nucleotide base) is indicative of a propensity to develop colorectal cancer (CRC) or otherwise may be used to identify an at-risk individual.

13. The polynucleotide of claim 12, selected from the group consisting of sequences identified by SEQ ID NOs: 1-4 and the complements of sequences identified by SEQ ID

NOs:l-4.

14. The polynucleotide of claim 14, comprising at least a portion of a sequence selected from the group consisting of sequences identified by SEQ ID NOs: 1-4 and the complements of sequences identified by SEQ ID NOs: 1-4.

15. Isolated polynucleotides comprising a SNP located within a sequence selected from the group consisting of sequences identified by SEQ ID NOs: 1-4 and the complements of sequences identified by SEQ ID NOs: 1-4, which hybridize, are complementary, or are partially complementary to a nucleotide sequence present in a test sample.

16. An isolated polynucleotide selected from the group consisting of sequences identified by SEQ ID NOs: 1-4 and the complements of sequences identified by SEQ ID NOs: 1-4, which hybridizes, is complementary, or is partially complementary to a nucleotide sequence present in a test sample.

17. An isolated polynucleotide comprising at least a portion of a sequence selected from the group consisting of sequences identified by SEQ ID NOs: 1-4 and the complements of sequences identified by SEQ ID NOs: 1-4, which hybridizes, is complementary, or is partially complementary to a nucleotide sequence present in a test sample.

18. A SNP is located within SEQ ID NO: 1 or the complement of SEQ ID NO: 1.

19. A SNP is located within SEQ ID NO: 2 or the complement of SEQ ID NO: 2.

20. A SNP is located within SEQ ID NO: 3 or the complement of SEQ ID NO: 3.

21. A SNP is located within SEQ ID NO: 4 or the complement of SEQ ID NO: 4.

22. Isolated polynucleotides comprising one or more haplotypes selected from the group consisting of the haplotypes identified in FIGURE 4C which are indicative of a propensity to develop colorectal cancer (CRC).

23. A polynucleotide isolated using all or a portion of a sequence selected from the group consisting of sequences identified by SEQ ID NOs: 1-4 and the complements of sequences identified by SEQ ID NOs: 1-4.

24. Polypeptides encoded by a polynucleotide, wherein the polynucleotide comprises a SNP located within a sequence selected from the group consisting of sequences identified by SEQ ID NOs: 1-4 and the complements of sequences identified by SEQ ID NOs: 1-4.

25. A polypeptide encoded by a polynucleotide, wherein the polynucleotide is selected from the group consisting of sequences identified by SEQ ID NOs: 1-4 and the complements of sequences identified by SEQ ID NOs: 1-4.

26. A polypeptide encoded by a polynucleotide, wherein the polynucleotide comprises at least a portion of the sequence selected from the group consisting of sequences identified by SEQ ID NOS: 1-4 and the complements of sequences identified by SEQ ID NOS: 1-4.

27. Antibodies that bind the polypeptides of claims 24, 25, or 29.

28. Polypeptides encoded by a polynucleotide, wherein the polynucleotide comprises a haplotype selected from the group consisting of the haplotypes identified in Figure 4C.

29. A vector comprising a haplotype identified in Figure 4C or a SNP located within a sequence selected from the group consisting of the sequences identified by SEQ ID NOS: 1-4 and the complements of sequences identified by SEQ ID NOS: 1-4; operably linked to a regulatory sequence.

30. A method of diagnosing colorectal cancer (CRC) or a susceptibility to colorectal cancer (CRC) in an individual, comprising determining the presence or absence of particular alleles of SNPs contained in SEQ ID NOS: 1-4.

31. A method of diagnosing colorectal cancer (CRC) or a susceptibility to colorectal cancer (CRC) in an individual, comprising screening for one of the at-risk alleles associated with colorectal cancer (CRC) shown in Figure 4C.

32. The method of claim 31, wherein the SNP is located within SEQ ID NO: 1 or the complement of SEQ ID NO: 1.

33. The method of claim 31, wherein the SNP is located within SEQ ID NO: 2 or the complement of SEQ ID NO: 2.

34. The method of claim 31, wherein the SNP is located within SEQ ID NO: 3 or the complement of SEQ ID NO: 3.

35. The method of claim 31, wherein the SNP is located within SEQ ID NO: 4 or the complement of SEQ ID NO: 4.

36. A method of detecting the presence of a polynucleotide in a sample containing a SNP located within a sequence selected from the group consisting of sequences identified by SEQ ID NOS: 1-4 and the complements of sequences identified by SEQ ID NOS: 1-4, wherein the method comprises: contacting the sample with an isolated polynucleotide comprising a sequence (or a portion of a sequence) selected from the group consisting of sequences identified by SEQ ID NOS: 1-4 and the complements of sequences identified by SEQ ID NOS: 1-4, under conditions appropriate for hybridization, and assessing whether hybridization has occurred between the polynucleotide in the sample and the isolated polynucleotide; wherein if hybridization has occurred, a certain polynucleotide containing a particular allele of a SNP associated (or not associated) with colorectal cancer (CRC) is present in the sample.

37. The method of claim 36, wherein the isolated polynucleotide is completely complementary to the polynucleotide present in the sample.

38. The method of claim 36, wherein the isolated polynucleotide is partially complementary to the polynucleotide present in the sample.

39. The method of claim 36, wherein the isolated polynucleotide is at least 80% identical to the polynucleotide present in the sample and capable of selectively hybridizing to the polynucleotide.

40. A method for assaying a sample for the presence of a first polynucleotide which is at least partially complementary to a part of a second polynucleotide wherein the second polynucleotide comprises a sequence selected from the group consisting of sequences identified by SEQ ID NOS: 1-4 and the complements of sequences identified by SEQ ID NOS: 1-4 comprising: a) contacting the sample with the second polynucleotide under conditions appropriate for hybridization, and b) assessing whether hybridization has occurred between the first and the second polynucleotide, wherein if hybridization has occurred, the first polynucleotide is present in the sample.

41. The method of claim 40, wherein the presence of the first polynucleotide is indicative of colorectal cancer (CRC) or the propensity to develop colorectal cancer (CRC).

42. The method of claim 40, wherein the second polynucleotide is completely complementary to a part of the sequence of the first polynucleotide.

43. The method of claim 40, wherein the method further comprises amplification of at least part of the first polynucleotide.

44. The method of claim 40 wherein the second polynucleotide is 99 or fewer nucleotides in length and is either: (a) at least 80% identical to a contiguous sequence of nucleotides in the first polynucleotide or (b) capable of selectively hybridizing to the first polynucleotide.

45. A method of assaying a sample for the presence of a polypeptide associated with colorectal cancer (CRC) encoded by a polynucleotide, wherein the polynucleotide comprises an allele of a SNP associated with colorectal cancer (CRC) located within a sequence selected from the group consisting of sequences identified by SEQ ID NOS: 1-4 and the complements of sequences identified by SEQ ID NOS: 1-4, the method comprising contacting the sample with an antibody that specifically binds to the polypeptide.

46. The method of claim 45, wherein the presence of a polypeptide associated with colorectal cancer (CRC) in a sample encoded by a polynucleotide (comprising a sequence selected from the group consisting of sequences identified by SEQ ID NOS: 1-4 and the complements of sequences identified by SEQ ID NOS: 1-4) is assayed by contacting the sample with an antibody that specifically binds to the polypeptide.

47. The method of claim 45, wherein the presence of a polypeptide associated with colorectal cancer (CRC) in a sample encoded by a polynucleotide (comprising at least a portion of a sequence selected from the group consisting of sequences identified by SEQ ID NOS: 1-4 and the complements of sequences identified by SEQ ID NOS: 1-4) is assayed by contacting the sample with an antibody that specifically binds to the polypeptide.

48. A reagent for assaying a sample for the presence of a first polynucleotide comprising a SNP located within a sequence selected from the group consisting of sequences identified by SEQ ID NOS: 1-4 and the complements of sequences identified by SEQ ID NOS: 1-4, the reagent comprising a second polynucleotide comprising a contiguous nucleotide sequence which is at least partially complementary to a part of the first polynucleotide.

49. The reagent of claim 48, wherein the second polynucleotide is completely complementary to a part of the first polynucleotide

50. A reagent kit for assaying a sample for the presence of a first polynucleotide comprising a SNP located within a sequence selected from the group consisting of sequences identified by SEQ ID NOS: 1-4 and the complements of sequences identified by SEQ ID NOS: 1-4, comprising in separate containers: a) one or more labeled second polynucleotides comprising a sequence selected from the group consisting of the sequences identified by SEQ ID NOS: 1-4 and the complements of sequences identified by SEQ ID NOS: 1-4; and b) reagents for detection of the label.

51. A kit comprising one or more polynucleotides to assay samples for the presence of polynucleotides containing an allele of a SNP associated (or not associated) with colorectal cancer (CRC) located within a sequence selected from the group consisting of sequences identified by SEQ ID NOS: 1-4 and the complements of sequences identified by SEQ ID NOS: 1-4.

52. A kit comprising one or more antibodies to assay samples for the presence of proteins associated (or not associated) with colorectal cancer (CRC) that are encoded by the polynucleotides containing an allele of a SNP associated (or not associated) with colorectal cancer (CRC).

53. A method of diagnosing a susceptibility to colorectal cancer (CRC) in an individual comprising: determining the expression or composition of a polypeptide in a control sample encoded by a polynucleotide containing an allele of a SNP not associated with colorectal cancer (CRC), and comparing with the expression or composition of a polypeptide in a test sample encoded by the same polynucleotide except containing an allele of a SNP associated with colorectal cancer (CRC), wherein the presence of an alteration in expression or composition of the polypeptide in the test sample compared to the control sample is indicative of a susceptibility to colorectal cancer (CRC).

54. A method of diagnosing colorectal cancer (CRC) or a susceptibility to colorectal cancer (CRC) in an individual, comprising determining the presence or absence in the individual of certain haplotypes by comprise screening for one of the at-risk haplotypes shown in Figure 4C.

55. A method of diagnosing a susceptibility to colorectal cancer (CRC) in an individual, or for screening individuals for a susceptibility to colorectal cancer (CRC) comprising: a) obtaining a polynucleotide sample from the individual; and b) analyzing the polynucleotide sample for the presence or absence of a haplotype, comprising a haplotype shown in Figure 4C, wherein the presence of the haplotype corresponds to a susceptibility to colorectal cancer (CRC).

56. The method of claim 54, comprising detecting multiple SNPs identified in Figure 4C.

57. A method of determining the susceptibility to colorectal cancer (CRC) in an individual comprising detecting multiple SNPs identified in one or more of: SEQ ID NOS: 1, 2, 3 and/or 4.

58. The method of claim 57, wherein at least one SNP is located within SEQ ID NO: 1 or the complement of SEQ ID NO: 1.

59. The method of claim 579, wherein at least one SNP is located within SEQ ID NO:

2 or the complement of SEQ ID NO: 2.

60. The method of claim 57, wherein at least one SNP is located within SEQ ID NO:

3 or the complement of SEQ ID NO: 3.

61. The method of claim 57, wherein at least one SNP is located within SEQ ID NO:

4 or the complement of SEQ ID NO: 4.

62. A method of identifying a gene associated with colorectal cancer (CRC) comprising: a) identifying a gene containing a SNP that is located within a sequence selected from the group consisting of sequences identified by SEQ ID NOS: 1-4 and the complements of sequences identified by SEQ ID NOS: 1-4; and b) comparing the expression of the gene in an individual having the at-risk allele with the expression of the gene in an individual having the non-risk allele for differences indicating that the gene is associated with colorectal cancer (CRC).

63. A method of identifying a gene associated with colorectal cancer (CRC) comprising: a) identifying a gene containing an at-risk haplotype identified in Figure 4C; and b) comparing the expression of the gene in an individual having the at-risk haplotype with the expression of the gene in an individual not having the at-risk haplotype for differences indicating that the gene is associated with colorectal cancer (CRC).

Description:
TITLE

POLYMORPHISMS ASSOCIATED WITH DEVELOPING COLORECTAL CANCER, METHODS OF DETECTION AND USES THEREOF

Inventors: Albert de Ia Chapelle, Stephan M. Tanner, Laura Valle, Boris Pasche

CROSS-REFERENCE TO RELATED APPLICATIONS

[0001] This application claims the benefit of United States Provisional Application Number

61/088,080 filed August 12, 2008, the entire disclosure of which is expressly incorporated herein by reference.

REFERENCE TO SEQUENCE LISTING [0002] This application contains a Sequence Listing submitted as an electronic text file named "604_50402_SEQ_List_ST25_OSURF-08139.txt", having a size in bytes of 4.04 kb, and created on August 11, 2009.

STATEMENT REGARDING FEDERALLY SPONSORED RESEARCH

[0003] This invention was made with government support under National Institutes of Health Grant Nos. CA69741, CA16058, CA112520 and CA108741. The government has certain rights in this invention.

TECHNICAL FIELD AND INDUSTRIAL APPLICABILITY OF THE INVENTION [0004] This invention relates generally to the field of molecular biology. More particularly, it concerns methods and compositions involving biomarkers for colorectal cancer (CRC). Certain aspects of the invention include application in diagnostics, therapeutics, and prognostics of CRC.

BACKGROUND OF THE INVENTION [0005] There is no admission that the background art disclosed in this section legally constitutes prior art.

[0006] The annual worldwide incidence of colorectal cancer (CRC) exceeds one million, being the second to fourth most common cancer in industrialized countries (1). Although diet and lifestyle are thought to have a strong impact on CRC risk, genes have a key role in the predisposition to this cancer. A positive family history of CRC occurs in 20-30% of all probands. Highly penetrant autosomal dominant and recessive hereditary forms of CRC account for about 5% of all CRC cases (2). Although additional high- and low-penetrance alleles have been proposed, much of the remaining predisposition to CRC remains unexplained (3).

[0007] Aberrations in the transforming growth factor β (TGF-β) pathway are heavily involved in CRC carcinogenesis (4). While mutations in the TGF-β type II receptor gene have been explicitly associated with CRC (5), the type I receptor gene (TGFβRl) has received less attention although there is evidence that a common variant may be associated with cancer risk (6, 7). Also, there is evidence that inherited allele- specific expression of APC acts as a mechanism of predisposition to familial adenomatous polyposis (8) and that there is an analogous mechanism involving DAPKl in chronic lymphocytic leukemia (9).

[0008] In view of such, there is a need for a method for reliably and accurately diagnosing and/or screening individuals for a predisposition to cancers associated with allele- specific expression (ASE) of particular genes, including the TGFβRl gene.

SUMMARY OF THE INVENTION

[0009] In one aspect, it is disclosed herein that that TGFβRl is a notable candidate for a gene that, when its expression is reduced, causes predisposition to CRC or acts as a modifier of other genes resulting in a predisposition.

[0010] Much of the genetic predisposition to colorectal cancer (CRC) is unexplained. The inventors herein now show that germline allele- specific expression (ASE) of TGFβRl is a quantitative trait that occurs in 10-20% of CRC patients and 1-3% of controls.

[0011] ASE results in reduced expression, is dominantly inherited, segregates in families, and occurs in sporadic CRC cases.

[0012] Although subtle, the reduction in constitutive TGFβRl expression alters SMAD- mediated TGF-β signaling. Two major TGFβRl haplotypes are predominant among ASE cases, suggesting ancestral mutations but causative germline changes have not yet been identified. Conservative estimates suggest that ASE confers a substantially increased risk of CRC (Odds Ratio 8.7; 95% Confidence Interval: 2.6-29.1). The proportion of all CRC attributable to ASE in the Caucasian-dominated population of Central Ohio is probably at least 10% (CI: 6.1-14.9).

[0013] In a broad aspect, there is provided herein a method for assessing a pathological condition in a subject comprising measuring the expression of one or more alleles from the subject, wherein a reduction in the expression of the alleles from the subject compared to the expression of the alleles in a healthy control is indicative of a colorectal cancer (CRC) or a predisposition to CRC. [0014] In a particular aspect, the biomarker comprises transforming growth factor β receptor type 1 (TGFβRl). [0015] Various objects and advantages of this invention will become apparent to those skilled in the art from the following detailed description of the preferred embodiment, when read in light of the accompanying drawings.

BRIEF DESCRIPTION OF THE DRAWINGS

[0016] The patent or application file contains at least one drawing executed in color. Copies of this patent or patent application publication with color drawing(s) will be provided by the Office upon request and payment of the necessary fee.

[0017] Figure 1: TGFβRl allele-specific expression (ASE) distribution in 138 CRC patients and 105 controls studied by SNaPshot. The ASE cut-off value of 1.5 chosen to categorize the cases is indicated, together with its associated P-value obtained from comparing the proportions of cases (29/138) and controls (3/105) above the indicated value.

[0018] Figures 2A and 2B: ASE determination in two ASE CRC probands.

[0019] Figure 2A: ASE detection in blood DNA by SNaPshot. The ASE ratio was calculated by normalizing the ratio between the peak areas of the two alleles in cDNA with the same parameters in genomic DNA (gDNA). In both examples, the transcript from the "a" allele is reduced with respect to the other allele.

[0020] Figure 2B: Semiquantitative RT-PCR of the cDNA from monochromosomal hybrids of the same two patients. Human TGFβRl expression (amplicon size 135 bp) was assessed and mouse Gpi used as control (176 bp). The values shown below the gel represent the ratios of the densitometric values of human TGFβRl versus mouse Gpi, showing reduced expression of human TGFβRl in the hybrids that contain the "a" allele.

[0021] Figures 3A and 3C: Analysis of SMAD-mediated TGF- β signaling in lymphoblastoid cell lines from ASE CRC patients and non-ASE healthy controls:

[0022] Figure 3A: SMAD2 and phosphorylated SMAD2 (pSMAD2) expression was assessed by Western blotting in lymphoblastoid cell lines from ASE patients (P-I, P-5, P- 14) and non-ASE controls (C-I, C-2 and C-3), after exposure to TGF-β (100 pM), at various time points from 0 to 16 hours and using β-actin as a loading control. In all three ASE cases, lower constitutive pSMAD2 was observed when compared with non-ASE controls. The differences in pSMAD2 expression between ASE and non-ASE cell lines were further enhanced after exposure to TGF-β.

[0023] Figure 3B: SMAD2 and p-SMAD2 expression 1 hour after exposure to different TGF-β concentrations. The effect shown in Figure 3A also occurs at low concentrations of TGF-β (5pM).

[0024] Figure 3C: pSMAD3 detection in nuclear extracts from three ASE patients and three non-ASE controls after exposure to TGF-βl. The three non-ASE lymphoblastoid cell lines had pSMAD3 expression in the nucleus while nuclear pSMAD3 expression was undetectable in two ASE cases (P-I and P-14) and barely detectable in one case (P-5).

[0025] Figure 4A: Diagram of the TGFBRl genomic region. The uppermost line depicts the 96.5 kb region sequenced in six ASE patients (four monochromosomal hybrids and four diploid DNAs). Shown are the locations of the 2-bp CA deletion upstream of exon 1, the 9 A/6 A polymorphism in exon 1, and the four SNPs in the 3'-UTR used for ASE determinations.

[0026] Figure 4B: Locations of the sixty SNPs used for haplotype inference in ASE (n=31) and non-ASE (n=55) CRC patients. The arrowed shorter lines each depict a 10 SNP overlapping window. P- values indicate the significance of differences in haplotype distribution between ASE and non-ASE individuals.

[0027] Figure 4C: Two major haplotypes identified in ASE patients are shown ("GAAGAGCATA" disclosed as [SEQ ID NO: 5]).

[0028] Figure 5: ASE evaluation strategy in cases and controls and number of patients at each step.

[0029] Figure 6: ASE inheritance and co-segregation with CRC. Pedigrees of 4 ASE probands. Solid black circles or squares indicate CRC. ASE values are framed and in red if over 1.5. For all four kindreds, the haplotypes inferred by MERLIN from genotyping data of 35 SNPs (Figure 11 - Table 5), are represented as colored bars. For each pedigree, each colored bar represents a different haplotype, and in all of them, the red one corresponds to the down-expressed allele in the proband. Although two borderline ASE values («1.4) have not been marked as positive in red, both occur in individuals (red asterisks) who carry their corresponding proband's affected allele.

[0030] Figure 7: Table 1. Germline and somatic TGFβRl characteristics of informative CRC patients (n=145).

[0031] Figure 8: Table 2. Sensitivity, specificity and Youden's index for different ASE cutoff values.

[0032] Figure 9: Table 3. Germline and tumor characteristics of informative colorectal cancer patients according to their ASE status.

[0033] Figure 10: Table 4. Promoter methylation status (BSPl region) in blood DNA from ASE (N=16) and non-ASE (N=9) CRC probands and from healthy controls (N=69). Fisher's exact test after adjusting for multiple comparisons did not show significant differences at the 0.05 level between ASE and non-ASE CRC probands, nor between ASE cases and healthy controls.

[0034] Figure 11: Table 5. SNPs located in the TGFβRl region used for haplotype inference by MERLIN and PHASE.

[0035] Figure 12: Table 6. Characteristics of the family members represented in Figure 6, showing their cancer affection, tumor micro satellite instability status, expression of the DNA mismatch repair proteins, current age or age at death and ASE value.

[0036] Figure 13: rs334348 [SEQ ID NO:1], rs334349 [SEQ ID NO:2], rsl590 [SEQ ID NO:3] and rs7871490 [SEQ ID NO:4].

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENT

[0037] Throughout this disclosure, various publications, patents and published patent specifications are referenced by an identifying citation. The disclosures of these publications, patents and published patent specifications are hereby incorporated by reference into the present disclosure to more fully describe the state of the art to which this invention pertains. For example, the general teaching of measuring gene expression by using PCR based techniques is disclosed in the references cited herein, the entire disclosures of which are hereby incorporated herein by reference. [0038] A nucleic acid sequence at which more than one sequence is possible in a population (either a natural population or a synthetic population, e.g., a library of synthetic molecules) is referred to herein as a "polymorphic site." Polymorphic sites can allow for differences in sequences based on substitutions, insertions, or deletions. Such substitutions, insertions, or deletions can result in frame shifts, the generation of premature stop codons, the deletion or addition of one or more amino acids encoded by a polynucleotide, alter splice sites, and affect the stability or transport of MRNA. Where a polymorphic site is a single nucleotide in length, the site is referred to as a single nucleotide polymorphism ("SNP").

[0039] SNPs are the most common form of genetic variation responsible for differences in disease susceptibility. SNPs can directly contribute to or, more commonly, serve as markers for many phenotypic endpoints such as disease risk differences between patients.

[0040] Identification of these genetic factors can lead to diagnostic methods, reagents and reagent kits for the identification of individuals who have a propensity to develop certain diseases.

[0041] The instant invention concerns the identification of genetic factors that predispose individuals to colorectal cancer (CRC), with a focus on candidate genes and specifically, nucleic acid fragments of genes having single nucleotide polymorphisms ("SNPs").

[0042] In certain embodiments, the invention provides isolated polynucleotides containing SNPs located within sequences selected from the group consisting of sequences identified by Sequence Identification Numbers ("SEQ ID NOs.") 1-4 and the complements of the sequences identified by SEQ ID NOs: 1-4 as well as vectors, recombinant host cells, transgenic animals, and compositions containing such polynucleotides. The invention also provides methods of diagnosing a susceptibility to colorectal cancer (CRC) in an individual, by detecting one or more at-risk alleles of SNPs associated with colorectal cancer (CRC). In addition, the invention provides methods of diagnosing a susceptibility to colorectal cancer (CRC) in an individual by detecting one or more haplotypes associated with colorectal cancer (CRC).

[0043] The term "SNP" refers to a single nucleotide polymorphism at a particular position in the human genome that varies among a population of individuals. As used herein, a SNP may be identified by its name or by location within a particular sequence. [0044] As used herein, the nucleotide sequences disclosed by the SEQ ID NO. encompass the complements of the nucleotide sequences. In addition, as used herein, the term "SNP" encompasses any allele among a set of alleles. The term "allele" refers to a specific nucleotide among a selection of nucleotides defining a SNP.

[0045] The term "at-risk allele" refers to an allele that is associated with colorectal cancer (CRC). The term "haplotype" refers to a combination of particular alleles from two or more

SNPs.

[0046] The term "at-risk haplotype" refers to a haplotype that is associated with colorectal cancer (CRC).

[0047] The term "polynucleotide" refers to polymeric forms of nucleotides of any length. The polynucleotides may contain deoxyribonucleotides, ribonucleotides, and/or their analogs. Polynucleotides may have any three-dimensional structure including single-stranded, double- stranded and triple helical molecular structures, and may perform any function, known or unknown. The following are non-limiting embodiments of polynucleotides: a gene or gene fragment, exons, introns, mRNA, tRNA, rRNA, short interfering nucleic acid molecules (siNA), ribozymes, cDNA, recombinant polynucleotides, branched polynucleotides, plasmids, vectors, isolated DNA of any sequence, isolated RNA of any sequence, nucleic acid probes, and primers. A polynucleotide may also comprise modified nucleic acid molecules, such as methylated nucleic acid molecules and nucleic acid molecule analogs.

[0048] The terms "individual," "host," and "subject" are used interchangeably herein to refer to a vertebrate, preferably a mammal, more preferably a human.

[0049] The present invention employs, unless otherwise indicated, conventional techniques of molecular biology (including recombinant techniques), microbiology, cell biology, biochemistry and immunology, which are within the skill of the art.

[0050] As used herein, the singular form of any term can alternatively encompass the plural form and vice versa.

[0051] In addition, a polynucleotide of the present invention can be isolated using standard molecular biology techniques and the sequence information provided herein. Using all or a portion of a sequence selected from the group consisting of sequences identified by SEQ ID NOs: 1-4 and the complements of sequences identified by SEQ ID NOs: 1-4, polynucleotides can be isolated using standard hybridization and cloning techniques (e.g., as described in Sambrook et al., eds., Molecular Cloning: A Laboratory Manual, 2.sup.nd ed., Cold Spring Harbor Laboratory, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N. Y, 1989).

[0052] A polynucleotide can be amplified using cDNA, mRNA or genomic DNA as a template and appropriate oligonucleotide primers according to standard PCR amplification techniques. The polynucleotide so amplified can be cloned into an appropriate vector and characterized by DNA sequence analysis. Furthermore, oligonucleotides corresponding to all or a portion of a polynucleotide can be prepared by standard synthetic techniques, e.g., using an automated DNA synthesizer.

[0053] The term "allele specific expression" as used herein, refers to the differential expression of the two alleles from the two chromosome copies in a mammalian cell and implies thereby that one allele is lower expression than the other.

[0054] The term "hybridization," as used herein, refers to the process of binding, annealing, or base-pairing between two single- stranded nucleic acids. The "stringency of hybridization" is determined by the conditions of temperature and ionic strength. Nucleic acid hybrid stability is expressed as the melting temperature or Tm, which is the temperature at which the hybrid is 50% denatured under defined conditions. Equations have been derived to estimate the Tm of a given hybrid; the equations take into account the G+C content of the nucleic acid, the length of the hybridization probe, etc. (e.g., Sambrook et al., 1989). To maximize the rate of annealing of the probe with its target, hybridizations are generally carried out in solutions of high ionic strength (6x SSC or 6x SSPE) at a temperature that is about 2025 0 C below the Tm. If the sequences to be hybridized are not identical, then the hybridization temperature is reduced 1-1.5 0 C for every 1% of mismatch. In general, the washing conditions should be as stringent as possible (i.e., low ionic strength at a temperature about 12-2O 0 C below the calculated Tm). As an example, highly stringent conditions typically involve hybridizing at 68 0 C in 6x SSC/5x Denhardt's solution/1.0% SDS and washing in 0.2x SSCVO.1 % SDS at 65 0 C. The optimal hybridization conditions generally differ between hybridizations performed in solution and hybridizations using immobilized nucleic acids. One skilled in the art will appreciate which parameters to manipulate to optimize hybridization.

[0055] The term "nucleic acid," as used herein, refers to sequences of linked nucleotides. The nucleotides may be deoxyribonucleotides or ribonucleotides, they may be standard or non-standard nucleotides; they may be modified or derivatized nucleotides; they may be synthetic analogs. The nucleotides may be linked by phosphodiester bonds or non- hydrolyzable bonds. The nucleic acid may comprise a few nucleotides (i.e., oligonucleotide), or it may comprise many nucleotides (i.e., polynucleotide). The nucleic acid may be single- stranded or double- stranded.

[0056] Probes based on the sequence of a polynucleotide of the invention can be used to detect transcripts or genomic sequences. A probe may comprise a label group attached thereto, e.g., a radioisotope, a fluorescent compound, an enzyme, or an enzyme co-factor. Such probes can be used as part of a diagnostic test kit for identifying cells or tissues which mis-express the protein, such as by measuring levels of a nucleic acid molecule encoding a protein in a sample of cells from a subject, e.g., detecting mRNA levels or determining whether a gene encoding a protein has been mutated or deleted.

[0057] It has now been discovered that inherited allele-specific expression (ASE) of the transforming growth factor beta (TGF-β) type I receptor gene (TGFβRl) acts as a mechanism of predisposition to familial colorectal cancers (CRC). While not wishing to be bound by theory, the inventors herein now believe that the putative change might be subtle; for instance, lowered rather than extinguished expression of one allele (referred to here as ASE for Allele-Specific Expression).

[0058] The present invention provides isolated polynucleotides comprising a SNP located within a sequence selected from the group consisting of sequences identified by SEQ ID NOS.: 1-4 and the complements of sequences identified by SEQ ID NOS.: 1-4; wherein the presence of a particular allele of a SNP (a particular nucleotide base) is indicative of a propensity to develop colorectal cancer (CRC) or otherwise may be used to identify an at-risk individual.

[0059] In one embodiment, the polynucleotide is selected from the group consisting of sequences identified by SEQ ID NOs: 1-4 and the complements of sequences identified by SEQ ID NOs: 1-4. In another embodiment, the polynucleotide comprises at least a portion of a sequence selected from the group consisting of sequences identified by SEQ ID NOs: 1-4 and the complements of sequences identified by SEQ ID NOs: 1-4.

[0060] The present invention also relates to isolated polynucleotides comprising a SNP located within a sequence selected from the group consisting of sequences identified by SEQ ID NOs: 1-4 and the complements of sequences identified by SEQ ID NOs: 1-4, which hybridize, are complementary, or are partially complementary to a nucleotide sequence present in a test sample. [0061] In one embodiment, an isolated polynucleotide is selected from the group consisting of sequences identified by SEQ ID NOs: 1-4 and the complements of sequences identified by SEQ ID NOs: 1-4, which hybridizes, is complementary, or is partially complementary to a nucleotide sequence present in a test sample.

[0062] In a further embodiment, an isolated polynucleotide comprises at least a portion of a sequence selected from the group consisting of sequences identified by SEQ ID NOs: 1-4 and the complements of sequences identified by SEQ ID NOs: 1-4, which hybridizes, is complementary, or is partially complementary to a nucleotide sequence present in a test sample.

[0063] In certain embodiments, the SNP is located within SEQ ID NO: 1 or the complement of SEQ ID NO: 1.

[0064] In certain embodiments, the SNP is located within SEQ ID NO: 2 or the complement of SEQ ID NO: 2.

[0065] In certain embodiments, the SNP is located within SEQ ID NO: 3 or the complement of SEQ ID NO: 3.

[0066] In certain embodiments, the SNP is located within SEQ ID NO: 4 or the complement of SEQ ID NO: 4.

[0067] The present invention also provides isolated polynucleotides comprising one or more haplotypes selected from the group consisting of the haplotypes identified in Figure 4C which are indicative of a propensity to develop colorectal cancer (CRC).

[0068]

[0069] In certain embodiments, the invention also provides polypeptides encoded by a polynucleotide, wherein the polynucleotide comprises a SNP located within a sequence selected from the group consisting of sequences identified by SEQ ID NOs: 1-4 and the complements of sequences identified by SEQ ID NOs: 1-4.

[0070] In one embodiment, a polypeptide is encoded by a polynucleotide, wherein the polynucleotide is selected from the group consisting of sequences identified by SEQ ID NOs: 1-4 and the complements of sequences identified by SEQ ID NOs: 1-4.

[0071] In another embodiment, a polypeptide is encoded by a polynucleotide, wherein the polynucleotide comprises at least a portion of the sequence selected from the group consisting of sequences identified by SEQ ID NOS: 1-4 and the complements of sequences identified by SEQ ID NOS: 1-4. Also contemplated are antibodies that bind such polypeptides. [0072] The present invention also provides polypeptides encoded by a polynucleotide, wherein the polynucleotide comprises a haplotype selected from the group consisting of the haplotypes identified in Figure 4C.

[0073] In certain embodiments, the invention also provides a vector comprising a haplotype identified in Figure 4C or a SNP located within a sequence selected from the group consisting of the sequences identified by SEQ ID NOS: 1-4 and the complements of sequences identified by SEQ ID NOS: 1-4; operably linked to a regulatory sequence.

[0074] In other embodiments, compositions and kits are contemplated which contain the polynucleotides, proteins and/or antibodies of the present invention.

[0075] One application of the current invention involves prediction of those at higher risk of developing colorectal cancer (CRC). Diagnostic tests that define genetic factors contributing to colorectal cancer (CRC) may be used together with, or independent of, the known clinical risk factors to define an individual's risk relative to the general population. Means for identifying those individuals at risk for colorectal cancer (CRC) should lead to better prophylactic and treatment regimens, including more aggressive management of the current clinical risk factors. In certain embodiments, the present invention includes methods of diagnosing a susceptibility to colorectal cancer (CRC) in an individual, comprising detecting polymorphisms in nucleic acids of specific genes or gene segments, wherein the presence of the polymorphism in the nucleic acid is indicative of a susceptibility to colorectal cancer (CRC).

[0076] In certain embodiments, the present invention includes methods of diagnosing colorectal cancer (CRC) or a susceptibility to colorectal cancer (CRC) in an individual, comprising determining the presence or absence of particular alleles of SNPs contained in SEQ ID NOS: 1-4. In one aspect of the invention, methods comprise screening for one of the at-risk alleles associated with colorectal cancer (CRC) shown in Figure 4C. In certain embodiments, the SNP is located within SEQ ID NO: 1 or the complement of SEQ ID NO: 1. In certain embodiments, the SNP is located within SEQ ID NO: 2 or the complement of SEQ ID NO: 2. In certain embodiments, the SNP is located within SEQ ID NO: 3 or the complement of SEQ ID NO: 3. In certain embodiments, the SNP is located within SEQ ID NO: 4 or the complement of SEQ ID NO: 4.

[0077] In one embodiment, the invention provides a method of detecting the presence of a polynucleotide in a sample containing a SNP located within a sequence selected from the group consisting of sequences identified by SEQ ID NOS: 1-4 and the complements of sequences identified by SEQ ID NOS: 1-4, wherein the method comprises contacting the sample with an isolated polynucleotide comprising a sequence (or a portion of a sequence) selected from the group consisting of sequences identified by SEQ ID NOS: 1-4 and the complements of sequences identified by SEQ ID NOS: 1-4, under conditions appropriate for hybridization, and assessing whether hybridization has occurred between the polynucleotide in the sample and the isolated polynucleotide; wherein if hybridization has occurred, a certain polynucleotide containing a particular allele of a SNP associated (or not associated) with colorectal cancer (CRC) is present in the sample.

[0078] In certain embodiments of the above method, the isolated polynucleotide is completely complementary to the polynucleotide present in the sample. In other embodiments of the above method, the isolated polynucleotide is partially complementary to the polynucleotide present in the sample. In other embodiments, the isolated polynucleotide is at least 80% identical to the polynucleotide present in the sample and capable of selectively hybridizing to the polynucleotide. If desired, amplification of the polynucleotide present in the sample can be performed using known methods in the art.

[0079] The present invention further provides a method for assaying a sample for the presence of a first polynucleotide which is at least partially complementary to a part of a second polynucleotide wherein the second polynucleotide comprises a sequence selected from the group consisting of sequences identified by SEQ ID NOS: 1-4 and the complements of sequences identified by SEQ ID NOS: 1-4 comprising: a) contacting the sample with the second polynucleotide under conditions appropriate for hybridization, and b) assessing whether hybridization has occurred between the first and the second polynucleotide, wherein if hybridization has occurred, the first polynucleotide is present in the sample.

[0080] In one embodiment of the method hereinbefore described, the presence of the first polynucleotide is indicative of colorectal cancer (CRC) or the propensity to develop colorectal cancer (CRC). In a further embodiment of the method, the second polynucleotide is completely complementary to a part of the sequence of the first polynucleotide. In another embodiment, the method further comprises amplification of at least part of the first polynucleotide. In a further embodiment, the second polynucleotide is 99 or fewer nucleotides in length and is either: (a) at least 80% identical to a contiguous sequence of nucleotides in the first polynucleotide or (b) capable of selectively hybridizing to the first polynucleotide.

[0081] Also contemplated by the invention is a method of assaying a sample for the presence of a polypeptide associated with colorectal cancer (CRC) encoded by a polynucleotide, wherein the polynucleotide comprises an allele of a SNP associated with colorectal cancer (CRC) located within a sequence selected from the group consisting of sequences identified by SEQ ID NOS: 1-4 and the complements of sequences identified by SEQ ID NOS: 1-4, the method comprising contacting the sample with an antibody that specifically binds to the polypeptide.

[0082] In one embodiment, the presence of a polypeptide associated with colorectal cancer (CRC) in a sample encoded by a polynucleotide (comprising a sequence selected from the group consisting of sequences identified by SEQ ID NOS: 1-4 and the complements of sequences identified by SEQ ID NOS: 1-4) is assayed by contacting the sample with an antibody that specifically binds to the polypeptide.

[0083] In another embodiment, the presence of a polypeptide associated with colorectal cancer (CRC) in a sample encoded by a polynucleotide (comprising at least a portion of a sequence selected from the group consisting of sequences identified by SEQ ID NOS: 1-4 and the complements of sequences identified by SEQ ID NOS: 1-4) is assayed by contacting the sample with an antibody that specifically binds to the polypeptide.

[0084] The present invention also contemplates a reagent for assaying a sample for the presence of a first polynucleotide comprising a SNP located within a sequence selected from the group consisting of sequences identified by SEQ ID NOS: 1-4 and the complements of sequences identified by SEQ ID NOS: 1-4, the reagent comprising a second polynucleotide comprising a contiguous nucleotide sequence which is at least partially complementary to a part of the first polynucleotide. In one embodiment of the reagent, the second polynucleotide is completely complementary to a part of the first polynucleotide.

[0085] The present invention also encompasses a reagent kit for assaying a sample for the presence of a first polynucleotide comprising a SNP located within a sequence selected from the group consisting of sequences identified by SEQ ID NOS: 1-4 and the complements of sequences identified by SEQ ID NOS: 1-4, comprising in separate containers: a) one or more labeled second polynucleotides comprising a sequence selected from the group consisting of the sequences identified by SEQ ID NOS: 1-4 and the complements of sequences identified by SEQ ID NOS: 1-4; and b) reagents for detection of the label.

[0086] In other embodiments, kits are contemplated containing polynucleotides which can be used to assay samples for the presence of polynucleotides containing an allele of a SNP associated (or not associated) with colorectal cancer (CRC) located within a sequence selected from the group consisting of sequences identified by SEQ ID NOS: 1-4 and the complements of sequences identified by SEQ ID NOS: 1-4. Kits are also contemplated which contain antibodies which can be used to assay samples for the presence of proteins associated (or not associated) with colorectal cancer (CRC) that are encoded by the polynucleotides containing an allele of a SNP associated (or not associated) with colorectal cancer (CRC).

[0087] Other methods of diagnosing a susceptibility to colorectal cancer (CRC) in an individual comprise determining the expression or composition of a polypeptide in a control sample encoded by a polynucleotide containing an allele of a SNP not associated with colorectal cancer (CRC) and comparing it with the expression or composition of a polypeptide in a test sample encoded by the same polynucleotide except containing an allele of a SNP associated with colorectal cancer (CRC), wherein the presence of an alteration in expression or composition of the polypeptide in the test sample compared to the control sample is indicative of a susceptibility to colorectal cancer (CRC).

[0088] In certain embodiments, the invention also relates to a method of diagnosing colorectal cancer (CRC) or a susceptibility to colorectal cancer (CRC) in an individual, comprising determining the presence or absence in the individual of certain haplotypes. In one aspect of the invention, methods comprise screening for one of the at-risk haplotypes shown in Figure 4C. Thus, the present invention encompasses a method for diagnosing a susceptibility to colorectal cancer (CRC) in an individual, or a method of screening for individuals with a susceptibility to colorectal cancer (CRC), comprising detecting a haplotype associated with colorectal cancer (CRC) selected from the group consisting of the haplotypes shown in Figure 4C.

[0089] The presence or absence of the haplotype may be determined by various methods, including, for example, using enzymatic amplification of nucleic acid from the individual, electrophoretic analysis, restriction fragment length polymorphism analysis and/or sequence analysis.

[0090] A method of diagnosing a susceptibility to colorectal cancer (CRC) in an individual, or for screening individuals for a susceptibility to colorectal cancer (CRC) is also included, comprising: a) obtaining a polynucleotide sample from the individual; and b) analyzing the polynucleotide sample for the presence or absence of a haplotype, comprising a haplotype shown in Figure 4C, wherein the presence of the haplotype corresponds to a susceptibility to colorectal cancer (CRC). [0091] In certain embodiments, a method of determining the susceptibility to colorectal cancer (CRC) in an individual is provided comprising detecting multiple SNPs identified in Figure 4C.

[0092] In certain embodiments, the method of determining the susceptibility to colorectal cancer (CRC) in an individual comprises detecting multiple SNPs identified in one or more of: SEQ ID NOS: 1, 2, 3 and/or 4.

[0093] In certain embodiments, the presence of a first polynucleotide in a sample containing one or more at-risk alleles in Figure 4C is assayed for by contacting the sample with probe polynucleotides that are complementary to the first polynucleotide. In certain embodiments, at least one SNP is located within SEQ ID NO: 1 or the complement of SEQ ID NO: 1. In certain embodiments, at least one SNP is located within SEQ ID NO: 2 or the complement of SEQ ID NO: 2. In certain embodiments, at least one SNP is located within SEQ ID NO: 3 or the complement of SEQ ID NO: 3. In certain embodiments, at least one SNP is located within SEQ ID NO: 4 or the complement of SEQ ID NO: 4.

[0094] In one embodiment, the invention pertains to a method of identifying a gene associated with colorectal cancer (CRC) comprising: (a) identifying a gene containing a SNP that is located within a sequence selected from the group consisting of sequences identified by SEQ ID NOS: 1-4 and the complements of sequences identified by SEQ ID NOS: 1-4; and (b) comparing the expression of the gene in an individual having the at-risk allele with the expression of the gene in an individual having the non-risk allele for differences indicating that the gene is associated with colorectal cancer (CRC).

[0095] In another embodiment, the invention pertains to a method of identifying a gene associated with colorectal cancer (CRC) comprising: (a) identifying a gene containing an at- risk haplotype identified in Figure 4C; and (b) comparing the expression of the gene in an individual having the at-risk haplotype with the expression of the gene in an individual not having the at-risk haplotype for differences indicating that the gene is associated with colorectal cancer (CRC).

[0096] In certain embodiments, the isolated nucleic acid can be from about 3 to 101 nucleotides in length. In particular embodiments, the isolated nucleic acid being a length selected from the group of from about 5 to 101, from about 7 to 101, from about 9 to 101, from about 15 to 101, from about 20 to 101, from about 25 to 101, from about 30 to 101, from about 40 to 101, from about 50 to 101, from about 60 to 101, from about 70 to 101, from about 80 to 101, from about 90 to 101, and from about 99 to 101 nucleotides in length. Also, in specific embodiments, the SNP is selected from the group of rs334338, rs334349, rsl590 and rs7871490.

[0097] In another broad aspect, there is provided herein a polynucleotide useful to predict colorectal cancer (CRC) risk, comprising a complement to a sequence selected from the group of SEQ ID NOs: 1-4. In certain embodiments, the complement can be from about 3 to 101 nucleotides in length. In particular embodiments, the complement can be a length selected from the group of from about 5 to 101, from about 7 to 101, from about 9 to 101, from about 15 to 101, from about 20 to 101, from about 25 to 101, from about 30 to 101, from about 40 to 101, from about 50 to 101, from about 60 to 101, from about 70 to 101, from about 80 to 101, from about 90 to 101, and from about 99 to 101 nucleotides in length. Also, in specific embodiments, the polynucleotide has a Single Nucleotide Polymorphism (SNP) selected from the group of rs334338, rs334349, rsl590 and rs7871490. In certain embodiments, the complement can be an allele-specific probe or primer. [0098] In another broad aspect, there is provided herein an amplified polynucleotide containing a Single Nucleotide Polymorphism (SNP) selected from SEQ ID NOS: 1-4, or a complement thereof. In certain embodiments, the complement can be from about 3 to 101 nucleotides in length.

[0099] A method of distinguishing patients having an increased susceptibility to colorectal cancer (CRC) from patients who do not, includes the step of detecting at least one Single Nucleotide Polymorphism (SNP) in any of SEQ ID NOs: 1-4 in a nucleic acid sample from the patients, wherein the presence or absence of the SNP can be used to assess increased susceptibility to CRC.

[00100] In certain embodiments, the presence of the SNP is an indication that patients have an increased susceptibility to CRC. In certain other embodiments, the presence of the SNP is an indication that patients have a decreased susceptibility to CRC.

[00101] In certain embodiments, the method includes distinguishing patients where the SNP is selected from the group of rs334338, rs334349, rsl590 and rs7871490.

[00102] In certain embodiments, the method includes determining the colorectal cancer (CRC) risk in a patient, comprising the step of identifying one or more Single Nucleotide Polymorphism (SNP) in any of SEQ ID NOS: 1-4 in a nucleic acid sample from the patient.

[00103] In certain embodiments, the method includes determining CRC risk, where the presence of the SNP is an indication that the patient has a risk of CRC.

[00104] In certain embodiments, the method includes determining the CRC risk, where the presence of the SNP is an indication that the patient does not have a risk of CRC.

[00105] In certain embodiments, the method includes determining the CRC risk, where the SNP is selected from the group of rs334338, rs334349, rsl590 and rs7871490.

[00106] In certain embodiments, the method includes of detecting CRC, where the genetic material is combined with one or more polynucleotide probes capable of hybridizing selectively to a SNP in any of SEQ ID NOS: 1-4.

[00107] In certain embodiments, the method includes detecting CRC, where the probes are oligonucleotides capable of priming polynucleotide synthesis in a polymerase chain reaction.

[00108] In certain embodiments, the method includes detecting CRC, where the genetic material comprises one or more of: DNA, RNA and/or where the genetic material is amplified.

[00109] Those skilled in the art will recognize that the analysis of the nucleotides present in one or several of the SNP markers in an individual's nucleic acid can be done by any method or technique capable of determining nucleotides present at a polymorphic site. One of skill in the art would also know that the nucleotides present in SNP markers can be determined from either nucleic acid strand or from both strands.

[00110] The present invention is further defined in the following Examples, in which all parts and percentages are by weight and degrees are Celsius, unless otherwise stated. It should be understood that these Examples, while indicating preferred embodiments of the invention, are given by way of illustration only. From the above discussion and these Examples, one skilled in the art can ascertain the essential characteristics of this invention, and without departing from the spirit and scope thereof, can make various changes and modifications of the invention to adapt it to various usages and conditions. All publications, including patents and non-patent literature, referred to in this specification are expressly incorporated by reference. The following examples are intended to illustrate certain preferred embodiments of the invention and should not be interpreted to limit the scope of the invention as defined in the claims, unless so specified.

[00111] The value of the present invention can thus be seen by reference to the Examples herein.

[00112] Examples

[00113] As now shown herein, to test for ASE in TGFβRl, the inventors chose three SNPs rs334348 [SEQ ID NO:1] (see web site: ncbi.nlm.nih.gov/SNP/snp_ref.cgi?rs=334348), rs334349 [SEQ ID NO:2] (see web site: ncbi.nlm.nih.gov/SNP/snp_ref.cgi?rs=334349), and rsl590) [SEQ ID NO:3] (see web site: ncbi.nlm.nih.gov/SNP/snp_ref.cgi?rs=1590) in the 3'- UTR region, to which primer extension with fluorescent nucleotides (SNaPshot) (10) was applied. These three SNPs are separated by 1916 and 1778 bp respectively, yet they exhibit total linkage disequilibrium.

[00114] Among a total of 242 patients with MSI-negative CRC (10), 96 (39.7%) were heterozygous for the three 3'-UTR SNPs, of whom 12 showed ASE variation ratios higher than 1.5, while no patient showed ratios below 0.67. Forty-nine additional cases were heterozygous for one further SNP rs7871490 [SEQ ID NO:4] (see web site: ncbi.nlm.nih.gov/SNP/snp_ref.cgi?rs=7871490) located in the 3'-UTR region that was not in strong linkage disequilibrium with the above 3 markers, and 17/49 had ASE values higher than 1.5. Thus, 29 out of 138 (21%) informative CRC patients showed ASE in the TGFβRl gene. Three additional cases had borderline values (Figure 5 and Table 1 (Figure 7)).

[00115] DNA samples from blood of healthy Columbus-area controls (195 individuals) (10) were genotyped for the four SNPs. One hundred and nine (55.9%) were heterozygous and ASE analysis in 105 of them revealed ratios ranging between 0.72 and 3.25 (Figure 5). Only three controls showed ratios above 1.5. The results in both the cases and controls suggest that the degree of ASE is a quantitative trait (Figure 1).

[00116] Differences in the degree of ASE between cases and controls showed a P-value of 0.1208 when applying a Wilcoxon rank sum test and a P-value of 0.0207 when a permutation test (100,000 permutations) was applied.

[00117] While it is not possible to determine whether the degree of predisposition to colorectal cancer is proportional to the degree of ASE or whether there is a threshold value that separates "abnormal" values that predispose to CRC from "normal" values that do not, a ratio of 1 means that both alleles are equally expressed whereas a ratio of 1.5 means a 33% difference, as does a ratio of 0.67. To define a cut-off point we applied the Receiver Operating Characteristic (ROC) analysis that estimates the sensitivity and specificity of cutoff points.

[00118] As shown in Table 2 (Figure 8), the value of 1.5 maximizes both characteristics providing the highest Youden's index. Using a cut-off of 1.5 the P-value comparing cases and controls was 7.655 x 10 Λ -5. While there is no overall need to define a firm cut-off point, we used the value of 1.5 to categorize the cases. In order to determine whether the observed ratios falling outside this range represent an increase or decrease in transcript of one allele, a RT-PCR experiment was performed taking advantage of hybrid clones monoallelic for chromosome 9 created from two individuals with ASE (Patients 1 and 26, Table 1 (Figure 7)).

[00119] Each of the 4 hybrid clones contained either the maternal or paternal copy of chromosome 9, plus the mouse genome (10). As shown in Figure 2A, ASE determination in the diploid samples indicated that the expression of one allele (a) was reduced compared to the other allele (b). In the four monoallelic hybrid clones, the densitometric values of the RT- PCR of human TGFβRl were compared with the corresponding values for mouse Gpi (10). One allele (a) showed reduced expression in both patients. These experiments support the notion of a lowered expression of one allele, and that in both patients the same allele was affected (Figure 2A and Figure 2B).

[00120] To assess the effect of ASE on TGF- β signaling, lymphoblastoid cell lines from four ASE patients and four non-ASE healthy controls were exposed to TGF- β (10), which binds TGFβR2 and leads to the formation of the TGFβR2/TGFβRl/TGF- β heteromeric complex. We observed differences in levels of phosphorylated SMAD2 (pSMAD2), an important downstream effector and surrogate marker of TGF- β signaling (11, 12). There were constitutive differences in pSMAD2 expression between ASE cases and non-ASE controls in the absence of exogenously added TGF- β (Time 0; Figure 3A). Differences in pSMAD2 levels became more pronounced upon exposure to TGF- β. These differences were observed at low TGF- β concentrations (<5 pM) (Figure 3B), and occurred in four out of four ASE cases compared to non-ASE controls.

[00121] The phosphorylation of SMAD3 is an essential step in signal transduction by TGF- β for inhibition of cell proliferation (13). Furthermore, SMADd3-deficient mice are prone to colon cancer development (14, 15). To assess the impact of TGFβRl ASE on the phosphorylation of SMAD3 we used an antibody targeting the Ser423/425 site in SMAD3 (10, 16). Constitutive levels of pSMAD3 were detectable in the lymphoblastoid cell lines of three non-ASE controls while pSMAD3 was barely detectable in one ASE case (Figure 3C).

[00122] Exposure to TGF-β did not result in any detectable increase in pSMAD3 in the lymphoblastoid cell lines of the ASE patients. The pSMAD2 and pSMAD3 results indicate that patients with ASE exhibit decreased SMAD-mediated signaling when compared with non-ASE controls.

[00123] A GCG trinucleotide variable number of tandem repeat polymorphism occurs in exon 1 of TGFβRl. The most common allele contains 9 repeats leading to a stretch of 9 alanines (9A) in the signal peptide of the receptor protein. The second most common allele has 6 repeats (6A) and occurs in approximately 14% of all individuals in most Caucasian populations (6). The 6 A allele has been associated with a low-level but statistically significant predisposition to several forms of cancer (17-20). Recent studies suggest that the association of 6A with colon cancer is either weak (O.R.1.2, 95% CI: 1.01-1.43) (17) or borderline significant (O.R.I.13, CI: 0.98-1.30) (21). We typed this polymorphism in all 242 CRC cases studied by us and found 9 A/9 A in 197, 9 A/6 A in 40, 6 A/6 A in 4, and one failed (Table 3 (Figure 9)). There were clearly more 9A/6A heterozygotes among the patients with ASE (14/29) than in those without ASE (22/108) (P=0.0052, Chi-square test). While not wishing to be bound by theory, the inventors herein now believe that the 6A allele is likely in linkage disequilibrium with one of the putative mutations that causes ASE, but 6A is not in itself causative of ASE.

[00124] All 29 patients showing ASE and three patients with borderline ASE values (1.49, 1.49 and 1.46, respectively) (n=32) were studied for genetic changes occurring in the germline. By sequencing of all nine exons, 2 kb upstream of exon 1, and the entire 3'-UTR (10) a single sequence change in the coding exons was identified in patient 30 consisting of a c.l204T>A missense change in exon 7 that changes a tyrosine to asparagine (p.Tyr401Asn). Its pathogenicity is currently being assessed.

[00125] Several changes, all previously reported as polymorphic, were identified in the 3'- UTR and promoter regions. In three patients, a deletion of two bases (c.l-1782_1783delCA) at 1783 bp upstream of exon 1 was identified in a SINE repetitive sequence. Multiplex ligation-dependent probe amplification (MLPA) (10) did not suggest any large rearrangements, deletions, or duplications of exons. In a study of promoter methylation none of the comparisons of germline methylation status between ASE and non-ASE cases and ASE cases versus controls were significant (see Table 4 (Figure 10)). Thus, germline promoter methylation is unlikely to play a role in ASE.

[00126] Changes occurring in non-coding regions of the gene may be responsible for the reduction in expression. To fully study this possibility, overlapping fragments of 1.7 to 10 kb were long-range PCR amplified, cloned, and sequenced. In all, approximately 96.5 kb covering the whole gene and 3'-UTR region (49 kb), 35 kb upstream of exon 1 (up to the next gene COL15A1), and 12.5 kb downstream of the 3'-UTR (Figures 4A-4C), were fully sequenced in the four monochromosomal hybrids (patients 1 and 26), and in diploid DNA from four other ASE patients (patients 5, 11, 14 and 21) (10). Our sequencing strategy allowed us to determine the phase of every change within each amplicon, and over larger regions when at least one change occurred in the overlapping fragments. In all, 25 and 104 changes were identified in the down-regulated alleles of patients 1 and 26, respectively, while 31 and 6 changes were detected in their wild type counterparts. Diploid DNA from the four patients harbored 61, 37, 33 and 135 changes, respectively.

[00127] Excluding changes known to be present in the wild type alleles, 140 changes were identified in the down-regulated alleles. Only the c.l-1782_1783delCA change stood out as a candidate mutation. It occurred in 3/29 (10.3%) ASE patients, in 0/3 ASE controls, in 1/51 (2%) non-ASE CRC patients, and 1/81 (1.2%) non-ASE controls. In summary, these investigations did not uncover the genetic changes causing ASE.

[00128] Genotyping of most changes identified by sequencing was carried out in all available ASE CRC patients including borderline cases (n=31), and in 55 non-ASE CRC patients. Construction of haplotypes from the available genotype and haplotype data was performed with PHASE v.2.1.1 (10). In all, 60 polymorphisms covering 73.5 kb (from 12 kb upstream of exon 1 to 12.5 kb downstream of the 3'-UTR region) were used for haplotype inference (Table 5 (Figure H)). For all ASE and non-ASE patients, the program was run with 1000 permutations with overlapping 10-SNP sliding windows. Haplotype frequency distributions in ASE and non-ASE populations showed significant differences in a genomic region covering the area between the 3 '-end of intron 3 to ~5 kb downstream the 3 '-end of the UTR region (Figures 4A-4C).

[00129] The group of patients carrying the minor allele for the three 3'-UTR SNPs in linkage disequilibrium (Group 1) was very different from the other group derived from the study of SNP rs7871490 (Group 2). Haplotype analysis was performed separately in the two groups using 50 and 21 SNPs respectively.

[00130] In Group 1 (n=53), one major haplotype for the affected alleles, was present in 11/14 (78.6%) ASE but also in 22/39 (56.4%) non-ASE patients (Figures 4A-4C).

[00131] For Group 2 (n=33), another major haplotype for the affected allele was present in 14/17 (82.4%) of ASE and in 1/16 (6.3%) of non-ASE patients (Figures 4A-4C). Fisher's exact test to compare haplotype proportions showed P-values of 0.2031 and 1.260 x 10 Λ -5 for groups 1 and 2, respectively. Importantly, the 6A allele of the 9A/6A polymorphism occurred in the ASE haplotype in all 14 cases of group 2, but not in group 1 where all ASE cases except one were homozygous for the 9A allele.

[00132] In search of somatic changes in line with Knudson's two-hit hypothesis, LOH analyses as well as a search for somatic mutations in the coding sequences of the gene were performed in DNA from the tumors of 26 ASE patients. Using the described threshold (10), 6 cases out of 26 showed LOH. In 3 of these 6 cases the wild-type allele, the one with normal expression in blood, was lost or reduced while in the other 3 cases the allele showing germline ASE was lost. Exon-by-exon sequencing of the entire gene in tumors from 26 ASE patients revealed somatic changes not found in blood DNA in 3 tumors. The mutations were: c.634G>A (p.Gly212Asp) in one tumor, and c.682_685delAAG (p.delGlu228) in two tumors. These mutations occurred in exon 4 which encodes the kinase domain of the protein. LOH analyses and exon 4 sequencing in 49 tumors of CRC patients without ASE showed that none of these tumors had evidence of somatically- acquired mutations, and 5 showed LOH (Table 3 (Figure 9)).

[00133] Fisher's exact test comparing proportions of LOH and mutations between ASE and non-ASE cases showed P- values of 0.1708 and 0.0355, respectively. The occurrence of somatic mutations in ASE cases, but not in controls, supports the role of TGFβRl as a tumor suppressor gene. On the other hand, the fact that LOH affected the ASE allele as often as the wild type allele could indicate random losses.

[00134] The cohort of MSI- negative CRC patients had been deliberately enriched for familial cases (10). In the cohort of 138 patients with available ASE values, 59 out of 136 (43.4%) were familial according to the criteria indicated above, and family information was not available in two cases. Among the cases showing ASE, 53.6% were familial (Table 3 (Figure 9)). The proportion of ASE was higher among familial than among non-familial cases: 15 out of 59 (25.4%) familial cases versus 13 out of 77 non-familial cases (16.9%). Chi-square test to compare proportions showed that this difference was not statistically significant (p=0.314).

[00135] The above data suggest that ASE contributes somewhat more to familial than to sporadic CRC, but do not allow its inheritance to be assessed. If ASE is regularly inherited as a dominant trait the expectation is that 50% of first degree relatives (FDR) also have ASE. Data from 4 families that are informative in this regard are shown in Figure 6. In all, among 11 FDR, ASE was greater than 1.5 in 4, borderline in 2 (ASE values 1.40 and 1.44, respectively), and low in 5. There was no instance of ASE being incompatible with Mendelian dominant inheritance. Importantly, in all four families co-segregation of ASE with the inferred risk haplotype, representing the downregulated allele, occurred. The highest Kong and Cox non-parametric LOD-score was 1.25 with a P-value of 0.008 (non-parametric Z-score=4.12; P-value 0.00002). Among the 4-6 ASE positive FDRs, two had CRC, one had endometrial cancer and a tubular colonic adenoma, one had prostate cancer, and another one multiple polyps in the colon and rectum (Table 6 Figure 12)). Although fragmentary, these data suggest dominant inheritance of ASE with incomplete penetrance of CRC in ASE carriers.

[00136] There is indirect evidence to support the notion that ASE of TGFβRl contributes to CRC development. The TGF-β pathway is strongly involved in the carcinogenesis of colon and other cancers and its signaling is dependent on the integrity of both of its receptors (TGFβRl and TGFβR2) (22, 23). In a comprehensive study of CRC tumors, somatic mutations occurred with high frequency in 69/13,023 genes. Among these 69 genes were TGFβR2, SMAD4, SMAD2 and SMAD3, attesting to the importance of the TGF-β pathway in CRC (24). There is rapidly increasing evidence that subtle variations in gene expression play central roles not only in development in various organisms, but also in human disease (8, 9, 25). Linkage analysis of a cohort of sib pairs concordant or discordant for colorectal carcinoma or adenoma highlighted a region in chromosome 9q22-31 (26). Subsequently, borderline significant linkage to the same region was observed in families segregating colorectal cancer or adenoma without micro satellite instability (27, 28). This evidence is compatible with, but in no way proves, a role for TGFβRl.

[00137] While is remains to be determined what mechanism causes ASE, the haplotype data support the implication of ancestral mutations for most ASE patients. Moreover, the elusive genomic change causing ASE is likely in cis but the data do not exclude the possibility that ASE arises as a result of trans-acting genes that preferentially affect the risk haplotypes. Such genes could well be RNA genes as predicted earlier (29). Very recently the existence of extensive quantitative trait loci for gene expression was documented in two large studies (30, 31).

[00138] The inventors also investigated how common is ASE of TGFβRl. Using our definition it occurred in 29/138 tested CRC patients (21%) and in 3/105 tested controls (3%). In the extreme, if none of the non-informative CRC cases had ASE, the frequency would be 29/242 (12%), and for the controls, 3/195 (1.5%). Because not all individuals are informative (heterozygous for a transcribed SNP) the true frequency in cases and controls cannot be precisely assessed at present. Using the above alternative numbers we can calculate the odds ratio (OR) of CRC in carriers of ASE. In the first scenario the OR is 9.0 (CI: 2.7-30.6), and in the conservative one, OR is 8.7 (CI: 2.6-29.1).

[00139] The inventors also investigated what proportion of all CRC is attributable to ASE of TGFβRl. From the available data of the present case-control study we estimated the population attributable risk (PAR). If ASE occurs in 21% of cases and 3% of controls, the estimated PAR is 18.7% (CI: 10.8-25.8). If ASE occurs in 12% of cases and 1.5% of controls the estimated PAR is 10.6% (CI: 6.0-14.9). These numbers are estimates, representing the Caucasian-dominated population of Central Ohio, and are heavily dependent on the relevant allele frequencies which may show strong inter-ethnic variation. We nevertheless conclude that ASE of TGFBRl is a major contributor to the genetic predisposition to CRC.

[00140] Materials and Methods

[00141] Patients and controls

[00142] The patients belonged to a series of 1566 consecutively ascertained, unselected, consenting colorectal cancer (CRC) cases diagnosed in 1999-2004 in the 6 main hospitals in metropolitan Columbus, Ohio (32,33). Cases of familial polyposis or the rare hamartomatous polyposis syndromes were not accrued. All cases were screened for micro satellite instability of the tumor. Mutational analyses of the mismatch repair genes were performed in all cases showing micro satellite instability (MSI). MSI positive cases (n=306) were excluded from the study, including Lynch syndrome patients (n=44) leaving 1260 MSI negative cases.

[00143] A random sample of 144 MSI negative cases was chosen. To enrich the sample with familial cases, a further 79 MSI-negative cases were chosen who were diagnosed at age 65 or younger and had at least one first-degree relative with CRC diagnosed at any age. Furthermore 19 MSI negative patients seen in the Clinical Cancer Genetics Program at the Ohio State University were included. These patients were counseled owing to their early age of onset (under age 55) or their family history of colon cancer (CRC patients diagnosed at any age with at least 1 FDR with CRC). Thus the series consisted of 242 MSI-negative cases deliberately chosen so that familial cases (n=98) were overrepresented. After genetic counseling, clinical data and biological samples were obtained from consenting family members of some of the probands.

[00144] Controls (n=195) were obtained from the Ohio State University Division of Human Genetics collection of samples. The controls used here were cancer-free individuals accrued in the waiting rooms of the OSU Primary Care Network physician offices. As a result, the controls were derived from the same Columbus-area population as the CRC patients. Furthermore, cases (n=242) and controls (n=195) were demographic ally matched according to ethnicity (92.9 and 87.7% Caucasians, respectively), gender (47.5 and 48.7% females) and age (average age: 57.9 and 54.0, respectively). [00145] DNA/RNA extraction and cDNA synthesis

[00146] Extraction of genomic DNA from peripheral blood or lymphobastoid cells was performed by a standard phenol-chloroform procedure. For DNA extraction from formalin- fixed paraffin-embedded tissue, tumor and normal areas were microdissected, and DNA extracted using a proteinase K and phenol-chloroform protocol. For total RNA extraction, cells were processed with TRIzol reagent (Invitrogen, Carlsbad, CA). Total RNA was treated with DNAse (DNAfree™, Ambion, Austin, TX) prior to reverse transcription with Superscript™ First-Strand Synthesis System for RT-PCR (Invitrogen, Carlsbad, CA).

[00147] Allele-specific expression (ASE) and loss of heterozygosity (LOH) assessment

Polymorphisms in the cDNA of the gene were used as markers to distinguish and measure the expression of the two alleles, using the SNaPshot (PE Applied Biosystems, Foster City, CA) technique as described (34). Four polymorphisms located in the 3'-UTR region with appreciable allele frequencies, rs334348, rs334349 and rsl590 (in total linkage disequilibrium with each other) and a fourth marker, rs7871490, were used for analyses of all informative patients.

[00148] ASE ratio calculations were performed as described (34). Briefly, the ratio of the two alleles in the cDNA of the transcript was normalized with the ratio of the two alleles in genomic DNA, applying the following formula: cDNA (peak area common allele / peak area rare allele) divided by gDNA (peak area common allele / peak area rare allele). Each SNP was assayed with two independent cDNA preparations, thus the SNaPshot ASE variation value for each individual is given as the average of six different analyses when using rs334348, rs334349 and rsl590. For the rs7871490 heterozygous individuals, SNaPshot analysis was performed with two independent cDNA preparations each in duplicate so that the ASE was calculated as the average of 4 different ratios.

[00149] The arbitrary cut-off points defined (>1.5 or <0.67) were based on the optimum value of Youden's index (Table 2 (Figure 8)).

[00150] LOH was determined by the same SNaPshot methodology, markers and cut-off values as the ASE evaluation. The normalized ratio was calculated as: tumor DNA (peak area common allele / peak area rare allele) divided by germline DNA (peak area common allele / peak area rare allele). All primers and PCR conditions are available upon request.

[00151] Generation of haploid hybrids and expression analysis

[00152] The generation of monochromosomal hybrids was carried out by somatic cell fusion (Cytogenetics Laboratory, Mayo Medical Laboratories, Rochester, MN) as described (35). Briefly, cells from Epstein-Barr virus transformed lymphoblastoid cultures were electrofused with the E2 murine cell line. Hybrid clones were tested and selected for chromosome 9 by fluorescence in situ hybridization (FISH) and confirmatory genotyping was performed. Hybrid cell lines were cultured under selective pressure using Dulbecco's Modified Eagle's Medium (DMEM) containing 10% Fetal Bovine Serum (FBS), 1% HAT supplement, 1% Geneticin, 1% penicillin- streptomycin and 1% Minimum Essential Medium - Non Essential Amino Acids solution (MEM-NEAA), all from Gibco (Invitrogen).

[00153] To evaluate the amount of TBFBRl transcript semi-quantitative RT-PCR was performed for each clone. Human TGFβRl forward and reverse primers were designed in different exons (exons 7 and 8, respectively), creating a 135 bp amplicon. Murine Gpi was used as amplification control (size 176 bp), taking advantage of the fact that the hybrid cell lines carried all mouse chromosomes. After separation in a 2.5% agarose gel the relative amount of human product was quantified by comparing the densitometrically determined TGFBRl -human/Gpi-mouse ratios.

[00154] Mutation detection

[00155] As a first screening approach, direct genomic DNA sequencing was carried out using genomic DNA extracted from blood. One PCR fragment was amplified for each exon, including 50-100 bp of each flanking intron. Also, sequencing was extended to 2 kb upstream of exon 1 and the entire 3'-UTR region, divided into overlapping PCR fragments of approximately 500 bp. PCR products were incubated with ExoS AP-IT (USB Corp., Cleveland, OH) for 1 hour at 37 0 C and for 15 minutes at 75 0 C. All products were sequenced in both directions using the ABI Prism BigDye Terminator Cycle Sequencing Kit version 3.1 and the Applied Biosystems 3730 DNA Analyzer (PE Applied Biosystems, Foster City, CA).

[00156] As a second approach, a 96.5 kb genomic region was studied for mutations by sequencing. The analyzed region extends from the end of the COL15A1 gene (35 kb upstream of the first exon of TGFβRl) to 12.5 kb downstream of the TGFβRl 3'-UTR. For long-range amplification and cloning purposes, the entire region was divided into 18 overlapping amplicons of 1.7 to 10 kb. Each fragment was PCR amplified using the Expand Long Template PCR System (Roche Applied Science, Mannheim, Germany). The long- range PCR amplification products were cloned into chemically competent TOPlO cells (Invitrogen, Carlsbad, CA) following a standard cloning protocol and using a pBluescript- modified vector. Clones were analyzed by restriction enzyme digestion and positive clones were sequenced. Forward and reverse sequencing was performed with primers separated by 400 bp creating overlapping amplicons. For each of the 18 large amplicons at least three clones were sequenced for the monochromosomal hybrids and at least six clones for the diploid samples. Only the changes common to all clones were considered, so that artifacts produced by the long-range PCR amplification were eliminated from the analysis.

[00157] The multiplex ligation-dependent probe amplification methodology (SALSA MLPA kit P148 TGFβRl+2, MRC-Holland, Amsterdam, Holland), was used to search for large deletions in TGFβRl as recommended by the manufacturer.

[00158] Promoter methylation analysis

[00159] To study the methylation status of the CpG islands located in the promoter region, bisulphite modification of germline DNA was used to detect methylated cytosines by sequencing. Germline DNA was modified using the CpGenome™ Fast DNA Modification Kit (Chemicon International, Inc., Temecula, CA) following the manufacturer's instructions. Methylation patterns within the TGFβRl promoter (sequence -1178 bp to -102 bp upstream of the ATG start codon, GenBank Accession Number U51139) were determined using a semi-nested PCR approach which amplified 5 overlapping fragments (BSP1-5) covering the entire TGFβRl promoter region. The expected PCR bands were excised from 2.0% agarose gels, purified using the QIAquick Gel Extraction Kit (Qiagen, Germantown, MD), and then sequenced with a dGTP BigDye™ Terminator Version 3.0 Ready Reaction Cycle Sequencing Kit on an ABI 3100 DNA sequencer (Applied Biosystems, Foster City, CA).

[00160] Analysis of SMAD2, pSMAD2 and pSMAD3 expression

[00161] Lymphobastoid cell lines were grown in RPMI with 20% FBS, 2% antibiotic- antimycotic solution (all from Gibco, Invitrogen), and 0.33% Tylosin (Sigma) at 37 0 C, 5% CO2 and -85% humidity. After overnight serum starvation (0.1% FBS) TGF-βl (R&D Systems, Minneapolis, Minn) was added to the medium to a final concentration of 10OpM. Cells were harvested at different time points: before the addition of TGF- β (time 0) and after 1, 4, 8 and 16 hours.

[00162] For the dose effect experiment, cells were exposed to different TGF-β concentrations (5, 25, 50 and 100 pM) and harvested after 1 hour. Western blotting was performed as described (36) using the following antibodies: Rabbit anti-pSMAD2 (Cell Signaling, Danvers, MA), rabbit anti-SMAD2 (Cell Signaling), mouse anti-β-actin (Sigma, St Louis, MO), anti-rabbit and anti-mouse peroxidase conjugated secondary antibodies (Rockland, Gilberts ville, PA).

[00163] For SMAD3 expression experiments, lymphoblastoid cell nuclear extracts were isolated by using NE-PER nuclear and cytoplasmic extraction kit (cat # 78833) (Pierce, Rockford, IL-61105). Fifteen micrograms of protein from each sample were separated on 10% SDS-PAGE gels and transferred to a PVDF membrane. Rabbit anti-pSmad3 (Ser423/425) (pSMAD3) was a gift from Dr. Koichi Matsuzaki, Kanzai Medical University, Osaka, Japan. Mouse anti-Histone 1 (sc-8030) was used as a loading control and purchased from Santa Cruz Biotechnology (Santa Cruz, CA).

[00164] Promoter methylation analysis

[00165] The study of the methylation status of the TGFβRl promoter in germline DNA obtained from 69 consecutive patients with colorectal cancer and 69 healthy controls matched for age, gender, ethnic status and geographic location, did not show any evidence of methylation in the BSP2, BSP3, BSP4 and BSP5 regions (the closest to exon 1). The only methylated sites were located at positions -1108, -1105, -1083, -1078,-1049, -958, -953 and - 951, which belonged to region BSPl, although no differences in the methylation pattern were observed between CRC cases and controls. Similarly, we did not find any evidence of methylation within BSP2, BSP3, BSP4 and BSP5 in germline DNA from ASE (N=16) and non-ASE (N=9) CRC patients (Table 4 (Figure 10)).

[00166] Examples of Uses

[00167] In one aspect, the present invention provides methods for assessing the genetic predisposition of a subject to develop colorectal cancer and potentially other cancers. The detection method is based upon the differential expression of alleles (or presence of their underlying haplotypes) of TGFβRl in normal somatic cells (such as white blood cells). The differential allelic expression was termed allele specific expression (ASE). The ASE of TGFβRl is indicative of a higher risk to develop CRC and potentially other cancers.

[00168] It was discovered that some TGFβRl alleles (or their underlying haplotypes) are lower expressed in people with CRC and that this lowered expression is heritable. These risk-alleles (or their underlying haplotypes) are more common in CRC cancer patients than normal controls.

[00169] ASE in a sample of cells from a subject may be used to predict relative risk to develop CRC and potentially other cancers, and ultimately the prognosis, for that subject. Thus, as a biomarker ASE of TGFβRl provides information as to the risk that someone will be more likely to develop CRC and potentially other cancers.

[00170] Measuring Expression Of TGFβRl

[00171] One embodiment includes measuring ASE in a sample of cells from an unaffected subject. The ASE values may then be used to generate a risk score that is predictive of predisposition to cancer.

[00172] The ASE may be measured by a variety of techniques that are well known in the art. Quantifying the total or allelic levels of the messenger RNA (mRNA) of TGFβRl may be used to define the level of total mRNA or the level of ASE. Alternatively, quantifying the levels of the protein product of TGFβRl may be used to measure the expression of the biomarker. Additional information regarding the methods discussed below may be found in Ausubel et al., (2003) Current Protocols in Molecular Biology, John Wiley & Sons, New York, NY, or Sambrook et al. (1989) .Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Press, Cold Spring Harbor, NY. One skilled in the art will know which parameters may be manipulated to optimize detection of the mRNA or protein of interest.

[00173] ASE ratio calculations may be performed as described (S3). Briefly, the ratio of the two alleles in the cDNA of the transcript was normalized with the ratio of the two alleles in genomic DNA, applying the following formula: cDNA (peak area common allele / peak area rare allele) divided by gDNA (peak area common allele / peak area rare allele). Each SNP was assayed with two independent cDNA preparations, thus the SNaPshot ASE variation value for each individual is given as the average of several different analyses when using various SNPs. The arbitrary cut-off points defined (>1.5 or <0.67) were based on the optimum value of Youden's index (Table 2 (Figure 8)).

[00174] A nucleic acid microarray may also be used to quantify the differential expression of the biomarker. Microarray analysis may be performed using commercially available equipment, following manufacturer's protocols, such as by using the Affymetrix GeneChip® technology (Santa Clara, CA) or the Microarray System from Incyte (Fremont, CA). Typically, single- stranded nucleic acids (e.g., cDNAs or oligonucleotides) are plated, or arrayed, on a microchip substrate. The arrayed sequences are then hybridized with specific nucleic acid probes from the cells of interest. Fluorescently labeled cDNA probes may be generated through incorporation of fluorescently labeled deoxynucleotides by reverse transcription of RNA extracted from the cells of interest. Alternatively, the RNA may be amplified by in vitro transcription and labeled with a marker, such as biotin. The labeled probes are then hybridized to the immobilized nucleic acids on the microchip under highly stringent conditions. After stringent washing to remove the non-specifically bound probes, the chip is scanned by confocal laser microscopy or by another detection method, such as a CCD camera. The raw fluorescence intensity data in the hybridization files are generally preprocessed with the robust multichip average (RMA) algorithm to generate expression values.

[00175] Quantitative real-time PCR (QRT-PCR) may also be used to measure the differential expression of the biomarker. In QRT-PCR, the RNA template is generally reverse transcribed into cDNA, which is then amplified via a PCR reaction. The amount of PCR product is followed cycle -by-cycle in real time, which allows for determination of the initial concentrations of mRNA. To measure the amount of PCR product, the reaction may be performed in the presence of a fluorescent dye, such as SYBR Green, which binds to double- stranded DNA. The reaction may also be performed with a fluorescent reporter probe that is specific for the DNA being amplified. A non-limiting example of a fluorescent reporter probe is a TagMan® probe (Applied Biosystems, Foster City, CA). The fluorescent reporter probe fluoresces when the quencher is removed during the PCR extension cycle. Muliplex QRT-PCR may be performed by using multiple gene- specific reporter probes, each of which contains a different fluorophore. Fluorescence values are recorded during each cycle and represent the amount of product amplified to that point in the amplification reaction. To minimize errors and reduce any sample-to- sample variation, QRT-PCR is typically performed using a reference standard. The ideal reference standard is expressed at a constant level among different tissues, and is unaffected by the experimental treatment. Suitable reference standards include, but are not limited to, mRNAs for the housekeeping genes glyceraldehyde-3-phosphate-dehydrogenase (GAPDH) and beta-actin. The level of mRNA in the original sample or the fold change in expression of each biomarker may be determined using calculations well known in the art.

[00176] Immunohistochemical staining may also be used to measure the differential expression of the biomarker. This method enables the localization of a protein in the cells of a tissue section by interaction of the protein with a specific antibody. For this, the tissue may be fixed in formaldehyde or another suitable fixative, embedded in wax or plastic, and cut into thin sections (from about 0.1 mm to several mm thick) using a microtome. Alternatively, the tissue may be frozen and cut into thin sections using a cryostat. The sections of tissue may be arrayed onto and affixed to a solid surface (i.e., a tissue microarray). The sections of tissue are incubated with a primary antibody against the antigen of interest, followed by washes to remove the unbound antibodies. The primary antibody may be coupled to a detection system, or the primary antibody may be detected with a secondary antibody that is coupled to a detection system. The detection system may be a fluorophore or it may be an enzyme, such as horseradish peroxidase or alkaline phosphatase, which can convert a substrate into a colorimetric, fluorescent, or chemiluminescent product. The stained tissue sections are generally scanned under a microscope. Because a sample of tissue from a subject with cancer may be heterogeneous, i.e., some cells may be normal and other cells may be cancerous, the percentage of positively stained cells in the tissue may be determined. This measurement, along with a quantification of the intensity of staining, may be used to generate an expression value for the biomarker.

[00177] An enzyme-linked immunosorbent assay, or ELISA, may be used to measure the differential expression of the biomarker. There are many variations of an ELISA assay. All are based on the immobilization of an antigen or antibody on a solid surface, generally a microtiter plate. The original ELISA method comprises preparing a sample containing the biomarker proteins of interest, coating the wells of a microtiter plate with the sample, incubating each well with a primary antibody that recognizes a specific antigen, washing away the unbound antibody, and then detecting the antibody- antigen complexes. The antibody- antibody complexes may be detected directly. For this, the primary antibodies are conjugated to a detection system, such as an enzyme that produces a detectable product. The antibody- antibody complexes may be detected indirectly. For this, the primary antibody is detected by a secondary antibody that is conjugated to a detection system, as described above. The microtiter plate is then scanned and the raw intensity data may be converted into expression values using means known in the art.

[00178] An antibody microarray may also be used to measure the differential expression of the biomarker. For this, a plurality of antibodies is arrayed and covalently attached to the surface of the microarray or biochip. A protein extract containing the biomarker proteins of interest is generally labeled with a fluorescent dye. The labeled biomarker proteins are incubated with the antibody microarray. After washes to remove the unbound proteins, the microarray is scanned. The raw fluorescent intensity data maybe converted into expression values using means known in the art.

[00179] Luminex multiplexing microspheres may also be used to measure the differential expression of the biomarker. These microscopic polystyrene beads are internally color-coded with fluorescent dyes, such that each bead has a unique spectral signature (of which there are up to 100). Beads with the same signature are tagged with a specific oligonucleotide or specific antibody that will bind the target of interest (i.e., biomarker mRNA or protein, respectively). The target, in turn, is also tagged with a fluorescent reporter. Hence, there are two sources of color, one from the bead and the other from the reporter molecule on the target. The beads are then incubated with the sample containing the targets, of which up 100 may be detected in one well. The small size/surface area of the beads and the three dimensional exposure of the beads to the targets allows for nearly solution-phase kinetics during the binding reaction. The captured targets are detected by high-tech fluidics based upon flow cytometry in which lasers excite the internal dyes that identify each bead and also any reporter dye captured during the assay. The data from the acquisition files may be converted into expression values using means known in the art.

[00180] In situ hybridization may also be used to measure the differential expression of the biomarker. This method permits the localization of mRNAs of interest in the cells of a tissue section. For this method, the tissue may be frozen, or fixed and embedded, and then cut into thin sections, which are arrayed and affixed on a solid surface. The tissue sections are incubated with a labeled antisense probe that will hybridize with an mRNA of interest. The hybridization and washing steps are generally performed under highly stringent conditions. The probe may be labeled with a fluorophore or a small tag (such as biotin or digoxigenin) that may be detected by another protein or antibody, such that the labeled hybrid may be detected and visualized under a microscope. Multiple mRNAs may be detected simultaneously, provided each antisense probe has a distinguishable label. The hybridized tissue array is generally scanned under a microscope. Because a sample of tissue from a subject with cancer may be heterogeneous, i.e., some cells may be normal and other cells may be cancerous, the percentage of positively stained cells in the tissue may be determined. This measurement, along with a quantification of the intensity of staining, may be used to generate an expression value for each biomarker.

[00181] Obtaining A Sample Of Cells

[00182] The expression of risk-alleles or the presence of underlying haplotypes will be measured in a sample of cells from a human subject.

[00183] ASE of TGFβRl or the presence of underlying haplotypes is measured by SNAPshot technology in RNA or DNA isolated from a peripheral blood sample of an individual; no biopsy material or cancer cells are necessary.

[00184] It is not necessary that the human subject already has cancer for the ASE analysis or the genotyping for risk-alleles or the presence of underlying haplotypes to be informative.

[00185] Once a sample of cells is removed from the subject, it may be processed for the isolation of RNA or protein using techniques well known in the art and disclosed in standard molecular biology reference books, such as Ausubel et al., (2003) Current Protocols in Molecular Biology, John Wiley & Sons, New York, NY. A sample may also be stored or flash frozen and stored at -8O 0 C for later use. The biopsied cells sample may also be fixed with a fixative, such as formaldehyde, paraformaldehyde, or acetic acid/ethanol. The fixed sample may be embedded in wax (paraffin) or a plastic resin. The embedded sample (or frozen sample) may be cut into thin sections. RNA or protein may also be extracted from a fixed or wax-embedded sample.

[00186] The subject will generally be a mammalian subject. Mammals may include primates, livestock animals, and companion animals. Non-limiting examples include: Primates may include humans, apes, monkeys, and gibbons; Livestock animals may include horses, cows, goats, sheep, deer and pigs; Companion animals may include dogs, cats, rabbits, and rodents (including mice, rats, and guinea pigs). In an exemplary embodiment, the subject is a human.

[00187] Statistical analysis

[00188] Comparisons of proportions between groups were tested by Chi-square test or by Fisher's exact test where expected cell count was < 5, using R version 2.5. Non-parametric Wilcoxon rank sum test was used to compare ASE values between cases and controls. Moreover, means of the two groups were compared using a permutation test with 100,000 permutations. All tests were two-sided and a P-value of less than 0.05 was considered significant.

[00189] The reconstruction of haplotypes from the genotype and known haplotype data was performed with the PHASE V2.1.1 program (37,38). For this purpose, only SNPs with a minimum minor allele frequency of 0.05, and a maximum presence of missing values of 10% were chosen. Permutation based test available through the PHASE program was used to test for significant differences in haplotype frequency distributions between cases and controls, with 1000 permutations. This tests the null hypothesis that the case and control haplotypes are a random sample from a common set of haplotype frequencies, versus the alternative that cases are more similar to each other than to controls. Haplotype frequencies of inferred haplotypes were compared between cases and controls by grouping each haplotype versus all the others together, using Fisher's exact test. Haplotype inference and LOD-score calculation in families was performed with MERLIN (39).

[00190] Receiver operating characteristic (ROC) analysis was performed by estimating sensitivity and specificity for varied ASE cut-off points. Youden's index (40) was calculated for several cut-off values (Table 2 (Figure 8)). [00191] Odds ratios (OR) and 95% confidence intervals were estimated using unconditional maximum likelihood estimation (WaId) method and with normal approximation. Estimated population attributable risks (PAR) were obtained from a case-control study by using the method described by Armitage et al. (40).

[00192] Kit for predicting the risk for developing CRC

[00193] A further aspect of the invention provides kits for predicting the risk for developing CRC. A kit comprises a plurality of agents for measuring differential expression of alleles (or presence of their underlying haplotypes) of TGFβRl, means for converting the expression data into ASE values or classifying genotypes into haplotypes, and means for generating risk scores that indicate relative risk to develop CRC and potentially other cancers. The agents in the kit for measuring ASE or generate genotypes may comprise a plurality of PCR probes and primers.

[00194] The invention is also directed to kits for detecting CRC an individual comprising one or more reagents for detecting 1) one or more microRNAs; 2) one or more target genes of one or more microRNAs; 3) one or more polypeptides expressed by the target genes or 4) a combination thereof. For example, the kit can comprise hybridization probes, restriction enzymes (e.g., for RFLP analysis), allele- specific oligonucleotides, and antibodies that bind to the polypeptide expressed by the target gene.

[00195] In a particular embodiment, the kit comprises at least contiguous nucleotide sequence that is substantially or completely complementary to a region one or more of the microRNAs. In one embodiment, one or reagents in the kit are labeled, and thus, the kits can further comprise agents capable of detecting the label.

[00196] In another aspect, there is provided herein a method of prevention of colorectal cancer (CRC) morbidity and mortality in the population, comprising administering a diagnostic screening to individuals in the population, and if an individual has at least one risk factor selected from the group: an at-risk haplotype for CRC; an at-risk haplotype in the TGFβRl gene; an at-risk polymorphism in TGFβRl; dysregulation of TGFβRl mRNA expression; dysregulation of a TGFβRl mRNA isoform; or decreased TGFβRl protein expression, the individual then can undergo routine colonoscopy and potentially therapy to prevent CRC from developing or spreading, thereby lowering CRC morbidity and mortality.

[00197] Unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. Although methods and materials similar or equivalent to those described herein can be used in the practice or testing of the present invention, suitable methods and materials are described herein. All publications, patent applications, patents, and other references mentioned herein are incorporated by reference in their entirety. In case of conflict, the present specification, including definitions, will control. In addition, the materials, methods, and examples are illustrative only and not intended to be limiting.

[00198] While the invention has been described with reference to various and preferred embodiments, it should be understood by those skilled in the art that various changes may be made and equivalents may be substituted for elements thereof without departing from the essential scope of the invention. In addition, many modifications may be made to adapt a particular situation or material to the teachings of the invention without departing from the essential scope thereof.

[00199] Therefore, it is intended that the invention not be limited to the particular embodiment disclosed herein contemplated for carrying out this invention, but that the invention will include all embodiments falling within the scope of the claims.

[00200] REFERENCES

[00201] The publication and other material used herein to illuminate the invention or provide additional details respecting the practice of the invention, are incorporated by reference herein, and for convenience are provided in the following bibliography.

[00202] Citation of the any of the documents recited herein is not intended as an admission that any of the foregoing is pertinent prior art. All statements as to the date or representation as to the contents of these documents is based on the information available to the applicant and does not constitute any admission as to the correctness of the dates or contents of these documents.

1. D. M. Parkin, F. Bray, J. Ferlay, P. Pisani, CA Cancer J Clin 55, 74 (2005).

2. de Ia Chapelle, Nat Rev Cancer 4, 769 (2004).

3. N. M. Lindor et al., Jama 293, 1979 (2005).

4. P. M. Siegel, J. Massague, Nat Rev Cancer 3, 807 (2003).

5. S. Markowitz et al., Science 268, 1336 (1995).

6. B. Pasche et al., Cancer Res 58, 2727 (1998).

7. Y. Xu, B. Pasche, Hum MoI Genet 16 Spec No 1, R14 (2007).

8. H. Yan et al., Nat Genet 30, 25 (2002).

9. Raval et al., Cell 129, 879 (2007).

10. Materials and methods are available as supporting material on Science Online

11. J. Massague, R. R. Gomis, FEBS Lett 580, 2811 (2006).

12. J. Massague, MoI Cell 29, 149 (2008).

13. X. Liu et al., Proc Natl Acad Sci U S A 94, 10669 (1997).

14. Y. Zhu, J. A. Richardson, L. F. Parada, J. M. Graff, Cell 94, 703 (1998).

15. N. M. Sodir et al., Cancer Res 66, 8430 (2006). 16. G. Sekimoto et al., Cancer Res 67, 5090 (2007).

17. B. Pasche et al., J Clin Oncol 22, 756 (2004).

18. B. Pasche et al., Cancer Res 59, 5678 (1999).

19. B. Pasche et al., Jama 294, 1634 (2005).

20. Y. Bian et al., J Clin Oncol 23, 3074 (2005).

21. J. Skoglund et al., Clin Cancer Res 13, 3748 (2007).

22. B. Bierie, H. L. Moses, Cytokine Growth Factor Rev 17, 29 (2006).

23. J. Massague, S. W. Blain, R. S. Lo, Cell 103, 295 (2000).

24. T. Sjoblom et al., Science 314, 268 (2006).

25. H. Yan, W. Zhou, Curr Opin Oncol 16, 39 (2004).

26. G. L. Wiesner et al., Proc Natl Acad Sci U S A 100, 12961 (2003).

27. Z. E. Kemp et al., Cancer Res 66, 5003 (2006).

28. J. Skoglund et al., J Med Genet 43, e7 (2006).

29. M. Morley et al., Nature 430, 743 (2004).

30. H. H. Goring et al., Nat Genet 39, 1208 (2007).

31. B. E. Stranger et al., Nat Genet 39, 1217 (2007).

32. H. Hampel et al., N Engl J Med 352, 1851 (2005).

33. H. Hampel et al., Cancer Res 67, 9603 (2007).

34. H. He et al., Thyroid 15, 660 (2005).

35. H. Yan et al., Nature 403, 723 (2000).

36. Q. Zeng et al., Cancer Cell 8, 13 (2005).

37. M. Stephens, N. J. Smith, P. Donnelly, Am J Hum Genet 68, 978 (2001).

38. M. Stephens, P. Donnelly, Am J Hum Genet 73, 1162 (2003).

39. G. R. Abecasis, S. S. Cherny, W. O. Cookson, L. R. Cardon, Nat Genet 30, 97 (2002).

40. P. Armitage, G. Berry, J. N. S. Matthews, Statistical Methods in Medical Research (Blackwell Science, ed. 4th, 2002), pp. 682-697.