Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
IMPRINTED GENES AND DISEASE
Document Type and Number:
WIPO Patent Application WO/2008/070144
Kind Code:
A2
Abstract:
Methods for identifying imprinted genes. In some embodiments, the methods comprise (a) providing a first data set comprising a plurality of nucleic acid sequences, wherein the nucleic acid sequences comprise genomic DNA sequences corresponding to a plurality of genes known to be imprinted in the subject; (b) providing a second data set comprising a plurality of nucleic acid sequences, wherein the nucleic acid sequences comprise genomic DNA sequences corresponding to a plurality of genes known not to be imprinted in the subject; (c) identifying one or more features that by themselves or in combination are differentially present or absent from the first data set as compared to the second data set; and (d) applying the one or more features to a test data set comprising a plurality of genomic DNA sequences which correspond to one or more genes for which the imprinting status is unknown to thereby identify an imprinted gene in a subject. The presently disclosed subject matter also provides methods for identifying a feature in a subject with respect to an imprinted gene and methods for detecting a presence of or a susceptibility to a medical condition associated with parent-of-origin dependent monoallelic expression in a subject.

Inventors:
JIRTLE RANDY L (US)
HARTEMINK ALEXANDER J (US)
LUEDI PHILIPPE P (CH)
Application Number:
PCT/US2007/024973
Publication Date:
June 12, 2008
Filing Date:
December 06, 2007
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
UNIV DUKE (US)
JIRTLE RANDY L (US)
HARTEMINK ALEXANDER J (US)
LUEDI PHILIPPE P (CH)
International Classes:
C12Q1/68; G16B30/00
Other References:
ESTELLO ET AL.: 'Cancer as an epigenetic disease: DNA methylation and chromatin alterations in human tumours.' JOURNAL OF PATHOLOGY. vol. 196, 2002, pages 1 - 7
KIM ET AL.: 'Altered expression of KCNK9 in colorectal cancers.' APMIS. vol. 112, 2004, pages 588 - 594
Attorney, Agent or Firm:
TAYLOR JR., Arles, A. (Wilson Taylor & Hunt, P.a.,Suite 1200, University Tower,3100 Tower Boulevar, Durham NC, US)
Download PDF:
Claims:

CLAIMS

What is claimed is: 1. A method for identifying an imprinted gene in a subject, the method comprising: (a) providing a first data set comprising a plurality of nucleic acid sequences, wherein the nucleic acid sequences comprise genomic DNA sequences corresponding to a plurality of genes known to be imprinted in the subject;

(b) providing a second data set comprising a plurality of nucleic acid sequences, wherein the nucleic acid sequences comprise genomic

DNA sequences corresponding to a plurality of genes known not to be imprinted in the subject;

(c) identifying one or more features that by themselves or in combination are differentially present or absent from the first data set as compared to the second data set; and

(d) applying the one or more features to a test data set comprising a plurality of genomic DNA sequences which correspond to one or more genes for which the imprinting status is unknown to thereby identify an imprinted gene in a subject.

2. The method of claim 1 , wherein the subject is a human.

3. The method of claim 1 , wherein the genomic DNA sequences include untranslated sequences of at least 1 kilobase, 2 kilobases, 5 kilobases, 10 kilobases, 25 kilobases, 50 kilobases, 100 kilobases, or greater than 100 kilobases for one or more of the plurality of genes known to be imprinted in the subject, one or more of the plurality of genes known not to be imprinted in the subject, and combinations thereof.

4. The method of claim 3, wherein the genomic DNA sequences comprise 5' untranslated sequences, 3' untranslated sequences, or both 5' and 3' untranslated sequences.

5. The method of claim 1 , wherein the features are selected from those set forth in Table 4.

6. The method of claim 1 , wherein the identifying comprises training an algorithm using the first data set as a first training data set and the second data set as a second training data set to thereby identify one or more features in the first and second data sets that are predictive of imprinting status.

7. A method for identifying a feature in a subject with respect to an imprinted gene, the method comprising:

(a) obtaining a biological sample from the subject, wherein the biological sample comprises one or more nucleic acid molecules derived from one or more of the genes listed in Table 1 ; and

(b) analyzing the one or more nucleic acid molecules, whereby a feature is identified in the subject with respect to the imprinted gene.

8. The method of claim 7, wherein the feature is selected from the group consisting of a genetic feature, an epigenomic feature, and combinations thereof.

9. The method of claim 8, wherein the genetic feature comprises a genotype of the subject with respect to at least one gene listed in Table 1.

10. The method of claim 8, wherein the epigenomic feature is selected from the group consisting of a DNA sequence modification (such as methylation), a nucleosome positioning feature, a chromatin state, and a histone modification (such as methlyation or acetylation or similar).

11. The method of claim 7, wherein the biological sample comprises genomic DNA from the subject.

12. The method of claim 7, wherein the analyzing comprises sequencing at least a portion of the one or more nucleic acid molecules derived from one or more of the genes listed in Table 1.

13. The method of claim 12, wherein the subject is heterozygous for one or more polymorphisms located in the portion of the one or more nucleic

acid molecules derived from one or more of the genes listed in Table 1 , and the sequencing identifies the one or more polymorphisms.

14. The method of claim 7, wherein the method further comprises screening a biological sample from one or both biological parents of the subject to identify which parent transmitted each allele to the subject.

15. The method of claim 14, further comprising predicting whether or not one or more of the alleles is likely to be expressed in the subject.

16. The method of claim 15, wherein the predicting comprises correlating maternal or paternal inheritance of the one or more alleles with an assessment of whether the one or more alleles is expressed when inherited maternally or paternally.

17. A method for detecting a presence of or a susceptibility to a medical condition associated with parent-of-origin dependent monoallelic expression in a subject, the method comprising: (a) obtaining a biological sample from the subject, wherein the biological sample comprises one or more nucleic acid molecules;

(b) analyzing the one or more nucleic acid molecules for a feature with respect to parent-of-origin for one or both alleles of at least one imprinted gene; and (c) determining whether the feature correlates with a presence of or a susceptibility to a medical condition associated with monoallelic expression, whereby a presence of or a susceptibility to a medical condition associated with parent-of-origin dependent monoallelic expression in the subject is detected.

18. The method of claim 17, wherein the feature is selected from the group consisting of a genetic feature, an epigenomic feature, and combinations thereof.

19. The method of claim 18, wherein the genetic feature comprises a genotype of the subject with respect to at least one gene listed in Table 1.

20. The method of claim 18, wherein the epigenomic feature is selected from the group consisting of a DNA sequence methylation state, a nucleosome positioning feature, and a histone modification.

21. The method of claim 17, wherein the feature relates to a gene listed in Table 1 the expression or lack of expression of which is associated with a medical condition.

22. The method of claim 17, wherein the medical condition is selected from the group consisting of alcoholism, Alzheimer's disease, asthma/atopy, autism, bipolar disorder, obesity, diabetes, Parental Uniparental Disomy (UPD), cancer, epilepsy, DiGeorge syndrome, and schizophrenia.

23. The method of claim 17, wherein the at least one imprinted gene is selected from DLGAP2 and KCNK9.

Description:

DESCRIPTION IMPRINTED GENES AND DISEASE

CROSS REFERENCE TO RELATED APPLICATION The presently disclosed subject matter claims the benefit of U.S. Provisional Patent Application Serial No. 60/873,151 , filed December 6, 2006; the disclosure of which is incorporated herein by reference in its entirety.

GOVERNMENT INTEREST

This presently disclosed subject matter was made with U.S. Government support under Grant Nos. R01-ES008823 and R01-ES015165 awarded by the National Institutes of Health and Grant No. DE-FG02-05ER64101 from the

Department of Energy. Thus, the U.S. Government has certain rights in the presently disclosed subject matter.

TECHNICAL FIELD

The presently disclosed subject matter relates to the field of imprinted genes. More particularly, the presently disclosed subject matter relates to methods and compositions for identifying imprinted genes, for genotyping subjects with respect to one or more imprinted genes, for diagnosing and/or determining a susceptibility of a subject to a disease process associated with expression or lack of expression of an imprinted gene, and for determining those subjects predicted to benefit from therapies that target the epigenome.

BACKGROUND

The untranslated mRNA H19 was the first gene shown to be imprinted in humans (Zhang & Tycko, 1992), and since its discovery in 1992, about 40 additional imprinted genes have been identified in the human genome (Morison et a/. , 2005). A gene is imprinted if the expression of one of its alleles is silenced or significantly reduced in expression depending on the parent from whom that allele was inherited (Reik & Walter, 2001). This functionally haploid state eliminates the protection that diploidy normally confers against the deleterious effects of recessive mutations. The expression of imprinted genes can also be deregulated epigenetically. Identifying genes that are imprinted in the human genome, and determining the factors responsible for epigenetic establishment and maintenance of imprinting control, remain as goals in the art.

Experimental identification of imprinted genes has typically focused on small genomic regions. These efforts are usually motivated by phenotypical observations, such as differences when a gene knock-out was inherited maternally versus paternally. The advent of cDNA microarrays to study differential expression between parthenogenetic and androgenetic embryos has allowed for a more high throughput approach (Mizuno etal., 2002; Nikaido etal., 2003). Though this general technique has led to the discovery of three apparently imprinted genes (Mizuno et al., 2002), it has recently been criticized for failing to enrich for truly imprinted genes because of the inherent expression differences associated with the abnormal development of parthenogenotes (Morison et al., 2005; Ruf et ai, 2006).

Computational analyses have demonstrated that the concentration of certain types of repeated elements and other sequence characteristics can differ between monoallelically and biallelically expressed genes (Greally, 2002; Ke et ai, 2002; Allen et al., 2003), yet there are no unique sequence motifs known to be common to imprinted genes. A machine learning approach was recently used to predict novel imprinted genes across the entire mouse genome using a variety of sequence-derived statistics (Luedi et al., 2005).

However, comparative models between mouse and human are complicated by discrepancies in imprinting status. For example, while some genes are imprinted in both mouse and human, others, including Igf2r, Ascl2, Phemx, Cd81, Tssc4, Nap1l4, Gatm, Den, and Impact are imprinted in mouse but not human (Morison et ai, 2005; Monk et ai, 2006). Conversely, the homeobox gene DLX5 is imprinted in human (Okita et ai, 2003) but not mouse (Kimura et ai , 2004), although a subtle maternal preference was reported in the mouse brain (Horike et ai, 2005). This discordance makes the mouse an unreliable model for identifying imprinted genes in humans.

Therefore, there exists a long-felt need in the art for methods and compositions for identifying imprinted genes in humans and for correlating of the same with disease processes.

To address this need at least in part, the presently disclosed subject matter provides methods and compositions for identifying imprinted genes. The genes so identified are useful for genotyping subjects to identify and/or detect

disease processes that are associated with expression or lack of expression of an imprinted gene and/or for identifying a susceptibility of a subject to a disease process associated with expression or lack of expression of an imprinted gene, and for determining those subjects predicted to benefit from therapies that target the epigenome.

SUMMARY

This Summary lists several embodiments of the presently disclosed subject matter, and in many cases lists variations and permutations of these embodiments. This Summary is merely exemplary of the numerous and varied embodiments. Mention of one or more representative features of a given embodiment is likewise exemplary. Such an embodiment can typically exist with or without the feature(s) mentioned; likewise, those features can be applied to other embodiments of the presently disclosed subject matter, whether listed in this Summary or not. To avoid excessive repetition, this Summary does not list or suggest all possible combinations of such features.

The presently disclosed subject matter provides methods for identifying an imprinted gene in a subject. In some embodiments, the methods comprise (a) providing a first data set comprising a plurality of nucleic acid sequences, wherein the nucleic acid sequences comprise genomic DNA sequences corresponding to a plurality of genes known to be imprinted in the subject; (b) providing a second data set comprising a plurality of nucleic acid sequences, wherein the nucleic acid sequences comprise genomic DNA sequences corresponding to a plurality of genes known not to be imprinted in the subject; (c) identifying one or more features that by themselves or in combination are differentially present or absent from the first data set as compared to the second data set; and (d) applying the one or more features to a test data set comprising a plurality of genomic DNA sequences which correspond to one or more genes for which the imprinting status is unknown to thereby identify an imprinted gene in a subject. The genomic DNA sequences can include untranslated sequences of in some embodiments at least 1 kilobase, in some embodiments at least 2 kilobases, in some embodiments at least 5 kilobases, in some embodiments at least 10 kilobases, in some embodiments at least 25 kilobases, in some embodiments at least 50 kilobases, in some embodiments at least 100

kilobases, and in some embodiments greater than 100 kilobases for one or more of the plurality of genes known to be imprinted in the subject, one or more of the plurality of genes known not to be imprinted in the subject, and combinations thereof. In some embodiments, the genomic DNA sequences comprise 5' untranslated sequences, 3' untranslated sequences, or both 5' and 3' untranslated sequences. In some embodiments, the features are selected from those set forth in Table 4 hereinbelow. In some embodiments, the identifying comprises training an algorithm using the first data set as a first training data set and the second data set as a second training data set to thereby identify one or more features in the first and second data sets that are predictive of imprinting status.

The presently disclosed subject matter also provides methods for identifying a feature in a subject with respect to an imprinted gene. In some embodiments, the methods comprise (a) obtaining a biological sample from the subject, wherein the biological sample comprises one or more nucleic acid molecules derived from one or more of the genes present within the genome of the subject (including, but not limited to those genes listed in Tables 1 and/or 7 hereinbelow); and (b) analyzing the one or more nucleic acid molecules, whereby a feature is identified in the subject with respect to the imprinted gene. In some embodiments, the feature is selected from the group consisting of a genetic feature, an epigenomic feature, and combinations thereof. In some embodiments, the genetic feature comprises a genotype of the subject with respect to at least one gene (e.g., one of the genes listed in Tables 1 and/or 7 hereinbelow). In some embodiments, the epigenomic feature is selected from the group consisting of a DNA sequence modification (e.g., methylation), a nucleosome positioning feature, a chromatin state, and a histone modification (e.g., methlyation, acetylation, etc.). In some embodiments, the biological sample comprises genomic DNA from the subject. In some embodiments, the analyzing comprises sequencing at least a portion of the one or more nucleic acid molecules derived from one or more of the genes present within the genome of the subject (e.g. , one or more of the genes listed in Tables 1 and/or 7 hereinbelow). In some embodiments, the subject is heterozygous for one or more polymorphisms located in the portion of the one or more nucleic acid

molecules derived from one or more of the genes present within the genome of the subject (including, but not limited to the genes listed in Tables 1 and/or 7 hereinbelow), and the sequencing identifies the one or more polymorphisms.

In some embodiments, the methods further comprise screening a biological sample from one or both biological parents of the subject to identify which parent transmitted each allele to the subject. In some embodiments, the methods further comprise predicting whether or not one or more of the alleles is likely to be expressed in the subject. In some embodiments, the predicting comprises correlating maternal or paternal inheritance of the one or more alleles with an assessment of whether the one or more alleles is expressed when inherited maternally or paternally.

The presently disclosed subject matter also provides methods for detecting a presence of or a susceptibility to a medical condition associated with parent-of-origin dependent monoallelic expression in a subject. In some embodiments, the methods comprise (a) obtaining a biological sample from the subject, wherein the biological sample comprises one or more nucleic acid molecules; (b) analyzing the one or more nucleic acid molecules for a feature with respect to parent-of-origin for one or both alleles of at least one imprinted gene; and (c) determining whether the feature correlates with a presence of or a susceptibility to a medical condition associated with monoallelic expression, whereby a presence of or a susceptibility to a medical condition associated with parent-of-origin dependent monoallelic expression in the subject is detected. In some embodiments, the feature is selected from the group consisting of a genetic feature, an epigenomic feature, and combinations thereof. In some embodiments, the genetic feature comprises a genotype of the subject with respect to at least one gene (e.g., a gene listed in Tables 1 and/or 7 hereinbelow). In some embodiments, the epigenomic feature is selected from the group consisting of a DNA sequence methylation state, a nucleosome positioning feature, and a histone modification. In some embodiments, the feature relates to a gene (e.g. , a gene listed in Tables 1 and/or 7) the expression or lack of expression of which is associated with a medical condition. In some embodiments, the medical condition is selected from the group consisting of alcoholism, Alzheimer's disease, asthma/atopy, autism, bipolar disorder, obesity,

diabetes, Parental Uniparental Disomy (UPD), cancer, epilepsy, DiGeorge syndrome, and schizophrenia. In some embodiments, the at least one imprinted gene is selected from DLGAP2 and KCNK9.

In some embodiments of the presently disclosed methods, the subject is a mammal, and in some embodiments the subject is a human.

It is an object of the presently disclosed subject matter to provide a method for identifying imprinted genes.

An object of the presently disclosed subject matter having been stated hereinabove, and which is achieved in whole or in part by the presently disclosed subject matter, other objects will become evident as the description proceeds when taken in connection with the accompanying examples and drawings as best described hereinbelow.

BRIEF DESCRIPTION OF THE DRAWINGS Figures 1A-1 C are schematic diagrams depicting the genome-wide distribution of genes proved (filled triangles) or predicted with high confidence (unfilled triangles) to be imprinted. Downward triangles, upward triangles, and circles indicate genes predicted to be maternally, paternally, or biallelically expressed, respectively. Gray bars highlight a 3 Mb region centered on the linkage regions presented in Table 7 hereinbelow. Figures 2A-2E and 3A-3E present a series of bar graphs depicting distributions of the weights of features characteristic of imprinted genes, as determined by two feature selection methods, those of Equbits (Figures 2A-2E) and SMLR (Figures 3A-3E). Absolute weights are shown as box plots; the dotted line represents the overall mean of all selected features. Figures 2A and 3A are bar graphs depicting distributions of feature type. Figures 2B and 3B are bar graphs depicting distributions of different ways of quantifying repetitive elements. Ratios of ± counts carried the greatest weight (P < 6*1 CF 11 ). Figures 2C and 3C are bar graphs depicting distributions of different repetitive element locations. The 1 kb downstream window was of least importance (P < 1 x 10 "3 ). Figures 2D and 3D are bar graphs depicting distributions of different families of repetitive elements. Alus carried the lowest weight (P < 4 x 1 Cf 3 ), whereas endogenous retroviruses were of greatest importance (P < 3 x 10 "3 ). Figures 2E and 3E are

bar graphs depicting distributions of counts of the highest scoring transcription factor binding sites.

Figures 4A and 4B are plots depicting sequence comparisons of conceptus and maternal genomic DNA versus conceptus cDNA. In each plot, the arrow denotes the polymorphic nucleotide position.

Figure 4A depicts results showing a conceptus as polymorphic (G/A,

GEN BANK® Accession No. rs17829155, now merged with SNP ID rs22351 12;

SEQ ID NO: 1 ) in DLGAP2, whereas the mother (maternal decidua) is homozygous (A/A). Thus, DLGAP2 isoforms 24, 25, 26, and 27 are expressed monoallelically in the testis from the paternal allele.

Figure 4B depicts results showing a conceptus as polymorphic (C/T,

GENBANK® Accession No. rs2615374; SEQ ID NO: 2) in KCNK9, whereas the mother (maternal decidua) is homozygous (C/C) at the polymorphic nucleotide position. Thus, KCNK9 is expressed monoallelically in the brain from the maternal allele.

Figure 5 is a flow chart illustrating schematically the processes of cross- validation, training, testing, and prediction under two different kernels and employing Equbits and SMLR classifier learning strategies.

BRIEF DESCRIPTION OF THE SEQUENCE LISTING SEQ ID NO: 1 is a nucleic acid sequence of GENBANK® Accession No. rs17829155 (now merged with SNP ID rs22351 12, which lists the SNP from the opposite strand as set forth herein), a polymorphism associated with the DLGAP2 locus.

SEQ ID NO: 2 is a nucleic acid sequence of GENBANK® Accession No. rs2615374, a polymorphism associated with the KCNK9 locus.

SEQ ID NOs: 3-13 are the nucleotide sequences of various primers that can be employed in the analysis of the DLGAP2 and KCNK9 loci and gene products thereof.

DETAILED DESCRIPTION L General Considerations

Imprinted genes can be essential in embryonic development, and imprinting dysregulation can contribute to human disease (Murphy & Jirtle, 2003). Disclosed herein are 156 human genes predicted to be imprinted by

multiple classification algorithms using DNA sequence characteristics as features. Two of these genes have been verified experimentally to indeed be imprinted in humans. KCNK9, which is predominantly expressed in the brain, might be involved in bipolar disorder and epilepsy (Kananura et ai, 2002), and is a known oncogene (Patel & Lazdunski, 2004), while DLGAP2 is a candidate bladder cancer tumor suppressor (Muscheck et ai , 2000). The findings disclosed herein demonstrate that DNA sequence characteristics, including recombination hot spots, are sufficient to accurately predict the imprinting status of individual genes in the human genome. Moreover, mapping the imprinted gene candidates onto the chromosomal landscape defined by linkage analysis revealed many to be in loci that are linked to human health conditions as diverse as alcoholism, Alzheimer's, asthma, autism, bipolar disorder, cancer, diabetes, obesity, and schizophrenia.

Genes involved in human disease are commonly identified by disease- oriented experimental approaches. Disclosed herein is the discovery that potential susceptibility genes for a wide range of conditions can be identified by defining the subset of genes that are functionally haploid because of imprinting. Mapping these imprinted genes to disease susceptibility loci with parent-of-origin inheritance provides novel insights into how complex human diseases can arise from environmental alteration of the epigenome.

Thus, in some embodiments the presently disclosed subject matter provides a model to perform genome-wide predictions of imprinted genes directly in the human. These predictions are then employed to guide experimental identifications of new imprinted human genes. IL Definitions

Unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which the presently disclosed subject matter pertains. For clarity of the present specification, certain definitions are presented hereinbelow. Following long-standing patent law convention, the articles "a", "an", and

"the" refer to "one or more" when used in this application, including in the claims. For example, the phrase "a polymorphism" refers to one or more polymorphisms. Similarly, the phrase "at least one", when employed herein to refer to an

oligonucleotide, a gene, or any other entity, refers to, for example, 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25, 50, 100, or more of that entity. Thus, the phrase "at least one gene" used in the context of the genes and gene products disclosed herein refers to 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, up to every gene disclosed herein, including every value in between.

As used herein, the phrase "biological sample" refers to a sample isolated from a subject (e.g., a biopsy) or from a cell or tissue from a subject (e.g., RNA and/or DNA isolated therefrom). Biological samples can be of any biological tissue or fluid or cells from any organism as well as cells cultured in vitro, such as cell lines and tissue culture cells. Frequently the sample will be a "clinical sample" which is a sample derived from a patient (i.e., a subject undergoing a diagnostic procedure and/or a treatment). Typical clinical samples include, but are not limited to, blood, blood cells (e.g., white cells), tissue or fine needle biopsy samples, and cells therefrom. Biological samples can also include sections of tissues, such as frozen sections or formalin fixed sections taken for histological purposes. In some embodiments, a biological sample isolated from a subject comprises a number of cells to provide a sufficient amount of genomic

DNA and/or RNA to practice one or more of the presently disclosed methods.

As used herein, the term "complementary" refers to two nucleotide sequences that comprise antiparallel nucleotide sequences capable of pairing with one another upon formation of hydrogen bonds between the complementary base residues in the antiparallel nucleotide sequences. As is known in the art, the nucleic acid sequences of two complementary strands are the reverse complement of each other when each is viewed in the 5' to 3' direction. Unless specifically indicated to the contrary, the term "complementary" as used herein refers to 100% complementarity throughout the length of at least one of the two antiparallel nucleotide sequences.

As used herein, the phrase "derived from" refers to an entity that is present either in another entity and/or in some embodiments in the same entity but in a different context. In terms of biological samples and nucleic acids, the phrase "derived from" can be synonymous with "isolated from". However, especially in the case of a biological molecule, the phrase "derived from" can also refer to the fact that the biological molecule is present in a different context

or form in one situation versus another. For example, in some embodiments, the presently disclosed methods employ nucleic acid molecules "derived from" a gene (e.g., a gene listed in any of the Tables disclosed herein). In this context, it is understood that a nucleic acid molecule is "derived from" a gene if the nucleic acid molecule can be generated naturally or artificially by employing genetic and/or epigenomic information that is associated with the gene in the subject. In some embodiments, a nucleic acid molecule is "derived from" a gene if it is encoded by the gene, is a transcription product of the gene, or otherwise is generated based on genetic or non-genetic information that is provided by the gene.

As used herein, the term "fragment" refers to a sequence that comprises a subset of another sequence. When used in the context of a nucleic acid or amino acid sequence, the terms "fragment" and "subsequence" are used interchangeably. A fragment of a nucleic acid sequence can be any number of nucleotides that is less than that found in another nucleic acid sequence, and thus includes, but is not limited to, the sequences of an exon or intron, a promoter, an imprint regulatory element, an enhancer, an origin of replication, a 5' or 3' untranslated region, a coding region, and/or a polypeptide binding domain. It is understood that a fragment or subsequence can also comprise less than the entirety of a nucleic acid sequence, for example, a portion of an exon or intron, promoter, enhancer, etc. Similarly, a fragment or subsequence of an amino acid sequence can be any number of residues that is less than that found in a naturally occurring polypeptide, and thus includes, but is not limited to, domains, features, repeats, etc. Also similarly, it is understood that a fragment or subsequence of an amino acid sequence need not comprise the entirety of the amino acid sequence of the domain, feature, repeat, etc.

As used herein, the term "gene" is used broadly to refer to any segment of DNA associated with a biological function. Thus, genes include, but are not limited to, coding sequences, the regulatory sequences required for their expression (e.g., 5' regulator sequences, 3' regulatory sequences, and combinations thereof), intron sequences associated with the coding sequences, and combinations thereof. Genes can also include non-expressed DNA segments that, for example, form recognition sequences for a polypeptide.

Genes can be obtained from a variety of sources, including cloning from a source of interest or synthesizing from known or predicted sequence information, and can include sequences designed to have desired parameters.

The phrase "hybridizing specifically to" refers to the binding, duplexing, or hybridizing of a molecule only to a particular nucleotide sequence under stringent conditions when that sequence is present in a complex mixture (e.g., total cellular) of DNA and/or RNA. The phrase "bind(s) substantially" refers to complementary hybridization between a probe nucleic acid and a target nucleic acid and embraces minor mismatches that can be accommodated by reducing the stringency of the hybridization media to achieve the desired detection of the target nucleic acid sequence.

As used herein, the term "isolated", when used in the context of an isolated nucleic acid or an isolated polypeptide, is a nucleic acid or polypeptide that, by the hand of man, exists apart from its native environment and is therefore not a product of nature. An isolated nucleic acid molecule or polypeptide can exist in a purified form or can exist in a non-native environment such as, for example, in a transformed host cell.

As used herein, the term "native" refers to a gene that is naturally present in the genome of an untransformed cell. Similarly, when used in the context of a polypeptide, a "native polypeptide" is a polypeptide that is encoded by a native gene of an untransformed cell's genome. Thus, the terms "native" and "endogenous" are synonymous.

As used herein, the term "naturally occurring" refers to an object that is found in nature as distinct from being artificially produced or manipulated by man. For example, a polypeptide or nucleotide sequence that is present in an organism (including a virus) in its natural state, which has not been intentionally modified or isolated by man in the laboratory, is naturally occurring. As such, a polypeptide or nucleotide sequence is considered "non-naturally occurring" if it is encoded by or present within a recombinant molecule, even if the amino acid or nucleic acid sequence is identical to an amino acid or nucleic acid sequence found in nature.

As used herein, the term "nucleic acid" refers to deoxyribonucleotides or ribonucleotides and polymers thereof in either single- or double-stranded form.

Unless specifically limited, the term encompasses nucleic acids containing known analogues of natural nucleotides that have similar binding properties as the reference nucleic acid and are metabolized in a manner similar to naturally occurring nucleotides. Unless otherwise indicated, a particular nucleic acid sequence also implicitly encompasses conservatively modified variants thereof (e.g., degenerate codon substitutions) and complementary sequences and as well as the sequence explicitly indicated. Specifically, degenerate codon substitutions can be achieved by generating sequences in which the third position of one or more selected (or all) codons is substituted with mixed-base and/or deoxyinosine residues (Batzer et a/., 1991 ; Ohtsuka et ai, 1985;

Rossolini et ai, 1994). The terms "nucleic acid" or "nucleic acid sequence" can also be used interchangeably with gene, cDNA, and mRNA encoded by a gene.

As used herein, the phrase "oligonucleotide" refers to a polymer of nucleotides of any length. In some embodiments, an oligonucleotide is a primer that is used in a polymerase chain reaction (PCR) and/or reverse transcription- polymerase chain reaction (RT-PCR), and the length of the oligonucleotide is typically between about 15 and 30 nucleotides. In some embodiments, the oligonucleotide is present on an array and is specific for a gene of interest. In whatever embodiment that an oligonucleotide is employed, one of ordinary skill in the art is capable of designing the oligonucleotide to be of sufficient length and sequence to be specific for the gene of interest (i.e., that would be expected to specifically bind only to a product of the gene of interest under a given hybridization condition).

As used herein, the phrase "percent identical"," in the context of two nucleic acid or polypeptide sequences, refers to two or more sequences or subsequences that have in some embodiments 60%, in some embodiments 70%, in some embodiments 75%, in some embodiments 80%, in some embodiments 85%, in some embodiments 90%, in some embodiments 92%, in some embodiments 94%, in some embodiments 95%, in some embodiments 96%, in some embodiments 97%, in some embodiments 98%, in some embodiments 99%, and in some embodiments 100% nucleotide or amino acid residue identity, respectively, when compared and aligned for maximum correspondence, as measured using one of the following sequence comparison

algorithms or by visual inspection. The percent identity exists in some embodiments over a region of the sequences that is at least about 50 residues in length, in some embodiments over a region of at least about 100 residues, and in some embodiments, the percent identity exists over at least about 150 residues. In some embodiments, the percent identity exists over the entire length of the sequences.

For sequence comparison, typically one sequence acts as a reference sequence to which test sequences are compared. When using a sequence comparison algorithm, test and reference sequences are input into a computer, subsequence coordinates are designated if necessary, and sequence algorithm program parameters are designated. The sequence comparison algorithm then calculates the percent sequence identity for the test sequence(s) relative to the reference sequence, based on the designated program parameters.

Optimal alignment of sequences for comparison can be conducted, for example, by the local homology algorithm disclosed in Smith & Waterman, 1981 ; by the homology alignment algorithm disclosed in Needleman & Wunsch, 1970; by the search for similarity method disclosed in Pearson & Lipman, 1988; by computerized implementations of these algorithms (GAP, BESTFIT, FASTA, and TFASTA in the GCG ® WISCONSIN PACKAGE ® , available from Accelrys, Inc., San Diego, California, United States of America), or by visual inspection. See generally, Altschul et al., 1990; Ausubel et al., 2002; and Ausubel et al., 2003.

One example of an algorithm that is suitable for determining percent sequence identity and sequence similarity is the BLAST algorithm, which is described in Altschul et ai, 1990. Software for performing BLAST analysis is publicly available through the website of the National Center for Biotechnology Information. This algorithm involves first identifying high scoring sequence pairs (HSPs) by identifying short words of length W in the query sequence, which either match or satisfy some positive-valued threshold score T when aligned with a word of the same length in a database sequence. T is referred to as the neighborhood word score threshold. See generally, Altschul et a/., 1990. These initial neighborhood word hits act as seeds for initiating searches to find longer HSPs containing them. The word hits are then extended in both directions along each sequence for as far as the cumulative alignment score can be increased.

Cumulative scores are calculated using, for nucleotide sequences, the parameters M (reward score for a pair of matching residues; always > 0) and N (penalty score for mismatching residues; always < 0). For amino acid sequences, a scoring matrix is used to calculate the cumulative score. Extension of the word hits in each direction are halted when the cumulative alignment score falls off by the quantity X from its maximum achieved value, the cumulative score goes to zero or below due to the accumulation of one or more negative-scoring residue alignments, or the end of either sequence is reached. The BLAST algorithm parameters W, T, and X determine the sensitivity and speed of the alignment. The BLASTN program (for nucleotide sequences) uses as defaults a wordlength (W) of 1 1 , an expectation (E) of 10, a cutoff of 100, M = 5, N = -4, and a comparison of both strands. For amino acid sequences, the BLASTP program uses as defaults a wordlength (W) of 3, an expectation (E) of 10, and the BLOSUM62 scoring matrix. See Henikoff & Henikoff, 1992. In addition to calculating percent sequence identity, the BLAST algorithm also performs a statistical analysis of the similarity between two sequences (see e.g., Karlin & Altschul, 1993). One measure of similarity provided by the BLAST algorithm is the smallest sum probability (P(N)), which provides an indication of the probability by which a match between two nucleotide or amino acid sequences would occur by chance. For example, a test nucleic acid sequence is considered similar to a reference sequence if the smallest sum probability in a comparison of the test nucleic acid sequence to the reference nucleic acid sequence is in some embodiments less than about 0.1 , in some embodiments less than about 0.01 , and in some embodiments less than about 0.001. As used herein, the term "subject" refers to any organism for which analysis of gene expression would be desirable. Thus, the term "subject" is desirably a human subject, although it is to be understood that the principles of the presently disclosed subject matter indicate that the presently disclosed subject matter is effective with respect to invertebrate and to all vertebrate species, including Therian mammals (e.g., Marsupials and Eutherians), which are intended to be included in the term "subject". Moreover, a mammal is understood to include any mammalian species in which detection of differential gene expression is desirable, particularly agricultural and domestic mammalian

species. The methods of the presently disclosed subject matter are particularly useful in the analysis of gene expression in warm-blooded vertebrates, e.g., mammals.

More particularly, the presently disclosed subject matter can be used for assessing imprinting and its consequences in a mammal such as a human. Also provided is the analysis of gene expression in mammals of importance due to being endangered (such as Siberian tigers), of economic importance (animals raised on farms for consumption by humans) and/or social importance (animals kept as pets or in zoos) to humans, for instance, carnivores other than humans (such as cats and dogs), swine (pigs, hogs, and wild boars), ruminants (such as cattle, oxen, sheep, giraffes, deer, goats, bison, and camels), and horses (e.g., thoroughbreds and race horses).

Additionally, in some embodiments the term "subject" refers to a biological sample as defined herein, which includes but is not limited to a cell, tissue, or organ that is isolated from an organism. Thus, it is understood that the methods and compositions disclosed herein can be employed for assessing imprinting and its consequences in a subject that is an organism but can also be employed for assessing imprinting and its consequences in a subject that is a biological sample isolated from an organism. Accordingly, the methods and compositions disclosed herein are intended to be applicable to assessing imprinting and its consequences in vivo as well as in vitro. HL Methods for Identifying an Imprinted Gene

The presently disclosed subject matter provides in some embodiments methods for identifying an imprinted gene in a subject. In some embodiments, the methods comprise a computer-assisted comparison of various features of genetic loci that are known to be imprinted to various features of genetic loci that are known not to be imprinted, and extrapolating from the comparison a plurality of features that are indicative of imprinting status.

As used herein, the term "identifying an imprinted gene" refers to predicting whether or not the gene is imprinted and/or if it is, predicting whether the gene is likely to be maternally or paternally expressed. In some embodiments, the identifying is accomplished by feature selection and classifier learning as described herein. In some embodiments, once features are selected

and classifiers are learned, the learned classifiers, which are equations that output a value indicating the probability of being imprinted, are applied to the features of the genes in the genome.

As used herein, the term "imprinted" and grammatical variants thereof refers to a genetic locus for which one of the parental alleles is repressed and the other one is transcribed and expressed, and the repression or expression of the allele depends on whether the genetic locus was maternally or paternally inherited. Thus, an imprinted genetic locus is characterized by parent-of-origin dependent monoallelic expression: the two alleles present in an individual are subject to a mechanism of transcriptional regulation that is dependent on which parent transmitted the allele. Imprinting has been shown to be species- and tissue-specific as well as a developmental-stage-specific phenomenon (see e.g., Weber et al., 2001 ; Murphy & Jirtle, 2003).

Several mechanisms by which genetic loci are imprinted have been identified, the most common of which appears to be differences in the methylation status of maternal and paternal alleles. However, and as disclosed herein, additional representative sequence features present within the genome have also been identified as being highly predictive of imprinting. These features are summarized in Table 4 hereinbelow. Figure 1 depicts the distributions and weights of various features characteristic of imprinted genes as determined using two different algorithmic approaches. These features include, but are not limited to the presences and relative locations of various repetitive elements (e.g., AIu, CR1 , FAM, FLAM, FRAM, HAL1 , L1 , L2, LTR, ERV, ERV1 , ERVK, WRVI, MaLR, and MIR elements), their orientations relative to each other and to the direction of transcription, etc.

Thus, in some embodiments the presently disclosed methods comprise employing training algorithms to recognize the presence or absence of various genomic sequence features in known imprinted versus known non-imprinted genes, and to use the trained algorithms to identify whether a genetic locus that might or might not be imprinted is in fact imprinted or not. In some embodiments, the methods comprise (a) providing a first data set comprising a plurality of nucleic acid sequences, wherein the nucleic acid sequences comprise genomic DNA sequences corresponding to a plurality of genes known to be imprinted in

the subject; (b) providing a second data set comprising a plurality of nucleic acid sequences, wherein the nucleic acid sequences comprise genomic DNA sequences corresponding to a plurality of genes known not to be imprinted in the subject; (c) identifying one or more features that by themselves or in combination are differentially present or absent from the first data set as compared to the second data set; and (d) applying the one or more features to a test data set comprising a plurality of genomic DNA sequences which correspond to one or more genes for which the imprinting status is unknown to thereby identify an imprinted gene in a subject.. Representative human genes that are known to be imprinted or non-imprinted and that can be used to train the algorithms are presented in Tables 8 and 9.

IV. Methods for Identifying Genetic and Epiqenomic Features in a Subject with Respect to an Imprinted Gene The presently disclosed subject matter also provides methods for identifying a feature in a subject with respect to an imprinted gene. In some embodiments, the methods comprise (a) obtaining a biological sample from the subject, wherein the biological sample comprises one or more nucleic acid molecules isolated from the subject (e.g., a nucleic acid molecule derived from and/or encoding one or more of the genes listed in Tables 1 and/or 7 hereinbelow); and (b) analyzing the one or more nucleic acid molecules, whereby a feature is identified in the subject with respect to the imprinted gene

As used herein, the term "feature" refers to any assayable and/or identifiable characteristic of a genome or epigenome of the subject. Exemplary, non-limiting features include genetic features such as DNA sequence differences (e.g., genotypes).

As such, in some embodiments the presently disclosed methods relate to genotyping a subject with respect to an imprinted gene. As used herein, the phrase "genotyping a subject with respect to an imprinted gene" refers to determining what alleles the subject has with respect to an imprinted gene, and further whether the individual alleles were inherited maternally or paternally. After this has been determined, it can be possible to predict a phenotype that is associated with the genotype.

Any method can be used to determine a genotype with respect to an imprinted gene. In some embodiments, the methods rely on there being an assayable difference between the alleles. Exemplary assayable differences include sequence differences (for example, nucleotide sequence differences in the open reading frame of an imprinted gene, including but not limited to those that result in amino acid differences in the encoded polypeptide). The sequence differences can be determined directly (for example, by sequencing and/or by using amplification primers that are specific for different alleles) or can be determined indirectly (for example, by assaying a biological activity or a biochemical characteristic of a nucleic acid sequence and/or a polypeptide encoded thereby).

Once an assayable characteristic of each allele is determined, it is also possible to determine from which parent each allele is inherited. For example, a sequence difference identified in an imprinted gene in a subject can be used to assay one or both parents to determine what alleles the parents have, and by deduction which alleles in the subject came from which parents.

For example, with imprinted genes it is possible to disregard any contribution to a phenotype from an allele that is expected not to be expressed as a result of the imprinting. In some embodiments, including for example where the imprinting results in monoallelic expression only in a tissue-specific and/or developmental-stage-specific expression of an imprinted gene, this can result in a phenotype in the subject (for example, in a specific cell type or tissue or at a specific developmental stage) that can be predicted once a genotype including parent-of-origin is known. This approach can also benefit from knowing whether the maternal or paternal allele is expected to be expressed in the cell or tissue type of interest or at the developmental stage of interest. A method for predicting parental preference is disclosed herein (see e.g., EXAMPLE 7).

Additionally, a feature that is identified can be an epigenomic feature. Representative, non-limiting epigenomic features include DNA sequence modifications other than nucleotide changes (e.g., methylation status), nucleosome positioning features, chromatin states, and histone modifications (e.g., methlyation or acetylation status or similar). Techniques for assaying for

the presence of these epigenomic features would be known to one of ordinary skill in the art after consideration of the present disclosure. V 1 Methods for Detecting the Presence of. or Predicting a Susceptibility to, a Medical Condition Associated with Parent-of-origin Dependent Monoallelic Expression

The presently disclosed subject matter provides in some embodiments methods for detecting a presence of, or predicting a susceptibility to, a medical condition associated with parent-of-origin dependent monoallelic expression in a subject. In some embodiments, the methods comprise (a) obtaining a biological sample from the subject, wherein the biological sample comprises one or more nucleic acid molecules; (b) analyzing the one or more nucleic acid molecules for a feature with respect to parent-of-origin for one or both alleles of at least one imprinted gene; and (c) determining whether the feature correlates with a presence of or a susceptibility to a medical condition associated with monoallelic expression, whereby a presence of or a susceptibility to a medical condition associated with parent-of-origin dependent monoallelic expression in the subject is detected

Stated another way, the presently disclosed subject matter provides in some embodiments methods for correlating a subject's genotype with respect to one or more imprinted genes with a disease phenotype based on which alleles for the one or more imprinted genes are inherited maternally and which are inherited paternally.

It is possible for subjects to have and/or be susceptible to medical conditions that are associated with imprinted genes. For example, because imprinted genes are expressed in a parent-of-origin dependent monoallelic fashion (in some embodiments the monoallelic expression being tissue- and/or developmental stage-specific), it is possible for a subject to inherit a deleterious allele of an imprinted gene from one parent that is not compensated for by the allele inherited from the other parent. In these cases, it is useful to know not only the nature of the two alleles that a subject has, but also the parent from whom the subject has inherited each allele. Examples of medical conditions that might be associated with imprinted genes include, but are not limited to alcoholism, Alzheimer's disease, asthma/atopy, autism, bipolar disorder, obesity, diabetes,

Parental Uniparental Disomy (UPD), cancer, epilepsy, DiGeorge syndrome, and schizophrenia (see e.g., Table 7 hereinbelow). In some embodiments, the imprinted gene is DLGAP2, DLGAP2L, KCNK9, RTL1,

In some embodiments, the presently disclosed methods can be employed for determining those subjects predicted to benefit from therapies that target the epigenome. As used herein, the term "epigenome" refers to the overall epigenetic state of a subject and/or of a particular, cell, tissue, or organ thereof.

Thus, in some embodiments the epigenome relates to the sum total of all genetic effects as well as epigenetic effects, the latter of which result in some embodiments from differences in expression of loci that are subject to parent-of- origin dependent monoallelic expression. In some embodiments, a subject that is predicted to be likely to benefit from therapies that target the epigenome is a subject in which a cell, tissue, or organ functions inappropriately as a result of the dysregulation of parent-of-origin dependent monoallelic expression of one or more loci. In some embodiments, the one or more genetic loci are selected from among those loci set forth in Table 1 or Table 2 hereinbelow. In some embodiments, the inappropriate function in the cell, tissue, or organ results in the subject having one or more of the conditions set forth in Table 7 hereinbelow. In some embodiments, the condition comprises cancer (see Yoo & Jones, 2006; Feinberg et a/., 2006).

Additionally, the phrase "therapies that target the epigenome" refers to therapies that are designed to influence at least one effect of the epigenome on a phenotype in a subject (e.g., a phenotype related to a disorder or other undesirable medical condition). In some embodiments, a therapy that targets the epigenome can comprise administering to a subject in need thereof a composition that can modify the methylation and/or acetylation of an imprint regulatory element of an imprinted locus. Exemplary, non-limiting examples of such compositions include methyl donors, modulators of methyl transferases, acetyl donors, and modulators of acetylases. EXAMPLES

The following Examples provide illustrative embodiments. In light of the present disclosure and the general level of skill in the art, those of skill will appreciate that the following Examples are intended to be exemplary only and

that numerous changes, modifications, and alterations can be employed without departing from the scope of the presently disclosed subject matter.

EXAMPLE 1 Human Genome Data DNA sequence and annotation data were obtained from the Ensembl database, jointly managed by the European Molecular Biology Laboratory - European Bioinformatics Institute (EMBL - EBI; Cambridge, United Kingdom) and the Sanger Institute (Cambridge, United Kingdom). It is publicly available on the World Wide Web. A positive training set of 40 imprinted genes compiled from the Imprinted Gene Catalog (publicly available from the website of the University of Otago, Dunedin, New Zealand) and recent literature, and a negative training set of 52 genes, for which experimental evidence suggests biallelic expression was employed. Additionally, random sets of 500 control genes presumed to be non-imprinted for a number of tasks were also employed. These random control genes were sampled from autosomal chromosomal bands known or not suspected to contain imprinted genes, and were intended to represent the overall characteristics of biallelically expressed genes. Random control genes were used to compute top pairwise interaction terms, to carry out feature selection with the Equbits classifier (Equbits Inc., Livermore, California, United States of America), and to supplement the final training set that was used to learn our classifiers. To minimize bias, the set of 500 random control genes was resampled for each of these three tasks.

EXAMPLE 2 Feature Measurements DNA sequence feature measurements were acquired from an examination of human genomic sequences present in the Ensembl database and included data derived from recombination hotspots, nucleosome formation potential, and repeat phase changes, as explained below.

Another statistic regarding the repetitive elements flanking a gene was introduced, which is referred to as "phase change" and is defined as an instance of a repetitive element changing its orientation compared to a neighboring element of the same family. The number of such phase changes was counted among retrotransposon classes such as Alus, MIRs, and LTRs within the 100 kb

up- and downstream. In doing this, it was noticed that within the downstream region of imprinted genes, compared to a random sample, a phase change occurred more frequently in one of the following LTRs: MLT1A0, MLT1 B, MSTA,

MSTB1 , MLT1D, MLT2B4, or MLT1 G1. Conversely, phase changes in an MLT1 C LTR were underrepresented in the flanking regions of imprinted genes.

Whether data on recombination could be used to discern imprinted genes was also investigated. Coordinates of recombination hotspots (Myers et al.,

2005) were downloaded from the International HapMap Project website. The recombination hotspots were mapped to the data set, and for each gene the number of hotspots within 350 kb up- and downstream, as well as the minimum distance to the closest recombination hotspot up- and downstream were computed. Interestingly, the retrovirus-like retrotransposon THE1B is reported to be among certain sequence features that are overrepresented in hotspots (Myers et al., 2005). In particular, Myers et al. found the 8-nucleotide motif CCACGTGG to be significantly more frequent in hotspot THEI Bs compared to THEI Bs elsewhere in the genome. The same oligonucleotide motif is also involved in serum-induced transcription at the G1/S-phase boundary in the hamster (Miltenberger et al. , 1995), and is known as the G-box binding motif for plant basic leucine zipper (bZIP) proteins (Niu et al., 1999). The occurrence of this oligomer within all THE1 B elements in the 100 kb flanking each gene was counted.

The last additional class of feature measurements involved nucleosome formation potential profiles. Such in silico estimates of nucleosome packaging density in the promoter region have previously been used to distinguish tissue- specific genes from housekeeping genes and widely expressed genes (Levitsky et al., 2001). Nucleosome formation potential estimates were acquired and summarized as follows. The sum within the 0.82-0.61 kb upstream, the standard deviation 5.86-5.81 kb upstream, the mean 0-1 and 0.31-0.49 kb within the concatenated exons, and the standard deviation 6.7-6.75 and 7.02-7.07 kb downstream were computed. These particular windows were picked following visual inspection of plotted potentials.

EXAMPLE 3 Statistical Methods

To be more robust in the imprinted gene predictions, two distinct strategies for feature selection and classifier learning were employed: Equbits FORESIGHT™ (Equbits Inc., Livermore, California, United States of America), which employs support vector machines, and Sparse Multinomial Logistic Regression (SMLR; Krishnapuram et al., 2005), which adopts a Bayesian approach to sparse multinomial logistic regression. In each case, two separate classifiers were learned: one with a linear kernel and one with a radial basis function (RBF) kernel. The operating point on the ROC for each classifier was chosen so as to minimize the number of false positives while retaining all true positives. To be more conservative in the final predictions, joint agreement among all four classifiers was required before predicting a gene to be imprinted. These are referred to herein as the "high-confidence" predictions. When using Equbits to predict imprinted genes, a 40-fold cross-validation

(CV) procedure was used; at each step feature selection was performed using a linear kernel and then classifiers for imprint status with linear and RBF kernels were learned. Based on the results of this CV, final parameters were selected and linear and RBF classifiers trained on the full training set were applied both to the independent test set and to the whole human genome. During CV, the number of retained features ranged from 613 to 638, while 626 features were retained in the final classifier.

When using SMLR to predict imprinted genes, a similar scheme was adopted. At each step of a 40-fold CV, feature selection was performed by applying a sparsity-promoting prior directly on the weights of the features (no kernel) and then classifiers for imprint status with linear and RBF kernels were learned. Based on the results of this CV 1 final parameters were selected and linear and RBF classifiers trained on the full training set were applied both to the independent test set and to the whole human genome. During CV, the number of retained features averaged 875, while 820 features were retained in the final classifier.

SMLR is written in portable Java, with a GUI, and is available with complete source code under a non-commercial use license from Duke University

(Durham, North Carolina, United States of America). In addition, all data, and all scripts used to produce the SMLR results, are also available.

To ensure that no straightforward relationships within the training data were obscured by sophisticated learning methods, CV was also performed using three simple classifiers (as implemented in Weka 3.4; Witten & Frank, 2005). A naϊve Bayes classifier showed a sensitivity of 40% (16 out of 40 imprinted genes correctly recognized) and a specificity of 97% (535 out of 552 non-imprinted genes correctly classified). A decision stump simply classified all genes as non- imprinted. A random forest classifier showed a sensitivity of 20% (eight out of 40 correct) and a specificity of 95% (522 out of 552 correct). These experiments suggested that simple alternative classification approaches were not likely to result in comparable classification accuracy.

To simplify the prediction of parental expression preference, Equbits was employed only with a linear kernel and the top 30 features. This procedure is analogous to that used to predict parental preference in the mouse (Luedi etal., 2005).

X 2 -tests were used to compare proportions and two-sided Student's t- tests to compare means. To be conservative, Bonferroni's method was used when correcting for multiple testing (α = 0.05). EXAMPLE 4

Experimental Procedures

From human conceptuses and matched maternal deciduas, DNA was isolated in Qiagen buffer ATL and proteinase K (Qiagen Inc., Valencia, California, United States of America) followed by phenol-chloroform-isoamyl alcohol extraction and ethanol precipitation. Each individual was screened for polymorphisms in KCNK9 (C/T, dbSNP Accession No. rs2615374; SEQ ID NO: 2) and DLGAP2 (G/A, dbSNP Accession No. rs17829155 (now merged with SNP ID rs22351 12); SEQ ID NO: 1 ) by genomic DNA PCR with Qiagen HOTSTARTAQ® polymerase (Qiagen Inc., Valencia, California, United States of America) as per the manufacturer's instructions. Following identification of heterozygous polymorphic individuals, total RNA was isolated from brain and testis by homogenization in RNA-Stat 60 (Tel-Test, Friendswood, Texas, United

States of America); subsequent processing was performed as recommended by the manufacturer.

First strand cDNA was primed with gene-specific primers (see below), and synthesized from DNase l-treated RNA using SUPERSCRIPT® Il as recommended by the manufacturer (Invitrogen, Carlsbad, California, United States of America). Qiagen HOTSTARTAQ® polymerase (Qiagen Inc., Valencia, California, United States of America) in a 25 μl RT-PCR reaction volume, as per the manufacturer's instructions. RT-PCR products were separated by electrophoresis on a 1.5% agarose gel, and appropriately sized fragments of cDNA were excised and gel-extracted (GENELUTE™, Sigma Chemical Co., St. Louis, Missouri, United States of America). Products were sequenced (ABI 377 sequencer, PE Biosystems, Foster City, California, United States of America), and analyzed for expression using FinchTV (Geospiza, Inc., Seattle, Washington, United States of America). In order to rule out any stochastic effects, the PCR and the sequencing reactions were repeated at least three times in all cases where exclusive monoallelic expression was observed. All sequencing reactions were also performed in both directions.

DLGAP2 (Disks large-associated protein 2), also known as DAP-2, is annotated to have four splice variants (see the University of California at Santa Cruz Genome Website, May 2004 assembly, Santa Cruz, California, United States of America; Karolchik ef a/., 2003). The four splice variants - chr8.27.24, chr8.27.25, chr8.27.26, and chr8.27.27 - are referred to as DLGAP2-24, -25, - 26, and -27, respectively, lsoforms DLGAP2-24 and DLGAP2-25 were reverse transcribed using primer DLGAP2-RT1 (SEQ ID NO: 3), while DLGAP2-RT2 (SEQ ID NO: 4) was used to reverse transcribe DLGAP2-26 and DLGAP2-21. cDNA from DLGAP2-2λ and DLGAP2-21 was specifically amplified using reverse primer DLGAP2-M1 R (SEQ ID NO: 5), while DLGAP2-M2R (SEQ ID NO: 6) was used to amplify DLGAP2-25 and DLGAP2-26. DLGAP2-M1 F (SEQ ID NO: 7) was used as a common forward primer to amplify cDNA. When amplifying cDNA, the primers bridged two long introns, ruling out any potential influence of undigested genomic DNA. Genomic DNA was amplified and sequenced using DLGAP2-1 F (SEQ ID NO: 8) and DLGAP2-1 R (SEQ ID NO: 9).

KCNK9 (potassium channel, subfamily K, member 9), also known as TASK-3, is annotated to have one isoform. Primers KCNK9-1 F (SEQ ID NO: 10) and KCNK9-1 R (SEQ ID NO: 11) were used for the amplification of genomic DNA. cDNA was amplified using KCNK9-M1 F (SEQ ID NO: 12) and -M1 R (SEQ ID NO: 13), which bridge an 84 kb intron. Primer sequences are given in Table 11 hereinbelow. In order to rule out any stochastic effects, the PCR and the sequencing reactions were repeated multiple times whenever monoallelic expression was observed. All sequencing reactions were performed in both directions. EXAMPLE 5

Conceptual Approach

A conservative approach was adopted in identifying human imprinted genes because of their important role in disease etiology. Specifically, two separate classifier learning strategies - one based on support vector machines and the other sparse logistic regression - each with a different feature selection process, were adopted. With each strategy, classifiers with two different similarity kernels were classified: linear and radial basis function (RBF). Only genes predicted to be imprinted by all four classifiers were considered "high- confidence" predictions. Although all four classifiers use the same initial training set of known imprinted genes, the combined classifier approach helps to control for biases that might arise from different choices for feature selection, classifier learning, or similarity kernel.

All four classifiers were trained on DNA sequence features collected from 40 genes known to be imprinted in human and 52 genes known not to be imprinted in human (see Table 9 hereinbelow), plus 500 randomly selected genes suspected not to be imprinted in human (see Table 10 hereinbelow). The prediction accuracy of the combined classifier both by cross-validation and with an independent negative test set was assessed (see Table 8 hereinbelow). In a 40-fold cross-validation, a specificity of 100% (40/40 imprinted genes correctly identified) and a sensitivity of 99% (545/552 presumably non-imprinted genes correctly identified) was obtained. The independent negative test set consisted of 13 genes with random monoallelic expression and 88 genes with biallelic expression or synchronous replication, including four genes imprinted in mouse

but not human. All 101 genes were correctly predicted to not be imprinted (see Table 8 hereinbelow; see also Figure 5 for a schematic depiction of the workflow).

EXAMPLE 6 Genome-wide Prediction of Candidate Imprinted Genes

Applying the combined classifier to the entire human genome, 156 of 20,770 (0.75%) annotated autosomal genes not previously known to be imprinted (Ensembl v20) were predicted to be imprinted with high confidence (see Table 1 and Table 2 hereinbelow). Only chromosomes 7 and 11 showed a higher density of predicted and known imprinted genes compared to the rest of the autosome (P = 0.0014 and P = 0.0026, respectively, X 2 test with 1 df; see also Figure 1).

Seven chromosomal bands contained a significantly higher density of imprinted gene candidates, including novel candidates related to various cancers (P < 2 x 10 '8 , X 2 test with 1 df; see Table 3 hereinbelow). The clusters on 15q12 and 7q21.3 include known imprinted genes. Included in the 11p15.5 region were well know imprinted genes such as H19 and IGF2, and five novel candidates, located further distal, including PKP3, an oncogene involved in lung cancer (Furukawa et al., 2005). The cluster on 1p36.32 included the known imprinted gene TP73 along with the novel candidate PRDM16, which is associated with leukemia (Du et a/., 2005). The ortholog of this gene was also predicted to be imprinted in mouse (Luedi etal., 2005). Chromosomal band 14q32.31 contained the known imprinted gene MEG3 along with the novel candidate RTL1, which is imprinted in the mouse (Seitz etal., 2003) and sheep (Charlier etal., 2001). The cluster of candidate genes on 10q26.3 included the novel candidate NKX6-2, which is preferentially expressed in the brain (Lee et al., 2001), and was predicted to be imprinted in the mouse (Luedi et al., 2005). NKX6-2, along with four neighboring candidate genes, was predicted to be maternally expressed. This region on 10q26 is 4.7-5.7 Mb from the marker D10S217, which is maternally linked to male sexual orientation (Mustanski et al. , 2005). A germline differentially methylated region was found within this interval (coordinate 135.1 Mb; see Strichman-Almashanu et al., 2002), lending further support to the prediction of imprinted genes within the immediate vicinity of this region.

Figures 2 and 3 present a series of bar graphs depicting distributions of the weights of features characteristic of imprinted genes as determined by two feature selection methods: those of Equbits (Figure 2) and SMLR (Figure 3). Absolute weights are shown as box plots, the dotted line represents the overall mean of all selected features. Figures 2A and 3A depict the distribution of feature type. Figures 2B and 3B depict the distribution of different ways of quantifying repetitive elements. The ratios of ± counts carried the greatest weight (P < 6x10 ~11 ; see also Table 4 hereinbelow). Figures 2C and 3C depict the distribution of different repetitive element locations. The 1 kb downstream window was of least importance (P < 1 * 10 ~3 ). Figures 2D and 3D depict the distribution of different families of repetitive elements. Alus carried the lowest weight (P < 4 x 1 Cf 3 ), whereas endogenous retroviruses (ERV) were of greatest importance (P < 3 x 1 Cf 3 ). Figures 2E and 3E depict the distribution of counts of the highest scoring transcription factor binding sites. Among transcription factor binding sites, those of greatest importance in both feature selection strategies were CEBP, E2F, ICP4, lgPE2, NFuEI , NFuE3, PEA1 , PEA2, Sp1 , and SRF (see Figures 2E and 3E). E2F family transcription factors are involved with cell proliferation, Sp 1 elements have been shown to protect CpG islands from de novo methylation in the embryo (Brandeis et al., 1994), and SRF (serum response factor) is involved in the activation of "immediate early" genes (Schratt et al., 2001), in muscle differentiation (Vandromme et al., 1992; Soulez et al., 1996), and in mesoderm formation (Arsenian et al., 1998).

EXAMPLE 7 Prediction of Parental Preference

A separate classifier was trained to determine if the maternal or paternal allele of an imprinted gene is expressed. The training set included 19 maternally expressed genes and 20 paternally expressed genes (GRB10 was omitted due to its complex expression patterns (Blagitko et al., 2000)). In a 19-fold cross- validation, a sensitivity of 85% (17/20 paternally expressed genes correctly identified) and a specificity of 79% (15/19 maternally expressed genes correctly identified) was achieved. The ability to accurately predict the expressed parental allele of known imprinted genes in both human and mouse (Luedi et al., 2005)

lent support to the suggestion that different mechanisms might be responsible for regulating paternal versus maternal imprinting (Mancini-Dinardo etal., 2006).

Maternal expression was predicted for 56% (88/156) of the candidate imprinted genes, comparable to the 64% frequency found for mouse imprinted genes (Luedi et al., 2005). Among the features of greatest significance for the prediction of parental expression preference were the ratios of the relative orientation of AIuJ and ERVL elements downstream (see Table 5 hereinbelow).

E4F1 transcription factor binding sites were also significantly more prevalent in the 3-4 kb upstream region of maternally expressed genes than in paternally expressed genes.

EXAMPLE 8

Experimental Identification of New Imprinted Genes Guided by the high-confidence predictions of the combined classifier, two new imprinted human genes were experimentally verified. DLGAP2 (Disks Large-Associated Protein 2) and KCNK9 (Potassium Channel, Subfamily K, Member 9) were chosen for experimental validation. A number of criteria were employed to prioritize the 156 predictions for experimental validation: large posterior probabilities of being imprinted (in the case of SMLR), large signed hyperplane distances (in the case of SVM) 1 potential involvement in an important condition (such as a cancer or one of the conditions listed in Table 7), and location in a chromosome not known to contain imprinted genes (e.g., DLGAP2 and KCNK9 reside at opposite telomeric regions of chromosome 8, a human chromosome not previously shown to contain imprinted genes; Morison et al., 2005), as many imprinted genes have to date been identified by searching near known imprinted genes, so finding some on a completely different chromosome would be compelling; also this would ensure that confounding effects related to known imprinted genes nearby were minimized). It was further decided that having one candidate with an ortholog predicted to be imprinted in the mouse but the other not was desirable to emphasize that the two sets of predictions did not overlap significantly and that novel human imprinted genes could be discovered even without relying on any conservation of imprinting status between human and mouse.

This approach resulted in a high-priority list of five genes. Conceptuses were screened to determine whether for each gene a sufficient number possessed an informative genotype that would permit experimental detection of monoallelic expression. The list was further narrowed to DLGAP2 and KCNK9, for which a detailed validation of imprinting status was undertaken.

DLGAP2 is highly expressed and alternatively spliced in brain and testis (Ranta et al., 2000). It is contained within a 1.1 Mb interval on chromosome 8p23.3 that is frequently deleted in bladder cancer (Muscheck et a/., 2000), making it a candidate tumor suppressor. cDNA containing polymorphic sites was generated by reverse transcription of total RNA isolated from brain and testis in heterozygous human conceptuses (N = 8; gestational age: 63-105 days). The four isoforms of DLGAP2 (splice variants 24, 25, 26, and 27) (Karolchik et al., 2003) were paternally expressed in the testis of all samples (Figure 4A) with some evidence of imprinting relaxation in isoforms 24 and 26. In contrast, expression from both alleles was observed for all four isoforms of DLG AP2 in whole brain. PEG1-AS is another imprinted gene predominantly expressed in the testis, and like DLGAP2 is expressed only from the paternal allele (Li et a/., 2002).

KCNK9 resides at chromosomal location 8q24.3. It encodes the TASK3 (Twik-like acid-sensitive K+) channel and is predominantly expressed in the cerebellum (Medhurst et al. , 2001 ). Therefore, RNA was isolated from the brains of conceptuses that were polymorphic at this locus (N = 9; gestational age: 63-98 days). KCNK9 was exclusively expressed from the maternal allele in all samples (Figure 4B). Thus, both genes chosen for experimental verification of their predicted imprint status were shown to be monoallelically expressed from the predicted parental allele (see Table 1 hereinbelow).

Discussion of the EXAMPLES

Comparison to mouse. When making predictions with a classifier, it is preferable to weigh the trade-off between sensitivity and specificity, or analogously, between false positive rate and false negative rate. In the co- inventors' previous mouse study (Luedi et al., 2005), a greater focus was placed on keeping the false negative rate low. In the present human study, however, it was sought to keep the false positive rate low, defining the set of high

confidence imprinted gene candidates as the intersection of four different classifiers. At least in part because of these different methodological choices, the number of imprinted genes predicted in the mouse and the number of high- confidence imprinted genes predicted in the human are not directly comparable. If a similar statistical methodology is adopted in the human as was used in the mouse, the number of human imprinted gene candidates increases, but is still only a little more than half as large as the mouse set. While these numbers are still not directly comparable since the sequence features in the human data are slightly richer than those in mouse, they are suggestive that the overall prevalence of imprinted genes is lower in human than in mouse.

The concordance between the high-confidence human imprinted candidates and the predictions for their orthologs in mouse was also investigated. A murine ortholog was identified for 1 19 of the genes proved or predicted with high confidence to be imprinted in human. Only 39 (33%) of these genes are known or predicted to be imprinted in both species (see Table 6 hereinbelow). This fraction does not change significantly if the same prediction method that was used for the mouse is also applied to the human data. Hence, the lack of greater overlap is not solely due to differences in the statistical approach. That there are high levels of discordance of imprinting status between mouse and human has been recognized previously (Morison et al., 2005; Monk et al., 2006). It has been speculated that mice might have expanded genomic imprinting in order for the placenta to accommodate a large litter size and shorter gestational period, which might require an increased conservation of maternal resources (Monk et al., 2006). In contrast, human pregnancies tend to be singletons and of longer gestational time, which alleviates evolutionary pressure on imprinted genes to preserve maternal resources. Hence, it seems plausible that relatively fewer genes would be imprinted and maternally expressed in human (predicted proportion of 56% versus 64% in mouse); this is also consistent with the lower prevalence predicted overall. Of course, it is not the desire of the present co-inventors to be bound by any particular theory of operation in this regard.

The observed difference in the imprint status of genes in mouse and human raises the possibility that despite their immense popularity as models of human disease, mice might not be an ideal choice for studying diseases resulting principally from the epigenetic deregulation of imprinted genes, or for assessing human risk from environmental factors that alter the epigenome.

Imprinting and development. Of the 146 genes with a systematic name that are proved or predicted with high confidence to be imprinted, 38% are associated with embryonic development (based on PubMed abstracts); this compares to 18% among a random set of 5000 autosomal genes predicted not to be imprinted (P < 1.7 x 10 "9 ). As one interesting example, the homeobox (HOX) genes play a key role in pre- and post-implantation development (Eun Kwon & Taylor, 2004; Moens & Selleri, 2006). 23% of the HOX genes were predicted to be imprinted (9 out of 39; P < 2 x 10 "16 ). Five of the high-confidence candidates are located in the HOXA cluster, two in each of the HOXB and HOXC clusters, and none in the HOXD cluster. Several imprinted genes are known to be regulated in mouse by the same Polycomb group proteins (Mager et al., 2003; Umlauf et al., 2004) that also regulate HOX expression (Bantignies & Cavalli, 2006). Thus, there could be sequence characteristics shared in common between these two families of genes; however, no Hox genes were predicted to be imprinted in the mouse (Luedi et al., 2005). This indicates that the high prevalence of HOX imprinted gene candidates in human does not result simply from any shared sequence characteristics. Instead, it raises the possibility that monoallelic expression of HOX genes may have influenced human evolution, particularly the evolution of the brain. Insights into the evolution of imprinting. Interestingly, recombination data was found to be of considerable importance for discriminating imprinted from non-imprinted genes. For example, an 8 basepair (bp) motif within THE1 B elements that is overrepresented near recombination hotspots (Myers et al., 2005) is positively correlated with the presence of imprinted genes. In addition, the average distance between recombination hotspots and known imprinted genes is found to be about one third of that for all annotated genes. These observations lend support to the hypothesis that imprinted genes were originally linked in a few chromosomal regions, and were dispersed throughout the

genome by recombination events during mammalian evolution (Walter & Paulsen, 2003). Of course, it is not the desire of the present co-inventors to be bound by any particular theory of operation in this regard.

In a cross-species comparison of imprinted regions between mouse and human, it has also been hypothesized that genomic imprinting might have evolved on the basis of dosage compensation following large-scale duplication events (Walter & Paulsen, 2003). To investigate this, it was asked whether the imprinted gene candidates were more likely to have been duplicated than the rest of the autosome. When using FASTA (Pearson & Lipman, 1988) to query each protein sequence against all other human proteins in our set, the distribution of the significance value for the second best hit was not different among imprinted gene candidates compared to the rest of the autosomal genes. Also, the proportion of paralogs that are located on the same chromosome was found not to differ between the two classes of genes, nor was there a significant difference in distance to that paralog. In conclusion, these findings fail to corroborate the hypothesis of large-scale gene duplication as the driving force of imprinting evolution. Of course, it is not the desire of the present co-inventors to be bound by any particular theory of operation in this regard.

Other hypotheses for the evolution of genomic imprinting include the proposition that imprinting is a by-product of a host defense against foreign DNA (Barlow, 1993; Yoder et ai, 1997), or that during retrotransposition of a gene some regulatory elements may have been carried along with it that confer imprinted expression (Walter & Paulsen, 2003). To investigate this, it was determined whether the set of imprinted gene candidates identified was enriched for single-exon genes that might have been derived from multiexonic precursor paralogs. No significant difference in the rate of imprinted gene candidates consisting of only a single exon was observed compared to the autosomal genes not predicted to be imprinted (18% versus about 16%). Contrary to the observation that almost all known imprinted genes derived from retrotransposition are paternally expressed (Walter & Paulsen, 2003; Morison et ai , 2005), it was also found that there was no statistically significant difference in the rate of intron-less genes among imprinted gene candidates with predicted

maternal versus paternal expression. Of course, it is not the desire of the present co-inventors to be bound by any particular theory of operation in this regard.

Relevance for disease etiology. Parent-of-origin inheritance is increasingly observed in complex human health conditions such as alcoholism, Alzheimer's, asthma, autism, bipolar disorder, cancer, and schizophrenia (Murphy & Jirtle, 2003), providing evidence that imprinted genes play a role in their etiology. Furthermore, evidence is mounting for an association of assisted reproductive technology with birth defects and diseases caused by epigenetic dysregulation (Niemitz & Feinberg, 2004), which mostly involve imprinted genes. Disclosed herein is the successful mapping of genes proved or predicted with high confidence to be imprinted into chromosomal regions linked to a number of these complex conditions (see Table 7 hereinbelow). Interestingly, when candidate imprinted genes were mapped onto the overall human disease landscape defined by linkage analysis, some imprinted genes appeared to be involved in the etiology of multiple human diseases.

For example, KCNK9 is associated with a variety of human cancers (Patel & Lazdunski, 2004). It also resides at chromosome location 8q24 within 6 Mb of the marker D8S256 that is linked with bipolar disorder (Mclnnis et al. , 2003; see Table 7 hereinbelow). Furthermore, since KCNK9 encodes for a potassium ion channel that mediates neuronal excitability, it is a strong candidate for idiopathic absence epilepsies (Zara et al., 1995; Kananura er a/., 2002).

Table 1

Hiqh-confidence Imprinted Human Gene Candidates

Ensembl ID Band Pred. Ensembl ID Band Pred.

184163 (Q5EBL5) 1 p36.33 M 173935 1 p34.2 M

107404 (DVL1) 1 p36.33 M (NM_182518)

178821 (TMEM52) 1 p36.33 P 178973 1 p34.2 M

15791 1 (PEX10) 1 p36.32 M (NM_024547)

177121 (Q8N6L5) 1 p36.32 P 137944 1 p22.2 M

142611 (PRDM16) 1 p36.32 P (NM_019610)

116213 (WDR8) 1 p36.32 M 162676 (GF11) 1 p22.1 P

179163 (FUCAI) 1 p36.11 P 186371 (NDUFA4) 1 p13.3 P

183682 (BMP8) 1 p34.3 P 1731 10 (HSPA6) 1q23.3 M

152104 (PTPN 14) 1q32.3 M

124860 (OBSCN) 1q42.13 P 112499 6q25.3 P

181203 1q42.13 M (SLC22A2)

(HIST3H2BB) 060762 (BRP44L) 6q27 P

177356 (Q8NGX0) 1q44 P 105996 (HOXA2) 7p15.2 M

138061 (CYP1 B1) 2p22.2 P 105997 (HOXA3) 7p15.2 M

152518 (ZFP36L2) 2p21 M 106001 (HOXA4) 7p15.2 M

143921 (ABCG8) 2p21 M 106004 (HOXA5) 7p15.2 M

055813 (Q96PX6) 2p16.1 P 005073 (HOXA11) 7p15.2 M

115507 (OTX1) 2p15 M 106038 (EVX1) 7p15.2 P

116035 (VAX2) 2p13.3 M 106571 (GLI3) 7p14.1 M

169636 2q12.3 P 185037 7q11.21 M

184764 (RPL22) 2q13 P 185947 (Q81W5) 7q11.21 P

171567 (TIGDI) 2q37.1 P 13521 1 (C7orf35) 7q 11.23 P

186540 (Q9Y419) 2q37.3 M 187391 (MAG 12) 7q21.11 M

172428 (MYEOV2) 2q37.3 P 164889 (SLC4A2) 7q36.1 M

144908 (FTHFD) 3q21.3 M 164896 (FASTK) 7q36.1 M

181882 3q22.3 P 180204 8p23.3 P

152977 (ZIC1) 3q24 M (NM_181648)

114315 (HES1) 3q29 P 104284 (DLGAP2) 8p23.3 P

127418 (FGFRLI) 4p16.3 M 185161 (Q8N914) 8p23.1 P

159674 (SPON2) 4p16.3 P 172733 (PURG) 8p12 P

163945 4p16.3 M 167912 (Q96QE0) 8q12.1 M

(NP_065945.1) 185942 (FAM77D) 8q12.3 P

153851 (Q9NY19) 4q13.2 P 169427 (KCNK9) 8q24.3 M

153852 (Q9NYJ6) 4q13.2 P 167656 (LY6D) 8q24.3 P

186158 4q35.2 M 167701 (GPT) 8q24.3 M

186147 (DUX2) 4q35.2 P 186758 (Q8N710) 9p21.1 M

145536 5p15.32 M 107282 (APBAI) 9q21.1 1 P

(ADAMTS 16) 155621 9q21.12 P

145526 (CDH 18) 5p14.3 P (NM_182505)

174132 (Q8TBP5) 5q21.1 P 186788 9q21.32 M

164400 (CSF2) 5q23.3 M (NP_001001670)

145945 (FAM50B) 6p25.2 M 177945 9q33.3 P

168426 (BTNL2) 6p21.32 M (NM_016158)

135324 (C6orf117) 6q14.2 P 136944 (LMXI B) 9q33.3 M

160345 9q34.3 P 185498 13q21.32 P

(NM_144654) 184497 (FAM70B) 13q34 M

172889 (EGFL7) 9q34.3 P 176165(FOXGIC) 14q12 P

054148(PHPTI) 9q34.3 M 073712 14q22.1 P

186909 10p15.3 P (PLEKHCI)

107485 (GATA3) 10p14 P 183992 14q31.1 M

180740 (Q9H6Z8) 10q23.31 P 185469 (RTL1) 14q32.31 M

148820 (LDB1) 10q24.32 M 126290 (HV2A) 14q32.33 P

180066 (C10orf91) 10q26.3 M 151802 (Q9P068) 15q13.1 P

148826 (NKX6-2) 10q26.3 M 005513 (SOX8) 16p13.3 P

171811 (C10orf93) 10q26.3 M 172268 (Q96S05) 16p13.3 P

151650 (VENTX2) 10q26.3 M 103449(SALLO) 16q12.1 M

178592 (Q8N377) 10q26.3 M 103005 (C06orf57) 16q13 M

148832 (PAOX) 10q26.3 M 102977(ACD) 16q22.1 M

185885(IFITMI) 11p15.5 M 103241 (FOXF1) 16q24.1 M

182272 11p15.5 M 183788 (Q8N206) 16q24.3 M

(B4GALNT4) 183518 17p13.3 M

184363 (PKP3) 11p15.5 M 167874 (TMEM88) 17p13.1 M

176828 (Q8N9U2) 11p15.5 M 181977 (PYY2) 17q11.2, P

184682 11p15.5 M 173917 (HOXB2) 17q21.32 M

184193 (Q8N7V1) 11p14.3 M 120093 (HOXB3) 17q21.32 M

174903(RABIB) 11q13.2 M 141378 (YCE7) 17q23.2 M

182359 (KBTBD3) 11q22.3 P 181428 (Q8N8L1) 17q25.3 P

182657 11q24.3 M 141441 (FAM59A) 18q12.1 P

182667 (NTR1) 11q25 P 101489 18q12.2 M

139194 (RBP5) 12p13.31 P (BRUNOL4)

069431 (ABCC9) 12p12.1 M 141934 (PPAP2C) 19p13.3 M

180806 (HOXC9) 12q13.13 M 180866 (Q8NB05) 19p13.2 P

186426 (HOXC4) 12q13.13 M 172684 (Q8NE65) 19p13.11 P

135502 12q13.3 M 172666 19p13.11 P

(SLC26A10) 121297 (TSH3) 19q12 P

135446 (CDK4) 12q14.1 M 124302 (CHST8) 19q13.11 M

165891 (Q96AV8) 12q21.2 M 180458 (Q8N3U1) 19q13.13 P

112787 (Q9HCM7) 12q24.33 M 159904 (ZNF225) 19q13.31 P

178215 (Q8N7V5) 13q21.1 M 167383 (ZNF229) 19q13.31 M

177527 (Q8N7F4) 13q21.31 P 186818 (LILRB4) 19q13.42 M

105132 (ZN550) 19q 13.43 M 159263 (SIM2) 21q22.13 P 130724 (CHMP2A) 19q13.43 M 183628 (DGCR6) 22q11.21 M 099326 (ZNF42) 19q 13.43 M 183099 22q11.21 M 101230 (C20orf82) 20p12.1 P 184390 (Q61 CMO) 22q12.2 P 101189 (C20orf20) 20q13.33 M 184687 (Q8ND38) 22q13.31 P 092758 (COL9A3) 20q13.33 M The table lists high-confidence novel predictions of the combined classifier. Genes predicted to be expressed from the maternal or paternal allele are denoted by M or P, respectively. To enhance legibility, the common prefix "ENSG00000" has been dropped from the Ensembl ID. Also listed are gene names and/or GENBANK® Accession Nos. where applicable.

Table 2

Hiqh- and Lower-Confidence Imprinted Gene Candidates

Ensembl ID Band Pred ^ Ensembl ID Band Pred. t

173447 1 p36.33 S M 157911 1 p36.32 E 1 S M

184235 1 p36.33 S M (PEX10)

131591 1 p36.33 S M 157881 1 p36.32 S M

(NM_017891) (PANK4)

182839 1p36.33 E M 169797 1 p36.32 S M

184163 1 p36.33 E 1 S M 157870 1 p36.32 S M

(Q5EBL5) (NM_152371)

131584 1 p36.33 S M 177121 1 p36.32 E 1 S P

(CENTB5) (Q8N6L5)

127054 1p36.33 S M 142611 1 p36.32 E 1 S P

(NM_017871) (PRDM16)

169962 1 p36.33 S M 162591 1 p36.32 S P

(TAS/R3) (EGFL3)

107404 (DVL 1) 1 p36.33 E 1 S M 182956 1 p36.32 S M

162576 1p36.33 S M 116213 1 p36.32 E 1 S M

(NM_032348) (WDR8)

160075 1 p36.33 S M 183509 1 p36.32 S M

(NM_014488) (Q8IYL3)

178821 1 p36.33 E 1 S P 131697 1 p36.31 S M

(TMEM52) (Q9UFQ2)

157916 (RER1) 1 p36.33 S P 130940 1 p36.22 S M

(NM 017766)

117154 1p36.13 S P 184458 1q21.3 S P

(NM_032880) (Q86YZ3)

179002 1p36.13 S P 169474 1q21.3 S M

(TASIR2) (SPRR1A)

179163 1p36.11 E 1 S P 160691 (SHC1) 1q22 S M

(FUCA1) 143620 1q22 S M

142698 1p35.1 E P (EFNA4)

(NM_032884) 160856 1q23.1 S P

126070 1p34.3 E M (NM_052939)

(EIF2C3) 132704 1q23.1 E M

185668 1p34.3 S P (FCRL2)

(POU3F1) 132703(APCS) 1q23.2 S P

183682 (BMP8) 1p34.3 E 1 S P 173110 1q23.3 E 1 S M

173935 1p34.2 E 1 S M (HSPA6)

(NMJ82518) 143152 1q24.1 S M

178973 1p34.2 E 1 S M (Q9C074)

(NM_024547) 117501 1q24.3 S P

117410 1p34.1 S M (NM_025063)

(ATP6V0B) 116147(TNR) 1q25.1 S P

118473 1p31.2 E P 116703(PDC) 1q31.1 S P

(SG1P1) 118194 1q32.1 S P

132489 1p31.2 E P (TNNT2)

(NM_020948) 152104 1q32.3 E 1 S M

117069 (S 17E) 1p31.1 E P (PTPN14)

137944 1p22.2 E 1 S M 152120 1q41 S P

(NM_019610) (Q9NQ13)

162676 (GF11) 1p22.1 E 1 S P 117791 1q41 S P

182166 1p21.2 S P (NM_017898)

186371 1p13.3 E 1 S P 185495 1q42.1 1S M

(NDUFA4) (Q9H5Q3)

121931 1p13.3 S P 173419 1q42.12 S P

(NM_018372) (Q8IVP0)

116455 (ME50) 1p13.2 S P 081692 1q42.13 S P

179735 1q21.1 S P (NM_023007) •

(Q8NE92) 124860 1q42.13 E 1 S P

(OBSCN)

181203 1q42.13 E 1 S M 144040 2p13.2 S P

(HIST3H2BB) (SFXN5)

168159 1q42.13 E M 135637 2p13.1 S M

(Q5TA31) (MRPL53)

182887 1q42.13 E P 115325 (DOK1) 2p13.1 S M

162946 (DISC1) 1q42.2 S M 116119 (KV2A) 2p11.2 S P

179397 1q44 S M 115085 2q11.2 S P

(NM_173807) (ZAP70)

177356 1q44 E 1 S P 135951 2q11.2 E P

(Q8NGX0) (TSGA10)

035115 2p25.3 S M 071082 2q11.2 S P

(N M_015677) (RPL31)

172554 2p25.3 S P 169636 2q12.3 E 1 S P

(SNTG2) 183998 2q13 E P

186170 2p25.3 S M (RPL22)

(TMSL2) 015568 2q13 S P

182551 2p25.2 S P (RANBP2L1)

(NMJ318269) 184764 2q13 E 1 S P

134321 2p25.2 S P (RPL22)

(NM_080657) 184538 2q13 S P

115738 (JD2) 2p25.1 E M (RANBP2L1)

138061 2p22.2 E 1 S P 153094 2q13 S P

(CYP1 B1) (BCL2L11)

152154 2p22.1 S P 125618 (PAX8) 2q13 S M

(NM_152390) 183300 2q14.3 S P

152518 2p21 E 1 S M 136720 2q14.3 S P

(ZFP36L2) (HS6ST1)

143921 2p21 E 1 S M 169822 2q14.3 S P

(ABCG8) (NM_030970)

138083 (SIX3) 2p21 S P 136698 2q21.1 S M

055813 2p16.1 E 1 S P (NM_032545)

(Q96PX6) 179843 2q21.1 S M

115507 (OTX 1) 2p15 E 1 S M (RAB6C)

116035 (VAX2) 2p13.3 E 1 S M 183840 2q21.2 E M

178455 2p13.2 S P (GPR39)

003137 (C26A) 2p13.2 S P

136539 2q24.2 S P 178055 3p21.31 S M

(NM_014880) (N M_182702)

174470 2q24.2 S M 068028 3p21.31 S M

(Q96M44) (RASSF1)

128714 2q31.1 S M 145050 3p21.31 S P

(HOXDJ3) (ARMET)

128713 2q31.1 S M 114841 3p21.1 S M

(HOXD11) (NM_015512)

128709 2q31.1 S M 010322 3p21.1 S M

(HOXD9) (NISCH)

170166 2q31.1 E M 168268 3p21.1 S M

(HOXD4) (NM_022908)

171567 (TIGDI) 2q37.1 E 1 S P 144741 3p14.1 S P

157985 2q37.2 E M (NM_173471)

(CENTG2) 183185 3q12.1 S P

144485 (HES6) 2q37.3 S M (Q9UIV9)

132326 (PER2) 2q37.3 S M 184804 3q13.12 E M

186540 2q37.3 E 1 S M 185565 3q13.31 S M

(Q9Y419) (LSAMP)

178580 2q37.3 S P 144908 3q21.3 E 1 S M

(Q81YXC7) (FTHFD)

172428 2q37.3 E 1 S P 114626 3q21.3 S P

(MYEOV2) (ABTB 1)

178602 2q37.3 S P 179348 3q21.3 S M

(NM_148961) (GATA2)

063660 (GPC1) 2q37.3 S M 004399 3q22.1 S P

142327 2q37.3 S M (PLXND1)

(RNPEPL1) 174640 3q22.1 S P

115687 (PASK) 2q37.3 E M (SLC21A2)

132170 3p25.2 S P 144872 3q22.2 S P

(PPARG) 181882 3q22.3 E 1 S P

131374 3p24.3 E M 168875 3q22.3 S M

(TBC1 D5) (SOX14)

060971 3p22.3 S P 114120 3q23 S M

(ACAA1) (NM_018155)

010282 (KB73) 3p22.1 S P

175685 3q24 S P 180769 4q21.23 S M

(Q9BZ57) (Q8N507)

174963 (ZIC4) 3q24 S M 138821 4q24 S P

152977 (ZIC1) 3q24 E 1 S M (NM_022154)

175726 3q25.1 S M 168743 4q24 E P

174948 3q25.2 S M (NP_001028219)

(Q86SP6) 164093 (PITX2) 4q25 S P

151967 3q25.32 S P 177826 4q28.1 S P

(SCH1 P1) 170153 4q31.21 S M

181501 3q26.33 E M (Q9ULK6)

163882 3q27.1 S M 180519 4q31.21 S P

(POLR2H) 151615 4q31.22 S M

114315 (HES1) 3q29 E 1 S P (POU4F2)

169020 4p16.3 S M 172799 4q31.3 S M

(ATP51) 145431 4q32.1 E P

145214 (DGKQ) 4p16.3 S M (PDGFC)

127418 4p16.3 E 1 S M 038295 (TLL1) 4q32.3 S P

(FGFRL1) 056050 4q33 S M

176836 4p16.3 S P (NM_017867)

159674 4p16.3 E 1 S P 168322 4q34.3 S P

(SPON2) (NM_030970)

163945 4p16.3 E 1 S M 177310 4q35.1 E M

(NP_065945.1) (NM_153008)

174141 4p16.3 S P 186158 4q35.2 E 1 S M

(Q15270) 186147 (DUX2) 4q35.2 E 1 S P

068078 4p16.3 S M 066230 5p15.33 S M

(FGFR3) (SLC9A3)

163956 4p16.3 S M 185486 5p15.33 S M

(LRPAP1) 125063 5p15.33 S M

183190 4p13 E M (NM_017808)

182739 4q13.2 S P 112877 5p15.33 S M

(GRINL1 B) (NM_018140)

153851 4q13.2 E 1 S P 145506 (NKD2) 5p15.33 S M

(Q9NY19) 113504 5p15.33 S M

153852 4q13.2 E 1 S P (SLC12A7)

(Q9NY16) 174358 5p15.33 S M

153395 5p15.33 S M 080709 5q22.3 S M

(Q8NF37) (KCNN2)

113430 (I RX4) 5p15.33 S M 113396 5q23.3 S M

170561 (IRX2) 5p15.33 S P (SLC27A6)

170549 (IRX1) 5p15.33 S M 164400 (CSF2) 5q23.3 E 1 S M

145536 5p15.32 E 1 S M 06901 1 (PITX1) 5q31.1 S P

(ADAMTS 16) 174313 5q31.1 E P

164236 5p15.2 E P 081818 5q31.3 S M

(XP_293937.5) (PCDHB4)

133357 5p15.2 E P 177895 5q31.3 S P

(NM_030970) (PCDHB16)

145526 5p14.3 E 1 S P 120327 5q31.3 E P

(CDH18) (PCDHB14)

132404 5p14.1 S P 081853 5q31.3 S M

1 13492 5p13.2 E P (PCDHGC5)

(AGXT2) 1 13580 5q31.3 E P

168621 (GDNF) 5p13.2 S P (NR3C1)

016082 (ISL1) 5q11.2 S P 169302 5q32 S P

164258 5q11.2 S P 113667 (Y555) 5q32 E M

(NDUFS4) 145888 5q33.1 E P

164283 (ESM 1) 5q11.2 S P (GLRA1)

152929 5q12.1 S M 182344 5q35.2 S M

(Q9BXE3) 185548 5q35.3 S M

145645 5q12.1 S P 178392 5q35.3 S M

(Q9P193) 185784 5q35.3 E M

171540 (OTP) 5q14.1 S M (Q8TAJ0)

131730 5q14.1 S M 168903 5q35.3 S P

(CKMT2) (BTNL3)

131732 5q14.1 S M 137273 6p25.3 S P

(NM_032280) (FOXF2)

153922 (CHD1) 5q15 E M 184250 6p25.2 S M

174132 5q21.1 E,S P (Q86WA7)

(Q8TBP5) 145945 6p25.2 E 1 S M

181751 5q21.1 S M (FAM50B)

(NM_033211) 124785 (NRN1) 6p25.1 S M

176857 5q21.3 E M

137203 6p24.3 S M 175211 6q23.2 S P

(TFAP2A) (Q9BXE6)

176078 6p24.3 S M 135521 6q24.2 S M

(Q8NAN4) (C6orf93)

185694 6p22.1 E P 118508 6q24.3 S P

181573 6p22.1 S P (RAB32)

(Q96MM2) 112499 6q25.3 E 1 S P

112498 6p22.1 S M (SLC22A2)

(PPPIR11) 146477 6q25.3 S P

161877 6p21.32 S M (SLC22A3)

(C60orf10) 060762 6q27 E 1 S P

168426 6p21.32 E 1 S M (BRP44L)

(BTNL2) 153471 6q27 S P

168383 (HLA- 6p21.32 E P (TCP10)

DPB1) 186340 6q27 S P

161896 (IHPK3) 6p21.31 S M (THBS2)

156582 6p21.2 S M 164493 6q27 S P

137252 6p12.1 S P (Q96N37)

(HCRTR2) 170767 6q27 E M

146151 6p12.1 S P (C6orf208)

(HMGCLL1) 177706 7p22.3 S P

179713 6q14.1 S P (FAM20C)

(Q8N481) 184773 7p22.3 E M

135324 6q14.2 E 1 S P (Q96GH9)

(C6orf17) 122691 7p21.1 S M

135315 6q14.2 S P (TWIST2)

(C6orf84) 105855 (ITGB8) 7p21.1 E M

184486 6q16.1 S M 105996 7p15.2 E 1 S M

(POU3F4) (HOXA2)

183075 6q21 S P 105997 7p15.2 E 1 S M

153989 6q22.1 S P (HOXA3)

(C6orf68) 164519 7p15.2 S M

146350 6q22.31 E P (Q96MZ3)

(C6orf170) 106001 7p15.2 E 1 S M

184362 6q22.31 S P (HOXA4)

(Q9BZ63)

106004 7p15.2 E 1 S M 106028 7q34 S M

(HOXA5) (SSBP1)

106006 7p15.2 S M 181551 7q34 S P

(HOXA6) 184412 7q34 S P

005073 7p15.2 E 1 S M 133624 7q36.1 S P

(HOXA11) (NMJ324910)

106038 (EVX1) 7p15.2 E 1 S P 164889 7q36.1 E 1 S M

106483 7p14.1 S P (SLC4A2)

(SFRP4) 164896 7q36.1 E 1 S M

106571 (GLI3) 7p14.1 E 1 S M (FASTK)

164543 7p13 E P 164690 (SHH) 7q36.3 S P

(STKJ7A) 187177 7q36.3 E M

058404 7p13 S P 146909 (C7orf3) 7q36.3 E P

(CAMK2B) 130675 7q36.3 S M

164742 7p13 S M (HLXB9)

(ADCY1) 178158 7q36.3 S M

185292 7p13 S M (Q8N7D3)

179869 7p12.3 S P 155093 7q36.3 S M

(NM_152701) (PTPRN2)

042813 (ZPBP) 7p12.2 S P 180204 8p23.3 E 1 S P

185037 7q1 1.21 E 1 S M (NM_181648)

185947 7q1 1.21 E 1 S P 104284 8p23.3 E 1 S P

(Q81W5) (DLGAP2)

135211 7q11.23 E 1 S P 036448 8p23.3 S M

(C7orf35) (MYOM2)

187391 7q21.1 1 E 1 S M 186550 8p23.1 E M

(MAG 12) 186553 8p23.1 E M

185191 7q21.12 S P 186555 8p23.1 E P

182348 7q21.13 S P 186558 8p23.1 E P

(NM_181646) 186560 8p23.1 E M

105810 (CDK6) 7q21.2 S M 186647 8p23.1 E M

006377 (DLX6) 7q21.3 S P 185161 8p23.1 E 1 S P

121716 7q22.1 S M (Q8N9J4)

(P1 LRB) 158815 8p21.3 S M

128594 7q32.1 S P (FGF17)

(NM_022143) 168487 (BMP1) 8p21.3 S P

120896 (VINE) 8p21.3 S M 179950 8q24.3 S P

179388 (EGR3) 8p21.3 S M (NM_078480)

172733 (PURG) 8p12 E 1 S P 185189 8q24.3 S M

167912 8q12.1 E 1 S M (NM_178564)

(Q96QE0) 186574 8q24.3 S M

183226 8q12.3 E P (Q8ND02)

185942 8q12.3 E 1 S P 178719 8q24.3 E P

(FAM77D) (GRINA)

165084 8q13.2 E M 167701 (GPT) 8q24.3 E 1 S M

(NM_052958) 160959 (YOJ4) 8q24.3 S P

184234 8q21.2 S. M 177742 8q24.3 E M

(NM_172239) (NM_178535)

180694 8q21.3 S P 120215 9p24.1 S M

(Q8N3G6) (MLANA)

156486 8q22.2 S P 186758 9p21.1 E 1 S M

(KCNS2) (Q8N710)

164796 8q23.3 S M 174994 9p12 S P

(CSMD3) (Q96M55)

104406 8q24.22 E P 170152 9p11.2 S M

(NM_032205) 154537 9p11.2 S M

169427 8q24.3 E 1 S M (Q8NCQ8)

(KCNK9) 178784 9q12 S P

184489 8q24.3 S P (Q96F02)

(PTP4A3) 184879 9q13 S M

181790 (BAH) 8q24.3 S M 182368 9q13 S M

180838 8q24.3 E M (Q8NCQ8)

(Q8NAM3) 170215 9q13 S M

167656 (LY6D) 8q24.3 E 1 S P (Q8NCQ8)

179142 8q24.3 S M 170217 9q13 S M

(CYP11 B2) 107282 9q21.11 E 1 S P

182851 8q24.3 E P (APBA1)

(NIVM78172) 155621 9q21.12 E 1 S P

158106 8q24.3 S M (NM_182505)

(RHPN1) 186788 9q21.32 E 1 S M

181528 8q24.3 S M (NP_001001670)

177992 9q22.1 S P 182569 9q34.3 S M

(NM_178828) (NM_053045)

186359 9q22.1 S P 186909 10p15.3 E 1 S P

(Q8NDSJ) 151632 10p15.1 S P

130222 9q22.2 S P (AKR1C2)

(GADD45G) 178462 10p15.1 S M

169027 9q22.31 S P (NM_024803)

(NM_030970) 178372 (CLSP) 10p15.1 S P

131662 (PHF2) 9q22.31 S P 176730 10p15.1 E M

119523 9q22.33 S P (Q8N218)

(NM_033087) 107485 10p14 E 1 S P

177945 9q33.3 E 1 S P (GATA3)

(NM_016158) 182077 10p12.1 E M

136944 (LMXIB) 9q33.3 E 1 S M (NP_001030014)

123454 (DBH) 9q34.2 S M 099250 (NRP1) 10p1 1.22 E P

186459 9q34.3 S P 175395 10p11.21 E M

160345 9q34.3 E 1 S P (ZNF25)

(NM_144654) 165511 10q11.21 S M

148411 9q34.3 S M (NM_145022)

(NM_144653) 165406 10q11.21 E P

160360 9q34.3 S M (MARCH8)

(Q9UFS8) 14861 1 10q11.22 S M

148400 9q34.3 S M (SYT15)

(NOTCH1) 165606 10q11.23 S M 172889 9q34.3 E 1 S P 107671 10q11.23 S M

(EGFL7) (NM_018245)

169692 9q34.3 S M 165443 10q21.1 S M

(AGPAT2) (NM_032439)

054148 9q34.3 E 1 S M 148575 10q21.2 S M

(PHPT1) (NM_178505)

184709 9q34.3 S M 182771 (GRID1) 10q23.2 E M

185863 9q34.3 S M 138135 10q23.31 S P

176248 9q34.3 S M (CH25H)

(NM_013366) 180740 10q23.31 E 1 S P

176058 9q34.3 S M (Q9H6Z8)

(NM_173691) 095585 (BLNK) 10q24.1 S P

148820 (LDB1) 10q24.32 E 1 S M 182272 11p15.5 E 1 S M

166275 10q24.32 S P (B4GALNT4)

(NM_144591) 184363 (PKP3) 11p15.5 E 1 S M

176584 10q26.13 S M 176828 11p15.5 E 1 S M

119965 10q26.13 E M (Q8N9U2)

(C10orf88) 177700 11p15.5 S M

108001 (EBF3) 10q26.3 S M (POLR2L)

165752 10q26.3 S P 184956 (MUC6) 11p15.5 S M

(NM_173575) 183116 11p15.5 S M

171813 10q26.3 S P 184545 11p15.5 S P

(Q96F43) (DUSP8)

180066 10q26.3 E 1 S M 130598 11p15.5 S P

(C10orf91) (TNN12)

148826 10q26.3 E 1 S M 184682 11p15.5 E 1 S M

(NKX6-2) 183680 11p15.5 S P

171811 10q26.3 E 1 S M (Q8N2L8)

(C10orf93) 181963 11p15.4 S P

151646 10q26.3 S M (Q8NGK3)

(GPR123) 180785 11p15.4 S M

171798 10q26.3 S M (NM_152430)

(Q8TEE5) 176904 11p15.4 S P

165824 10q26.3 S M (Q8NH63)

(NM_152643) 180974 11p15.4 S P

171794 (UTF1) 10q26.3 S P (Q8NGH9)

151650 10q26.3 E 1 S M 051009 11p15.4 S M

(VENTX2) (NM_032127)

178592 10q26.3 E 1 S M 166337(TAFIO) 11p15.4 S P

(Q8N377) 170748 11p15.4 S P

148832 (PAOX) 10q26.3 E 1 S M (NIVM4469)

186730 (DUX4) 10q26.3 S P 170688 11p15.4 E P

184243 10q26.3 S P (OR5EJP)

179882 (DUX2) 10q26.3 S P 129152 11p15.1 S M

177947 (ODF3) 11p15.5 S M (MYOD 1)

174885 (PYA5) 11p15.5 S M 184193 11p14.3 E 1 S M

185885 11p15.5 E 1 S M (Q8N7V1)

(IFITM1)

129151 11p14.2 E P 170257 11q25 S P

(BBOX1) (NM_030970)

007372 (PAX6) 11p13 S M 080854 11q25 S M

183242 (WIT1) 11p13 S M (Q9UPX0)

182565 11p11.12 S P 151503 (Y056) 11q25 S M

185927 11q11 S P 149328 11q25 S M

186660 (ZFP91) 11q12.1 S P (NM_138342)

172289 11q12.1 S P 109956 11q25 S M

(Q8NG17) (B3GAT1)

134824 11q12.2 S M 139194 (RBP5) 12p13.31 E 1 S P

(FADS2) 150045 12p13.31 S P

174903 11q13.2 E 1 S M (KLRF1)

(RAB1D) 121374 12p13.2 S P

174851 (YIF1) 11q13.2 S M (KLRC3)

173621 11q13.2 S P 171681 12p13.1 S M

(NM_024036) (ATF71P)

172932 11q13.2 S P 111404 12p12.3 S P

162105 11q13.3 S M (NM_024730)

(SHANK2) 172572 12p12.2 S M

175534 11q13.4 S M (PDE3A)

(Q8TB74) 11700 12p12.2 S P

137474 11q13.5 S P (SLC21A8)

(MY07A) 069431 12p12.1 E 1 S M

168959 (GRM5) 11q14.2 S P (ABCC9)

182359 11q22.3 E 1 S P 013573 12p11.21 S M

(KBTBD3) (DDX11)

150750 11q23.1 E M 177627 12q13.11 E P

(C11orf53) (NM_1523I9)

184824 11q23.3 S M 123364 12q13.13 S M

(C1QTNF5) (HOXC13)

154146(NRGN) 11q24.2 S P 123388 12q13.13 S M

182657 11q24.3 E 1 S M (HOXC11)

120462 11q24.3 S M 180818 12q13.13 S M

(Q9P195) (HOXC10)

182667 (NTR1) 11q25 E 1 S P 180806 12q13.13 E 1 S M

(HOXC9)

186426 12q13.13 E 1 S M 178215 13q21.1 E 1 S M

(HOXC4) (Q8N7V5)

170338 12q13.13 S M 178205 13q21.1 S M

(HOXC6) (Q8N7V5)

172789 12q13.13 E M 178200 13q21.1 S P

(HOXC5) (Q8N7V5)

174604 12q13.2 S P 177527 13q21.31 E 1 S P

(Q9BXE6) (Q8N7F4)

135502 12q13.3 E 1 S M 185498 13q21.32 E 1 S P

(SLC26A10) 152192 13q31.1 S M

135446 (CDK4) 12q14.1 E 1 S M (POU4F1)

079081 12q14.2 S M 171650 13q31.1 S P

(SRGAP1) (PTA1A)

173401 12q21.1 E M 184052 13q31.1 E P

(NM_152779) 165300 (Y918) 13q31.2 S P

165891 12q21.2 E 1 S M 139800 (ZIC5) 13q32.3 S M

(Q96AV8) 102466 13q33.1 E M

111046 (MYF6) 12q21.31 S P (FGF14)

151572 12q23.1 S P 185950 (IRS2) 13q34 S M

(NM_178826) 153481 13q34 S M

089116 (LHX5) 12q24.13 S M (NM_018210)

175727 12q24.31 S P 126218 (F10) 13q34 S M

(NMJD14938) 186009 13q34 S M

184967 12q24.33 S M (ATP4B)

(NM_024078) 184497 13q34 E 1 S M

112787 12q24.33 E 1 S M (FAM70B)

(Q9HCM7) 185989 13q34 S M

139495 13q12.12 S P (RASA3)

(NM_153023) 176294 14q11.2 E P

169840 (GSH 1) 13q12.2 S M (OR4N2)

102760 13q14.11 S M 136367 14q11.2 S M

(NM_014059) (ZFHX2)

152207 13q14.2 S P 176165 14q12 E 1 S P

(CYSLTR2) (FOXG 1C)

171945 13q14.3 S P 136352 (TITFI) 14q13.3 S M

(NM 030970)

186215 14q13.3 S P 126309 (HV1 A) 14q32.33 S P

(Q86SZ3) 126290 (HV2A) 14q32.33 E 1 S P

136327 14q13.3 S P 151802 15q13.1 E 1 S P

(NKX2-8) (Q9P168)

151338 14q13.3 E M 103832 15q13.2 E P

(M1 POL1) (060374)

151748 (SAV1) 14q22.1 E M 134146 15q14 S P

073712 14q22.1 E 1 S P (NM_080650)

(PLEKHC1) 179315 15q.14 S P

125378 (BMP4) 14q22.2 S M 184263 15q21.2 S P

184302 (S IX6) 14q23.1 S P 169856 15q21.3 S M

177126 14q24.3 S P (ONECUT1)

(C14orf141) 069667 (RORA) 15q22.2 S M

183992 14q31.1 E 1 S M 138622 (HCN4) 15q24.1 S M

140093 14q32.13 E P 186690 15q24.3 S P

(SERP1 NA10) 140557 15q26.1 S P

036530 14q32.2 S P (SIAT8B)

(CYP46A1) 183643 15q26.1 E M

140107 14q32.2 E M (C15orf32)

(Q86U14) 184254 15q26.3 S M

185469 (RTL1) 14q32.31 E 1 S M (ALDH 1 A3)

066735 14q32.33 E M 140479 15q26.3 S M

(KIF26A) (PACE4)

184601 14q32.33 S M 103326 (SOLH) 16p13.1 S M

(Q8N912) 127585 16p13.3 S M

130235 14q32.33 S M (NM_153350)

(NM_032714) 127586 16p13.3 S M

1849J6 (JAG2) 14q32.33 S M (CHTF18)

184552 14q32.33 S M 005513 (SOX8) 16p13.3 E 1 S P

(Q8NAF8) 172268 16p13.3 E 1 S P

J82351 14q32.33 S M (Q96S05)

(CR1 P1) 172257 16p13.3 S P

177199 (IGHA2) 14q32.33 S M (Q96S03)

177154 (IGHE) 14q32.33 S P 184471 16p13.3 S M

177145 14q32.33 S M 073761 16p13.3 S M

(IGHG1) (CACNA1 H)

140650 (PMM2) 16p13.2 S P 172716 17q12 S M

182375 16p11.2 S P (NM_152270)

185836 16p11.2 S P 171532 17q12 S P

102924 16q12.1 S M (NEUROD2)

(CBLN1) 173917 17q21.32 E 1 S M

103449 (SALLI) 16q12.1 E 1 S M (HOXB2)

183022 16q12.2 S M 120093 17q21.32 E 1 S M

103005 16q13 E 1 S M (HOXB3)

(C16orf57) 182742 17q21.32 S M

102890 16q22.1 S P (HOXB4)

(ELM03) 108511 17q21.32 S M

102977 (ACD) 16q22.1 E 1 S M (HOXB6)

103056 16q22.1 S M 120068 17q21.32 S M

(SMPD3) (HOXB8)

103241 16q24.1 E 1 S M 141378 (YCE7) 17q23.2 E 1 S M

(FOXF1) 121068 (TBX2) 17q23.2 S M

179588 16q24.2 S M 187011 17q23.2 E M

(ZFPM1) (C17orf82)

051523 (CYBA) 16q24.2 S M 136492 17q23.2 S P

183788 16q24.3 E 1 S M (BR1 P1)

(Q8N206) 125398 (SOX9) 17q24.3 S P

183518 17p13.3 E 1 S M 161547 17q25.1 E M

183688 17p13.3 S M (SFRS2)

(NM_182705) 16728J 17q25.3 S P

167874 17p13.1 E 1 S M 141570 (CBX8) 17q25.3 S M

(TMEM88) 141582 (CBX4) 17q25.3 S M

109061 (MYH 1) 17p13.1 S P 175901 17q25.3 S M

108448 17p12 E M (Q8NBT7)

(TRIM 16) 181428 17q25.3 E 1 S P

160516 17p1 1.2 S M (Q8N8L1)

(RPS28) 181409 (AATK) 17q25.3 S M

181977 (PYY2) 17q1 1.2 E 1 S P 187207 17q25.3 S M

184142 (TIAFI) 17q1 1.2 E M 186765 17q25.3 S P

108587 17q1 1.2 S P (FSCN2)

(GOSR1) 184703 (SIRT7) 17q25.3 S M

184715 17q25.3 S M 105655 19p13.11 S M

(NM_032711) (NM_016368)

169750 (RAC3) 17q25.3 S P 172684 19p13.11 E 1 S P

169727 (GPS1 ) 17q25.3 S M (Q8NE65)

154655 18p11.31 S P 172666 19p13.1 1 E 1 S P

(NM_173464) 187135 19q12 S M

067900 18q11.1 S M 121297 (TSH3) 19q12 E 1 S P

(ROCK1) 130876 19q13.1 1S M

141448 18q11.2 E M (SLC7A10)

(GATA6) 124302 19q13.11 E 1 S M

141441 18q12.1 E 1 S P (CHST8)

(FAM59A) 105698 (USF2) 19q13.12 S M

J01746 (NOL4) 18q12.1 S M 126266 19q13.12 S M

101489 18q12.2 E 1 S M (GPR40)

(BRUNOL4) 105663 (TRX2) 19q13.12 E M

152217 18q12.3 E P 180458 19q13.13 E 1 S P

(SETBPJ) (Q8N3U1)

183677 (ELA2) 18q21.1 S M 105737 (GRIK5) 19q13.2 S M

141644 (MBD1) 18q21.1 S M 159904 19q13.31 E 1 S P

041353 18q21.2 E P (ZNF225)

(RAB27B) 167383 19q13.31 E 1 S M

141668 18q22.3 S P (ZNF229)

(NM_182511) 176499 19q13.33 E M

141665 18q22.3 S P (Q9Y4U5)

(NM_152676) 175856 19q13.41 E M

101544 18q23 S M (Q8NB48)

(NM_104913) 186818 19q13.42 E 1 S M

178184 18q23 S P (LILRB4)

(PARD6G) 105132 (ZN550) 19q13.43 E 1 S M

141934 19p13.3 E 1 S M 130724 19q13.43 E 1 S M

(PPAP2C) (CHMP2A)

1J8050 19p13.3 S M 099326 19q13.43 E 1 S M

(NM_017914) (ZNF42)

180866 19p13.2 E 1 S P 175487 19q13.43 S P

(Q8NB05) (Q9BPX8)

178591 20p13 S M 060491 (OGFR) 20q13.33 S M

(DEFB125) 092758 20q13.33 E 1 S M

088782 20p13 S P (COL9A3)

(DEFB127) 101204 20q13.33 E M

125906 20p13 E P (CHRNA4)

(Q9H410) 075043 20q13.33 S M

125861 20p13 S P (KCNQ2)

(GFRA4) 130589 (P285) 20q13.33 E M

101230 20p12.1 E 1 S P 125520 20q13.33 S M

(C20orf82) (SLC2A4RG)

172264 20p12.1 S M 171700 2Oq 13.33 S M

(C20orf133) (RGS19)

125798 20p11.21 S M 171695 20q13.33 S P

(FOXA2) (Q8TD35)

125810 20p11.21 S M 181872 20q13.33 S P

(CIQR1) 175302 21q11.2 S P

125831 20p1 1.21 S M (Q9NS19)

(CSTJ 1) 184856 21q21.1 E P

154930 20p11.21 E P (C21 orf74)

(ACAS2L) 186930 21q22.11 S P

183029 20q11.21 S P (KRTAP6-2)

(Q8NCY9) J85569 (OLIG2) 21q22.11 S M

026559 20q13.13 S P 159263 (SIM2) 21q22.13 E 1 S P

(KCNG1) 183067 21q22.2 E P

124222 20q13.32 E P (Q9NS15)

(STX16) 141956 21q22.3 S M

179242 (CDH4) 20q13.33 S M (PRDM15)

101180 (HRH3) 20q13.33 S M 014442 21q22.3 E M

130702 20q13.33 S M (ADARB 1)

(LAMA5) 182586 21q22.3 E P

174407 20q13.33 S M (C21orf89)

(C20orf166) 186866 21q22.3 S P

101188 20q13.33 S M (C21orf80)

(NTSR1) 187153 21q22.3 S M

101189 20q13.33 E 1 S M 142156 21q22.3 S M

(C20orf20) (COL6A1)

160294 21q22.3 E M 166897 22q13.1 S P

(MCM3AP) (Q96PY3)

160305 (D1P2) 21q22.3 S M 184687 22q13.31 E 1 S P

160307(SIOOB) 21q22.3 P (Q8ND38)

160310 21q22.3 E P 075275 22q13.31 E M

(HRMTIL1) (CELSR1)

183628 22q1 1.21 E 1 S M 182858 22q13.33 S M

(DGCR6) (NM_024105)

100075 22q1 1.21 S M 128159 22q133.3 S M

(SLC25A1) (TUBGCP6)

183099 22q11.21 E L U 1 S M 185386 22q13.33 S M

100208 (IGLC1) 22q11.22 S P (MAPK12)

186746 22q 11.22 E M 100239 (K685) 22q13.33 S P

178803 22q11.23 S P 025770 22q13.33 S M

(Q8NAW6) (NM_014551)

100104 (SRR1) 22q12.1 E M 182786 22q13.33 S P

169184 (MN1) 22q12.1 S P

184390 22q12.2 E 1 S P

(Q61CM0)

Genes predicted to be imprinted by both the linear and REF kernel classifiers learned by Equbits are denoted by E, and those predicted by both the linear and RBF kernel classifiers learned by SMLR by S. Genes predicted to be imprinted by both programs are denoted by E 1 S and represent the 'high-confidence' set presented in Table 1 hereinabove. Genes predicted to be expressed from the maternal or paternal allele are denoted by M or P, respectively. To enhance legibility, the common prefix

"ENSGOOOOO" has been dropped from the Ensembl ID. Also listed are gene names and/or GENBANK® Accession Nos. where applicable.

Table 3

Chromosomal Bands with High Frequencies of Genes Proved or Predicted with High Confidence to be Imprinted

Table 4 Relevant Features for Prediction of Imprinting by Equbits Classifiers -

en

-

i

CD CD

-

O

i

-

i - 4 -

oo

CO

OO Ul

OO

OO 00

CD O

I

Unit is kilobases and it refers to the beginning of the first or the end of the last exon, respectively. Corresponding table for SMLR available upon request. For example, "downstream 10:100" refers to the 90 kb window from 10 kb to 100 kb downstream of the last exon.

1 Number of this feature within the sequence window; ±1 denotes the ratio of repeated elements in the "+" versus the "-" orientation with respect to the gene. It is the negative inverse if there are more elements in the "-" orientation than in the "+" orientation;

2 Percentage; ±2 Ratio of the percentage of the sequence window covered by repeated elements in ± orientation; 3 Indicator for presence of this feature within the sequence window; 4 Indicator for presence of upstream CTCF consensus-binding site.

<D 5 Indicator for presence of TGTTTGCAG consensus site; 6 The phase change happened at one of the following LTR elements:

MLT1A0, MLT1B, MSTA, MSTB1 , MLT1D, MLT2B4, or MLT1G1 ; 7 Indicator for presence of CpG island overlapping the last exon;

Indicator for presence of CpG island overlapping the first exon; Orientation of motif relative to gene; 10 Methylation prone;

11 Methylation resistant; * indicates pairwise interaction between two variables.

Table 5 Relevant Features for Prediction of Parental Preference by Equbits Classifier

Unit is kilobases and it refers to the beginning of the first or the end of the last exon, respectively. For example, "downstream 10:100" refers to the 90 kb window from 10 kb to 100 kb downstream of the last exon. 1 Number of this feature within the sequence window; ±1 Denotes the ratio of repeated elements in "+" versus "-" orientation with respect to the gene. It is the negative inverse if there are more elements in the "-" orientation than in the "+" orientation; 2 Percentage of the sequence window covered by this feature; ±2 Ratio of the percentage of the sequence window covered by repeated elements in ± orientation; Indicator for presence of this feature within the sequence window; Methylation prone; * indicates pairwise interaction between two variables.

Table 6 High-confidence Imprinted Gene Candidates Predicted in Human and Mouse

Genes predicted to be expressed from the maternal or paternal allele are denoted by M or P, respectively. For brevity, genes previously known to be imprinted are not included.

Table 7

Genes Proved or Predicted with High Confidence to be Imprinted Map to Loci Linked to Various Human Conditions

CD

CD

CD OO

CD CD

O O

O

The table lists loci that have previously been linked to various human conditions, and high-confidence imprinted gene candidates that map into or within 10 Mb (or less) of that locus. If a locus has been observed to have a parent-of-origin effect, this is denoted by a lowercase m orp, for maternal or paternal effects, respectively. Genes predicted to be expressed from the maternal or paternal allele are denoted by M or P, respectively. Genes also predicted to be imprinted in the mouse are marked by f. Alleles that have been proved to be exclusively expressed are underlined.

O

Table 8 Independent Negative Test Genes

Expression can be one of the following: P (imprinted and paternally expressed), M (imprinted and maternally expressed), or X (not imprinted). All 101 genes were correctly predicted not to be imprinted by the combined classifier.

Table 9 Training Genes of Known Imprint Status

Expression can be one of the following: P (imprinted and paternally expressed), M (imprinted and maternally expressed), or X (Not imprinted). The GRB10 locus encodes oppositely imprinted transcripts and was excluded from the maternal/paternal model (denoted by I).

Table 10

Randomly Chosen Control Genes

Ensembl ID Band Ensembl ID Band

065183 (WDR3) 1 p12 i 116793 (PHTFI) 1p13 f

092621 (PHGDH) 1 p12 f 122481 (RWDD3) 1p21 t

134245 (WNT2 B) 1 p13 t 173146 (Q96CI4) 1p21 t

134253 (TRIM45) 1 p13 t 117600 1p21 t

007341 (ST7L) 1 p13 t (NM_014839)

162688(AGL) 1p21 f, i 054116 (TRAPPC3) 1p34 f

142951 1p21 f 132773 (TOE1) 1p34 f

069702 (TGFBR3) 1p22 i 126091 (SIAT6) 1p34 i

122417 (Q9ULJ1) 1p22 t 131238 (PPT1) 1p34 t

097096 (Q8NDB8) 1p22 i 179178 1p34 t

(NM_144626)

171488 1p22 t 117400(MPL) 1p34 i

(NM_032270) 127129 (EDN2) 1p34 t

117174 1p22 f 171812 (COL8A2) 1p34 U

(NM_017953) 117407(ARTN) 1p34 i

153898 (MCOLN2) 1p22 f 185421 1p34 f

162624 (Q9BYB7) 1p31 t 186444(TMSLI) 1p34 f

178965 (Q8ND41) 1p31 f 186973 1p34 f

079739 (PGM1) 1p31 i 084628(TCBAI) 1p35 f

117114 (LPHN2) 1p31 t 175089 (Q9BXE6) 1p35 f

116641 (DOC7) 1p31 t 168528 1p35 f

162433 (AK3) 1p31 f (NM_178865)

185483 (ROR1) 1p31 t 134668 1p35 i

162402 (USP24) 1p32 t (NM_144569)

134744(012764) 1p32 i 121774 1p35 t

162398 1p32 i (KHDRBS1)

(NM_152607) 183615(YSEC) 1p35 f,t

121310 1p32 i 125945 (ZNF436) 1p36 f

(NM_018281) 133226(SRRMI) 1p36 i

058804 1p32 i 173413 (Q9BXE6) 1p36 t

(NM_018087) 179589 (Q8NA34) 1p36 t

162384 1p32 i 008130(PPNK) 1p36 f

(NM_017887) 177799 (O4F3) 1p36 t

181150 1p32 i 175262 1p36 t

186857 (Q9HBS7) 1p32 f (NM_173507)

142973 (CYP4B1) 1p33 i 175087 1p36 t

186160 1p33 i (NM_152835)

(NM 178134)

160072 1p36 f 183558 1q21 t

(NM_031921) (HIST1H2AH)

117701 1p36 t 183598 1q21 i

(NM_022078) (NM_021059)

127054 1p36 f 187170 (SPRL4A) 1q21 t

(NM_017871) 187173 1q21 i

177000 (MTHFR) 1p36 t (NM_178428)

008125 (MMP23A) 1p36 t 187223(SPRLIA) 1q21 f

158748 (HTR6) 1p36 i 187428 1q21 i

162426 (DNB5) 1p36 f (NM_178353)

117682(DHDDS) 1p36 f 160753(RUSCI) 1q22 i

162438 (CTRC) 1p36 i 163239 1q22 i

169504 (CLIC4) 1p36 f (NM_182499)

171735(CAMTAI) 1p36 f 160752 (FDPS) 1q22 i

053371 (AKR7A2) 1p36 t 160716 (CHRNB2) 1q22 f

158803 1p36 f 143595(AQPIO) 1q22 i

186410 1p36 i 132704(SPAPI) 1q23 f

160670 (S100A6) 1q21 f 162729 (IGSF8) 1q23 i

177954 (RPS27) 1q21 i 158481 (CD1C) 1q23 f

143545 (RAB13) 1q21 i 158485 (CD1B) 1q23 f

163155 (Q96S90) 1q21 t 186950 (Q96M18) 1q23 f, i, t

178527 (Q8N9C2) 1q21 t 120370 1q24 t

143615(014634) 1q21 i (NM_152281)

160741 1q21 t 000457 1q24 f

(NMJ81715) (NM_020423)

143415 1q21 t 178454 1q24 t

(NM_020239) (NM_018578)

143621 (ILF2) 1q21 t 143167 (GPA33) 1q24 t

131781 (FMO5) 1q21 t 094975 (C1orf9) 1q24 t

143369 (ECM1) 1q21 t 162666 (Y040) 1q25 i

132043 1q21 i 116147(TNR) 1q25 i

163122 1q21 i 117586 (TNFSF4) 1q25 f

172762 (Q9P1E1) 1q25 t

135870 (Q8IVE6) 1q25 f 116918(TSNAX) 1q42 t

162779 1q25 f,t 116957(TBCE) 1q42 f

(NM_182766) 116991 (Q8NA38) 1q42 f

162787 1q25 f 162913 (Q8N372) 1q42 t

(NMJ81572) 162885 1q42 i

135862(LAMCI) 1q25 i (NM_152490)

183831 1q25 i 081692 1q42 i

143355 (LHX9) 1q31 f (NM_023007)

122185 (RPS27) 1q32 t 168148 (HIST3H3) 1q42 t

174307 (PHLDA3) 1q32 t 152904(GGPSI) 1q42 f

174514 1q32 i 143669 (CHS1) 1q42 t

(NMJ81644) 173576 1q42 f

162757 1q32 t 185495 (Q9H5Q3) 1q42 f

(NM_152485) 116984(MTR) 1q43 i

158715 1q32 t 117009(KMO) 1q43 i

(NM_033102) 143700 1q43 f, i

117153 1q32 i 182097 (Q96CB2) 1q43 t

(NM_021633) 185346 (Q96B84) 1q43 t

077152 1q32 f 179510 (Q9H5F0) 1q44 t

(NM_014176) 162727 (Q96R29) 1q44 f

117691 1q32 t 177564 (Q8TC70) 1q44 t

(NM_013349) 177535 (Q8NGW7) 1q44 f

117650 (NEK2) 1q32 i 171163 1q44 f, i

118193 (KIF14) 1q32 t (NM_017865)

162891 (IL20) 1q32 f 162711 (CIAS1) 1q44 t

162809 (Q9NQI1) 1q41 t 185420 (SMYD3) 1q44 f

162814 1q41 f 187117 (Q8NG85) 1q44 i

(NM_138796) 178486 (Q8N115) 2p11 i

154309 1q41 i 068654 (POLR1A) 2p11 i

(NM_032890) 173758 (KV2F) 2p11 f

186063 1q41 i 153586 (IGKV4-1) 2p11 t

(NM_022831) 172116 (CD8B1) 2p11 i

135763 (Y133) 1q42 t 183281 (PLGL) 2p11 t

184943 2p11 i 130508 (Q92626) 2p25 i

(NM_052871) 174685 2p25 i

186854 (Q86V40) 2p11 i (N M_153011)

115353 (TACRI) 2p12 i 175652 2p25 f

163219 (Y053) 2p13 i 182717 2p25 f

173027 (WBP1) 2p13 i 084090 (STARD7) 2q11 t

143977 (SNRPG) 2p13 f 163126 2q11 f

075292 2p13 f. t (NM_144994)

(N M_014497) 163699 2q11 i

135638 (EMX1) 2p13 i (NM_025024)

144048 (DUSP11) 2p13 f 158435 2q11 i

114956 (DGUOK) 2p13 i (NM_017546)

115980 (ANXA4) 2p13 i 115446 2q11 i

138072 (Q9NTE1) 2p14 t (N M_014044)

011523 2p14 i 071073 (MGAT4A) 2q11 t

(NM_015147) 168677 (HMGNI) 2q11 t

143995 (MEISI) 2p14 t 144191 (CNGA3) 2q11 f

014641 (MDH1) 2p15 f 115669 (SULT1C1) 2q12 t

152518 (ZFP36L2) 2p21 i 170417 2q12 f. t

152527 (Q8IVE3) 2p21 f (NM_144632)

138095 (LRPPRC) 2p21 t 176120 2q12 i

172877 (Q9BXE6) 2p22 t (NM_032658)

055332 (PRKR) 2p22 t 015568 2q13 i

138068 2p22 i (RANBP2L1)

084733 (RABIO) 2p23 t 179757 (Q9P1 E1) 2q13 i

163795 2p23 f 175772 2q13 i

(N M_144631) (NM_152518)

119777 2p23 t 153214 2q13 t

(N M_017727) (NM_032824)

138028 (CGREFI) 2p23 f 136688 (IL1 F9) 2q13 f

186453 (Q96NH8) 2p23 i 136696 (IL1 F8) 2q13 i

068697 (LAPTM4A) 2p24 f 184764 (RPL22) 2q13 i

171863 (RPS7) 2p25 f 074054 (CLASPI) 2q14 t

176119 (Q96N27) 2q21 i 155729 2q33 i

173302 (Q8TDV2) 2q21 f (NM_152387)

136698 2q21 i 178420 2q33 f

(NM_032545) (NM_030804)

178206 2q21 f 115520 2q33 i

(NM_032248) (NM_025147)

076003 (MCM6) 2q21 i 155760 (FZD7) 2q33 f

182316 2q21 i 155755 (ALS2CR4) 2q33 t

115919(KYNU) 2q22 t 186680 2q33 i

169432 (SCN9A) 2q24 i 168582(CRYGA) 2q34 i

144285 (SCN1A) 2q24 i 135912 (Y173) 2q35 f

174470 (Q96M44) 2q24 i 127831 (VIL1) 2q35 t

169507 2q24 f 135913 (USP37) 2q35 f

(NM_173512) 163526 (TUBA4) 2q35 i

172292 2q24 f 115592(PRKAG3) 2q35 t

155657(TTN) 2q31 i 115655 2q35 i

172845 (SP3) 2q31 i (NM_024085)

163364 (Q96D13) 2q31 f 163497 2q35 t

175892 (Q8NAT4) 2q31 i (NM_017521)

128655 (PDE11A) 2q31 f 066216 (TNRC15) 2q37 i

163492 2q31 t 176946 (THA4) 2q37 i, t

(NM_173648) 173100 (Q9P0V4) 2q37 i

163093 2q31 f 144488 (Q8IVU2) 2q37 i

(NM_152384) 066248 (NGEF) 2q37 i

144306 2q31 i 115488 (NEU2) 2q37 i

(NM_024583) 135916 (ITM2C) 2q37 i

115806 2q31 f 135930 (EIF4EL3) 2q37 i

(GORASP2) 163286 (ALPPL2) 2q37 i

079150 (FKBP7) 2q31 f 182177 2q37 f

018510(AGPS) 2q31 t 182202 2q37 f

115942 (ORC2L) 2q33 t 184945 (Q8IXF9) 2q37 t

155754 2q33 i 186235 2q37 t

(NM_152525) 064835 (POU1F1) 3p11 t

179021 3p11 f 185313(SCNIOA) 3p22 i

(NM_173824) 187091 (PLCD1) 3p22 i

179799 (Q8NHB5) 3p12 f 170266 (GLB1) 3p23 t

170837 (GPR27) 3p13 t 163513 (TGFBR2) 3p24 t

157467(015083) 3p14 f 131374 (TBC1D5) 3p24 i

177664 (DNAH12) 3p14 i 185690 (Q9NYD7) 3p24 f

168374 (ARF4) 3p14 f 186032 3p24 i

163638 3p14 f 171148 (TADA3L) 3p25 i

(ADAMTS9) 157103 (SLC6A1) 3p25 f, i

163825 (TMEM7) 3p21 t 171135 3p25 t

162244 (RPL29) 3p21 t (NM_032492)

168237 3p21 t 154743 3p25 t

(NMJI45262) (NM_025265)

163817 3p21 i 088726 3p25 f

(NM_020208) (NM_018306)

145029(NICNI) 3p21 t 163703(CRELDI) 3p25 i

160808 (MYL3) 3p21 i 131375 (CAPN7) 3p25 f

180929 (GPR62) 3p21 i 178700 3q11 f

088538 (DOCK3) 3p21 i (NMJI76815)

121797 (CCRL2) 3p21 f 178694 3q11 t

121807 (CCR2) 3p21 f (NM_022072)

164062 (APEH) 3p21 i 178660 3q11 i

184345 (Q8IXL9) 3p21 i 181065 (Q9P161) 3q13 f

186748 3p21 f 163428 (Q96CX6) 3q13 i

(NM_018651) 144848 3q13 f

163673 (Q9C098) 3p22 t (NM_022488)

168026 3p22 i 121570 3q13 t

(NM_145755) (NM_018189)

114853 3p22 f 091972 (MOX2) 3q13 f

(NMJI45166) 082701 (GSK3B) 3q13 f

172936 (MYD88) 3p22 t 121594 (CD80) 3q13 i

157036 3p22 t 144811 3q13 f

(ENDOGL1) 182491 3q13 f

058262 (S612) 3q21 i 119231 (SEN5) 3q29 f

114544 3q21 i 178413 (Q8N266) 3q29 i

(NM_017836) 173950 3q29 i,t

065534 (MYLK) 3q21 i (NM_152531)

173702 (MUC13) 3q21 t 163975 (MFI2) 3q29 f

184621 (Q9HDC0) 3q21 i 075711 (DLG1) 3q29 i

163785 (RYK) 3q22 t 170455(CNGAI) 4p12 i

170883 (Q9BXE5) 3q22 t 145246(ATPIOD) 4p12 f

163770 (Q86XG3) 3q22 f 170462 4p12 t

163864 (NMNAT3) 3q23 i 183380 4p12 f

175110 (MRPS22) 3q23 f 163281 4p13 f

114124 (GPRK7) 3q23 i (NM_138335)

152977 (ZIC1) 3q24 i 174123(TLRIO) 4p14 f

181467 (RAP2B) 3q25 t 154279 (Q8WZ27) 4p14 f

174928 3q25 t 121895 4p14 f

(NM_173657) (NM_024943)

144974 3q25 f, i 174343 (CHRNA9) 4p14 t

(NM_024621) 175524 (Q9UN81) 4p15 i

118855 3q25 f 163142 (Q8TE30) 4p15 f

(NM_022736) 053900 4p15 t

163659 3q25 i (NM_013367)

(NM_015508) 159788 (RGS12) 4p16 t

114771 (AADAC) 3q25 f 177631 4p16 t

169760(NLGNI) 3q26 t (NM_182524)

136521 (NDUFB5) 3q26 i 130997 4p16 f

171109 (MFN1) 3q26 f,t (NM_181808)

176494 3q26 t 178988 4p16 t

177694 3q26 f (NM_152301)

073849(SIATI) 3q27 t 179010 4p16 i

145012(LPP) 3q27 t (NM_033296)

090539 (CHRD) 3q27 f 159692(CTBPI) 4p16 f,t

161204 (ABCF3) 3q27 t 087269 (C4orf9) 4p16 t

058705(ILIRAP) 3q28 t 170891 (C17) 4p16 f

184855 4p16 t 109424 (UCP1) 4q31 i

109184 4q11 i 153130(SCOC) 4q31 f

(NM_015115) 137460 (Q9C0D6) 4q31 f

179378 4q13 i 151623 (NR3C2) 4q31 t

(NM_006692) 137463 4q31 t

124882 (EREG) 4q13 f (NM_032623)

087128 (DES1) 4q13 i 180484 4q31 f

124875 (CXCL6) 4q13 t (NM_024914)

135222 (CSN2) 4q13 f 109452 (INPP4B) 4q31 i

081051 (AFP) 4q13 i 164142 4q31 f

079557 (AFM) 4q13 i 151005 4q32 f

186942 (Q9BQR7) 4q13 i (NM_032136)

173130 (Q9P1E1) 4q21 t 171557(FGG) 4q32 f

138670 4q21 t 164117 (FBXO8) 4q34 i

(NM_152545) 185075 (Q9H399) 4q34 t

138759(FRASI) 4q21 i 173320 (Q9P2F5) 4q35 t

138769 (CDKL2) 4q21 t 180712 4q35 i

118785 (SPP1) 4q22 t (NM_173796)

174421 (Q9P1E1) 4q22 t 164303 4q35 i

163644 4q22 i (NM_153343)

(NM_152542) 109771 4q35 t

138641 (HERC3) 4q22 t (NM_018409)

052592 (DMP1) 4q22 i 168556 (ING1L) 4q35 f

164035 4q24 f 151726 (FACL6) 4q35 f,t

(NM_016242) 075705 (DUX2) 4q35 t

179078 4q24 i 182552 4q35 t

109534(NOLAI) 4q25 t (NM_152682)

174720 4q25 f 186158 4q35 i

(NM_016648) 172239 5p12 t

145384 (FABP2) 4q26 t (NMJ82789)

170917 (NUDT6) 4q27 i 178846 5p13 f

138686 (BBS7) 4q27 i (NM_175921)

164057 4q28 t 113361 (CDH6) 5p13 f

113460(BRIX) 5p13 i 133302 5q15 f

182564 5p13 t (NM_032290)

182977 (Q9P111) 5p13 t 185261 5q15 i

153416 (ZDHHC11) 5p15 t (NMJI73665)

142319 (SLC6A3) 5p15 i 169736 (Q9NS32) 5q21 f

133398 (Q9BTT4) 5p15 i 184213 5q21 f

164363 5p15 i (NM_173488)

(NM_182632) 134982(APC) 5q22 i

173545 5p15 i 125341 (SLC22A5) 5q23 f

(NM_033414) 180831 (Q8N933) 5q23 t

125063 5p15 f 164406 (LEA2) 5q23 f

(NM_017808) 138829 (FBN2) 5q23 t

176788(BASPI) 5p15 i 182549 5q23 f, i, t

184204 5p15 f 073905 (VDAC1) 5q31 t

164512 5q11 f 152700 (SARA2) 5q31 t

(NM_024669) 053108 (Q8TBU0) 5q31 t

153914 (SFRS12) 5q12 f 078795 (PKD2L2) 5q31 f 113598 5q12 i 113212 (PCDHB7) 5q31 f

(NM_019072) 113070(DTR) 5q31 i

164253 5q13 i 173250 (Q8TDV0) 5q32 f

(NM_018268) 185777 5q32 f

132835 5q13 t 155846 5q33 t

(NM_013303) (NMJI33263)

113161 (HMGCR) 5q13 f 086570 (FAT2) 5q33 i

181104 (F2R) 5q13 f 055163 (CYFIP2) 5q33 t

164300 5q14 f 181884 5q33 i

(NM_178276) 183111 5q33 i

164299 5q14 i 113327 (GABRG2) 5q34 i

(NM_032567) 145864 (GABRB2) 5q34 t

174715 5q14 f 113328(CCNGI) 5q34 t

176819 5q14 t 118322(ATPIOB) 5q34 i

183772 (CMYA5) 5q14 f 178187 (ZNF454) 5q35 t

170089 (THOC3) 5q35 i

131183 (SLC34A1) 5q35 t 175802 (Q9UGE0) 6p21 f

181538 (Q8N0T8) 5q35 f 168471 (Q9H3W0) 6p21 t

145912 (NOLA2) 5q35 t 178214 (Q96QB7) 6p21 t

170074 5q35 t 168379 (Q8WM95) 6p21 f

(NM_173663) 172764 (Q8TDV1) 6p21 t

168246 5q35 f 180911 (Q8N925) 6p21 i

(NM_152277) 176415 (Q8N1I6) 6p21 f

146067 5q35 t 172738 6p21 t

(NM_019057) (NM_145316)

040275 5q35 i 096080 (MRPS18A) 6p21 i

(NM_017785) 111971 (LY6G5C) 6p21 i

064747 5q35 t 096158(LTB) 6p21 f

(NM_015043) 112095(HLA-DOA) 6p21 t

169045(HNRPHI) 5q35 i 137333 (DHX16) 6p21 t

131459 (GFPT2) 5q35 f 112195 (C6orf76) 6p21 t

160867 (FGFR4) 5q35 t 007816 (C6orf11) 6p21 i

113732 (ATP6V0E) 5q35 i 168426 (BTNL2) 6p21 f

183718 (TRIM52) 5q35 t 064999 (ANKS1) 6p21 i

184550 (Q9H7L9) 5q35 f 124655 (AIF1) 6p21 f

184714 5q35 i 137406 6p21 i

185005 5q35 t 161912 6p21 f,t

(NM_022471) 173580 6p21 t

112210 (RAB23) 6p11 f 184729 6p21 t

112200 (ZNF451) 6p12 t (NM_018540)

065308 (TRAM2) 6p12 i 137403(HLA-F) 6p22 t

112077(RHAG) 6p12 t 178458 6p22 t

174201 (Q9P1E1) 6p12 t (HIST1H3A)

168116 (Q9HCI6) 6p12 f 146047 6p22 f

124743 (Q9H511) 6p12 i (HIST1H2BA)

096087 (GSTA2) 6p12 f 161777 (HCG9) 6p22 i

146233 (CYP39A1) 6p12 i 112293(GPLDI) 6p22 t

112715(VEGF) 6p21 i 137414 (FAM8A1) 6p22 i

137394(TRIMIO) 6p21 f 112242 (E2F3) 6p22 i, t

168405 (CMAH) 6p22 i 049618 6q25 f

124508 (BTN2A2) 6p22 t (NM_175863)

183679 (HIST1H4J) 6p22 f 120276 6q25 t

185193 (Q9BXE2) 6p22 f 174218 6q25 f

185694 6p22 i 185068 (Q9BST5) 6q25 i

112149 (CD83) 6p23 f 185345 (PARK2) 6q26 t

181590 (Q8NC12) 6p24 t 153471 (TCP10) 6q27 i

124827 (GCM2) 6p24 i 120436 (GPR31) 6q27 t

184431 6p24 i 146731 (CCT6A) 7p11 t

185689 6p25 f 154997 7p11 t

112280 (COL9A1) 6q13 t 180594 (Q96C79) 7p13 f

175596 (Q9P1E1) 6q14 t 106628 (POLD2) 7p13 t

085382(HACEI) 6q16 f 015676 7p13 i

132429 (POPDC3) 6q21 i (NM_015332)

177214 6q21 t 106624(AEBPI) 7p13 f

(NM_173559) 010270 7p14 t

123510(BXDCI) 6q21 t (STARD3NL)

146374 (THSD2) 6q22 t 164542 (Q8NCT3) 7p14 i

111817 (SART2) 6q22 i 181211 (Q8NA17) 7p14 t

152894(PTPRK) 6q22 i 173862 7p14 i

111912 (NCOA7) 6q22 f (NM_017937)

146376 6q22 f 122641 (INHBA) 7p14 t

(ARHGAP18) 106105(GARS) 7p14 f

146411 (SLC2A12) 6q23 f 187258 (Q86SP4) 7p14 f

154269 (ENPP3) 6q23 i 176514 (Q9UDC8) 7p15 f

146386 (Q9P0A1) 6q24 t 174487 (Q9BXE6) 7p15 f,t

135577(NMBR) 6q24 f 105928 (DFNA5) 7p15 f

111962(UST) 6q25 t 153790 (C7orf31) 7p15 f

130338 (TULP4) 6q25 f 105889 7p15 t

122335 (SERAC1) 6q25 f 186179 7p15 i

180821 (RBM16) 6q25 t 186797 7p15 t

120253 (NUP43) 6q25 i 106443 (PHF14) 7p21 t

146530 7p21 t 021461 (CYP3A43) 7q22 t

(NM_182545) 173685 7q22 t

173467 7p21 i 184414 (IRS3L) 7q22 f

(NMJ76813) 185055 7q22 t

106541 (AGR2) 7p21 f 128519 (TAS2R16) 7q31 f.i

146587 7p22 i 135272 (Q9P1T7) 7q31 f

(NMJD21163) 106034 7q31 t

164818 7p22 t (NM_024913)

(NM_017802) 106041 (FAM3C) 7q31 i

106263 (EIF3S9) 7p22 f 180324 (CAPZA2) 7q31 f

169549 7p22 f 146809 (ASB15) 7q31 t

174959 7p22 i 128578 (Q9ULQ0) 7q32 t

175987 7p22 i 064419 7q32 i

179800 7p22 f (NM_012470)

187127 (POL1) 7p22 t 105875 (Q96AE5) 7q33 t

009950 7q11 i 122786(CALDI) 7q33 f

(WBSCR14) 127364 (TAS2R4) 7q34 t

135174 (Q9Y4L9) 7q11 t 171082 (Q8N3Z8) 7q34 f

133380 7q11 f 184412 7q34 f

(NM_153363) 122063 (XRCC2) 7q36 t

177585 7q11 i 106615(RHEB) 7q36 f

184569 7q11 i 181652 7q36 f

146745 7q21 t (NM_173681)

(NM_032763) 133574 7q36 i

105781 (GRM3) 7q21 t (NM_018326)

157240 (FZD1) 7q21 i 126870 7q36 t

127962 7q21 i (NM_018051)

166526 (ZNF3) 7q22 t 106560 7q36 t

169899 (Q96MA9) 7q22 f (NMJD15660)

167011 (Q8N8M0) 7q22 t 127360 (IAN4L1) 7q36 i

078319 (PMS2L1) 7q22 t 106648(GALNTI5) 7q36 t

146834 7q22 f 105993 (DNAJB6) 7q36 i

(NM_019606) 164885 (CDK5) 7q36 i

170279 (C7orf33) 7q36 t 121039(RDHIO) 8q21 i

168172 8p11 i 176731 (Q8N0T1) 8q21 f

(NM_032410) 176206 8q21 i

104371 (DKK4) 8p11 f (NM_030970)

185900 8p11 i 155099 8q21 f

(NM_032237) (NM_018710)

181329(095724) 8p12 i 176623 8q21 i

133872 8p12 i (NM_016033)

(NM_016127) 156170 8q22 i

184844 8p12 f (NM_152416)

147454 8p21 f 104324 8q22 t

(NM_016612) (NM_016134)

147443 (DOK2) 8p21 i 164949(GEM) 8q22 f

069206 (ADAM7) 8p21 t 155097 8q22 f

182406 8p21 f (ATP6V1C1)

184661 (CDCA2) 8p21 i 147666 8q22 t

181897 8p22 t 147667 8q22 t

(NM_018422) 183299 8q22 f

156011 8p22 i 147654 (EBAG9) 8q23 i

(NM_015310) 104415(WISPI) 8q24 t

154316(TDH) 8p23 f 147804 (SLC39A4) 8q24 i

177405 (Q8NAJ9) 8p23 f 161016 (RPL8) 8q24 t

154359 8p23 f 170616 (Q9BRH9) 8q24 i

(NM_152271) 180838 (Q8NAM3) 8q24 i

147364 (FBXO25) 8p23 i 132297 (OC90) 8q24 f

177023 (DEFB104) 8p23 f 167702 8q24 f

171060 8p23 t (NM_145754)

184608 (C8orf12) 8p23 f 153310 8q24 t

186600 (Q9UDD8) 8p23 f (NM_016623)

082556 (OPRK1) 8q11 f 147724 8q24 f

047249 (ATP6V1H) 8q11 t (NM_015912)

157556 (Q8NHT1) 8q13 f 147684 (NDUFB9) 8q24 i

165093 (BTF3L2) 8q13 i,t 104419(NDRGI) 8q24 f

172172 (MRPL13) 8q24 f 107020 9p24 t

179527 (C8orf17) 8q24 t (NM_018465)

177205 8q24 f 120210 (INSL6) 9p24 t

185582 8q24 f 064218 (DMRT3) 9p24 f

035445 (UNCI 3) 9p13 t 183276 9p24 i

165006 (UBAPI) 9p13 f 183354 (Q8IVE5) 9p24 f

107371 (RR40) 9p13 i 178798 (Q8NGA9) 9q12 f

165012 (Q96GJ8) 9p13 t 170215 (Q8NCQ8) 9q13 f

175768 (Q8N4H5) 9p13 f 184879 9q13 i

180810 9p13 i, t 165059 (PRKACG) 9q21 f

(NM_014113) 135045 9q21 f

137104 (GALT) 9p13 i (NM_017998)

122735 (DNAH) 9p13 f 106782 (CHAC) 9q21 f

122705 (CLTA) 9p13 t 135049 (AGTPBPI) 9q21 t

164972 (C9orf24) 9p13 f 186632 9q21 f

165269 (AQP7) 9p13 f 186747 (Q8N493) 9q21 f

159712 9p13 f 136936 (XPA) 9q22 f

182355 9p13 i 106809 (OGN) 9q22 f

(N M_015667) 021374 (IARS) 9q22 t

120247 (IFNA13) 9p21 i 158122 9q22 i

147885 (IFNA13) 9p21 f 185544 9q22 t

147889 (CDKN2A) 9p21 i 106692 (FCMD) 9q31 f

186758 (Q8N7I0) 9p21 f 186943 (Q8NGS7) 9q31 t

186802 (IFNA16) 9p21 f, i 187003 (ACTL7A) 9q31 f

147893 9p22 i 136868 (SLC31A1) 9q32 i

155156 9p22 f 173238 (Q9P1 E1) 9q32 t

147852 (VLDLR) 9p24 f 173242 9q32 f

080503 9p24 t 136856 (SLC2A8) 9q33 t

(SMARCA2) 119446 9q33 t

120158 (RCL1) 9p24 t (NM_033117)

170777 9p24 i 136950 9q33 i

(NM_033516) (NM 030978)

136861 9q33 i 134465 (Q8TE30) 10p15 f

(CDK5RAP2) 180525 (Q8N8Z3) 10p15 i

160446 (ZDHHC12) 9q34 f 134453 10p15 f

165699 (TSC1) 9q34 t (NMJD32905)

119363(SPTANI) 9q34 f 165568 10p15 i

160271 (RALGDS) 9q34 f (NM_031436)

107290 9q34 t 134460 (IL2RA) 10p15 t

(NM_015046) 172619 (Y514) 10q11 i

125484 (GTF3C4) 9q34 t 177457 10q11 t

136877(FPGS) 9q34 t (NM_173524)

169583 (CLIC3) 9q34 f 165388 10q11 t

148408 9q34 t (NM_153034)

(CACNA1B) 170324 10q11 i

160323 9q34 t (NM_152428)

(ADAMTS13) 152728 10q11 f

159247 9q34 t (NMJ47156)

177185 9q34 f 148582 10q11 f

186350(RXRA) 9q34 f 122952(ZWINT) 10q21 t

187195 9q34 f 108064 (TCF6L1) 10q21 i

(NM_030898) 170312 (CDC2) 10q21 i

136737 (ZNF25) 10p11 f 079332 (SARA1) 10q22 t

173776 (Q96HT2) 10p11 f 107719 (Q9ULE6) 10q22 f

177353(075029) 10p11 f 122861 (PLAU) 10q22 t

177291 10p11 f 178365 (NUDT13) 10q22 f

(NM_153368) 166220 10q22 i

183621 10p11 f (NM_152710)

(NM_182755) 122359 (ANXA11) 10q22 f

151025 (Q9ULT3) 10p12 f 182180 (DNAJC9) 10q22 t

095739 (NMA) 10p12 i 182523 10q22 t

150051 10p12 i 180850 10q23 i

(NM_173576) (NM_178512)

182649 10p12 f 152778 (IFT5) 10q23 i

152465 (NMT2) 10p13 i 165678(GHITM) 10q23 t

138180 (C10orf3) 10q23 i 110492(MDK) 11p11 t

148835 (TAF5) 10q24 i 180210 (F2) 11p11 i

095637(SORBSI) 10q24 i 175104 (TRAF6) 11p12 i

156398 (SFXN2) 10q24 f 110422 (HIPK3) 11p13 t

107816 (Q8N1I9) 10q24 f 186688 11p13 i

171160 10q24 f (NMJ81807)

(NM_178832) 121621 11p14 t

075826 10q24 t (NM_031217)

(NM_015490) 187398 (Q86TE4) 11p14 t

065613 10q24 f 134339 (SAA2) 11p15 f

(NM_014720) 133818 (RRAS2) 11p15 i

148820 (LDB1) 10q24 i 151117 11p15 t

138136 (LBX1) 10q24 t (NM_153347)

120053 (GOT1) 10q24 t 166800 11p15 f

107831 (FGF8) 10q24 t (NM_144972)

171311 (CSL4) 10q24 i 166788 11p15 i

165851 10q24 f (NM_138421)

151553 (Q9HCH2) 10q25 f 179826 11p15 f

151884 10q25 i (NM_054031)

151893 10q26 f 151116 11p15 i

(NM_153810) (NM_018314)

154490 10q26 t 129158 11p15 t

(NM_145235) (NM_012139)

107902 10q26 t 129152(MYODI) 11p15 f

(NM_022126) 181939 (Q8NGM1) 11q11 t

119979 10q26 f 184741 (Q8NH20) 11q11 i, t

(NM_018472) 186117 (Q8NGL2) 11q11 f

174755(ACADSB) 10q26 i 149150 (SLC43A1) 11q12 i

176584 10q26 f 172685 (Q96PG2) 11q12 i

186730 (DUX4) 10q26 f 176495 (Q8NGI8) 11q12 f

176567 (Q8NH49) 11p11 f 109991 (P2RX3) 11q12 t

165905 11p11 t 162222 11q12 t

(NM_152312) (NM_173810)

149532 11q12 i 137509(PRCP) 11q14 t

(NM_024811) 172946 (Q9P1E1) 11q21 t

162194 11q12 t 150312 11q21 t

(NM_024099) 137693 (YAP1) 11q22 f

166930 (MS4A5) 11q12 i 110723 (SL2B) 11q22 f

149516 (MS4A3) 11q12 f 166253 (Q96LP0) 11q22 i

134825 (C11orf10) 11q12 i 137692 11q22 i

173101 (SIPA1) 11q13 f (NM_032299)

173959 (RBM14) 11q13 f 118113 (MMP8) 11q22 t

175514 (Q8TDT2) 11q13 i 166648 (DNCH2) 11q22 f

179263 (Q8NH65) 11q13 t 176610 11q22 i

166439 (Q8NCN4) 11q13 f 187069 11q22 i

021300(PLEKHBI) 11q13 t 036672 (USP2) 11q23 i

171631 (P2RY6) 11q13 i 173524 (Q9BXE6) 11q23 t

175591 (P2RY2) 11q13 i 160613 (PCSK7) 11q23 t

178795 11q13 t 154114 11q23 i

(NM_182833) (NM_152715)

173914 11q13 f 095110 11q23 t

(NM_031492) (NMJ52315)

172732 11q13 f 137747 11q23 t

(NM_025128) (NMJD32046)

132749 (MTL5) 11q13 t 110367 (DDX6) 11q23 i

168056 (LTBP3) 11q13 f 167283 (ATP5L) 11q23 f

172638 (EFEMP2) 11q13 i 110244 (APOA4) 11q23 i

175602(DIPA) 11q13 f 182581 (Q9BXE6) 11q23 f

175315 (CST6) 11q13 i 023171 (Q9ULL9) 11q24 t

175334(BANFI) 11q13 t 170953 (Q8NGG6) 11q24 U

176245 11q13 t 176952 (Q8N6I7) 11q24 t

184154 11q13 t 165526 11q24 t

(NM_145309) (NM_032795)

186642 (PDE2A) 11q13 f 149552 11q24 f

187040 11q13 t (NM_024556)

118369 (USP35) 11q14 i

110013 11q24 t 111218 (HRMT1L3) 12p13 t

(NM_018978) 111664(GNB3) 12p13 f

120457 (KCNJ5) 11q24 t 118972 (FGF23) 12p13 i

151704(KCNJI) 11q24 i 111652 (COPS7A) 12p13 t

109832 (DDX25) 11q24 t 180574 12p13 t

183483 (Q8IZY5) 11q24 f 151229 (SLC2A13) 12q12 t

185688 (Q8NH79) 11q24 i 151233 (Q8IXV1) 12q12 f

187072 11q24 i 186518 (Q96SJ6) 12q12 t

175724 11q25 t 166888 (STAT6) 12q13 i

(NM_152711) 172602 (RHO6) 12q13 i

177340 12p11 f 076067 (RBMS2) 12q13 t

(NM_024799) 167566 (Q9HCH0) 12q13 i

123106 12p11 t 179962 (Q8NGE6) 12q13 t

(NM_018318) 139645 (Q8NB46) 12q13 i

110888 (C1QDC1) 12p11 f 139540 12q13 i

151490(PTPRO) 12p12 f (NM_173596)

067182 12p13 f 139579 12q13 f

(TNFRSF1A) (NM_024068)

121314 (TAS2R8) 12p13 i 139625 (MAP3K12) 12q13 i

121379(TAS2R14) 12p13 i 170477 (KRT4) 12q13 f

013588 (RAI3) 12p13 f 135472 (FAIM2) 12q13 t

126838 (PZP) 12p13 t 139631 (CSAD) 12q13 i

173342 (PRB1) 12p13 f 177981 (ASB8) 12q13 i

173391 (OLR1) 12p13 f 167580 (AQP2) 12q13 i

172322 12p13 f 135409 (AMHR2) 12q13 f

(NM_138337) 185389 12q13 f

111671 12p13 i (NM_018507)

(NM_032641) 185971 12q13 i

171792 12p13 f 186897 (Q86Z23) 12q13 f

(NM_031465) 177221 (Q8WYW9) 12q14 t

126740 12p13 f 155974(GRIPI) 12q14 t

(NM_007273) 111537(IFNG) 12q15 f

121373 (KLRC4) 12p13 i 166225 (FRS2) 12q15 f

111596 (CNOT2) 12q15 t 177213 (Q96LP1) 12q24 i

185393 (Q9BTS6) 12q15 t 139767 (Q96JH4) 12q24 i

185563 12q15 f 089159 (PXN) 12q24 t

165899 12q21 f 177192 (PUS1) 12q24 f

(NM_173591) 089250 (NOS1) 12q24 i

139330 (KERA) 12q21 i 130921 12q24 t

139292 (GPR49) 12q21 f (NM_152269)

083782 (DSPG3) 12q21 i 111412 12q24 t

180318 (CARTI) 12q21 f (NM_024738)

182127 12q21 t 135090 12q24 t

187111 12q21 f (NM_016281)

028203 (VEZA) 12q22 f, i 135148 12q24 i

139343 (SNRPF) 12q23 i (NM_006700)

136051 (Q9NV91) 12q23 f 122965 (K682) 12q24 t

166629 (Q96L24) 12q23 f 158104 (HPD) 12q24 i

151136 12q23 i 135108 (FBXO21) 12q24 i

(NM_152322) 111249 (CUTL2) 12q24 f

111670 12q23 i 111707 12q24

(NM_024312) 184967 12q24 f

139420 12q23 f (NM_024078)

(NM_007076) 132950 (ZNF237) 13q12 t

174600 (CMKLRI) 12q23 i 023957 (TUBA2) 13q12 i

139352 (ASCLI) 12q23 f 139514 (SLC7A1) 13q12 t

120868 (APAFI) 12q23 f 150459 (SAP18) 13q12 t

183395 (PMCH) 12q23 i 180776 (Q8N2S7) 13q12 i

185046 12q23 t 121390 (PSPCI) 13q12 f, t

(NM_020140) 122038 (POLRI D) 13q12 t

139370 (SLC15A4) 12q24 t 150456 13q12 f

061936 (SFRS8) 12q24 t (NM_174928)

089232 (SCA2) 12q24 t 102699 (ADPRTLI) 13q12 t

139697 (SBNOI) 12q24 f, i 073910 13q13 i

178043 (Q9HA69) 12q24 f (NM_023037)

180645 (Q9BUH0) 12q24 t 133101 (CCNA1) 13q13 t

179630 (U124) 13q14 i 185941 14q13 i

180331 (Q8IX95) 13q14 t 092208 (SIP1) 14q21 f

083635(NUFIPI) 13q14 i 165506 (C14orf104) 14q21 i

171945 13q14 f 182090 (C14orf25) 14q21 i

(NMJD30970) 131959 (Q9H373) 14q22 f

102837 13q14 i 131969 (C14orf29) 14q22 f

(NM_006418) 126778 (SIX1) 14q23 f

150506 (PCDH20) 13q21 i 126821 (SGPP1) 14q23 t

118946 (PCDH17) 13q21 i 100612 14q23 f

005810 13q22 f (NM_016029)

(NM_015057) 179008 (C14orf39) 14q23 t

165621 (GPR80) 13q32 i 184902 14q23 f

134900 (TPP2) 13q33 t 140044 14q24 t

139780 13q33 f (NM_130469)

139832 (RAB20) 13q34 i 133997 (MED6) 14q24 t

134905 13q34 t 139985 (ADAM21) 14q24 f

(NM_024537) 072110(ACTNI) 14q24 t

139835(GRTPI) 13q34 i 187073 (Q86TS2) 14q24 i

057593 (F7) 13q34 t 165417 (GTF2A1) 14q31 f

130177 (CDC16) 13q34 f 042088 (TDP1) 14q32 i

102606 13q34 i 140090 (SLC24A4) 14q32 f,t

(ARHGEF7) 178069 (Q8WYT3) 14q32 t

129563 (TVA2) 14q11 t 166428 14q32 t

166056 (TCA) 14q11 i (NM_138790)

169488 (Q8NH41) 14q11 i 165943(MOAPI) 14q32 t

176281 (Q8NGD3) 14q11 i 130076 (IGHA1) 14q32 f

136315(012762) 14q11 t 165521 14q32 f

100813(ACINUS) 14q11 f 183940 14q32 i

182545 14q11 i 153684 (Q8NDK0) 15q13 i

185271 14q11 f 169926 (KLF13) 15q13 t

092108 (C14orf163) 14q12 t 179938 15q13 i

176127 (C14orf128) 14q12 t 175779 (Q8NAA6) 15q14 t

129518 (C14orf11) 14q13 f 178351 (Q8N345) 15q14 f

159495 (TGM7) 15q15 f 185594 15q26 f, i

103932 15q15 i (NM_173499)

(NM_015540) 185907 15q26 f

128928 (IVD) 15q15 t (NM_018621)

166947 (EPB42) 15q15 i 186092 15q26 t

092529 (CAPN3) 15q15 t 180096(SEPTI) 16p11 t

179646 (Q9UI57) 15q21 f 175995 16p11 t

166262 15q21 i (NM_175901)

(NM_152647) 179755 16p11 f

140274 15q21 f (NM_153227)

166466 15q21 i 179965 16p11 t

170236 15q21 i (NM_016643)

140416 (TPM2) 15q22 t 149925 (ALDOA) 16p11 f

074621 (SLC24A1) 15q22 t 169861 16p11 t

090470 (PDCD7) 15q22 i 181601 16p11 f

140368(PSTPIPI) 15q24 f 183604 (Q9H2H6) 16p11 i

140367 15q24 i 184110 (EIF3S8) 16p11 t

(NM_173469) 175758 (Y220) 16p12 t

173546 (CSPG4) 15q24 t 169344(UMOD) 16p12 i

169553 (Q8N824) 15q25 i 047578 (Q8N803) 16p12 i

173867 (MRPL46) 15q25 t 179038 16p12 f

140607 15q25 i (NM_145237)

184206 15q25 t 103275 (UBE2I) 16p13 i, t

140545(SPAGIO) 15q26 f 103197 (TSC2) 16p13 i

140534 15q26 i 095917(TPSDI) 16p13 f

(NM_152259) 162009 (SSTR5) 16p13 i

131873(CHSYI) 15q26 t 162065 (Q9ULP9) 16p13 t

173607 15q26 f 171559 (Q96EU1) 16p13 i

183000 15q26 i 069651 (NPIP) 16p13 f

183208 15q26 t 161998 16p13 t

184508 (Q8N4P3) 15q26 f (NM_145294)

185442 (Q8NBH7) 15q26 f 153060 16p13 f

(NM_144674)

161995 16p13 i 159708 16q22 i

(NM_053284) (NM_018296)

168101 16p13 i 090857 16q22 f

(NM_032349) (NM_017990)

059122 16p13 i 038358 16q22 i

(NM_032296) (NM_014329)

033011 16p13 i 168625(HYDIN) 16q22 f

(NM_019109) 090863 (GLG1) 16q22 i

100726 16p13 t 135723(FHODI) 16q22 i

(NM_016111) 103089(FAXDCI) 16q22 f

072864 (NDE1) 16p13 f,t 103018 (CYM5) 16q22 i

102858(MGRNI) 16p13 f 141076(CIRHIA) 16q22 i

103313(MEFV) 16p13 t 062038 (CDH3) 16q22 i

103222(ABCCI) 16p13 t 067955 (CBFB) 16q22 t

103229(ABAT) 16p13 i 166454 (Y431) 16q23 i

166737 16p13 i 166455 16q23 t

184629 (Q8NCX2) 16p13 f (NM_152337)

069329 (VPS35) 16q11 t 153815 16q23 i, t

129635 (Q9P1B8) 16q11 t (NM_030629)

103460 (TNRC9) 16q12 t 140905 (GCSH) 16q23 t

103494 (Q9Y2K8) 16q12 t 168589 (DNCL2B) 16q23 f

166152 16q12. f 166522 16q23 f

(NM_144602) 131149 (Y182) 16q24 t

129636 16q12 i 140950 (Q9HCG3) 16q24 f

(NM_030790) 131153 16q24 i

171208 (NETO2) 16q12 (NM_016095)

169715 (MT1E) 16q12 f 124391 (IL17C) 16q24 f

102978 (POLR2C) 16q13 f, i 178773 (CPNE7) 16q24 f

070729(CNGBI) 16q13 i 182376 16q24 f

187185 (Q86VG7) 16q13 f (NM_182605)

103043 (TAX1BP2) 16q22 f 183967 16q24 f

140824 (TAT) 16q22 i 133030 17p11 t

157405 (Q96JG3) 16q22 f (NM 015134)

072210 (ALDH3A2) 17p11 f 087095 17q11 t

154050 17p11 t (NM_016231)

184185 (KCNJ12) 17p11 f 108651 (HC66) 17q11 i

141028 (Q96T59) 17p12 i 108278 (TRIP3) 17q12 f

108445 (095611) 17p12 i 172660 (TAF15) 17q12 t

175091 17p12 f 174111 (SOC6) 17q12 f

154914 (USP43) 17p13 f 132142 (ACACA) 17q12 i

132388 (UBE2G1 ) 17p13 t 108270 (AATF) 17q12 i

161955 (TNFSF13) 17p13 t 178655 17q12 i

181856 (SLC2A4) 17p13 t 108379 (WNT3) 17q21 i

141504 (SAT2) 17p13 i 131462 (TUBGI) 17q21 f

161929 (Q96MD0) 17p13 t 073861 (TBX21) 17q21 t

007168 17p13 t 167941 (SOST) 17q21 t

(PAFAH 1 B1) 131096 (PYY) 17q21 i

129235 17p13 i 108819 (PPP1 R9B) 17q21 i

(NM_032731) 141696 (NO55) 17q21 f

132376 17p13 i 167914 17q21 i

(NM_016532) (NM_178171)

141503 (M4K6) 17p13 f 167105 17q21 i

161958 (FGF11) 17p13 i (NM_153229)

178999 (AURKB) 17p13 i 167159 17q21 f

182335 (Q8TE90) 17p13 t (N M_152466)

184166 (OR1 D2) 17p13 t 108825 17q21 f

185530 17p13 f (NM_025267)

(NM_030970) 108800 17q21 f

185561 17p13 f (N M_014897)

187071 (GPS2) 17p13 f 159224 (GIP) 17q21 f

076604 (TRAF4) 17q11 i 178743 17q21 t

141316 (SPACA3) 17q11 f 180386 17q21 i

160551 (Q9P2I6) 17q11 f 182076 (NBR2) 17q21 t

173012 (Q8TCQ8) 17q11 i 183978 17q21 i

141298 17q11 f (NM_014019)

(NM_033389) 184502 (GAS) 17q21 f

185845 (Q8N0T2) 17q21 i 185262 17q25 i

186916 17q21 i (NM_182565).

178012(PECAMI) 17q23 i 185298 17q25 t

153951 (O4D2) 17q23 t 175319 (Q14179) 18p11 t

136490 17q23 i 176014 18p11 t

(NM_030576) (NM_032525)

108375 17q23 t 132199 18p11 t

(NM_017763) (NM_017512)

087995 (METTL2) 17q23 t 168461 18p11 t

187013 (Q86X59) 17q23 i 183206 18p11 f

141331 (HELZ) 17q24 t 141447(OSBPLIA) 18q11 f,t

108878(CACNGI) 17q24 i 158201 (ABHD3) 18q11 i

182481 (KPNA2) 17q24 t 134779 18q12 i

132481 (TRIM47) 17q25 i (NM_015476)

178932 (Q8N811) 17q25 f 141434(MEPIB) 18q12 i

178789 17q25 t 134765 (DSC1) 18q12 t

(NM_174892) 186412 18q12 f, t

173818 17q25 f 186496 (ZNF396) 18q12 f

(NM_173627) 081916 18q21 i

167302 17q25 t (SERPINB8)

(NM_144679) 179981 18q22 t

125457 17q25 t (SDCCAG33)

(NM_020679) 186411 18q23 i

141580 17q25 i 168892 (ZNF253) 19p13

(NM_019613) 132010 (ZNF20) 19p13 i

109065 17q25 i 150732 (YE73) 19p13 f

(NM_015654) 125735 (TNFSF14) 19p13 i

125445 (MRPS7) 17q25 t 181143 (Q9H8T7) 19p13 t

166685 (COG1) 17q25 i 132001 (Q9H0M5) 19p13 f, i

141527 (CARD14) 17q25 t 141933 (Q96GE2) 19p13 t

167281 17q25 f 129933 (Q8N7K4) 19p13 i

184703 (SIRT7) 17q25 f 099817 (POLR2E) 19p13 i

130313(PGLS) 19p13 f

104883 (PEX11G) 19p13 f 183617 (MRPL54) 19p13 i

176995 (OR7C1) 19p13 f 185113 19p13 f

099308 (O60307) 19p13 t (NM_032281)

175217 19p13 i 185293 19p13 t

(NM_138774) 187365 19p13 t

167807 19p13 i (NM_175910)

(NM_080665) 159905 (ZNF234) 19q13 t

130307 19p13 f 018607 (ZNF221) 19q13 i

(NM_031941) 159882 (ZNF155) 19q13 t

129951 19p13 i 063244 (U2AF) 19q13 i

(NM_024888) 063176 (SPHK2) 19q13 t

132000 19p13 f 160296 19q13 t

(NM_024825) (SIGLECL1)

079313 19p13 t 168995 (SIGLEC7) 19q13 i

(NM_020695) 161681 (SHANK1) 19q13 t

125912 19p13 t 180281 (Q8N843) 19q13 i

(NM_020170) 179873 (PYA6) 19q13 f

130813 19p13 f 104960(PTOVI) 19q13 f

(NM_018381) 011485 (PPP5C) 19q13 f

167487 19p13 i 105568 (PPP2R1A) 19q13 t

(NM_018316) 087074 19q13 f

171466 19p13 f (PPP1R15A)

(NM_017656) 105223 (PLD3) 19q13 t

105229 19p13 i 104967 (NOVA2) 19q13 f

(NM_015897) 179932 19q13 f

127666 19p13 f (NM_178511)

(NM_014261) 176472 19q13 i

064489 (MEF2B) 19p13 i (NM_174945)

099617 (EFNA2) 19p13 t 161652 19q13

123146 (CD97) 19p13 f (NM_152358)

161082 19p13 f 104892 19q13 i

(BRUNOL5) (NM_145275)

115268 19D13 f

142544 19q13 i 187092 (Q8N0S4) 19q13 i

(NM_145232) 187116 19q13 i

105479 19q13 t (NM_181879)

(NM_144577) 187356 19q13 f

160410 19q13 i 177587 (Q96MG3) 20p11 t

(NM_138392) 179447 (Q8N7Z9) 20p11 t

126249 19q13 f 132661 (NXT1) 20p11 i, t

(NM_032346) 101004 20p11 i

104865 19q13 i (NM_025176)

(NM_018111) 173404 (INSM1) 20p11 t

076650 19q13 t 101435 (CST9L) 20p11 i

(NM_018025) 125815 (CST8) 20p11 i

160505 (NAL4) 19q13 t 077984 (CST7) 20p11 i

174562 (KLKF) 19q13 i 125872 (C20orf75) 20p12 f

167749 (KLK4) 19q13 t 101247 (C20orf7) 20p12 i

105063 (KB15) 19q13 f 172296 (C20orf38) 20p12 f

167644 (IMUP) 19q13 t 089177 (C20orf23) 20p12 f

160007 (GRLFI) 19q13 i 089123 (C20orf13) 20p12 f

126262 (GPR43) 19q13 t 132623 (ANKRD5) 20p12 i

105220 (GPI) 19q13 i 149497 (Q9BYW8) 20p13 i

123859 (FPRL2) 19q13 t 171864 (PRND) 20p13 i

104884 (ERCC2) 19q13 f 125787 (GNRH2) 20p13 t

142025 (DMRTC2) 19q13 f 125903 (DEFB129) 20p13 t

105205 (CLC) 19q13 i 125843 (C20orf29) 20p13

170956 19q13 i 149451 (ADAM33) 20p13 t

(CEACAM3) 183994 (Q9Y2V8) 20p13 f

142273 (CBLC) 19q13 t 101400 (SNTAI) 2Oq 11 t

008364 (AP2A1) 19q13 t 125991 20q11 i

142513 (ACPT) 19q13 t (SDBCAG84)

176898 19q13 f 088303 (Q9NQF5) 2Oq 11 i

179930 19q13 t 101464 (CDC91 L1) 2Oq 11 f

182582 (Q96GE3) 19q13 f 149611 (C20orf93) 20q11 t

186888 19q13 f 167104 (BPIL3) 20q11 t

182171 20q11 i 185225 (C21orf32) 21q22 t

183566 20q11 t 185397 (C21orf51) 21q22 t

(NM_173859) 185706 (Q8TCY0) 21q22 i

171940 (ZNF217) 20q13 t 187026 21q22 t

064205 (WISP2) 20q13 i (KRTAP21-2)

180305 20q13 t 128218 (VPREB3) 22q11 i

(WFDC10A) 138842 (Q9BWW2) 22q11 f

101150 (TPD52L2) 2Oq13 i 133525(099919) 22q11 t

101448(SPINLWI) 20q13 f 099958 (Q96Q80) 22q11 f

124216(SNAH) 20q13 f 178026 (Q8N0S9) 22q11 i,t

174334 (Q9H3Z8) 2Oq13 t 100034(PPMIF) 22q11 i

177410 (Q8N5E3) 2Oq13 f 100023 (PPIL2) 22q11 t

168734(PKIG) 2Oq13 t 161149 22q11 f

132786(043713) 20q13 f (NM_145042)

149657 20q13 t 177663 (IL17R) 22q11 f

(NM_144703) 100056 (DGCR14) 22q11 t

124217 (MOCS3) 20q13 f 159664 22q11 i

101052 (C20orf9) 2Oq13 i 172963 22q11 t

132823 (C20orf111) 20q13 f 172981 22q11 i

130706(ADRMI) 20q13 i 183229 22q11 f

184402 (SS18L1) 2Oq13 i 183307 (CECR6) 22q11 i

155282 21q11 f 183506 (Q8WUK7) 22q11 f, i

185272 (RBM11) 21q11 i 183785 (PEX26) 22q11 t

156253 (C21orf6) 21q21 f 184273 22q11 i

182598 21q21 f 099995 (SF3A1) 22q12 i, t

160305 (DIP2) 21q22 f 099985 (OSM) 22q12 f

159055 (C21orf45) 21q22 i 100350 22q12 i

182871 (COL18A1) 21q22 i (NM_024955)

184724 21q22 t 100365 (NCF4) 22q12 t

(KRTAP6-1) 100385 (IL2RB) 22q12 f

184809 (C21orf88) 21q22 i 100118 (HMG1L10) 22q12 f

184836 (C21orf86) 21q22 f 128284 (APOL3) 22q12 t

184900 (SMT3H1) 21q22 t 175329 22q12 i

182763 (Q96EQ7) 22q12 t 073150 (PANX2) 22q13 t

183579 (Q9ULT6) 22q12 f 100266 (PACSIN2) 22q13 i

184117 22q12 f 176177 22q13 i , t

(NIPSNAP1) (NM_152512)

184122 (Q96NJ4) 22q12 i 100101 22q13 i

184654 (Q8N9L7) 22q12 t (NM_024313)

100426 (ZBED4) 22q13 f, t 128285 (GPR24) 22q13 i

100106 (TARA) 22q13 f 184472 (YV02) 22q13 f

100241 (SBF1) 22q13 t 185022 (MAFF) 22q13 i

100413 (RPC8) 22q13 f

Genes used for the calculation of the pairwise interactions, feature selection, and model training are denoted by i, f, and t, respectively. To enhance legibility, the common prefix "ENSG00000" has been dropped from the Ensembl ID. Also listed are gene names and/or GENBANK® Accession Nos. where applicable.

Table 11

Primers Used for Expression Analysis of DLGAP2 and KCNK9

Name Sequence 5' -. . . -3' Position

DLGAP2- ACATGAGAAGCTGGGCACTC 2585-2604t

RT1 (SEQ ID NO: 3)

DLGAP2- CGTCACCTCCATCGACTTCT 2651-2670φ

RT2 (SEQ ID NO: 4)

DLGAP2- GGCCGTTTCCACCTGAATC 2048-2066t

M1 R (SEQ ID NO: 5)

DLGAP2- TGATGCTCTGGGAATTCAG 2059-20774:

M2R (SEQ ID NO: 6)

DLGAP2- CAGCTACCTTCGAGCCATTC 1605-16241

M1 F (SEQ ID NO: 7)

DLGAP2 1 F TAGGCTAGACGTCCAGGAACA 1603779-

(SEQ ID NO: 8) 1603799

DLGAP2-1 R TATTGGCAGGACTGAGTGGAG 1604304-

(SEQ ID NO: 9) 1604284

KCNK9-1 F CAAGGCCTTCTGCATGTTCT 53849487-

(SEQ ID NO: 10) 53849468

KCNK9-1 R GTGAATGACCATGCTGTTGC 53848983-

(SEQ ID NO: 11) 53849002

KCNK9-M1F TCCTTCTACTTTGCGATCACG 53933168-

(SEQ ID NO: 12) 53933148

KCNK9- CATGGTCAAGAACCTGAGGAC 53849058-

M1 R (SEQ ID NO: 13) 53849078

Positions for DLGAP2 primers refer to gi:37552484 (see also GENBANK® Accession No. NTJD23736), chr. 8.27.24 (f), and chr. 8.27.26 (φ). Positions for KCNK9 primers are given forgi:51467074 (see also GENBANK® Accession No. NT_008046.

REFERENCES

All references listed in the instant disclosure, including but not limited to all patents, patent applications and publications thereof, scientific journal articles, and database entries (including but not limited to GENBANK®, Ensembl, and dbSNP database entries and all annotations available therein) are incorporated herein by reference in their entireties to the extent that they supplement, explain, provide a background for, or teach methodology, techniques, and/or compositions employed herein. Alders et al. (2000) Am J Hum Genet 66:1473-1484. Allen et al. (2003) Proc Natl Acad Sci U S A 100:9940-9945. Altschul et al. (1990) 215 J MoI Biol 403-410. Amiel et al. (1999) Eur J Hum Genet 7:223-230. Arima et al. (2000) Genomics 67:248-255. Arsenian et al. (1998) EMBO J 17:6289-6299. Ausubel ef al. (2002) Short Protocols in Molecular Biology. Fifth ed. Wiley, New

York, New York, United States of America. Ausubel et al. (2003) Current Protocols in Molecular Biology. John Wylie &

Sons, Inc, New York, New York, United States of America. Baghdadli et al. (2002) Encephale 28:248. Bajaj et al. (2004) BMC Genet 5:13.

Bantignies & Cavalli (2006) Curr Opin Cell Biol 18:275-283.

Barlow (1993)' Science 260:309-310.

Barlow et al. (1991) Nature 349:84-87.

Bartlett et al. (2005) Am J Hum Genet 76, 688. Batzer et al. ( 1991 ) 19 Nucleic Acid Res 5081.

Bentley et al. (2003) J Med Genet 40:249-256.

Bertram et al. (2000) Science 290:2302.

Bix & Locksley (1998) Science 281 :1352-1354.

Blagitko et al. (2000) Hum MoI Genet 9:1587-1595. Blin-Wakkach et al. (2001) Proc Natl Acad Sci U S A 98:7336-7341.

Boccaccio et al. (1999) Hum MoI Genet 8:2497-2505.

Bonthron et al. (2000) Hum Genet 107:165-175.

Boyl et al. (2001) lnt J Dev Neurosci 19:353.

Brakenhoff et al. (1999) Clin Cancer Res 5:725. Brandeis et al. (1994) Nature 371 :435-438.

Buettner et al. (2004) Mamm Genome 15:199-209.

Byrne & Smith (1993) Hum Genet 93:275-277.

Byun et al. (2003) lntl J Cancer 104:318-327.

Cai et al. (2000) Carcinogenesis 21 :683-689. Chai et al. (2003) Am J Hum Genet 73:898-925.

Champion et al., Proc Natl Acad Sci U S A 91 , 11338 (1994).

Charlier et al. (2001) Genome Res 11 :850-862.

Chess et al. (1994) Ce// 78:823-834.

Chibuk et al. (2001) BMC Genet 2. Chung et al. (1996) Hum MoI Genet 5:1101-1108.

Cichon et al. (1996) Am J Med Genet 67:229-231.

Cichon et al. (2001) Hum MoI Genet 10:2933.

Clark et al. (2002) BMC Genet 11.

Cooper et al. (1998) Genomics 49:38-51. Cost et al. (1997) Cancer Res 57:926-929.

Dallosso et al. (2004) Hum MoI GeneM 3:405-415.

Dao et al. (1998) Hum MoI Genet 7:597-608.

DeLisi et al. (2002) Am J Psychiatry 159:803.

Dotan et al. (2000) Genes Chromosomes Cancer 27:270-277.

Driscoll et al. (1992) Genomics 13:917-924.

Du et al. (2005) Blood 106:3932-3939.

Eggermann et al. (1999) Ann Genet 42:117-121. Einarsdottir et al., Diabetes 55, 1879 (2006).

Ekelund ef a/., Hum Mo/ GeneMO, 1611 (2001).

Eun Kwon et al. (2004) /Ann NYAcad Sci 1034:1-18.

Evans et al. (2001) Genomics 77:99-104.

Farber et al. (2000) Genomics 65:174-183. Feinberg et al. (2006) Nature Rev Genet 7:21-33.

Furukawa et al. (2005) Cancer Res 65:7102-7110.

Gabriel et al. (1998) Proc Natl Acad Sci U S A 95:14857-14862.

Gilks et al., MoI Cell Biol 13, 1759 (1993).

Glenn et al. (1993) Hum MoI Genet 2:2001-2005. Glenn et al. (1997) MoI Hum Reprod 3:321-332.

Goldberg et al. (2003) Hum Genet 112:334-342.

Goshu et al. (2004) MoI Endocrinol 18:1251.

Gray et al. (1999) Proc Natl Acad Sci U S A 96:5616-5621.

Greally (2002) Proc Natl Acad Sci U S A 99:327-332. Greally et al. (1998) Hum MoI Genet 7:91-95.

Guo et al. (2006) J Clin Endocrinol Metab 91 :4001.

Hayward et al. (1998) Proc Natl Acad Sci 95:15475-15480.

Henikoff & Henikoff (1992) 89 Proc Natl Acad Sci U S A 10915-10919.

Herzing et al. (2001 ) Am J Hum Genet 68: 1501 -1505. Higashimoto et al. (2002) Genomics 80:575-584.

Hitchins et al. (2002) Mamm Genome 13:686-691.

Hollander et al. (1998) Science 279:2118-2121.

Horike et al. (2005) Nat Genet 37:31-40.

Hovatta et al., Am J Hum Genet 65, 1114 (1999). Hu et al. (1996) Hum MoI Genet 5:1743-1748. lshihara et al. (1998) Mamm Genome 9:775-777.

Jay et al. (1997) Nat Genet 17:357-361.

John et al. (2001) Dev Biol 236:387-399.

Jong et al. (1999) Hum MoI Genet 8:783-793.

Kaghad et al. (1997) Ce// 90:809-819.

Kalscheuer et al. (1993) Nat Genet 5:74-78.

Kamiya et al. (2000) Hum MoI Genet 9:453-460. Kananura et al. (2002) Am J Med Genet 114:227-229.

Karlin & Altschul (1993) 90 Proc Natl Acad Sci U S A 5873-5877.

Karolchik et al. (2003) Nucleic Acids Res 31 :51-54.

Kato et al. (1998) Genomics 47:146.

Kayashima et al. (2003) Hum Genet 112:220-226. Ke et al. (2002) Mamm Genome 13:639-645.

Kelly & Locksley (2000) J Immunol 165:2982-2986.

Killian ef a/. (2001) Hum MoI Genet 10:1721-1728.

Kimura et al. (2004) J Hum Genet 49:273-237.

Kitsberg et al. (1993) Nature 364:459-463. Kobayashi et al. (1997) Hum MoI Genet 6:781-786.

Kobayashi et al. (2000) Genes Cells 5:1029-1037.

Koide et al., Nat Genet 6, 9 (1994).

Krishnapuram et al. (2005) IEEE Transactions on Pattern Analysis and Machine

Intelligence (PAMI) pp 957-968. Kurosawa et al., Am J Med Genet 110, 268 (2002).

Lamb et al., J Med Genet 42, 132 (2005).

Lee et al. (1997) Nat Genet 15: 181 -185.

Lee et al., (2000a) Nat Genet 26:470.

Lee et al. (2000b) Hum MoI Genet 9:1813-1819. Lee et al. (2001 ) Mamm Genome 12: 157-162.

Lerer et al. (2005) Hum MoI Genet 14:3911-3920.

Levitsky et al. (2001) Bioinformatics 17:998-1010.

Li et al. (1993) Genomics 16:572.

Li et al. (2002) J Biol Chem 277:13518-13527. Li et al. (2004) Proc Natl Acad Sci U S A ϊ OI :7341-7346.

Lin & Floros (2002) Physiol Genomics 11 :235-243.

Liu et al. (2000) J Clin Invest 106:1167-1174.

Liu et al. (2005) BMC Genet 6 Suppl 1 :S160.

Liu et al., Proc Natl Acad Sci U S A 99, 3717 (2002).

Luedi et al. (2005) Genome Res 15:875-884.

Luo et al. (2001) Biochim Biophys Acta 1519:216-222.

LyIe et al. (2000) Nat Genet 25:19-21. MacDonald & Weyrick, Hum MoI Genet 6, 1873 (1997).

Maecker et al. (1998) Proc Natl Acad Sci U S A 95:2458-2462.

Mager et al. (2003) Nat Genet 33:502-507.

Mai et al. (1998) Oncogene 17:1739-1741.

Mancini-Dinardo et al. (2006) Genes Dev 20:1268-1282. Marker et al. (1995) Genomics 28:576-580.

Matsuoka et al. (1996) Proc Natl Acad Sci U S A 93:3026-3030.

Mayeux et al. (2002) Am J Hum Genet 70:237.

Maynard et al. (2003) Proc Natl Acad Sci U S A 100:14433.

Mclnnis et al. (2003) MoI Psychiatry 8:288-298. Medhurst et al. (2001) Brain Res MoI Brain Res 86:101-114.

Meguro et al. (2001) Nat Genet 28:19-20.

Miltenberger et al. (1995) MoI Cell Biol 15:2527-2535.

Miyoshi et al. (2000) Genes Cells 5:211-220.

Mizuno et al. (2002) Biochem Biophys Res Commun 290:1499-1505. Moens & Selleri (2006) Dev Biol 291 :193-206.

Monk et al. (2006) Proc Natl Acad Sci U S A 103:6623-6628.

Moore et al. (2001) Diabetes 50:199-203.

Morison et al. (2001) Nucleic Acids Res 29:275-276.

Morison et al. (2005) Trends Genet 21 :457-465. Moroy (2005) lnt J Biochem Cell Biol 37:541.

Mostoslavsky et al. (2001) Nature 414:221-225.

Murphy & Jirtle (2003) Bioessays 25:577-588.

Murphy et al. (2001) Genomics 71 :110-117.

Muscheck et al. (2000) Lab Invest 80:1089-1093. Mustanski et al. (2005) Hum Genet 116:272-278.

Myers et al. (2005) Science 310:321-324.

Nabetani et al. (1997) MoI Cell Biol 17:789-798.

Nagafuchi et al. (1994) Nat Genet 8:177.

Nakabayashi et al. (2004) J Med Genet 41 :601-608.

Needleman & Wunsch (1970) 48 J MoI Biol 443-453.

Niemitz & Feinberg (2004) Am J Hum Genet 74:599-609.

Nikaido et al. (2003) Genome Res 13:1402-1409. Niu et al. (1999) Plant MoI Biol 41 :1-13.

Ogawa et al. (1993a) Nature 362:749-751.

Ogawa et al. (1993b) Hum MoI Genet 2:2163-2165.

Ohlsson et al. (1993) Nat Genet 4:94-97.

Ohtsuka et al. (1985) 260 J Biol Chem 2605-2608. Okita et al. (2003) Genomics 81 : 556-559.

Okutsu et al. (2000) J Biochem 127:475-483.

Ono et al. (2001) Genomics 73:232-237.

Overall et al. (1998) Mamm Genome 9:657-659.

Patel & Lazdunski (2004) Pflugers Arch 448:261-273. Paulsen ef a/. (1998) Hum MoI Genet 7:1149-1159.

Paulsen et al. (2000) Hum MoI Genet 9:1829-1841.

Pearson & Lipman (1988) Proc Natl Acad Sci U S A 85:2444-2448.

Pereira et al. (2003) Nat Immunol 4:464-470.

Piras et al. (2000) MoI Cell Biol 20:3308-3315. Qian et al. (1997) Hum MoI Genet 6:2021-2029.

Rachmilewitz et al. (1992) FEBS Lett 309:25-28.

Rachmilewitz et al. (1993) Biochem Biophys Res Commun 196:659-664.

Rainier et al. (1993) Nature 362:747-749.

Ranta et al. (2000) Eur J Hum Genet 8:381-384. Reik & Walter (2001) Nat Rev Genet 2:21-32.

Rossolini et al. (1994) 8 MoI Cell Probes 91-98.

Rougeulle et al. (1997) Nat Genet 17:14-15.

Ruf et al. (2006) Genomics 87:509-519.

Sandell et al. (2003) Proc Natl Acad Sci U S A 100:4622-4627. Sano ef a/. (2001) Genome Res 11 :1833-1841.

Schratt et al. (2001) MoI Cell Biol lλ :2933-2943.

Schweifer et al. (1997) Genomics 43:285-297.

Scott et al., Am J Hum Genet 66, 922 (2000).

Seitz et al. (2003) Nat Genet 34:261-262.

Shoichet et al. (2005) Hum Genet 117, 536 (2005).

Simonaro et al. (2006) Am J Hum Genet 78:865-870.

Singh et al. (2003) Nat Genet 33:339-341. Smith & Waterman (1981) 2 AdvAppl Math 482-489.

Soulez et al. (1996) MoI Cell Biol 16:6065-6074.

Stefansson et al. (2002) Am J Hum Genet 71 :877.

Strauch et al. (2001) Genet Epidemiol 21 Suppl 1 :S204.

Strauch et al. (2005) BMC Genet 6 Suppl 1 :S162. Strichman-Almashanu et al. (2002) Genome Res 12:543-554.

Takahashi & Ko (1993) Genomics 16:161-168.

Tanamachi et al. (2001) J Exp Med 193:307-315.

Taniguchi et al. (1997) Oncogene 14:1201-1206.

Tierling et al. (2006) Genomics 87:225-235. Umlauf et al. (2004) Nat Genet 36:1296-1300.

Van den Veyver et al. (2001) J Soc Gynecol Investig 8:305-313. van Doonlinck et al., J Neurosci 19, RC12 (1999). van Raamsdonk & Tilghman (2000) Development 127:5439-5448.

Vance et al. (2002) Proc Natl Acad Sci U S A 99:868-873. Vandromme ef a/. (1992) J Ce// Biol 118: 1489-1500.

Verri et al., Ann Genet 47, 281 (2004).

Vu & Hoffman (1997) Nat Genet 17:12-13.

Wakeling et al. (1998) Eur J Hum Genet 6:158-164.

Wakeling et al. (2000) J Med Genet 37:65-67. Walter & Paulsen (2003) Hum MoI Genet 12:215-220.

Waterland & Jirtle (2003) MoI Cell Biol 23:5293-5300.

Weber et al. (2001) Mech Devel 101 :133-141 .

Weyrick et al. (1994) Hum MoI Genet 3:1877-1882.

Williamson et al. (1994) Genomics 22:240-242. Williamson et al. (1995) Genet Res 65:83-93.

Witten & Frank (2005) Data mining: Practical machine learning tools and technigues (2d ed.). Morgan Kaufmann, San Francisco, United States of America.

Wood et al. (1998) MoI Cell Neurosci 11 :149.

Wright et al. (2004) Development 131 :5659.

WyNe et al. (2000) Genome Res 10:1711-1718.

Xin et al. (2000) J Biochem 128:847-853. Xuan et al., Neuron 14, 1141 (1995).

Yamada et al. (2002) Gene 288:57-63.

Yamada et al. (2003) Genomics 83:402-412.

Yevtodiyenko et al. (2002) Mamm Genome 13:633-638.

Yoder et al. (1997) Trends Genet 13:335-340. Yonan et al., Am J Hum Genet 73, 886 (2003).

Yoo & Jones (2006) Nature Rev Drug Discovery 37-50.

Yoshihashi et al. (2000) Am J Hum Genet 67:476-482.

Yu et al. (1999) Proc Natl Acad Sci 96:214-219.

Yuan et al. (1996) Hum MoI Genet 5:1931-1937. Zara et al. (1995) Hum MoI Genet 4: 1201 -1207.

Zhang & Tycko (1992) Nat Genef1 :40-44.

Zhang et al. (1994) Nature 372:809-812.

Zhu et al. (2000) Gene 256:311-317.

Zimprich et al. (2001) Nat Genet 29:66-69.

It will be understood that various details of the presently disclosed subject matter may be changed without departing from the scope of the presently disclosed subject matter. Furthermore, the foregoing description is for the purpose of illustration only, and not for the purpose of limitation.