Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
IRF-5 HAPLOTYPES IN SYSTEMIC LUPUS ERYTHEMATOSUS
Document Type and Number:
WIPO Patent Application WO/2007/115207
Kind Code:
A2
Abstract:
Methods and materials involved in diagnosing SLE are provided herein. The methods and materials can be used to diagnose SLE and/or assess a mammal's susceptibility to develop SLE, based on the presence or absence of one or more IRF-5 variants. Figure 13 depicts a summary of IRF5 haplotypes and their association to SLE.

Inventors:
BEHRENS TIMOTHY W (US)
GRAHAM ROBERT (US)
ALTSHULER DAVID (US)
Application Number:
PCT/US2007/065689
Publication Date:
October 11, 2007
Filing Date:
March 30, 2007
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
UNIV MINNESOTA (US)
GEN HOSPITAL CORP (US)
BEHRENS TIMOTHY W (US)
GRAHAM ROBERT (US)
ALTSHULER DAVID (US)
International Classes:
A63B53/00
Other References:
GRAHAM R.R. ET AL.: 'A common haplotype of interferon regulatory factor 5 (IRF5) regulates splicing and expression and is associated with increased risk of system lupus erythematosus' NATURE GENETICS vol. 38, no. 5, May 2006, pages 550 - 555
SIGURDSSON S. ET AL.: 'Polymorphisms in the tyrosine kinase 2 and interferon regulatory factor 5 genes are associated with systemic lupus erythematosus' AMERICAN JOURNAL OF HUMAN GENETICS vol. 76, no. 3, pages 528 - 537
Attorney, Agent or Firm:
KAYTOR, Elizabeth N. et al. (P.O. Box 1022Minneapolis, Minnesota, US)
Download PDF:
Claims:

WHAT IS CLAIMED IS:

1. A method for assessing the predisposition of a mammal to develop systemic lupus erythematosus (SLE), comprising:

(a) determining whether or not said mammal has an IRF-5 haplotype comprising an rs2004640 T allele, an IRF-5 exon 6 insertion allele, and an rsl 0954213 A allele; and

(b) classifying said mammal as being susceptible to develop SLE if said mammal has said IRF-5 haplotype, or classifying said mammal as not being susceptible to develop SLE if said mammal does not contain said IRF-5 haplotype.

2. The method of claim 1 , wherein said mammal is a human.

3. The method of claim 1 , further comprising determining whether a biological sample from said mammal contains elevated levels of interferon-α (IFN-α), interleukin- 1 receptor antagonist (IL-IRA), interleukin-6 (IL-6), monocyte chemoattractant protein- 1 (MCP-I), macrophage inflammatory protein- 1 α (MIP- 1 α), macrophage inflammatory protein- 1 β (MIP-I β), or tumor necrosis factor-α (TNF-α).

4. A method for diagnosing SLE in a mammal, comprising:

(a) determining whether or not said mammal has an IRF-5 haplotype comprising an rs2004640 T allele, an IRF-5 exon 6 insertion allele, and an rs 10954213 A allele; and

(b) classifying said mammal as being susceptible to develop SLE if said mammal has said IRF-5 haplotype, or classifying said mammal as not being susceptible to develop SLE if said mammal does not have said IRF-5 haplotype.

5. The method of claim 4, wherein said mammal is a human.

6. The method of claim 4, further comprising determining whether a biological sample from said mammal contains elevated levels of IFN-α, IL-IRA, IL-6, MCP-I, MIP- lα, MIP-I β, or TNF-α.

7. A method for assessing the predisposition of a mammal to develop SLE, comprising:

(a) determining whether or not said mammal has an IRF-5 haplotype comprising an rs2004640 T allele, an IRF-5 exon 6 insertion allele, an rslO954213 A allele, and

an rs2070197 C allele; and

(b) classifying said mammal as being susceptible to develop SLE if said mammal has said 1RF-5 haplotype, or classifying said mammal as not being susceptible to develop SLE if said mammal does not have said IRF-5 haplotype.

8. The method of claim 7, wherein said mammal is a human.

9. The method of claim 7, further comprising determining whether a biological sample from said mammal contains elevated levels of interferon-α (lFN-α), interleukin-1 receptor antagonist (IL-IRA), interleukin-6 (IL-6), monocyte chemoattractant protein-1 (MCP-I), macrophage inflammatory protein-lα (MIP-Ia), macrophage inflammatory protein- lβ (MIP-I β), or tumor necrosis factor-α (TNF -α).

10. A method for diagnosing SLE in a mammal, comprising:

(a) determining whether or not said mammal has an IRF-5 haplotype comprising an rs2004640 T allele, an IRF-5 exon 6 insertion allele, an rs 10954213 A allele, and an rs2070197 C allele; and

(b) classifying said mammal as being susceptible to develop SLE if said mammal has said IRF-5 haplotype, or classifying said mammal as not being susceptible to develop SLE if said mammal does not have said IRF-5 haplotype.

1 1. The method of claim 10. wherein said mammal is a human.

12. The method of claim 10, further comprising determining whether a biological sample from said mammal contains elevated levels of IFN-α, IL-IRA, IL-6, MCP-I, MIP- lα, MIP-l β, or TNF-α.

13. A method for assessing the predisposition of a mammal to develop SLE, comprising:

(a) determining whether or not said mammal comprises cells containing a level of an IRF-5 polypeptide that is greater than an average level of an IRF-5 polypeptide in control cells from one or more control mammals, wherein said mammal and said one or more control mammals are from the same species, and wherein said IRF-5 polypeptide in said mammal comprises an amino acid sequence encoded by exon IB and an amino acid sequence encoded by an insertion in exon 6; and

(b) classifying said mammal as being susceptible to develop SLE if said mammal

contains said cells, or classifying said mammal as not being susceptible to develop SLE if said mammal does not contain said cells.

14. The method of claim 13, wherein said mammal is a human.

15. The method of claim 13, wherein said one or more control mammals are healthy humans.

16. The method of claim 13, wherein said cells and said control cells are peripheral blood mononuclear cells or whole blood cells.

17. The method of claim 13, wherein said level of IRF-5 polypeptide in said mammal is greater than the average level of IRF-5 polypeptide in control cells from at least 10 control mammals.

18. The method of claim 13, wherein said level of IRF-5 polypeptide in said mammal is greater than the average level of IRF-5 polypeptide in control cells from at least 20 control mammals.

19. The method of claim 13, wherein said determining step comprises measuring the level of IRF-5 mRNA encoding said IRF-5 polypeptide.

20. The method of claim 13, wherein said determining step comprises measuring the level of said IRF-5 polypeptide.

21. The method of claim 13, further comprising determining whether a biological sample from said mammal contains elevated levels of IFN-α, IL-IRA, IL-6, MCP-I , MIP - lα, MIP- l β, or TNF-CL.

22. A method for diagnosing SLE in a mammal, comprising:

(a) determining whether or not said mammal comprises cells containing a level of an IRF-5 polypeptide that is greater than an average level of an IRF-5 polypeptide in control cells from one or more control mammals, wherein said mammal and said one or more control mammals are from the same species, and wherein said IRF-5 polypeptide in said mammal comprises an amino acid sequence encoded by exon IB and an amino acid sequence encoded by an insertion in exon 6; and

(b) classifying said mammal as being susceptible to develop SLE if said mammal

contains said cells, or classifying said mammal as not being susceptible to develop SLE if said mammal does not contain said cells.

23. The method of claim 22, wherein said mammal is a human.

24. The method of claim 22, wherein said one or more control mammals are healthy humans.

25. The method of claim 22, wherein said cells and said control cells are peripheral blood mononuclear cells or whole blood cells.

26. The method of claim 22, wherein said level of 1RF-5 in said mammal is greater than the average level of IRF-5 polypeptide in control cells from at least 10 control mammals.

27. The method of claim 22, wherein said level of IRF-5 polypeptide in said mammal is greater than the average level of IRF-5 polypeptide in control cells from at least 20 control mammals.

28. The method of claim 22, wherein said determining step comprises measuring the level of IRF-5 mRNA encoding said IRF-5 polypeptide.

29. The method of claim 22, wherein said determining step comprises measuring the level of IRF-5 polypeptide.

30. The method of claim 22, further comprising determining whether a biological sample from said mammal contains elevated levels of IFN-α, IL-IRA, IL-6, MCP-I, MIP- Iα, MIP-Iβ, or TNF-α.

31. A method for determining the likelihood of a mammal to respond to treatment with a therapy directed to IRF-5, comprising:

(a) determining whether or not said mammal has an IRF-5 haplotype comprising an rs2004640 T allele, an IRF-5 exon 6 insertion allele, and an rsl 0954213 A allele; and

(b) classifying said mammal as likely to respond to said therapy if said mammal has said IRF-5 haplotype, or classifying said mammal as not being likely to respond to said therapy if said mammal does not have said IRF-5 haplotype,

32. The method of claim 31, wherein said mammal is a human.

33. The method of claim 31, wherein said mammal is diagnosed as having SLE.

34. The method of claim 31, wherein a response to said therapy comprises a reduction in one or more symptoms of SLE.

35. The method of claim 31, further comprising determining whether a biological sample from said mammal contains elevated levels of IFN-α, IL-IRA, IL-6, MCP-I, MIP- lα, MIP-I β, or TNF-α.

36. A method for determining the likelihood of a mammal to respond to treatment with a therapy directed to IRF-5, comprising:

(a) determining whether or not said mammal comprises cells containing a level of an IRF-5 polypeptide that is greater than an average level of an IRF-5 polypeptide in control cells from one or more control mammals, wherein said mammal and said one or more control mammals are from the same species, and wherein said IRF-5 polypeptide in said mammal comprises an amino acid sequence encoded by exon IB and an amino acid sequence encoded by an insertion in exon 6; and

(b) classifying said mammal as likely to respond to said therapy if said mammal contains said cells, or classifying said mammal as not being likely to respond to said therapy if said mammal does not contain said cells.

37. The method of claim 36, wherein said mammal is a human.

38. The method of claim 36, wherein said mammal is diagnosed as having SLE.

39. The method of claim 36, wherein said one or more control mammals are healthy humans.

40. The method of claim 36, wherein said cells and said control cells are peripheral blood mononuclear cells or whole blood cells.

41. The method of claim 36, wherein said level of IRF-5 polypeptide in said mammal is greater than the average level of IRF-5 polypeptide in control cells from at least 10 control mammals.

2. The method of claim 36, wherein said level of IRF-5 polypeptide in said mammal is greater than the average level of IRF-5 polypeptide in control cells from at least 20 control mammals.

43. The method of claim 36, wherein said determining step comprises measuring the level of IRF-5 mRNA encoding said IRF-5 polypeptide.

44. The method of claim 36, wherein said determining step comprises measuring the level of IRF-5 polypeptide.

45. The method of claim 36, wherein a response to said therapy comprises a reduction in one or more symptoms of SLE.

46. The method of claim 36, further comprising determining whether a biological sample from said mammal contains elevated levels of IFN-α, IL-IRA, IL-6, MCP-I, MIP- lα, MIP-I β, or TNF-α.

47. The method of claim 36, comprising determining whether or not said mammal contains detectable levels of an IRF-5 mRNA having a truncated 3' untranslated region.

48. A method for determining the likelihood of a mammal to respond to treatment with a therapy directed to a cytokine or a Toll like receptor (TLR), comprising:

(a) determining whether or not said mammal has an IRF-5 haplotype comprising an rs2004640 T allele, an IRF-5 exon 6 insertion allele, and an rsl 0954213 A allele; and

(b) classifying said mammal as likely to respond to said treatment if said mammal has said IRF-5 haplotype, or classifying said mammal as not being likely to respond to said treatment if said mammal does not have said IRF-5 haplotype.

49. The method of claim 48, wherein said cytokine is IFN-α, IL-IRA, IL-6, MCP-I , MIP- lα, MIP-I β, or TNF-α.

50. The method of claim 48, wherein said TLR is TLR7, TLR8, or TLR9.

51. The method of claim 48. wherein said mammal is a human.

2. The method of claim 48, further comprising determining whether a biological sample from said mammal contains elevated levels of IFN-α, IL-IRA, IL-6, MCP-I, MIP- lα, MIP-I β, or TNF-α.

53. A method for determining the likelihood of a mammal to respond to treatment with a therapy directed to a cytokine or a TLR, comprising:

(a) determining whether or not said mammal comprises cells containing a level of an IRF-5 polypeptide that is greater than an average level of an IRF-5 polypeptide in control cells from one or more control mammals, wherein said mammal and said one or more control mammals are from the same species, and wherein said IRF-5 polypeptide in said mammal comprises an amino acid sequence encoded by exon IB and an amino acid sequence encoded by an insertion in exon 6; and

(b) classifying said mammal as likely to respond to said treatment if said mammal contains said cells, or classifying said mammal as not being likely to respond to said treatment if said mammal does not contain said cells.

54. The method of claim 53, wherein said cytokine is IFN-α, IL-IRA, IL-6, MCP-I , MIP- lα, MIP-I β, or TNF-α.

55. The method of claim 53, wherein said TLR is TLR7, TLR8, or TLR9.

56. The method of claim 53, wherein said mammal is a human.

57. The method of claim 53, wherein said one or more control mammals are healthy humans.

58. The method of claim 53, wherein said cells and said control cells are peripheral blood mononuclear cells or whole blood cells.

59. The method of claim 53, wherein said level of IRF-5 polypeptide in said mammal is greater than the average level of IRF-5 polypeptide in control cells from at least 10 control mammals.

60. The method of claim 53, wherein said level of IRF-5 polypeptide in said mammal is greater than the average level of IRF-5 polypeptide in control cells from at least 20 control mammals.

61. The method of claim 53, wherein said determining step comprises measuring the level of IRF-5 mRNA encoding said IRF-5 polypeptide.

62. The method of claim 53, wherein said determining step comprises measuring the level of IRF-5 polypeptide.

63. The method of claim 53, further comprising determining whether a biological sample from said mammal contains elevated levels of IFN-α, IL-IRA, IL-6, MCP-I , MIP- lα, MIP-l β, or TNF-α.

Description:

IRF-5 HAPLOTYPES IN SYSTEMIC LUPUS ERYTHEMATOSUS

CROSS-REFERENCE TO RELATED APPLICATIONS

This application claims priority from U.S. Provisional Application Serial No. 60/787,767, filed March 31 , 2006.

STATEMENTAS TO FEDERALLY SPONSORED RESEARCH

Funding for the work described herein was provided in part by the National Institutes of Health, grant numbers AI 63274-01 and AR 43274-10. The federal government may have certain rights in the invention.

TECHNICAL FIELD

This document relates to materials and methods for diagnosing or predicting risk of systemic lupus erythematosus.

BACKGROUND

Systemic lupus erythematosus (SLE) is a chronic, inflammatory autoimmune disease characterized by antinuclear autoantibodies and deposition of immune complexes, leading to organ damage and early death (Alarcon-Segovia et al. (2005) Arthritis Rheum. 52:1138-1147). SLE autoantibodies mediate organ damage by directly binding to host tissues and by forming immune complexes that deposit in vascular tissues and activate immune cells. Organs targeted in SLE include the skin, kidneys, vasculature, joints, various blood elements, and the central nervous system (CNS). The severity of disease, the spectrum of clinical involvement, and the response to therapy vary widely among patients.

The type I interferon (IFN) pathway is activated in human SLE (Blanco et al. (2001) Science 294:1540-1543; Ronnblom and Aim (2001) J. Exp. Med 194:F59-63; Baechler et al (2003) Proc. Natl. Acad. ScL USA 100:2610-2615). Type I IFN is a central mediator of viral immunity (Isaacs and Lindenmann (1957) Proc. R. Soc. B 147:258-273), and many SLE patients strongly overexpress IFN-responsive genes in blood cells (Baechler et al. supra; Bennett et al. (2003) J. Exp. Med. 197:711-723; Kirou

et al. (2004) Arthritis Rheum, 50:3958-3967). However, it is not known whether the IFN expression signature is a general biomarker of a dysregulated immune system, or rather reflects primary genetic variation causal to the pathogenesis of human SLE.

IFN regulatory factor 5 (IRF-5) is a member of a family of transcription factors that controls inflammatory and immune responses (Honda et al. (2005) Int. Immunol. 17:1367-1378). IRF-5 has a critical role in the production of the pro-inflammatory cytokines tumor necrosis factor-α (TNF-α), interleukin- 12 (IL- 12), and IL-6 following toll-like receptor (TLR) signaling as determined by knockout mouse studies (Takaoka et al. (2005) Nature 434:243-249), and is also important for transactivation of type I IFN and IFN-responsive genes (Barnes et al. (2001) J Biol. Chem. 276:23382-23390; Barnes et al. (2004) J. Biol. Chem. 279:45194-45207).

The clinical heterogeneity of SLE makes it challenging to diagnose and manage this disease. Moreover, current therapy options for SLE are limited, and therapy strategies are highly individualized and tend to include much trial and error. Thus, there is a need for diagnostic technologies for SLE that can identify patients that will likely respond well to particular therapies.

SUMMARY

This document is based in part on the discovery that several IRF-5 single nucleotide polymorphisms (SNPs) are associated with SLE. For example, the results provided herein demonstrate that the IRF-5 rs2004640 T allele, rs2880714 T allele, rs2070197 C allele, rs 10954213 A allele, and exon 6 insertion allele are associated with SLE. The results also demonstrate that the rs2004640 T allele creates a 5' donor splice site in an alternate exon 1 of IRF-5 (exon- IB), and that only individuals with the donor splice site express IRF-5 isoforms initiated at exon-lB. In addition, the results show that rs2880714, an independent cis-acting variant that drives elevated expression of IRF-5 transcripts, is strongly linked to the exon- IB splice donor site. Further, the results presented herein demonstrate that the rs 10954213 A allele results in a "short form" IRF-5 mRNA and a truncated 3' untranslated region (UTR). This allele also is associated with elevated levels of IRF-5 expression. Haplotypes with elevated IRF-5 expression in the absence of the exon- IB donor site, however, do not confer risk to SLE. Further, a germline polymorphism has been discovered that results in a 30 nucleotide insertion in exon 6 of IRF-5, and have observed that this insertion also is associated with SLE. An IRF-5 haplotype that drives elevated expression of multiple unique isoforms of IRF-5

can be an important genetic risk factor for SLE, proving a causal role of type I IFN pathway genes in human autoimmune disease.

In one aspect, this document features a method for assessing the predisposition of a mammal to develop systemic lupus erythematosus (SLE), comprising: (a) determining whether or not the mammal has an IRF-5 haplotype comprising an rs2004640 T allele, an IRF-5 exon 6 insertion allele, and an rslO954213 A allele; and (b) classifying the mammal as being susceptible to develop SLE if the mammal has the IRF-5 haplotype, or classifying the mammal as not being susceptible to develop SLE if the mammal does not contain the IRF-5 haplotype. The mammal can be a human. The method can further include determining whether a biological sample from the mammal contains elevated levels of interferon- α (IFN-α), interleukin-1 receptor antagonist (IL-IRA), interleukin-6 (IL-6), monocyte chemoattractant protein- 1 (MCP-I), macrophage inflammatory protein- lα (MlP- lα), macrophage inflammatory protein-l β (MIP- l β), or tumor necrosis factor-α (TNF-α). In another aspect, this document features a method for diagnosing SLE in a mammal, comprising: (a) determining whether or not the mammal has an IRF-5 haplotype comprising an rs2004640 T allele, an IRF-5 exon 6 insertion allele, and an rs 10954213 A allele; and (b) classifying the mammal as being susceptible to develop SLE if the mammal has the IRF-5 haplotype. or classifying the mammal as not being susceptible to develop SLE if the mammal does not have the IRF-5 haplotype. The mammal can be a human. The method can further include determining whether a biological sample from the mammal contains elevated levels of IFN-α, IL- IRA, IL-6, MCP-I, MIP- lα, MIP- l β, or TNF-α.

In another aspect, this document features a method for assessing the predisposition of a mammal to develop SLE, comprising: (a) determining whether or not the mammal has an IRF-5 haplotype comprising an rs2004640 T allele, an IRF-5 exon 6 insertion allele, an rslO954213 A allele, and an rs2070197 C allele; and (b) classifying the mammal as being susceptible to develop SLE if the mammal has the IRF-5 haplotype, or classifying the mammal as not being susceptible to develop SLE if the mammal does not have the IRF-5 haplotype. The mammal can be a human. The method can further include determining whether a biological sample from the mammal contains elevated levels of interferon-α (IFN-α), interleukin-1 receptor antagonist (IL-IRA), interleukin-6 (IL-6), monocyte chemoattractant protein- 1 (MCP-I ), macrophage

inflammatory protein- lα (MIP-I α), macrophage inflammatory protein- lβ (MIP- lβ), or tumor necrosis factor-α (TNF-α).

In another aspect, this document features a method for diagnosing SLE in a mammal, comprising: (a) determining whether or not the mammal has an IRF-5 haplotype comprising an rs2004640 T allele, an IRF-5 exon 6 insertion allele, an rs 10954213 A allele, and an rs2070197 C allele; and (b) classifying the mammal as being susceptible to develop SLE if the mammal has the IRF-5 haplotype, or classifying the mammal as not being susceptible to develop SLE if the mammal does not have the IRF-5 haplotype. The mammal can be a human. The method can further include determining whether a biological sample from the mammal contains elevated levels of IFN-α, IL- IRA, IL-6, MCP-I, MIP- lα, MIP-] β, or TNF-α.

In another aspect, this document features a method for assessing the predisposition of a mammal to develop SLE, comprising: (a) determining whether or not the mammal comprises cells containing a level of an IRF-5 polypeptide that is greater than an average level of an IRF-5 polypeptide in control cells from one or more control mammals, wherein the mammal and the one or more control mammals are from the same species, and wherein the IRF-5 polypeptide in the mammal comprises an amino acid sequence encoded by exon IB and an amino acid sequence encoded by an insertion in exon 6; and (b) classifying the mammal as being susceptible to develop SLE if the mammal contains the cells, or classifying the mammal as not being susceptible to develop SLE if the mammal does not contain the cells. The mammal can be a human. The one or more control mammals can be healthy humans. The cells and the control cells can be peripheral blood mononuclear cells or whole blood cells. The level of IRF-5 polypeptide in the mammal can be greater than the average level of IRF-5 polypeptide in control cells from at least 10 control mammals, or greater than the average level of IRF-5 polypeptide in control cells from at least 20 control mammals. The determining step can include measuring the level of IRF-5 mRNA encoding the IRF-5 polypeptide, or measuring the level of IRF-5 polypeptide. The method can further include determining whether a biological sample from the mammal contains elevated levels of IFN-α, IL- 1 RA, IL-6, MCP- 1 , MIP- 1 α, MIP- 1 β, or TNF-α.

In another aspect, this document features a method for diagnosing SLE in a mammal, comprising: (a) determining whether or not the mammal comprises cells containing a level of an IRF-5 polypeptide that is greater than an average level of an IRF- 5 polypeptide in control cells from one or more control mammals, wherein the mammal

and the one or more control mammals are from the same species, and wherein the [RF-5 polypeptide in the mammal comprises an amino acid sequence encoded by exon IB and an amino acid sequence encoded by an insertion in exon 6; and (b) classifying the mammal as being susceptible to develop SLE if the mammal contains the cells, or classifying the mammal as not being susceptible to develop SLE if the mammal does not contain the cells. The mammal can be a human. The one or more control mammals can be healthy humans. The cells and the control cells can be peripheral blood mononuclear cells or whole blood cells. The level of IRF-5 polypeptide in the mammal can be greater than the average level of IRF-5 polypeptide in control cells from at least 10 control mammals, or greater than the average level of IRF-5 polypeptide in control cells from at least 20 control mammals. The determining step can include measuring the level of IRF- 5 mRNA encoding the IRF-5 polypeptide, or measuring the level of IRF-5 polypeptide. The method can further include determining whether a biological sample from the mammal contains elevated levels of IFN-α, IL-IRA, IL-6, MCP-I, MIP- lα, MIP- lβ, or TNF-α.

In yet another aspect, this document features a method for determining the likelihood of a mammal to respond to treatment with a therapy directed to IRF-5, comprising: (a) determining whether or not the mammal has an IRF-5 haplotype comprising an rs2004640 T allele, an IRF-5 exon 6 insertion allele, and an rs 10954213 A allele; and (b) classifying the mammal as likely to respond to the therapy if the mammal has the IRF-5 haplotype, or classifying the mammal as not being likely to respond to the therapy if the mammal does not have the IRF-5 haplotype. The mammal can be a human. The mammal can be diagnosed as having SLE. A response to the therapy can include a reduction in one or more symptoms of SLE. The method can further include determining whether a biological sample from the mammal contains elevated levels of IFN-α, IL-IRA, IL-6, MCP-I, MIP- lα, MIP- l β, or TNF-α.

In still another aspect, this document features a method for determining the likelihood of a mammal to respond to treatment with a therapy directed to IRF-5, comprising: (a) determining whether or not the mammal comprises cells containing a level of an IRF-5 polypeptide that is greater than an average level of an IRF-5 polypeptide in control cells from one or more control mammals, wherein the mammal and the one or more control mammals are from the same species, and wherein the IRF-5 polypeptide in the mammal comprises an amino acid sequence encoded by exon IB and an amino acid sequence encoded by an insertion in exon 6; and (b) classifying the

mammal as likely to respond to the therapy if the mammal contains the cells, or classifying the mammal as not being likely to respond to the therapy if the mammal does not contain the cells. The mammal can be a human. The mammal can be diagnosed as having SLE. The one or more control mammals can be healthy humans. The cells and the control cells can be peripheral blood mononuclear cells or whole blood cells. The level of IRF-5 polypeptide in the mammal can be greater than the average level of IRF-5 polypeptide in control cells from at least 10 control mammals, or greater than the average level of IRF-5 polypeptide in control cells from at least 20 control mammals. The determining step can include measuring the level of IRF-5 mRNA encoding the IRF-5 polypeptide, or measuring the level of IRF-5 polypeptide. A response to the therapy can include a reduction in one or more symptoms of SLE. The method can further include determining whether a biological sample from the mammal contains elevated levels of IFN-α, IL-IRA, IL-6, MCP-I, MIP-Ia, MIP-I β, or TNF -α. The method can include determining whether or not the mammal contains detectable levels of an IRF-5 mRNA having a truncated 3' untranslated region.

In another aspect, this document features a method for determining the likelihood of a mammal to respond to treatment with a therapy directed to a cytokine or a Toll like receptor (TLR), comprising: (a) determining whether or not the mammal has an IRF-5 haplotype comprising an rs2004640 T allele, an IRF-5 exon 6 insertion allele, and an rs 10954213 A allele; and (b) classifying the mammal as likely to respond to the treatment if the mammal has the IRF-5 haplotype, or classifying the mammal as not being likely to respond to the treatment if the mammal does not have the IRF-5 haplotype. The cytokine can be IFN-α, IL- IRA, IL-6, MCP-I, MIP- lα, MlP-lβ, or TNF-α. The TLR can be TLR7, TLR8, or TLR9. The mammal can be a human. The method can further include determining whether a biological sample from the mammal contains elevated levels of IFN-α, IL-IRA, IL-6, MCP-I, MIP-Ia, MIP-I β, or TNF-α.

In yet another aspect, this document features a method for determining the likelihood of a mammal to respond to treatment with a therapy directed to a cytokine or a TLR, comprising: (a) determining whether or not the mammal comprises cells containing a level of an IRF-5 polypeptide that is greater than an average level of an IRF-5 polypeptide in control cells from one or more control mammals, wherein the mammal and the one or more control mammals are from the same species, and wherein the IRF-5 polypeptide in the mammal comprises an amino acid sequence encoded by exon IB and an amino acid sequence encoded by an insertion in exon 6; and (b) classifying the

mammal as likely to respond to the treatment if the mammal contains the cells, or classifying the mammal as not being likely to respond to the treatment if the mammal does not contain the cells. The cytokine can be IFN-α, IL-IRA, IL-6, MCP-I, MIP-Ia, MIP-I β, or TNF-α. The TLR can be TLR7, TLR8, or TLR9. The mammal can be a human. The one or more control mammals can be healthy humans. The cells and the control cells can be peripheral blood mononuclear cells or whole blood cells. The level of IRF-5 polypeptide in the mammal can be greater than the average level of IRF-5 polypeptide in control cells from at least 10 control mammals, or greater than the average level of IRF-5 polypeptide in control cells from at least 20 control mammals. The determining step can include measuring the level of IRF-5 mRNA encoding the IRF-5 polypeptide, or measuring the level of IRF-5 polypeptide. The method can further include determining whether a biological sample from the mammal contains elevated levels of IFN-α, IL-IRA, IL-6, MCP-I, MIP-I α, MIP- l β, or TNF-α.

Unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention pertains. Although methods and materials similar or equivalent to those described herein can be used to practice the invention, suitable methods and materials are described below. All publications, patent applications, patents, and other references mentioned herein are incorporated by reference in their entirety. In case of conflict, the present specification, including definitions, will control. In addition, the materials, methods, and examples are illustrative only and not intended to be limiting.

The details of one or more embodiments of the invention are set forth in the accompanying drawings and the description below. Other features, objects, and advantages of the invention will be apparent from the description and drawings, and from the claims.

DESCRIPTION OF DRAWINGS

FIG. Ia depicts the mRNA isoforms of IRF-5. Three sets of isoforms derive from three alternative promoters in the IRF-5 5' region. The locations of the exons encoding DNA binding, PEST, and protein interaction domains, as well as the 3' UTR, are annotated. Protein translation begins at a consensus ATG that is 10 bp from the 5' end of exon 2. The location of the rs2004640 SNP, 2 bp downstream of exon-lB, is shown in the box. Two polyadenylation sites are present in the IRF-5 3' UTR, and the lengths of the 3' UTRs for V5, V6, V7 and V8 are unknown. The exon/intron structures are not

shown to scale. FIG. Ib is a series of graphs summarizing the data from TaqMan realtime quantitative RT-PCR analysis for exon-1 A, -IB, and -1C associated transcripts. Each bar represents the mean ± SEM of expression levels (N = 8 SLE cases for each genotype; similar data was obtained for normal controls). Delta Cts were calculated from duplicate samples normalized to human β2-microglobulin, and converted to linear fold-differences.

FIG. 2 is a graph showing levels of IRF-5 mRNA as determined by microarray, compared between EBV transformed cell lines from CEPH individuals typed for rs2004640 and rs2280714. Identical findings were observed when only CEPH founders were examined.

FIGS. 3a and FIG. 3b are graphs showing levels of IRF-5 measured by Affymetrix microarrays, compared in whole blood (N=37; FIG. 3a) and in PBMCs (N=41; FIG. 3b) in two sets of independent SLE cases. Total IRF-5 levels were compared by rs2280714 genotype in whole blood SLE samples: TT vs. TC, P = 0.01; TC vs. CC, P = 0.0006; TT vs. CC, P = 0.000002. A similar analysis was performed for the SLE PBMC samples: TT vs. TC, P = NS; TC vs. CC, P = 0.0004; TT vs. CC, P = 0.000006.

FIG. 4 is a depiction of the SLE risk haplotype, showing both the rs2004640 T allele (green) and the rs2280714 T allele (blue). Haplotype frequencies of CEPH founders, as determined by Haploview, are shown. rs729302 is located 5' of the haplotype marked by rs2004640, rs752637, and rs2280714.

FIG. 5 is a graph showing levels of expression of the long IRF-5 isoform (IRF5_Long), the short IRF-5 isoform (IRF5_Short), or both isoforms (IRFS Common) as determined in individuals homozygous for the rs 10954213 A allele (gray bars), heterozygous for the rslO954213 A allele (striped bars), or homozygous for the rs 10954213 G allele (white bars).

FIG. 6 is a graph plotting expression levels of IRF5 mRNA in CEU cell lines carrying various genotypes for rs2004640 and rsl 0954213.

FIG. 7 is a graph plotting microarray expression levels oiIRFS in whole blood RNA samples from SLE patients. Each symbol represents the expression level in a single patient.

FIG. 8 is a schematic of the 3' UTR region of IRF5.

FlG. 9 is a pair of graphs plotting levels of IRF5 isoforms carrying the short (left panel) or long (right panel) 3' UTR, as determined by quantitative TaqMan RT-PCR in EBV cell lines (N = 9) and in control PBMCs (N = 14).

FIG. 10 is a graph plotting the decay of beta-globin 3' IRF5 UTR mRNAs following suppression of new transcription with doxycycline. Results represent 4 independent experiments. * P < 0.05; ** P < 0.01.

FIG. 11 is a diagram showing the location of the three common functional alleles identified in IRF5.

FIG. 12 is a diagram showing IRF 5 exon 6 mRNA isoforms determined by the common indel and two alternatively spliced exon 6 start sites (SS 1 and SS2). The expected full-length protein isoform lengths in amino acids (aa) are noted. The predicted lengths of PCR fragments from an exon IB primer site to a region just downstream of the exon 6 indel are shown for each of the isoforms.

FIG. 13 is a summary of IRF5 haplotypes and their association to SLE.

DETAILED DESCRIPTION

This document relates to methods and materials involved in diagnosing SLE in a mammal, assessing a mammal's susceptibility to develop SLE, and determining whether a mammal is likely to respond to therapy directed toward IRF-5. For example, this document relates to materials and methods for determining whether a mammal contains one or more 1RF-5 variants, contains an IRF-5 mRNA that results from alternative splicing or alternative polyadenylation due to the presence of one or more IRF-5 variants, or for determining whether a mammal contains cells in which IRF-5 is expressed at level that is more or less than the average level of IRF-5 expression observed in control cells obtained from control mammals. In some embodiments, for example, a mammal can be diagnosed as having or being at risk of developing SLE if it is determined that the mammal contains one or more IRF-5 variants (e.g., an rs2004640 T allele, an rs2280714 T allele, an rs2070197 C allele, an rsl 0954213 A allele, and/or an exon 6 insertion allele). In some embodiments, a mammal can be diagnosed as having or being at risk of developing SLE if it is determined that the mammal contains an IRF-5 mRNA comprising exon- IB, a truncated 3' UTR, and/ or an exon 6 insertion, as described herein. In some cases, a mammal can be diagnosed as having or being at risk for SLE if it is determined that the mammal contains cells that express a level of IRF-5 mRNA

containing exon-lB and/or a truncated 3' UTR and/or an exon 6 insertion that is greater than the level of an IRF-5 mRNA expressed in control cells from control mammals. In still other embodiments, a mammal can be diagnosed as having or being at risk of developing SLE if it is determined that the mammal contains cells having a level of IRF- 5 polypeptide that is higher than the average level of IRF-5 polypeptide in control cells obtained from control mammals.

The mammal can be any mammal such as a human, dog, mouse, or rat. Nucleic acids or polypeptides from any cell type can be isolated and evaluated. For example, whole blood cells, peripheral blood mononuclear cells (PMBC), total white blood cells, lymph node cells, spleen cells, or tonsil cells can be isolated from a human patient and evaluated to determine if that patient contains one or more IRF-5 variants (e.g., an rs2004640 T allele, an rs2280714 T allele, an rs 10954213 A allele, or rs2070197 C allele), an IRF-5 mRNA containing exon- IB and/or a truncated 3' UTR and/or an exon 6 insertion, or cells that express IRF-5 at a level that is greater or less than the average level of expression observed in control cells.

As used herein, "IRF-5 variant" and "IRF-5 nucleotide sequence variant" refer to any alteration in an IRF-5 reference sequence. IRF-5 variants include variations that occur in coding and non-coding regions, including exons, introns, and untranslated sequences. As used herein, "untranslated sequence" includes 5' and 3' flanking regions that are outside of the messenger RNA (mRNA) as well as 5' and 3' untranslated regions (5'-UTR or 3'-UTR) that are part of the mRNA, but are not translated. Nucleotides are referred to herein by the standard one-letter designation (A, C, G, or T).

In some embodiments, an IRF-5 nucleotide sequence variant results in an IRF-5 mRNA having an altered nucleotide sequence (e.g., a splice variant that includes exon IB and/or a variant that includes additional nucleotides in exon 6), or an IRF-5 polypeptide having an altered amino acid sequence (e.g., a polypeptide including a sequence encoded by exon IB and/or a sequence encoded by an insertion in exon 6). The term "polypeptide" refers to a chain of at least four amino acid residues (e.g., 4-8, 9- 12, 13-15, 16-18, 19-21, 22-50, 51-75, 76-100, 101-125 residues, or a full-length IRF-5 polypeptide). IRF-5 polypeptides may or may not have activity, or may have altered activity relative to a reference IRF-5 polypeptide. In some embodiments, polypeptides having an altered amino acid sequence can be useful for diagnostic purposes (e.g., for producing antibodies having specific binding affinity for variant IRF-5 polypeptides).

The presence or absence of IRF-5 nucleotide sequence variants can be determined using any suitable method, including methods that are standard in the art, for example, nucleotide sequence variants can be detected, for example, by sequencing exons, introns, 5' untranslated sequences, or 3' untranslated sequences, by performing allele-specific hybridization, allele-specific restriction digests, mutation specific polymerase chain reactions (MSPCR). by single-stranded conformational polymorphism (SSCP) detection (Schafer et al. (1995) Nat. Biotechnol. 15:33-39), denaturing high performance liquid chromatography (DHPLC, Underhill et al. (1997) Genome Res. 7:996-1005), primer extension of multiplex products (e.g., as described herein), infrared matrix-assisted laser desorption/ionization (IR-MALDI) mass spectrometry (WO 99/57318), and combinations of such methods.

Genomic DNA generally is used in the analysis of IRF-5 nucleotide sequence variants, although mRNA also can be used. Genomic DNA is typically extracted from a biological sample such as a peripheral blood sample, but can be extracted from other biological samples, including tissues (e.g., mucosal scrapings of the lining of the mouth or from renal or hepatic tissue). Routine methods can be used to extract genomic DNA from a blood or tissue sample, including, for example, phenol extraction. Alternatively, genomic DNA can be extracted with kits such as the QIAA MP ® Tissue Kit (QIAGEN ® , Chatsworth, CA), WIZARD ® Genomic DNA purification kit (PROMEGA™) and the A.S.A.P.™ Genomic DNA isolation kit (BOEHRINGER MANNHEIM™, Indianapolis, IN).

Typically, an amplification step is performed before proceeding with the detection method. For example, exons or introns of the IRF-5 gene can be amplified then directly sequenced. Dye primer sequencing can be used to increase the accuracy of detecting heterozygous samples.

Allele specific hybridization also can be used to detect sequence variants, including complete haplotypes of a subject (e.g., a mammal such as a human). See, Stoneking et al. (1991) Am. J. Hum. Genet. 48:370-382; and Prince et al. (2001) Genome Res. 11 :152-162, In practice, samples of DNA or RNA from one or more mammals can be amplified using pairs of primers and the resulting amplification products can be immobilized on a substrate (e.g., in discrete regions). Hybridization conditions are selected such that a nucleic acid probe can specifically bind to the sequence of interest, e.g., the variant nucleic acid sequence. Such hybridizations typically are performed under high stringency as some sequence variants include only a single nucleotide

difference. As used herein, high stringency conditions include the use of low ionic strength solutions and high temperatures for washing. In particular, under high stringency conditions, nucleic acid molecules are hybridized at 42°C in 2X SSC (0.3 M NaCl/0.03 M sodium citrate) with 0.1% sodium dodecyl sulfate (SDS) and washed in 0.1X SSC (0.015 M NaCl/0.0015 M sodium citrate), 0.1% SDS at 65°C. Hybridization conditions can be adjusted to account for unique features of the nucleic acid molecule, including length and sequence composition. Probes can be labeled (e.g., fluorescently) to facilitate detection. In some embodiments, one of the primers used in the amplification reaction is biotinylated (e.g., 5' end of reverse primer) and the resulting biotinylated amplification product is immobilized on an avidin or streptavidin coated substrate.

Allele-specific restriction digests can be performed in the following manner. For nucleotide sequence variants that introduce a restriction site, restriction digest with the particular restriction enzyme can differentiate the alleles. For sequence variants that do not alter a common restriction site, mutagenic primers can be designed that introduce a restriction site when the variant allele is present or when the wild type allele is present. A portion of an 1RF-5 nucleic acid can be amplified using the mutagenic primer and a wild type primer, followed by digest with the appropriate restriction endonuclease. Certain variants, such as insertions or deletions of one or more nucleotides, change the size of the DNA fragment encompassing the variant. The insertion or deletion of nucleotides can be assessed by amplifying the region encompassing the variant and determining the size of the amplified products in comparison with size standards. For example, a region of an IRF-5 gene can be amplified using a primer set from either side of the variant. One of the primers is typically labeled, for example, with a fluorescent moiety, to facilitate sizing. The amplified products can be electrophoresed through acrylamide gels with a set of size standards that are labeled with a fluorescent moiety that differs from the primer.

PCR conditions and primers can be developed that amplify a product only when the variant allele is present or only when the wild type allele is present (MSPCR or allele-specific PCR). For example, patient DNA and a control can be amplified separately using either a wild type primer or a primer specific for the variant allele. Each set of reactions is then examined for the presence of amplification products using standard methods to visualize the DNA. For example, the reactions can be electrophoresed through an agarose gel and the DNA visualized by staining with

ethidium bromide or other DNA intercalating dye. In DNA samples from heterozygous patients, reaction products would be detected with each set of primers. Patient samples containing solely the wild type allele would have amplification products only in the reaction using the wild type primer. Similarly, patient samples containing solely the variant allele would have amplification products only in the reaction using the variant primer. Allele-specific PCR also can be performed using allele- specific primers that introduce priming sites for two universal energy-transfer-labeled primers (e.g., one primer labeled with a green dye such as fluoroscein and one primer labeled with a red dye such as sulforhodamine). Amplification products can be analyzed for green and red fluorescence in a plate reader. See, Myakishev et al. (2001) Genome 1 1(1):163-169. Mismatch cleavage methods also can be used to detect differing sequences by PCR amplification, followed by hybridization with the wild type sequence and cleavage at points of mismatch. Chemical reagents, such as carbodiimide or hydroxylamine and osmium tetroxide can be used to modify mismatched nucleotides to facilitate cleavage. IRF-5 mRNA isoforms can be evaluated using any suitable method, including those known in the art. For example, northern blotting, slot blotting, chip hybridization techniques, or RT-PCR-based methods can be used to determine whether a mammal contains an IRF-5 mRNA that includes exon-lB or that has a truncated 3 * UTR.

When IRF-5 expression is evaluated, the expression level can be greater than or less than the average level observed in control cells obtained from control mammals.

Typically, IRF-5 can be classified as being expressed at a level that is greater than or less than the average level observed in control cells if the expression levels differ by at least 1-fold (e.g., 1.5-fold, 2-fold, 3-fold, or more than 3-fold). In addition, the control cells typically are the same type of cells as those isolated from the mammal being evaluated. In some cases, the control cells can be isolated from one or more mammals that are from the same species as the mammal being evaluated. When diagnosing or predicting susceptibility to SLE, the control cells can be isolated from healthy mammals such as healthy humans who do not have SLE. Any number of control mammals can be used to obtain the control cells. For example, control cells can be obtained from one or more healthy mammals (e.g., at least 5, at least 10, at least 15, at least 20, or more than 20 control mammals).

Further, any suitable method can be used to determine whether or not IRF-5 is expressed at a level that is greater or less than the average level of expression observed in control cells. For example, the level of IRF-5 expression can be measured by assessing

the level of IRF-5 mRNA expression. Levels of mRNA expression can be evaluated using, without limitation, northern blotting, slot blotting, quantitative RT-PCR, or chip hybridization techniques. Methods for chip hybridization assays include, without limitation, those described in published U.S. Patent Application No. 20040033498. The level of IRF-5 expression also can be measured by assessing polypeptide levels. Polypeptide levels can be measured using any method, including immuno-based assays (e.g., ELISA), western blotting, or silver staining.

Research has demonstrated that IRF-5 is activated by TLR7 and TLR8, and that IRF-5 is a critical mediator of TLR7 signaling (Schoenemeyer et al. (2005) J. Biol. Chem. 280: 17005-17012). TLR7, TLR8, and TLR9 form an evolutionarily related subgroup within the TLR superfamily (Chuang and Ulevitch (2000) Eur. Cytokine Netw. 1 1 :372-378; and Du et al. (2000) Eur. Cytokine Netw. 11 :362-371). As described in the Examples herein, subjects containing an rs2004604 T allele and an rs 1965213 A allele can secrete elevated levels of cytokines, and also display an enhanced response to TLR7 and IFN-α signaling as compared to subjects having an rs2004640 G allele and an rsl954213 G allele. Thus, the presence of the aforementioned IRF-5 alleles (e.g., the combination of alleles in haplotype 1 described in the Examples herein), or increased IRF-5 levels, also can be ascertained in methods to determine whether a mammal (e.g., a human) is likely to respond to a therapy directed toward IRF-5 (e.g., a therapy aimed at reducing IRF-5 levels), a therapy directed toward a TLR (e.g., TLR7, TLR8, or TLR9), or a therapy directed toward one or more cytokines (e.g., IRF-5 mediated cytokines such as IFN-α, interleukin- 1 receptor antagonist (IL-IRA), IL-6, monocyte chemoattractant protein- 1 (MCP-I), macrophage inflammatory protein- lot (MIP- lα), MIP-I β, and TNF- α). In some embodiments, the mammal can be diagnosed with SLE. By "respond" is meant that one or more symptoms of SLE are reduced by any amount (e.g., reduced by 5%, 10%, 15%, 20%, 25%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%. 85%, 90%, 95%, or 100%). Symptoms of SLE include, for example, arthralgia/arthritis, muscle pain, avascular necrosis, and osteoporosis, pericarditis, myocarditis, endocarditis, coronary artery problems, kidney problems, pleurisy, pneumonitis, chronic diffuse interstitial lung disease, pulmonary embolism, pulmonary hypertension, liver problems, lupus headache, seizures, CNS vasculitis, psychosis, mouth/nose ulcers, malar rash, discoid rash, hair loss, photosensitivity, hives, Raynaud's phenomenon, purpura, livedo reticularis, anemia, thrombocytopenia, leukopenia, fatigue, fever, weight loss/gain, eye problems, and gastrointestinal problems.

In some embodiments, a method that includes determining whether a mammal contains an IRF-5 variant can further include determining whether cytokine levels are increased in the mammal. For example, a method provided herein can include measuring the level of an IRF-5 mediated cytokine such as IFN-α, IL-IRA, 1L-6, MCP-I , MlP-I α, MIP-I β, and TNF-α. A biological sample from a mammal having an SLE risk haplotype (haplotype 1 as described herein) that is determined to have elevated levels of one or more cytokines can be a further indication that the mammal has SLE or is predisposed to develop SLE.

Any suitable method can be used to measure the level of a cytokine in a biological sample from a mammal. For example, a whole blood sample or a fraction of a blood sample (e.g., peripheral blood mononuclear cells; PMBC) from a mammal can be obtained, and the level of one or more cytokines in the sample can be determined.

When cytokine expression is evaluated, the expression level can be greater than or less than the average level observed in control cells obtained from control mammals. Typically, cytokines can be classified as being expressed at a level that is greater than or less than the average level observed in control cells if the expression levels differ by at least 1-fold (e.g., 1.5-fold, 2-fold, 3-fold, or more than 3-fold). In addition, the control cells typically are the same type of cells as those isolated from the mammal being evaluated. In some cases, the control cells can be isolated from one or more mammals that are from the same species as the mammal being evaluated. When diagnosing or predicting susceptibility to SLE, the control cells can be isolated from healthy mammals such as healthy humans who do not have SLE. Any number of control mammals can be used to obtain the control cells. For example, control cells can be obtained from one or more healthy mammals (e.g., at least 5, at least 10, at least 15, at least 20, or more than 20 control mammals).

Any suitable method can be used to determine whether or not a particular cytokine is expressed at a level that is greater or less than the average level of expression observed in control cells. As described above for IRF-5, for example, the level of expression of a cytokine such as TNF-α can be measured by assessing the level of TNF- α mRNA expression or by assessing polypeptide levels.

Agents targeted to IRF-5, TLRs, or cytokines such as those listed herein can be, for example, drug, small molecules, antibodies or antibody fragments, such as Fab' fragments, F(ab') 2 fragments, or scFv fragments, antisense oligonucleotides, interfering RNAs (RNAis), or combinations thereof.

Methods for producing antibodies and antibody fragments are known in the art. Chimeric antibodies and humanized antibodies made from non-human (e.g., mouse, rat, gerbil, or hamster) antibodies also may be useful. Chimeric and humanized monoclonal antibodies can be produced by recombinant DNA techniques known in the art, for example, using methods described in U.S. Patent Nos. 4,816,567; 5,482,856; 5,565,332; 6,054,297; and 6,808,901.

Antisense oligonucleotides typically are at least 8 nucleotides in length, and hybridize to an IRF-5, TLR, or cytokine transcript. For example, a nucleic acid can be about 8, 9, 10-20 (e.g., 1 1, 12, 13, 14, 15, 16, 17, 18. 19, or 20 nucleotides in length), 15- 20, 18-25, or 20-50 nucleotides in length. In other embodiments, antisense molecules can be used that are greater than 50 nucleotides in length. As used herein, the term "oligonucleotide" refers to an oligomer or polymer of ribonucleic acid (RNA) or deoxyribonucleic acid (DNA) or analogs thereof. Nucleic acid analogs can be modified at the base moiety, sugar moiety, or phosphate backbone to improve, for example, stability, hybridization, or solubility of a nucleic acid. Modifications at the base moiety include, without limitation, substitution of deoxyuridine for deoxythymidine, substitution of 5-methyl-2'-deoxycytidine or 5-bromo-2'-deoxycytidine for deoxycytidine, and any other suitable base substitution. Modifications of the sugar moiety can include, for example, modification of the 2' hydroxyl of the ribose sugar to form 2'-O-methyl or T- O-allyl sugars. The deoxyribose phosphate backbone can be modified to produce morpholino nucleic acids, in which each base moiety is linked to a six-membered. morpholino ring, or peptide nucleic acids, in which the deoxyphosphate backbone is replaced by a pseudopeptide backbone (e.g., an aminoethylglycine backbone) and the four bases are retained. See, for example, Summerton and Weller (1997) Antisense Nucleic Acid Drug Dev. 7:187-195; and Hyrup et al. (1996) Bioorgan Med Chem 4:5- 23. In addition, the deoxyphosphate backbone can be replaced with, for example, a phosphorothioate or phosphorodithioate backbone, a phosphoroamidite, or an alkyl phosphotriester backbone. See, for example, U.S. Patent Nos. 4,469,863; 5,235,033; 5,750,666; and 5,596,086 for methods of preparing oligonucleotides with modified backbones.

Methods for synthesizing antisense oligonucleotides are known in the art, including solid phase synthesis techniques. Equipment for such synthesis is commercially available from several vendors including, for example, Applied Biosystems (Foster City, CA), Alternatively, expression vectors that contain a

regulatory element that directs production of an antisense transcript can be used to produce antisense molecules.

It is understood in the art that the sequence of an antisense oligonucleotide need not be 100% complementary to that of its target nucleic acid to be hybridizable under physiological conditions. Antisense oligonucleotides hybridize under physiological conditions when binding of the oligonucleotide to the native nucleic acid interferes with the normal function of the native nucleic acid, and non-specific binding to non-target sequences is minimal.

Target sites for antisense oligonucleotides include the regions encompassing the translation initiation or termination codon of the open reading frame (ORF) of the gene. In addition, the ORF has been targeted effectively in antisense technology, as have the 5' and 3' untranslated regions. Furthermore, antisense oligonucleotides have been successfully directed at intron regions and intron-exon junction regions. Further criteria can be applied to the design of antisense oligonucleotides. Such criteria are well known in the art, and are widely used, for example, in the design of oligonucleotide primers. These criteria include the lack of predicted secondary structure of a potential antisense oligonucleotide, an appropriate G and C nucleotide content (e.g., approximately 50%), and the absence of sequence motifs such as single nucleotide repeats (e.g., GGGG runs). The effectiveness of antisense oligonucleotides at modulating expression of a nucleic acid can be evaluated by measuring levels of the targeted mRNA or polypeptide (e.g., by Northern blotting, RT-PCR, Western blotting, ELISA, or immunohistochemical staining).

Double-stranded interfering RNA (RNAi) homologous to IRF-5 or cytokine DNA also can be used to reduce expression and consequently, activity, of IRF-5 or cytokines. See, e.g., U.S. Patent No. 6, 933,146; Fire et al. (1998) Nature 391:806-811 ; Romano and Masino (1992) MoI Microbiol. 6:3343-3353; Cogoni et al. (1996) EMBO J. 15:3153-3163; Cogoni and Masino ( 1999) Nature 399:166-169; Misquitta and Paterson (1999) Proc. Natl. Acad. Set USA 96:1451-1456; and Kennerdell and Carthew (1998) Cell 95: 1017-1026. Sense and anti-sense RNA strands of RNAi can be individually constructed using chemical synthesis and enzymatic ligation reactions using procedures known in the art. For example, each strand can be chemically synthesized using naturally occurring nucleotides or nucleic acid analogs. The sense or anti-sense strand also can be produced biologically using an expression vector into which a target sequence (full-length or a fragment) has been subcloned in a sense or anti-sense

orientation. The sense and anti-sense RNA strands can be annealed in vitro before delivery of the dsRNA to cells. Alternatively, annealing can occur in vivo after the sense and anti-sense strands are sequentially delivered to the tumor vasculature or to tumor cells. The invention will be further described in the following examples, which do not limit the scope of the invention described in the claims.

EXAMPLES

Example 1 - A common IRF-5 haplotype that regulates mRNA splicing and expression and is associated with increased genetic risk in human SLE

Materials and Methods

Clinical Samples: A U.S. Caucasian SLE family collection of 187 sib-pair and 223 trio pedigrees was recruited at the University of Minnesota. An additional 63 trios from the NIAMS-sponsored Lupus Multiplex Registry at Oklahoma Medical Research Foundation were included in the analysis. The overall U.S. family cohort was comprised of 681 SLE cases and 824 other family members. 459 probands from the U.S. family collection, 266 cases from the Hopkins Lupus Cohort, 41 controls from Minnesota, and 1393 controls of European ancestry from the New York Health Project (Mitchell et al. (2004) J Urban Health 81 :301-310) collection were genotyped for the case/control analysis.

Three additional SLE case/control cohorts were studied. A cohort of 444 Spanish patients with SLE and 541 controls were collected in several clinics in the Andalucia region of Southern Spain. All individuals were of Spanish Caucasian ancestry. A second cohort of 284 patients SLE patients and 279 matched controls were collected through a multi-center collaboration in Argentina. Individuals were of Caucasian

(72.5%) and mixed (20%) ancestry. Six percent were of Amerindian (n=l), Asian (n=2), or unknown ancestry (n=22). A third set of 208 ethnic Swedish patients and 254 controls from the Stockholm -Uppsala area were studied (no overlap with the previously published cases; Sigurdsson e/ al. (2005) λm. J. Hum. Genet. 76:528-537). All patients fulfilled the revised American College of Rheumatology criteria for SLE (Hochberg (1997) Arthritis Rheum. 40:1725). These studies were approved by the Human Subject Institutional Review Boards at each institution, and informed consent was obtained from all subjects.

Genotyping: Four polymorphisms from IRF5 (rs729302, rs2004640, rs752637, and rs2280714) were genotyped in the 470 families by primer extension of multiplex products with detection by matrix-assisted laser desorption ionization-time of flight mass spectroscopy using a Sequenom platform. Primer sequences were: rs729302 forward, 5'- AGCGGATAACAAATAGACCAGAGACCAGGG-S' (SEQ ID NO: 1); rs729302 reverse, S'-AGCGGATAACAAGTCTAAGTGAGTGGCAGG-S' (SEQ ID NO:2); rs729302 extension, 5' -ATGGGAC AAGGTGAAGAC-3' (SEQ ID N0:3); rs2004640 forward, 5'-AGCGGATAACAGGCGCTTTGGAAGTCCCAG-S ' (SEQ ID NO:4); rs2004640 reverse, 5 '-AGCGGATAACATGAAGACTGGAGTAGGGCG-S' (SEQ ID NO:5); rs2004640 extension, 5'-CCCTGCTGTAGGCACCC-3' (SEQ ID NO:6); rs752637 forward, 5'-AGCGGATAACTCTAAAGGCCCTACTTTGGG-S ' (SEQ ID NO:7); rs752637 reverse, 5'-AGCGGATAACAAAGGTGCCCAGAAAGAAGCS- (SEQ ID NO:8); rs752637 extension, 5'-CTGACCCTGGGAGGAAGC-S ' (SEQ ID NO:9); rs2280714 forward, 5 '-AGCGGATAACCCATAAATTCTGACCCTGGC-S' (SEQ ID NO:10); rs2280714 reverse, 5'-AGCGGATAACAGGAGGAGTAAGCAAGG AAC-3' (SEQ ID NO:11); rs2280714 extension, 5'-TTCTGACCCTGGCAGGTCC-3' (SEQ ID NO: 12). The average genotype completeness for the four assays was 98.3%. The genotyping consensus error rate was 0.7% (9 errors in Mendelian inheritance from 1288 parent-offspring transmissions - all errors were zeroed out). The typing of rs2280714 did not include the OMRF trios.

For the U.S. case-control studies, rs2004640 was typed by TaqMan in the Hopkins cases and in the MN and NYHP controls, and by Sequenom for all other samples. rs2004640 primers were: forward, 5 '-CAGCTGCGCCTGGAAAG-S ' (SEQ ID NO: 13); reverse, S'-GGGAGGCGCTTTGGAAGT-S' (SEQ ID NO: 14); extension (vie), 5'-TGTAGGCACCCCCCCG-S' (SEQ ID NO:15); extension (fam), 5'-

TGTAGGC ACCCACCCG-3' (SEQ ID NO: 16). Forty individuals were genotyped on both platforms with 100% concordance of results. Genotyping of rs2004640 was performed separately for the Spanish, Swedish and Argentina cases and controls. Briefly, these three sets were genotyped at the Rudbeck Laboratory in Uppsala using TaqMan assay-on-demand from ABI for rs2004640. The average genotype completeness was 99% for Swedish, 98% for Argentina and 86% for Spanish samples. rs752637 also was typed by TaqMan using the following primers: forward, 5'- GCAAAAGGTGCCCAGAAAG AAG-3' (SEQ ID NO:17); reverse, 5'- TCCCCTGTACCCTGGTCTTC-3' (SEQ ID NO: 18); extension (vie), 5'-

CTTCTTTCAGCTTCCTC-3' (SEQ ID NO: 19); and extension (fam), 5 " - TCTTTCGGCTTCCTC-3' (SEQ ID NO:20). rs2280714 was typed for the case-control studies on both the Sequenom platform and using a TaqMan assay (Rudbeck Laboratory). Over 1100 individuals were typed on both platforms with 98.2% concordance of results. The following samples were not typed for rs2280714: 63 OMRF trios, 96 Spanish SLE cases, 126 Swedish cases, and 161 Swedish controls. Hardy- Weinberg equilibrium P values for rs2004640 and rs2280714 for each population are presented in Table 6.

Statistical Analysis: Family-based Association Analysis - The Transmission Disequilibrium Test (TDT) was performed using Haploview v3.2 (available on the

World Wide Web at broad.mit.edu/mpg/haploview/) under default settings. Haploview v3.2 examines the transmission patterns of all complete trios within each pedigree. To assess the statistical significance of the results, the transmitted/untransmitted status of each genotype and haplotype was randomly permutated for 1 ,000,000 iterations and the best chi-square value generated for each permutated dataset was recorded. The number of times the permutated chi-square value exceeded the nominal chi-square value was divided by the number of iterations (1 ,000,000) to generate the permutated P value. The Pedigree Disequilibrium Test (PDT) was performed as described (Martin et al. (2000) Am. J. Hum. Genet. 67:146-154). Case Control Analysis - χ 2 analysis was used to evaluate the significance of differences in genotype and allele frequencies in the case-control samples. The allele frequencies for cases and controls were used to calculate the Odds Ratio (OR) and the 95% confidence interval using Woolf s method (In(OR) +- 1.96(1/A + 1/B + 1/C + l/D) A 0.5.). The chi-square value was calculated from the 2X2 contingency tables and p- values were determined using 1 degree of freedom.

Meta Analysis - Published results of the association of rs2004640 with SLE in Finnish and Swedish collections (Sigurdsson et al. supra) were combined with results for rs2004640 in SLE cases collected in Argentina, Spain, Sweden and the United States using the Mantel-Haenszel meta-analysis of the odds ratios (ORs; Lohmueller et al. (2003) Nat. Genet. 33: 177-182; Woolson and Bean (1982) Stat. Med. 1 :37-39).

Determination and quantification of IRF 5 UTR-specific transcripts: Total RNA from SLE patients carrying the various genotypes was purified from PBMCs with TRIZOL reagent (Invitrogen). 2 μg of total RNA was reverse transcribed with 2 U of

MultiScribe transcriptase in the PCR buffer II containing 5 niM MgCl 2 , 1 niM dNTPs, 0.4 U of RNase inhibitor and 2.5 μM random hexamers (all results were confirmed using oligo-dT primed cDNA). All reagents were from Applied Biosystems. Synthesis was performed at 42°C for 45 minutes and the reaction was terminated at 95°C for 5 minutes. IRF-5 isoforms with distinct 5'-UTRs were quantified by real-time TaqMan-PCR on ABI PRISM 7700 Sequence Detector (Applied Biosystems) with SDS 1.9.1 software. Primers used to distinguish PCR products with different UTRs were: forward (A) Exon- IA-UTR 5 '-ACGCAGGCGCACCGCAGACA-S' (SEQ ID NO:21), (B) Exon-1B- UTR:5 '-AGCTGCGCCTGGAAAGCGAGC-B' (SEQ ID NO:22), (C) ExonlC-UTR: 5'- AGGCGGC ACT AGGC AGGTGC AAC-3' (SEQ ID NO:23). and a common reverse primer lying in exon 3 5 '-TCGTAGATCTTGTAGGGCTGAGGTGGCA-S ' (SEQ ID NO:24). TaqMan probe labeled with FAM and TAMRA was 5'- CCATGAACCAGTCC ATCCCAGTGGCTCCCACC-3' (SEQ ID NO:25). 45 or 52 cycles of two-step PCR were run in a buffer containing 1.5 mM MgCl 2 , 200 μM of each of dNTP, 0.5 U of Platinum Taq polymerase (Invitrogen), primer-probe mix and cDNA. Extension/elongation was maintained at 65°C for 1 minute, while denaturation was at 95 0 C for 15 seconds. Expression levels were normalized using human β2-microglobulin with commercial primer-probe mix (Applied Biosystems).

Standard PCR amplification of diverse isoforms of IRF-5 was performed with the same forward primers as for the TaqMan assay with reverse primer designed so as to allow amplification of all transcripts containing exon 8: 5'-

GAAACTTGATCTCCAGGT CGGTCA-3' (SEQ ID NO:26). Cycle conditions were: initial denaturation at 95 °C for 3 minutes, followed by 40 cycles of denaturation at 95 0 C for 15 seconds, annealing at 60°C for 15 seconds and elongation at 72°C for 1.5 minutes. PCR was performed in a 25 μl reaction volume, with 0.5 U of Platinum Taq polymerase (Invitrogen) in the buffer supplied with enzyme. PCR products were electrophoresed on a 1.5% agarose gel.

The statistical analysis of isoform expression was performed using t-test included in GraphPad Software (World Wide Web at graphpad.com). Cloning and sequencing of IRF-5 isoforms: To isolate novel isoforms, total RNA isolated from human PBMCs of two rs2004640 TG SLE patients was subjected to RT- PCR with the same forward primers matching to Exon I used for the TaqMan RT-PCR assays, and a common reverse primer lying in the last exon: 5'-CTGAGAACATCTCCA GCAGCAG-3' (SEQ ID NO:27). PCR products were analyzed by gel electrophoresis

and individual bands were cut out and purified. Sequencing was performed using the Big Dye reaction at the Uppsala Genome Center. Two novel transcripts named VlO and Vl 1 were identified and deposited to GenBank under accession numbers DQ277633 and DQ277634, respectively. IRF-5 expression analysis: Two 1RF-5 region SNPs (rs2004640 and rs2280714) were genotyped using the Sequenom platform described above in 30 CEPH trios (CEU, 90 individuals) from the International Haplotype Map project (Altshuler et al (2005) Nature 437: 1299-1320) and the data was integrated into the Phase 11 data (HapMap data release #19) for 100kb flanking IRF-5. In addition, three SNPs (rs726302, rs2004640, and rs2280714) were genotyped in the 233 CEPH individuals (14 extended pedigrees, including 21 trios that are part of the HapMap CEU samples, and 38 unrelated individuals) described in Morley et al. ((2004) Nature 430:743-747), using a Sequenom platform. Linear regression (R statistical package) was used to test the significance of association of genetic variants to IRF-5 expression levels using publicly available gene expression data (GEO accession number GSE1485, IRF-5 probe 205469_s at; Morley et al. supra) in the 233 CEPH individuals, subdivided by (a) 42 unrelated founders included in the HapMap CEPH (CEU) population, (b) 92 unrelated individuals, and (c) all 233 individuals. Gene expression data were also obtained from the PBMCs of 37 SLE cases (Affymetrix U95A chips, IRF5 probe set 36465_at; Baechler et al. supra) and from PaxGene RNA from whole blood of 41 independent Caucasian SLE cases (Affymetrix 133 A chips, 1RF5 probe set 205469 s at).

Enrichment of the IRF-5 rs2004640 T allele in SLE Four sets of SLE cases and controls from the United States, Spain, Sweden and Argentina (total of 1 ,661 cases and 2,508 controls) were genotyped, and association of the IRF-5 rs2004640 T allele was assessed using a standard case-control study design. In all sets, a significant enrichment of the T allele was observed in SLE cases as compared to matched controls (overall 60,4% in cases vs. 51.5% in controls, P = 4.4 x 10 "16 ; Table 1). The frequency of the T allele was lower in the Argentine sample possibly due to the mixed ethnicity of the individuals studied (see Example 1). Importantly, in a subset of 470 cases from the U.S. for which family members were available, a family-based association ruled out the possibility that stratification could explain the results (P = 0.0006, Table 3).

When all available case/control data were examined (four independent cohorts described here, together with the two published cohorts from Sweden and Finland; Sigurdsson et ai, supra), robust and consistent association of the rs2004640 T allele with SLE was observed, with individual odds ratios (OR) ranging between 1.31 and 1.84 (Table 1). Using the Mantel -Haenszel method for meta-analysis of ORs, the pooled OR for the rs2004640 SNP T allele was found to be 1.47 (1.36-1.60), with an overall P = 4.2 X 10 "21 (Table 1). A single copy of the rs2004640 T allele was found in 45% of cases and conferred modest risk (pooled OR = 1.27, P = 0.0031), while the 38% of cases homozygous for the T allele are at a greater risk for SLE (pooled OR = 2.01 , P = 3.7 x 10 "14 ; Table 2). Based on these results, dominant and recessive models of inheritance can be formally rejected, and the likely mode of inheritance is additive or multiplicative. Thus, the evidence for association of the T allele of rs2004640 is highly significant, well surpassing even correction for testing all common variants in the human genome.

Functional consequences of the IFR-4 rs2004640 T allele

Given the convincing data for association of IRF -5 with SLE risk, the potential functional consequences of the rs2004640 T allele were investigated. Examination of the genomic sequence of IRF-5 revealed that the rs2004640 T allele is located two bp downstream of the intron/exon border of exon-lB, creating a consensus GT donor splice site (Figure Ia). Thus, studies were conducted to determine if rs2004640 influenced

IRF-5 splicing, which is highly complex (Mancl et al (2005) J. Biol. Chem. 280:21078- 21090). IRF-5 transcripts are initiated at one of three promoters, giving rise to transcripts containing exon-lA, exon-1 B or exon-lC (Figure Ia). Transcripts initiated at exon-1 A and exon-lB are constitutive Iy expressed in plasmacytoid dendritic cells and B cells, while exon-1 C bearing transcripts are inducible by type-I IFNs. In addition, multiple IRF-5 isoforms are initiated at each promoter, with 9 previously identified isoforms (V1-V9, Figure Ia).

To determine whether rs2004640 affected expression of IRF-5 transcripts bearing exon-lB, PBMCs were isolated from individuals carrying GG, GT or TT rs2004640 genotypes, and first strand cDNA was synthesized. Using specific primers to detect transcripts associated with each of the three exon 1 variants, it was observed that SLE patients and controls homozygous for the G allele expressed IRF-5 isoforms containing exon-lA and exon-lC, but not exon-lB. In contrast, individuals homozygous or heterozygous for the T allele expressed exon-lB, as well as both exon-lA and exon-lC,

containing transcripts. TaqMan PCR assays clearly documented that exon-lB transcripts were only detectable in the presence of the rs2004640 T allele (Figure Ib). In all samples studied (N=20). exon-1 A containing transcripts were more abundant than the other mRNA classes. Based on the above, it is apparent that only individuals with the rs2004640 T allele will express the multiple isoforms of IRF-5 initiated at exon-1 B. Given the association of the rs2004640 T allele to SLE and the fact that only individuals carrying the SNP express IRF-5 exon-1 B transcripts, further studies were conducted to obtain additional IRF-5 isoforms. Two novel isoforms of IRF-5 were cloned from the peripheral blood mRNA of rs2004640 heterozygote donors: VlO, which utilizes exon- IB and has an in-frame deletion of 30 nt at the beginning of exon 7, and a predicted protein 10 amino acids shorter than V2; and Vl I 5 a transcript derived from exon- 1C with a 28 bp deletion of exon 3, predicted to encode a truncated protein translated from an alternate reading frame (Figure Ia). Several of these isoforms, including isoforms initiated at exon- IB, contain splicing variation in and around exon 6, which encodes part of an extended PEST domain. PEST domains are highly enriched for proline, glutamic acid, serine and threonine, and can be associated with control of protein stability. Several unique and constitutive Iy expressed IRF-5 isoforms are initiated at exon- IB, and these isoforms may influence the function of IRF-5 or the transcriptional profile of IRF-5 target genes.

Association between elevated IFR-5 expression and the exon-1 B splice site Experiments were conducted to determine whether elevated expression of IRF-5 might be associated with the exon- IB splice site, using a common variant near IRF-5 that is one of the polymorphisms most strongly associated with variation in gene expression (Morley et al. (2004) Nature 430:743-747; and Cheung et al (2005) Nature 437: 1365- 1369). This variant, the rs22807814 T allele, is about 5 Kb downstream of IRF-5, and has been identified as being, or being in strong linkage disequilibrium (LD) with, a cis- acting determinant of IRF-5 expression.

The relationship between rs2004640 and rs2280714 was evaluated in 30 independent CEPH trios from the HapMap project. D' for the two SNPs is 0.96; i.e.. nearly all copies of the splice site rs2004640 T allele are on haplotypes bearing the rs2280714 T allele. However, r 2 for these SNPs is only 0.66, since the downstream rs2280714 T allele is also found on haplotypes that lack the splice site rs2004640 T allele (see Table 3 and Figure 4). While these two SNPs are strongly linked, the fact that the 3'

rs2280714 T allele can be observed in the absence of the upstream splice site SNP allowed determination of which variant is the best predictor of IRF-5 expression and also SLE risk.

The association of IRF-5 expression to the two SNPs was tested in expression data from EB V-transformed B cells of CEPH family members, and from peripheral blood cells of two independent sets of SLE cases. The rs2004640 and rs2280714 alleles were genotyped in 233 CEPH individuals, used for a genome-wide survey of determinants of gene expression (Morley et al supra), and examined for association to IRF-5 expression. The T alleles of both rs2004640 and rs2280714 were found to be associated with higher levels of IRF-5 mRNA expression (Figure 2). However, the rs2280714 T allele was a better predictor of IRF-5 overexpression in 92 unrelated individuals than the rs2004640 T allele (P = 2 x 10 '16 vs. P = 5.3 x 10 "1 1 , respectively), and in the full data set of 233 individuals, consisting of 14 extended pedigrees and 38 unrelated individuals (Figure 2). Similar findings were observed in the peripheral blood cells of two independent groups of SLE cases (Figures 3a and 3b). Based on these data, the hypothesis that the splice site rs2004640 SNP is the cis-acting variant controlling expression can be rejected, since rs2280714 remains significantly associated with IRF-5 expression (P = 4.7 x 10 '7 ) after logistic regression conditional on rs2004640, whereas rs2004640 no longer remains significant after controlling for rs2280714. Using phase II HapMap genotype data (~5 million SNPs across the genome), all available variants (including rs2004640 and rs2280714) within 100 kb of IRF-5 were tested for association to IRF-5 expression in EBV -transformed B-cells from 42 unrelated individuals from the HapMap CEPH (CEU) population. The rs2280714 variant and 4 polymorphisms that are perfect proxies of rs2280714 (r 2 = 1.0) are the most strongly associated with IRF-5 gene expression (P = 1.0 x 10 "10 , Table 4). Given that these variants are well downstream of IRF-5, and that they do not lie in a recognizable regulatory region, there may be additional genetic variation in tight LD with rs2280714 that drives the expression phenotype.

Association of IRF-5 with SLE

Studies were conducted to determine whether over-expression of IRF-5 (rs2280714), the presence of exon-lB initiated IRF-5 isoforms (rs2004640), or both, are associated with SLE. The fact that -14% of IRF-5 haplotypes are associated with over- expression, but lack the exon-lB splice site, allows the opportunity to test whether the

allele associated with overexpression (rs22807l4) is independently associated with SLE (Table 3). Indeed, in 470 SLE pedigrees, only haplotypes bearing the exon-lB splice site (rs2004640 T allele) show over-transmission using the transmission disequilibrium test 19 (208:149 T:U, P = 0.0021; Table 3). Haplotypes associated with over-expression of IRF-5 (rs2280714 T allele) but lacking the exon-lB splice site show no evidence for risk to SLE (70:108 T:U; Table 3). Supporting the family-based analysis, there was no difference observed in the frequency of the rs2004640/rs2280714 'G/T' haplotype between SLE cases (n = 1358, 13%) and controls (n = 2278, 15%; P = 0.98; Table 5). Additionally, rs2280714 was not significantly associated with SLE in the case-control analysis after logistic regression conditional on rs2004640 (P = 0.22). Thus, over- expression of IRF-5 in the absence of the exon-lB splice site does not confer risk to SLE.

Identification of the cis-acting variant linked to IRF-5 over-expression Additional studies were conducted to determine whether the rs 10954213 A allele is the cis-acting variant that causes IRF-5 over- expression, and whether the presence of this variant augments the risk to SLE conferred by the exon-lB splice site. The presence of the rs 10954213 A allele results in a "short form" IRF-5 mRNA having a truncated 3' UTR, as compared to the "long form" IRF-5 mRNA that is produced when an rs 10954213 G allele is present. To measure mRNA expression, specific primers were used to amplify the short form IRF-5 isoform, the long form IRF-5 isoform, or both isoforms in samples from individuals homozygous or heterozygous for the rs 10954213 A allele, as well as individuals homozygous for the rs 10954213 G allele. As shown in Figure 5, the short form was predominantly expressed in individuals having the rs 10954213 A allele. Expression was significantly greater in homozygous individuals than in heterozygous individuals. The presence of an rs 10954213 A allele did not preclude expression of the long form, but levels of the long form were significantly less than levels of the short form, particularly in individuals homozygous for the rs 10954213 A allele. Further, the overall level of IRF-5 expression was significantly greater in individuals homozygous for the rs 10954213 A allele than in individuals heterozygous for the allele (Figure 5). In turn, the overall level of IRF-5 expression was significantly greater in individuals heterozygous for the rs 10954213 A allele than in individuals homozygous for the rs 10954213 G allele. Thus, the rs 10954213 A allele is linked to increased expression levels of IRF-5.

Genetic analysis of IRF-5 haplotypes demonstrated that the presence of a short- form (rs 10954213 A) allele does not confer significant risk for SLE unless an Exon-1B (rs2004640 T) allele also is present (Table 7). Further genetic analysis demonstrated that the presence of a short-form allele augments the risk conferred by the presence of an Exon-1 B allele. As presented in Table 8, haplotypes are indicated such that the first letter represents the rs2004640 SNP and the second letter represents the rs 10954213 SNP. "Hapl" and "Hap2" represent the two haplotypes present in each group of individuals. Thus, the first row of Table 8 contains data for individuals homozygous for the rs2004640 T allele and the rslO954213A allele, whereas the second row of Table 8 contains data for individuals homozygous for the rs2004640 T allele and heterozygous for the rslO954213 A allele. "2X" and "IX" thus refer to the number of copies of the risk alleles at each SNP. These data show that having the short-form allele augments the risk that is conferred by having the Exon-1B transcripts. The data also suggest that having the Exon-1 B isoforms does not confer risk to SLE in the absence of the short- form allele, although those combinations of haplotypes (TG/TG and TG/GG) are relatively rare.

Cytokine secretion in response to TLR and IFN signaling PBMCs (~lxl O 6 cells/ml) were collected from normal donors with various IRF-5 genotypes at the rs2004640 and rsl 0954213 alleles. Specifically, cells were collected from four donors having a TT/AA haplotype (i.e., homozygous for the rs2004640 T allele and homozygous for the rsl 0954213 A allele), and three donors having a GG/GG haplotype (i.e., homozygous for the rs2004640 G allele and homozygous for the rsl 0954213 G allele). Cells were stimulated with optimal concentrations of TLR7 ligand (R848), IFN-α, or CpG oligos. Controls were treated with phosphate buffered saline

(PBS). Luminex assays were used to measure levels of various cytokines secreted after 6 hours of simulation. Specifically, levels of IL-IRA, IL-6, MPC-I, MIP-I α, MIP- l β, and TNF-α were measured using a Luminex xMAP system (Luminex Corp., Austin, TX). As shown in Table 9, cells harvested from individuals having a TT/AA haplotype secreted higher levels of the various cytokines in response to TLR and IFN signaling. Taken together, the data presented herein confirm the association of IRF-5 to SLE, and identify the IRF-5 risk haplotype as the strongest genetic effect outside the HLA yet discovered in this disease. There are three functional variants within IRF-5: the rs2004640 T allele provides a splice donor site that allows expression of multiple

IRF-5 isoforms containing exon-lB, while rs2280714 and its proxies, as well as rslO954213, are associated with elevated IRF-5 expression. The IRF-5 exon-lB isoforms are strongly linked to elevated expression of IRF-5 and to risk of SLE; over- expression of IRF-5 in the absence of exon-lB isoforms does not confer risk. Thus, over-expression of exon-lB transcripts may augment the risk to SLE.

Attorney Docket No .. 09531 -248 WO 1

Table 1. Case/control association analysis of rs2004640 T allele with SLE

κ>

a Number of individuals b Number of T alleles of rs2004640 c Number of G alleles of rs2004640 d Odds ratio and 95% confidence intervals e Mantel-Haenszel test of pooled odds ratios and 95% confidence intervals

1 Data from Sigurdsson et al.

Table 2: Genotypic association of rs2004640 with SLE

Table 3: TDT analysis of IRF-5 in 467 U.S. SLE Caucasian pedigrees

0 Frequency in parental chromosomes b Transmitted and untransmitted chromosomes, and the transmission ratio (T/U) c P value, uncorrected for multiple tests

" P value from 1 ,000,000 random iterations of the genotype data, as described in methods e Haplotype consisting of markers; rs729302, rs2004640, rs752637, rs2280714

' Haplotypes carrying 1 T" or "G" allele of rs2004640

Attorney Doeket No.: 09531 -248WO1

Table 4: Association of HapMap phase Il variants to IRF-5 expression levels

3 HapMap Phase Il markers with P < 1.0 x 10 "9 are shown, in addition to the results for IRF-5 region markers genotyped in the SLE families (rs729302, rs2004640, rs752637) b Position in HG16 (Build 34). c Minor Allele Frequency in HapMap CEPH (CEU) population.

Correlation to rs2280714. e P calculated using conditional linear regression, testing variants for association to IRF-5 expression in EBV-transformed B cells from CEPH individuals.

Table 5: IRF-5 haplotype frequency SLE cases and controls

a Haplotype of rs2004640 and rs2280714, phased using Haploview software Only samples with complete genotype data were analyzed. b P value, uncorrected for multiple tests, 1 degree of freedom c Number of individuals d Pooled P value from Mantel-Haenszel test of pooled odds ratios

Table 6. Hardy-Wienberg equilibrium expectation test in control samples

a P value for deviation from genotype frequencies predicted under Hardy-Weinberg Equilibrium expectations

Table 7: Genetic analysis of IRF-5 haplotypes

Table 8: Genetic analysis of IRF-5 haplotypes

Attorney Docket No.: 09531-248001

Table 9: In vitro stimulation of PBMC with TLR7 ligand, IFNα, or CpG oligos

-4

Example 2 - Three Functional Variants of IRF-5 Define Risk and Protective

Haplotypes for Human Lupus

Resequencing and genotyping in patients with SLE revealed evidence for three functional alleles of IRF '5: the exon IB splice site variant described above, a novel 30 bp in- frame insertion/deletion (indel) variant of exon 6 that alters a PEST domain region, and a novel variant in a conserved polyA + signal sequence that alters the length of the 3' UTR and stability of IRF5 mRNAs. Haplotypes of these three variants define at least three distinct levels of risk to SLE.

Materials and Methods

Whole blood donors and cell lines. Whole blood cells were collected from 5 healthy self-described European-ancestry donors who have the TT/ AA genotype (rs2004640/rs 10954213), 5 donors who have TG/AG genotype and 4 donors who have GG/GG genotype, and were used for quantitative PCR analyses. In addition, Epstein- Barr virus (EBV) infected immortalized B lymphocyte cell lines from CEPH family members were obtained from the Coriell Cell Repository and genotyped for rs2004640 and rslO954213. Three cell lines each forthe TT/AA genotype (GM12239, GM12154, and GM12761), the TG/AG genotype (GM7034, GM7345, and GMl 1881 ), and the GG/GG genotype (GM12145, GM7000, and GM12155) were selected for Northern, qPCR and Western analyses. CEPH cells were cultured in RPMl 1640 medium (Cellgro) supplemented with 2mM L-glutamine and 15% fetal bovine serum at 37 0 C in a humidified chamber with 5% CO 2 . Tet-off 293 cells were purchased from BD Biosciences and were cultured in Eagle Minimum Essential Media (Invitrogen Life Technologies) with 10% FBS, 4 mM L-glutamine, 100 units/ml penicillin G and 100 μg/ml streptomycin.

RNA extraction and cDNA synthesis. Whole blood total RNA was extracted from healthy donors using RNeasy ® Mini Kits (Qiagen). PoIy-A + RNA was extracted from CEPH cell lines using FastTrack ® 2.0 Kits (Invitrogen). First-strand cDNAs were synthesized from RNAs using Superscript 11 reverse transcriptase (Invitrogen) with Oligo(dT) 12-18 primers (Invitrogen).

Quantitative PCR. Expression of IRF 5 mRNA was quantified by real-time PCR with TaqMan assays using an ABI PRISM 7900HT Sequence Detector (Applied Biosystems). Primers and probes used to distinguish short form 3' UTRs, long form 3'

UTRs, and all 3' UTRs are listed in Table 10. A TaqMan ® Gene Expression Assay (Applied Biosystems) was used for glyceraldehyde-3-phosphate dehydrogenase (GAPDH). Fifty-five cycles of two-step PCR (95 0 C for 15 seconds and 6O 0 C for 1 minute) were carried out for common primer and probe sets and GAPDH, and 55 cycles of three-step PCR (95 0 C for 15 seconds, 48 0 C for 15 seconds, and 60 0 C for 40 seconds) were carried out for the short and long form IRF5 assays. PCR reaction mixtures contained 10 ng of cDNA from total RNAs or 2 ng of cDNA from poly- A + RNAs, IX

TaqMan Universal PCR Master mix (Applied Biosystems), lμM each of forward and reverse primers, and 250 nM of TaqMan ® MGB Probe (Applied Biosystems). Expression levels were normalized to GAPDH expression.

Northern Blotting. 0.5 μg of poly-A + RNA from CEPH cell lines was analyzed by Northern blotting. PoIy-A RNA + was denatured with glyoxile/dimethylsulfoxide (DMSO) sample dye (NorthernMax-Gly Based system, Ambion), resolved on 1.2% agarose gels, and blotted onto BrightStar-Plus Nylon membranes (Ambion). Membranes were crosslinked with UV and hybridized for 16-18 hours with a 32 P-labeled probe from the IRF5 proximal 3' UTR region and a with control GAPDH probe. Probes were generated by random primed DNA labeling using a DECAprime II kit (Ambion). Following stringent washes, membranes were exposed to a Phosphorlmager ® screen overnight and relative RNA levels were assessed using Phosphorlmager ® software (Molecular Dynamics (Sunnyvale, CA)). Total RNA was isolated from transfected Tet- off 293 cells, and probed with a radiolabeled cDNA fragment of beta-globin and GFP.

Western blotting. 1.5 x 10 7 cells from each of the CEU cell lines were solubilized using 0.6 ml of 1% SDS lysis buffer (150 mM NaCl, 50 mM Tris-HCl. pH 7.5) containing Complete Mini Protease Inhibitor (Roche). Cells were sheared through a 26G needle and incubated on ice for 30 minutes. The lysate was immediately centrifuged for 10 minutes at 14000 rpm and 4°C, and the supernatant was used for subsequent SDS- PAGE and Western blot analyses. Lysates were resolved on 12% SDS-poly-Acrylamide gels (Invitrogen) and transferred under semi-dry conditions onto polyvinylidene difluoride (PVDF) membrane using Semi-Dry Electroblot Buffer Kit (Owl). Membranes were blocked using Tris Buffered Saline (TBS) containing 0.1% Tween 20 (TBS-T) and 5% non-fat dry milk for 1 hour at room temperature, or overnight at 4°C. All washing stages were carried out using TBS-T. Blots were incubated for 1 hour at room temperature with a 1:2000 dilution of mouse monoclonal anti-IRF5 antibody (M03;

Abnova Corp., Taipei City, Taiwan), or a 1 :1000 dilution of Goat polyclonal anti-IRF5 antibody (ab2932; Abeam Inc., Cambridge, MA). Signals were detected using horseradish peroxidase (HRP) conjugated secondary Abs (1 :2000 dilution of rabbit antimouse/goat IgG; Zymed Laboratories, Inc., South San Francisco, CA), and ECL chemiluminescence system (Amersham). Membranes also were reprobed with a 1 :5000 dilution of rabbit polyclonal anti-GAPDH antibody (sc-154; Santa Cruz Biotechnology, Santa Cruz, CA) and a 1 : 10000 dilution of goat anti-rabbit IgG HRP conjugate (Zymed).

Transient Transfection andmRNλ Decay Assay. Tet-Off 293 cells (1.6 x 10 6 cells/mL) were transfected with 3,0 μg of Tet-responsive reporter constructs that encoded chimeric rabbit betaglobin transcripts linked to the 3' UTR of IRF5 that contained either the A or G allele of rs 10954213 and with 1 μg of the pTracer-EF/V 5 -Uis/lacZ construct (Invitrogen Life Technologies), which produces GFP, to control for transfection efficiency. Transfections were performed with 2.5 U of TransIT-293 reagent (Mirus, Madison, WI) per μg of plasmid DNA. After 48 hours, 300 ng/ml of doxycycline was added to stop transcription from the Tet-off constructs. Total RNA was isolated at 0, 1, 3 and 6 hours following doxycycline treatment using the TRIzol ® reagent (Invitrogen Life Technologies), RNA was further purified and DNase treated using the RNeasy Mini kit (Qiagen) according to the manufacturer's instructions, and Northern blots were performed. The hybridization intensity of each chimeric beta-globin:IRF5 transcript was normalized to the hybridization intensity of the GFP transcript, and the normalized values were used to calculate transcript half-lives.

Clinical Samples. A collection of family samples of European descent consisting of 555 pedigrees was recruited at the University of Minnesota and at Imperial College, UK (Gaffney et al. (1998) Proc. Natl. Acad Sci. USA 95: 14875-14879; Gaffney et al. (2000) Am. J. Hum. Genet. 66:547-556; Graham et al. (2006) Hum. MoI. Genet. 15:3195- 3205; and Graham et al. (2001) Arthritis Res. 3:299-305). The following independent European descent case/control populations were studied: 173 unrelated SLE cases from the University of Minnesota, 55 unrelated SLE cases from Imperial College in London, UK, 540 cases from the UCSF Lupus Genetics Project collection (Parsa et al. (2002) Genes Immunol. 3 Suppl. 1:S42-S46), and 1439 controls from the NYCP project

(Mitchell et al. (2004) J. Urban Health 81 :301-310). The study also included 338 SLE patients from Sweden, 213 of them recruited at the Karolinska Hospital in Stockholm (Svenungsson et al. (2003) Arthritis Rheum. 48:2533-2540) and 125 at Uppsala

University Hospital (Sigurdsson et al. (2005) Am. J. Hum. Genet. 76:528-537), with 363 healthy, age- and sex matched controls from the same geographical regions as the SLE patients. The SLE patients fulfilled the American College of Rheumatology revised criteria for SLE (Tan et al. (1982) Arthritis Rheum. 25:1271-1277). In addition, 270 samples from the International Haplotype Map Consortium ((2005) Nature 437: 1299- 1320), 233 CEPH individuals (14 extended pedigrees, including 21 trios that are part of the HapMap CEU samples, and 38 unrelated individuals) described in Morley et al, ((2004) Nature 430:743-747) were genotyped for IRF5 region markers.

Resequencing and genotyping. IRF5 was resequenced in 8 controls and 40 SLE cases collected at Uppsala, Sweden using 23 PCR fragments that covered lkb upstream of exon Ia, and all exons and introns. In addition, all exons of IRF 5 and 1 kb upstream of exon IA were resequenced in 96 SLE cases of European descent from the Minnesota SLE cohort. Bidirectional sequencing was conducted using an ABI 3700 and standard methodology. Polymorphisms were identified using Sequencer (Gene Codes Corp) or SNPcompare (de Bakker et al. (2005) Nat. Genet. 37:1217-1223), an algorithm that assigns a confidence score to putative SNPs. All putative SNPs were manually verified by examining the traces. All exonic SNPs and SNPs seen in 2 or more samples were validated in the HapMap CEU population.

In the Swedish samples the SNPs were genotyped at the SNP technology platform in Uppsala (available on the World Wide Web at genotyping.se) by multiplex, fluorescent single-base extension using the SNPstream system (Beckman Coulter), with the exception of SNP rs4728142, which was typed by homogeneous fluorescent single- base extension with detection by florescence polarization (Analyst AD, Molecular Probes). The exon 6 deletion was amplified as a 145 bp or 115 bp PCR fragment with primers located in exon 6, and the amplified fragments were separated on 2% agarose gels. The genotype call rate was on average 97.2%, and the accuracy estimated from 5156 genotype comparisons between repeated assays (61% of the genotypes) was 99.3%. The genotypes conferred to Hardy- Weinberg equilibrium (Fisher's exact test, P > 0.01). Fragment analysis and the sequencing runs for the Swedish samples were performed by the core facility of the Rudbeck Laboratory in Uppsala, Sweden.

Genotype data in the MN and UK samples were generated using iPLEX and hME chemistries on the Sequenom platform (see Table 10 for assay information). The following quality standards were applied: no more than 1 Mendel error per 100 trios,

HWE P > 0.001 , genotyping completeness > 95%, and samples with < 75% genotyping were excluded from the analysis. The exon 6 deletion was genotyped by amplifying the region using primers listed in Table 10 at an annealing temperature of 63 0 C. Fragments were separated using a 4% agarose gel (E-GeI 48, Invitrogen). All allele calls were made independently by two individuals blinded to sample ID.

Expression analysis in EBV cell lines. Normalized IRF5 mRNA expression levels were obtained from data made available by the GENEVAR project at the Sanger Centre from EBV transformed B -cells derived from the 270 HapMap samples (IRF5 exon 9 probe G1 38683858- A). In addition, JRF5 expression values (probeset 205469_s_at) were obtained from a dataset of 233 CEPH EBV transformed B cell lines (Cheung et al. (2005) Nature 437:1365-1369; GEO accession number GSE1485). Association of genotype to IRF5 expression levels and conditional logistic regression analyses were conducted using WHAP (available online at pngu.mgh.harvard.edu/purcell//whap).

Association analysis. Family-based and case/control association analyses, including permutation testing, were conducted using Haploview v3.3 (Barrett et al.

(2005) Bioinformatics 21 :263-265). Single marker association results for the population- based cohorts are shown in Table 11. Conditional logistic regression analyses of single markers and haplotypes was performed using the WHAP software program. Haplotypic association results in the family-based US and UK cohort, the case-control cohort collected in the US and UK and the Swedish case-control cohort were combined using the Mantel-Haenszel meta-analysis of the odds ratios (ORs) (Lohmueller et al. (2003) Nat. Benet. 33:177-182; and Woolson and Bean (1982) Stat. Med 1 :37-39).

Expression Analysis in Whole Blood. Total RNA was isolated from whole blood drawn into PAXgene tubes from 38 independent Caucasian SLE cases (Affymetrix 133A chips, IRF5 probe set 205469_s_at). The analysis included 23 patients that were AA at the rs 10954213 SNP (17 TT and 6 GT at rs2004640), 11 patients that were GA at rs 10954213 (8 GT and 3 GG at rs2004640), and 4 patients that were GG at rs 10954213 (1 GT and 3 GG at rs2004640).

Characterization of sequence variation at I RF 5

To more fully characterize genetic variation at IRF 5, the exons and lkb upstream of the IRF5 exon IA were sequenced in DNA from 136 cases of SLE. Each of the introns in 40 SLE cases and 8 controls also were sequenced (Tables 12 and 13). In total,

52 variants were observed, of which 32 were novel, while 20 had been previously identified (present in dbSNP). Of the novel variants, 13 had minor allele frequency greater than 1%. Each such variant was genotyped in the HapMap CEU samples, allowing them to be integrated with data from the International HapMap Project. While no common single nucleotide missense variants of IRF5 were observed, a

30 bp inframe insertion/deletion (indel) in exon 6 was observed. The exon 6 indel is located in a proline-, glutamic acid-, serine- and threonine-rich (PEST) domain, a motif previously shown to influence protein stability and function in the IRF family of proteins (Levi et al. (2002) J. Interferon Cytokine Res. 22: 153-160). TagSNPs were selected to serve as proxies (r 2 > 0.8) for all SNPs with minor allele frequency > 1% in the combined data from HapMap Phase II ((2005) Nature 437:1299-1320) and genotype data in the same samples for the SNPs discovered in the sequencing effort.

Association of common variation in IRF 5 to risk of SLE Each tagSNP was individually tested for association to SLE in a combined trio and family collection of 555 families from the US and the UK (Table 14). The strongest association with SLE was for three highly correlated SNPs (rs2070197, rs 10488631 , and rsl2539741, pairwise r 2 > 0.95). These SNPs (referred to herein as "Group 1") do not include the exon IB splice site variant (rs2004640) described above, and showed highly significant association: Transmitted / Untransmitted (T/U) ratio= 1.8; P = 1.2 x 10 "7 . To assess whether the Group 1 variants could explain the association to SLE, conditional logistic regression incorporating one of the Group 1 SNPs (rs2070197) was performed. This model was rejected, because a second set of correlated SNPs (rs729302, rs4728142, rs2004640, and rs6966125; referred to herein as Group 2) were independently associated with risk to SLE (P < 0.002-0.008, Table 1 1 ). Group 2 includes rs2004640.

To test the hypothesis that the combination of Group 1 and Group 2 variants fully account for the association observed to SLE, the conditional logistic regression analysis was repeated, including a Group 1 and a Group 2 variant in the model (represented by rs2070197 and rs2004640, respectively). A third set of six highly correlated SNPs (rs4728142, rs3807135, rs752637, rsl 0954213, rs2280714, and rsl 7166351; referred to as Group 3) was associated with risk of SLE (p<0.001-0.01; Table 1 1). These results indicate that three independent sets of correlated IRF5 variants (Groups 1 , 2, and 3) each provide statistically independent evidence for association with risk of SLE, Thus, while

the exon IB splice site (rs2004640) has been shown to be strongly associated with SLE, it is clear that rs2004640 does not explain all of the effect of IRF5 on risk to SLE - indeed, it is not even the strongest contributor. As such, experiments were conducted to identify other putative functional alleles that might explain the independent signals of association observed for Groups 1 and 3.

Cis-acting alleles underlying variation in IRF5 expression One approach to finding causal alleles is to examine other phenotypes that might be less complex in their inheritance, providing power to distinguish the effects of highly correlated alleles, and offer in vitro assays to assess function. In vitro expression levels provide one such phenotype. Given the previous observation that one of the Group 3 variants (rs2280714) is associated with levels 0ϊIRF5 mRNA expression, the more complete set of IRF 5 variants was systematically examined for alleles that might be associated to levels of IRF5 mRNA expression in lymphoblastoid cell lines. The same set of tagSNPs genotyped in the SLE family cohort was studied in the

HapMap samples, allowing correlation of genotype to mRNA expression data collected at the Sanger Institute (on the World Wide Web at sanger.ac.uk/humgen/genevar/). A variant in the 3' UTR (rsl 0954213, Group 3) showed the strongest association with IRF5 expression: P = 3.5 x 10 " " (Table 14). This variant and one other (rslO954214) reside in conserved elements within the 3' UTR, a region that often contains sequences that influence mRNA expression (Conne et al. (2000) Nat. Med. 6:637-641).

To increase the power to distinguish effects of correlated SNPs, a subset of the associated 1RF5 variants was genotyped in an independent dataset in 233 CEPH samples for which microarray gene expression data was publicly available (Morley et al., supra) (Table 15). Again, rsl 0954213 was the best predictor of IRF5 expression. Specifically, rslO954213 showed stronger association than either the neighboring rslO954214 or the rs2280714 SNP studied previously (Table 15, Figure 7). Formally, rsl 0954213 remained strongly associated with IRF5 mRNA levels after conditioning on rs2280714 (P = 5 x 10 "9 ), while conditioning on rsl 0954213 nearly eliminated association of rs2280714 to IRF5 expression (P = 0.004). Finally, similar findings were observed for expression of IRF5 in whole blood of SLE cases (Figure 7).

These results indicated that rslO954213 was the best predictor of IRF5 expression in this survey of lymphoblastoid cell lines, clearly distinguishable in its effect from the

other SNPs with which it is in strong linkage disequilibrium. As rsl 0954213 also is a member of Group 3, it became a candidate to explain the association of Group 3 SNPs to SLE. It is noted that the greater strength of the signal of association of IRF '5 expression levels (P < 1O "55 ) allowed the signal of rsl 0954213 to be distinguished from the other members of Group 3 for IRF5 expression. The weaker signals of association to risk of SLE were not able to be clearly distinguished.

While rsl 0954213 was the strongest determinant OϊIRF5 expression in the survey of common variation at IRF5, conditioning on this SNP did not account for all variance in IRF5 expression. After conditioning on rsl 0954213, the exon IB splice site (rs2004640) and other linked SNPs were the next strongest association to IRF5 levels (Table 15). Specifically, the presence of the T allele at rs2004640, which allows expression of exon IB isoforms, was associated with significantly higher levels OϊIRF5 expression in cell lines carrying GG or AG genotype at rsl 0954213 (Figure 6). After incorporating a two-locus model of both rsl 0954213 and rs2004640, no other SNP has a nominally significant association to IRF5 expression in the CEU cell lines (Tables 15 and 16).

Thus, the systematic search for a common variation that influences levels of IRF 5 mRNA led to identification of rslO954213, a SNP in a conserved element within the 3' UTR and a member of Group 3, as well as the exon IB splice site variant (rs2004640), a member of Group 2.

A Group 3 variant alters a polyadenylation signal and influences IRF5 expression While the data described in Example 1 show that the exon IB SNP influences IRF5 mRNA levels through its effect on splicing (Graham (2006) Nat. Genet. (38:550- 555), the function, if any, of rslO954213 was unknown. The sequence surrounding rsl 0954213 has been highly conserved throughout evolution. Moreover, the rsl 0954213 G allele is predicted to disrupt a polyA + signal sequence (AAUAAA → AAUGAA) located 552 bp downstream of the stop codon of IRF5 in the 3' UTR region of exon 9. The canonical motif is a binding site for a protein complex known as cleavage and polyadenylation specificity factor (CPSF). During RNA polymerase II transcription, CPSF binds to the AAUAAA sequence and is part of a complex that cuts the mRNA strand 10-30 bp downstream of the polyA + signal and initiates polyadenylation of the transcripts (Edmonds (2002) Prog. Nucl. Acid Res. MoI. Biol. 71 :285-389).

Based on the location of rs 10954213 in a conserved CPSF site, it was hypothesized that the different alleles of rs 10954213 might influence polyadenylation, and thereby the length and stability of the IRF5 message. Specifically, the A allele of rs 10954213 might allow efficient polyadenylation approximately 12 bp downstream, while the G allele favors the use of a distal poly A + site 648 bp downstream (Figure 8).

To directly test this hypothesis, Northern blotting and quantitative PCR were performed using IRF 5 mRNA from cell lines and PBMC of known genotype at rs 10954213, as well as chimeric mRNAs that attach the two alleles of the 3' UTR to heterologous expression constructs. Total and polyA + enriched RNA were isolated from the HapMap CEU population, selecting individuals based on genotype at rsl 0954213. Northern blotting of poly A + RNA showed that cell lines homozygous for the A allele at rslO954213, carrying the wild-type AAUAAA on both alleles, expressed mainly a short version of IRF5 mRNAs. In contrast, cell lines homozygous for the G allele (AAUGAA) expressed almost exclusively a longer mRNA that utilized the second downstream polyA + site. AG heterozygote cell lines showed expression of both isoforms. Identical results were obtained in Northern blots of total RNA isolated from the cell lines. These results were confirmed with TaqMan quantitative PCR assays in both EBV-transformed cell lines and normal donor PBMCs (Figure 9). These data confirmed that the allele at rsl 0954213 determines the site of polyadenylation. Thus, rsl 0954213 is referred to hereafter as the poly A + variant, with the A allele termed the "short" allele, and the G allele the '"long" allele.

To determine whether the long allele of the 3' UTR might be unstable, the two versions of the 3' UTR downstream from the coding region of rabbit beta-globin were cloned, and 293 'Tet-off kidney cells were transfected with expression plasmids driving chimeric cDNAs carrying either the short or long allele. Northern blotting of mRNA isolated 48 hours after transfection showed that chimeric cDNAs used the expected poly A + site, and that the long mRNAs had a shorter half-life than short chimeric transcripts (Figure 10). Estimates for the half-life of these transcripts, based on regression curves, were 342 ± 88 min for the short allele, and 125 ± 21 min for the long allele. By comparison, the calculated half-life of beta-globin mRNA alone (lacking the IRF5 3' UTR) was 1 1,631 ± 1,574 min. These experiments document that disruption of the proximal poly A + signal by rsl 0954213 leads to transcription of long and relatively unstable 1RF5 mRNA transcripts. These effects on 1RF5 mRNA are reflected in levels

of IRF5 protein, as shown by Western blots of whole cell lysates from EBV cell lines carrying the various polyA + SNP genotypes: cells carrying the AA genotype showed ~5- fold higher levels of immunoreactive IRF5 protein than cells carrying the GG genotype.

The exon 6 indel and risk of SLE

The experimental results discussed herein suggest that (a) the association of Group 2 SNPs to SLE is likely explained by the exon IB splice site allele (rs2004640), and (b) the association of the Group 3 SNPs is likely due to the polyA + variant (rs 10954213). In contrast, none of the Group 1 SNPs were found to alter the coding region of IRF5, lie in evolutionarily conserved regions, or change an annotated sequence motif. This suggests either that the Group 1 SNPs (or an undiscovered but strongly linked mutation) have an as yet unrecognized effect on IRF5 function, or that the Group 1 SNPs have no functional consequence but instead tag a combination of other functional variants in IRF 5. To assess the second model (having found no evidence for a functional allele among the Group 1 SNPs), the conditional logistic regression analysis was performed not in order of statistical significance (as above), but instead starting with the two putative functional alleles identified above (exon IB splice site variant and polyA + variant). Multiple variants were observed that showed significant association to SLE in this analysis (Table 1 1), including the 30 bp in-frame insertion/deletion (indel) polymorphism that was discovered within exon 6 (Figure 11). This indel is located in a PEST domain known to influence stability and function of the IRF family of proteins. Previous studies have shown that IRF5 protein isoforms which, in part, differ by the 30 bp (10aa) exon 6 indel (which had previously been observed in cDNA, but not recognized to be a germline polymorphism) have differential ability to initiate transcription of IRF5 target genes (Barnes et al. (2004) J. Biol Chem 279:45194-45207; Mancl et al. (2005) J. Biol Chem 280:21078-21090; and Barnes et al. (2002) MoI. Cell Biol 22:5721-5740).

It is noted that association of the exon 6 indel to SLE was only observed when conditioned on the exon IB splice site and polyA + variants. The association previously had been masked by the signal of the Group 1 variants in the initial analysis that proceeded in order of statistical significance. Consistent with a model in which the three putative functional alleles (exon IB, polyA + , and exon 6 indel) are sufficient to explain

the observed association to SLE, however, a logistic regression that includes these three variants revealed no additional SNP with p < 0.01, That is, the effect of Group 1 SNPs is statistically indistinguishable from their linkage disequilibrium with the three alleles that have putative functional effects on the structure of IRF5 protein and/or its expression.

Haplotype analysis identifies three levels of SLE risk

To better understand the observed combinations of the three putative functional alleles (and the Group 1 SNPs), the four marker haplotypes defined by: (a) the exon IB splice site (rs2004640, Group 2), (b) the polyA + variant (rslO954213, Group 3), (c) the exon 6 indel, and (d) Group 1 (using rs2070197 as a proxy) were examined (Table 17). These four variants defined five common haplotypes, each carrying unique combinations of the exon IB splice site, the exon 6 indel, and the poly A + variant.

These haplotypes were studied for association to SLE in large family-based and case-control samples totaling 2,188 case and 3,596 control chromosomes. Haplotype 1 (Table 17) was strongly associated with risk of SLE, appearing on 19.0% of SLE chromosomes in comparison to 11.9% of control chromosomes (P - 1.4 x 10 "19 , Table 17). In the case-control sample, a single copy of haplotype 1 was associated with an odds ratio (OR) of 1.46, while two copies were associated with an OR of 2.96 (Table 18). No other 1RF5 haplotypes showed positive association with SLE. The high-risk haplotype 1 is predicted to be the only haplotype with the ability to express exon IB isoforms (due to rs2004640), carries the exon 6 insertion, and is expressed at high levels due to the polyA + variant.

Alternative proximal splice acceptors for exon six, termed SSl and SS2, which are proximate to the exon 6 indel, have been shown to influence activation of downstream genes (Barnes et al. (2004), supra; Mancl et al., supra; and Barnes et al. (2002), supra). As shown in Figure 12, both SSl and SS2 are used regardless of the exon 6 indel genotype.

While haplotypes 2 and 3 showed no evidence for association to SLE as compared to the overall population (OR = 1.09 and 0.95, P >= 0.05, respectively), haplotypes 4 and 5 showed strong evidence for protection. Specifically, each was associated with a -25% reduction in risk (OR = 0.76) that was statistically highly significant (P < 5 x 10 ~8 and 3 x 10 '5 , respectively). Moreover, individuals that carry

haplotype 1 in trans with either of the haplotypes that lack exon IB isoform expression (4 and 5) show a reduction in risk of SLE (Table 18).

Frequency of IRF5 haplotvpes in world populations The Human Genome Diversity Panel was genotyped to assess the frequency of

IRF5 alleles in world populations, and genotype data was submitted to the Human Genome Diversity panel (HGDP) database (Rosenberg et al. (2002) Science 298:2381- 2385; and Cann et al. (2002) Science 296:261-262). It was noted that high-risk haplotype 1 is common in a European-derived population, but rare in West-African and East-Asian HapMap populations (15% in CEU, 0% in YRI, < 1 % in JPT/HCB).

Extending these observations into a broader array of populations in the HGDP revealed that haplotype 1 is found in Central Asia and derived populations (European and Native American), but is rare in other world populations (Table 19). Haplotype 1 was examined for evidence of recent rapid positive selection using extended haplotype homozygosity algorithms (Sabeti et al. (2002) Nature 419:832-837; and Walsh et al. (2006) Hum. Genet. 1 19:92-102), but there was no evidence for selection.

In summary, these data reveal that the highest risk for SLE is observed with a haplotype that is predicted to express at high levels of transcripts containing exon IB and the exon 6 insertion (Figure 13). Haplotypes 2 and 3, which carry only 2 of the 3 risk associated functional alleles, show average risk to SLE. Haplotypes 4 and 5, which carry only 1 of the 3 risk associated functional alleles - and, in particular, lack exon IB iso forms - are protective for SLE.

Table 10: Assay Information

Table 11: Single marker transmission and conditional analyses in SLE 1 rios from US and UK

o

'Number assigned to variant by dbSNP (World Wide Web at ncbi.nlm.nih,gov/entrez/query.fcgi?db=snp); Position in the HG 17 assembly of the Human Genome; 3 Minor Allele Frequency; 4 Number of chromosomes with high quality data

Table 13: Variants discovered b rese uencin all IRF5 exons and 1 Kb u stream of exon 1A in 96 US SLE cases

Number assigned to variant by dbSNP (World Wide Web at ncbi.nlm.nih.gov/entrez/query.fcgi?db=snp) 2 Position in the HGl 7 assembly of the Human Genome 3 Minor Allele Frequency dumber of chromosomes with high quality data

Table 14: Association with IRF5 mRNA expression in transformed B-cells from HapMap CEU, CHB, JPT, and YRI populations

Position of variant in the HGl 7 assembly of the human genome

Association of variant to IRF5 mRNA levels in 210 unrelated EBV transformed B-cells lines derived from the HapMap samples (GENEVAR dataset. World Wide Web at sanger.ac.uk/humgen/genevar/)

'Position of marker in the HGl 7 assembly of the human genome 2 Minor Allele Frequency

3 Uncorrected P value for association of the indicated marker to IRF5 mRNA levels in 233 CEPH EBV -transformed B cells Association of the indicated marker under the model that rs 10954213 fully explains all the variance in IRF5 expression Association of the indicated marker under the model that rs 10954213 and rs2004640 fully explain all the variance in IRF5 expression 6 NA indicates that the association to IRF5 expression cannot be calculated because it is statistically indistinguishable from the proposed mode

Table 16: Association of IRF5 region markers with IRF5 expression in the HapMap

CEU population

Table 17: Association of IRF5 ha lot es with SLE

2 ln-frame insertion/deletion of 30 bp in exon 6 of IRF5, chr7: 128, 181,324-54 (HGl 7)

3 PoIyA + Signal variant ("A" allele is associated with 561 bp 3' UTR; "G" allele is associated with enrichment of 1214 bp 3' UTR

4 Number of transmitted haplotypes

5 Number of untransmitted haplotypes

6 Odds Ratio and 95% confidence intervals

7 Nominal P value for association to SLE frequency of haplotypes in SLE cases 9 Frequency of haplotypes in controls

Table 18: IRF5 genotype frequencies in SLE cases and controls

OTHER EMBODIMENTS

It is to be understood that while the invention has been described in conjunction with the detailed description thereof, the foregoing description is intended to illustrate and not limit the scope of the invention, which is defined by the scope of the appended claims. Other aspects, advantages, and modifications are within the scope of the following claims.