Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
CARDIOVASCULAR DISEASE
Document Type and Number:
WIPO Patent Application WO/2022/189445
Kind Code:
A1
Abstract:
The invention relates to cardiovascular disease, and particularly, although not exclusively, to cardiovascular disease biomarkers and their use in methods of diagnosis and prognosis. The invention also extends to diagnostic and prognostic kits utilising the biomarkers of the invention for diagnosing or prognosing cardiovascular disease.

Inventors:
ALLEBRANDT KARLA (DE)
FRAU FRANCESCA (DE)
Application Number:
PCT/EP2022/055925
Publication Date:
September 15, 2022
Filing Date:
March 08, 2022
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
SANOFI SA (FR)
International Classes:
G01N33/68; G06N7/00; G16H50/70
Foreign References:
US10670611B22020-06-02
US20140255424A12014-09-11
Other References:
WANG WEI ET AL: "Prediction of risk of cardiovascular events in patients with mild to moderate coronary artery lesions using naive Bayesian networks", JOURNAL OF GERIATRIC CARDIOLOGY : JGC, 1 January 2016 (2016-01-01), China, pages 899 - 905, XP055928106, Retrieved from the Internet [retrieved on 20220603], DOI: 10.11909/j.issn.1671-5411.2016.11.004
GUANGLIN CUI ET AL: "Polymorphism of tumor necrosis factor alpha (TNF-alpha) gene promoter, circulating TNF-alpha level, and cardiovascular risk factor for ischemic stroke", JOURNAL OF NEUROINFLAMMATION, BIOMED CENTRAL LTD., LONDON, GB, vol. 9, no. 1, 10 October 2012 (2012-10-10), pages 235, XP021129331, ISSN: 1742-2094, DOI: 10.1186/1742-2094-9-235
DRAGUT R ET AL: "Relationship between TNF alpha, IL-6 and cardiovascular risk in patients with type 2 diabetes and metabolic syndrome", ATHEROSCLEROSIS, vol. 235, no. 2, 2014, XP028878483, ISSN: 0021-9150, DOI: 10.1016/J.ATHEROSCLEROSIS.2014.05.712
GERSTEIN HERTZEL C. ET AL: "Identifying Novel Biomarkers for Cardiovascular Events or Death in People With Dysglycemia", CIRCULATION, vol. 132, no. 24, 15 December 2015 (2015-12-15), US, pages 2297 - 2304, XP055928164, ISSN: 0009-7322, DOI: 10.1161/CIRCULATIONAHA.115.015744
FUSTER-PARRA P ET AL: "Bayesian network modeling: A case study of an epidemiologic system analysis of cardiovascular risk", COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE, ELSEVIER, AMSTERDAM, NL, vol. 126, 30 December 2015 (2015-12-30), pages 128 - 142, XP029433116, ISSN: 0169-2607, DOI: 10.1016/J.CMPB.2015.12.010
"GeneBank", Database accession no. P35398-2
"UniProtKB", Database accession no. Q04206
THOMPSON ET AL., NUCLEIC ACIDS RESEARCH, vol. 22, 1994, pages 4673 - 4680
THOMPSON ET AL., NUCLEIC ACIDS RESEARCH, vol. 24, 1997, pages 4876 - 4882
Attorney, Agent or Firm:
VENNER SHIPLEY LLP (GB)
Download PDF:
Claims:
Claims

1. A method of determining, diagnosing and/or prognosing an individual’s risk of suffering from cardiovascular disease, the method comprising; a. detecting, in a sample obtained from an individual, the expression level, amount and/or activity of two or more biomarkers selected from a group consisting of: Tumour Necrosis Factor (TNF)-DD Glutathione S-Transferase Alpha 1 (GSTA1); N-terminal-pro hormone BNP (NT-proBNP); Retinoic Acid Receptor- Related Orphan Receptor Alpha (RORA); Tenascin C (TNC); Growth Flormone Receptor (GFIR); Alpha-2-Macroglobulin (A2M); Insulin Like Growth Factor Binding Protein 2 (IGFBP2); Apolipoprotein B (APOB); Selenoprotein P (SEPP1); Trefoil Factor (TFF3); Interleukin 6 (IL6); Chitinase 3 Like 1 (CHI3L1); Hepatocyte Growth Factor Receptor (MET); Growth Differentiation Factor 15 (GDF15); Chemokine (C-C Motif) Ligand 22 (CCL22); Tumour Necrosis Factor Receptor Superfamily, Member 11 (TNFRSF11 ); Angiopoietin 2 (ANGPT2); and v -Rel Avian Reticuloendotheliosis Viral Oncogene Homolog A Nuclear Factor- kappa B (ReLA NF-KB); b. comparing the expression level, amount and/or activity of the biomarker with a reference from a healthy control population; and c. determining, diagnosing and/or prognosing the risk of an individual suffering from cardiovascular disease if the expression level, amount, and/or activity of the biomarker deviates from the reference from a healthy control population.

2. The method of claim 1 , wherein a decrease in expression, amount and/or activity of TNF-D, GSTA1 , NT-proBNP, RORA and/or TNC, when compared to the reference, is indicative of an individual having a higher risk of suffering from cardiovascular disease or a negative prognosis.

3. The method of either claim 1 or claim 2, wherein an increase in expression, amount and/or activity of GHR, A2M, IGFBP2, APOB, SEPP1 , TFF3, IL6 and/or CHI3L1 , when compared to the reference, is indicative of an individual having a higher risk of suffering from cardiovascular disease or a negative prognosis.

4. The method of any preceding claim, wherein a decrease in expression, amount and/or activity of MET, GDF15, CCL22, TNFRSF11 , ANGPT2 and/or RelA NF-KB, when compared to the reference, is indicative of an individual having a lower risk of suffering from cardiovascular disease or a positive prognosis.

5. The method according to any preceding claim, wherein step a) comprises detecting, in a sample obtained from the individual, the expression levels, amount and/or activities of at least 3, at least 4, at least 5, at least 6, at least 7, at least 8, at least 9, at least 10, at least 11 , at least 12, at least 13, at least 14, at least 15, at least 16, at least 17 or at least 18 of the biomarkers selected from the group consisting of: TNF-D; GSTA1 ; NT-proBNP; RORA; TNC; GHR; A2M; IGFBP2; APOB; SEPP1 ; TFF3; IL6; CHI3L1 ; MET; GDF15; CCL22; TNFRSF11 ; ANGPT2; and RelA NF-KB.

6. The method according to any proceeding claim, wherein the cardiovascular disease is selected from the group consisting of: cardiovascular death; myocardial infarction; stroke; and heart failure.

7. The method according to any proceeding claim, wherein the sample comprises blood, urine or tissue.

8. A kit for determining, diagnosing and/or prognosing the risk of an individual suffering from cardiovascular disease, the kit comprising: a. detection means for detecting, in a sample obtained from a test subject, the expression level, amount and/or activity of two or more biomarkers selected from the group consisting of TNF-a; GSTA1 ; NT-proBNP; RORA; TNC; GHR; A2M; IGFBP2; APOB; SEPP1 ; TFF3; IL6; CHI3L1 ; MET; GDF15; CCL22; TNFRSF1 1 ; ANGPT2 and RelA NF-KB; and - ii6 - b. a reference value from a healthy control population for expression level, amount and/or activity of two or more a biomarkers selected from the group consisting of TNF-a; GSTA1 ; NT-proBNP; RORA; TNC; GHR; A2M; IGFBP2; APOB; SEPP1 ; TFF3; IL6; CHI3L1 ; MET; GDF15; CCL22; TNFRSF11 ; ANGPT2 and RelA NF-KB, wherein the kit is used to identify: i) a decrease in expression, amount and/or activity of TNF-a; GSTA1 ; NT-proBNP; RORA and/or TNC when compared to the reference; and /or an increase in expression, amount and/or activity of GHR; A2M; IGFBP2; APOB; SEPP1 ; TFF3; IL6 and/or CH I3L1 ; when compared to the reference to determine, diagnose and/or prognose that an individual has a higher risk of suffering from cardiovascular disease; and/or ii) a decrease expression, amount and/or activity of MET ; GDF15; CCL22; TNFRSF11 ; ANGPT2 and/or RelA NF-KB when compared to the reference to determine, diagnose and/or prognose that an individual has a lower risk of suffering from cardiovascular disease.

9. A method of determining, diagnosing and/or prognosing an individual’s risk of suffering from cardiovascular disease, the method comprising detecting, in a sample obtained from an individual, a single nucleotide polymorphism (SNP) in the RORA gene, wherein the presence of the SNP is indicative of an individual having an increased risk of suffering from cardiovascular disease.

10. The method according to claim 9, wherein the SNP is Reference SNP cluster ID: rs73420079.

11. A method of determining, diagnosing and/or prognosing an individual’s risk of suffering from cardiovascular disease, the method comprising detecting, in a sample obtained from an individual, a single nucleotide polymorphism (SNP) in the GFIR gene, wherein the presence of the SNP is indicative of an individual having an increased risk of suffering from cardiovascular disease.

12. The method according to claim 11 , wherein the SNP is Reference SNP cluster ID: rs4314405.

13. GHR and/or RORA, for use in diagnosis or prognosis.

14. GHR and/or RORA, for use in diagnosing or prognosing cardiovascular disease.

15. GHR and/or RORA for use according to claim 13 or claim 14, wherein RORA comprises an SNP as defined in claim 10 and/or GHR comprise an SNP as defined in claim 12.

Description:
Cardiovascular Disease

The present invention relates to cardiovascular disease, and particularly, although not exclusively, to cardiovascular disease biomarkers and their use in methods of diagnosis and prognosis. The invention also extends to diagnostic and prognostic kits utilising the biomarkers of the invention for diagnosing or prognosing cardiovascular disease.

Due to the complexity of cardiovascular disease (CVD), the ability of single biomarkers to predict cardiovascular outcome (CVO) risk is limited (De Lemos et al 2017), though many genetic loci and protein biomarkers (BMKs) have been discovered. Combining genetic information with other types of omics data should have more predictive power to detect sub-groups of the population and the correspondent molecular signatures related to the clinical outcomes (Vilne et al., 2018). This should point to molecular signatures for CV risk and show their potential to stratify the population in relation to disease progression. A better understanding of the drivers of T2D-associated cardiovascular disease will help to define sub-populations that may differ with regard to disease progression.

The inventors developed and tested a workflow to identify substructure of a trial population by molecular signatures of protein and genetic biomarkers. They discovered signatures defining sub-populations that progressed differently towards CVD, suggesting that these signatures reflect different stages of disease progression. They identified combinations of biomarkers, reflecting differences in CVD progression, will lead to strategies to optimize clinical trials plans, drug efficacy, and so to optimize treatment.

Accordingly, in a first aspect of the invention, there is provided a method of determining, diagnosing and/or prognosing an individual’s risk of suffering from cardiovascular disease, the method comprising; a. detecting, in a sample obtained from an individual, the expression level, amount and/or activity of two or more biomarkers selected from a group consisting of: Tumour Necrosis Factor (TNF)-DD Glutathione S-Transferase Alpha 1 (GSTA1); N-terminal-pro hormone BNP (NT-proBNP); Retinoic Acid Receptor- Related Orphan Receptor Alpha (RORA); Tenascin C (TNC); Growth Flormone Receptor (GFIR); Alpha-2-Macroglobulin (A2M); Insulin Like Growth Factor Binding Protein 2 (IGFBP2); Apolipoprotein B (APOB); Selenoprotein P (SEPP1); Trefoil Factor (TFF3); Interleukin 6 (IL6); Chitinase 3 Like 1 (CHI3L1); Hepatocyte Growth Factor Receptor (MET); Growth Differentiation Factor 15 (GDF15); Chemokine (C-C Motif) Ligand 22 (CCL22); Tumour Necrosis Factor Receptor Superfamily, Member 11 (TNFRSF11 ); Angiopoietin 2 (ANGPT2); and v -Rel Avian Reticuloendotheliosis Viral Oncogene Homolog A Nuclear Factor- kappa B (ReLA NF-KB); b. comparing the expression level, amount and/or activity of the biomarker with a reference from a healthy control population; and c. determining, diagnosing an d/or prognosing the risk of an individual suffering from cardiovascular disease if the expression level, amount, and/or activity of the biomarker deviates from the reference from a healthy control population.

Advantageously, the method of the invention enables the identification of individuals who are at risk from suffering from a CVD event, and thus enables early intervention to prevent, or reduce the risk of, the individual from suffering from a CVD event. In particular, detection of each biomarker in isolation enables the identification of individuals who are at risk from suffering from a CVD event, and detection of multiple biomarkers, i.e. the biomarker signature provides a particularly effective means of enabling early intervention to prevent, or reduce the risk of, the individual from suffering from a CVD event.

The skilled person would understand that the term prognosis may relate to predicting the rate of progression or improvement and/or the duration of cardiovascular disease with individual suffering from cardiovascular disease. The method may be performed in vivo, in vitro or ex vivo. Preferably, the method is performed in vitro or ex vivo. Most preferably, the method is performed in vitro.

The expression level may relate to the level or concentration of a biomarker polynucleotide sequence. The polynucleotide sequence may be DNA or RNA. The DNA may be genomic DNA. The RNA may be mRNA.

The amount of biomarker may relate to the concentration of biomarker polypeptide sequence.

The biomarker activity may relate to the activity of the biomarker protein, preferably activity in relation to interacting biomarkers.

Categorical Biomarker(s):

LLQQ=2 -> inactivation (-1)

LLQQ= from 3 to 4 -> no activation (0)

LLQQ=from 5 to higher levels-> activation (+1)

For the quantitative biomarker(s) we used the tertiles of the distribution. Accordingly, the expression level, amount, and/or activity of the biomarker may be detected by: sequencing methods (e.g., Sanger, Next Generation Sequencing, RNA-SEQ), hybridization- based methods, including those employed in biochip arrays, mass spectrometry (e.g., laser desorption/ionization mass spectrometry), fluorescence (e.g., sandwich immunoassay), surface plasmon resonance, ellipsometry and atomic force microscopy. Expression levels of markers (e.g., polynucleotides, polypeptides, or other analytes) may be compared by procedures well known in the art, such as RT-PCR, Northern blotting, Western blotting, flow cytometry, immunocytochemistry, binding to magnetic and/or antibody-coated beads, in situ hybridization, fluorescence in situ hybridization (FISH), flow chamber adhesion assay, ELISA, microarray analysis, or colorimetric assays. Methods may further include one or more of electrospray ionization mass spectrometry (ESI-MS), ESI-MS/MS, ESI- MS/(MS)n, matrix-assisted laser desorption ionization time-of-flight mass spectrometry (MALDI-TOF- MS), surface-enhanced laser desorption/ionization time-of-flight mass spectrometry (SELDI- TOF-MS), desorption/ionization on silicon (DIOS), secondary ion mass spectrometry (SFMS), quadrupole time-of- flight (Q-TOF), atmospheric pressure chemical ionization mass spectrometry (APCI-MS), APCI-MS/MS, APCI-(MS)n, atmospheric pressure photoionization mass spectrometry (APPI-MS), APPI-MS/MS, and APPI-(MS)n, quadrupole mass spectrometry, fourier transform mass spectrometry (FTMS), and ion trap mass spectrometry, where n is an integer greater than zero.

Thus, the method of detection may relate to a probe that is capable of hybridizing to the biomarker DNA or RNA sequence. The term "probe" may defined to be an oligonucleotide. A probe can be single stranded at the time of hybridization to a target. Probes include but are not limited to primers, i.e., oligonucleotides that can be used to prime a reaction, for example at least in a PCR reaction.

Preferably, a decrease in expression, amount and/or activity of TNF-D, GSTA1 , NT-proBNP, RORA and/or TNC, when compared to the reference, is indicative of an individual having a higher risk of suffering from cardiovascular disease or a negative prognosis.

Preferably, an increase in expression, amount and/or activity of GFIR, A2M, IGFBP2, APOB, SEPP1 , TFF3, IL6 and/or CHI3L1 , when compared to the reference, is indicative of an individual having a higher risk of suffering from cardiovascular disease or a negative prognosis.

Preferably, a decrease in expression, amount and/or activity of MET, GDF15, CCL22, TNFRSF11 , ANGPT2 and/or RelA NF-KB, when compared to the reference, is indicative of an individual having a lower risk of suffering from cardiovascular disease or a positive prognosis. Preferably, step a) comprises detecting, in a sample obtained from the individual, the expression levels, amount and/or activities of at least 3 of the biomarkers selected from the group consisting of: TNF-D; GSTA1 ; NT-proBNP; RORA; TNC; GHR; A2M; IGFBP2; APOB; SEPP1 ; TFF3; IL6; CHI3L1 ; MET; GDF1 5; CCL22; TNFRSF11 ; ANGPT2; and Re I A NF-KB. However, step a) may comprise detecting expression levels, amount and/or activities of at least 4 biomarkers or at least 5 biomarkers.

Step a) may comprise detecting expression levels, amount and/or activities of at least 6 biomarkers or at least 7 biomarkers. Alternatively, step a) may comprise detecting expression levels, amount and/or activities of at least 8 biomarkers or at least 9 biomarkers. In another embodiment, step a) may comprise detecting expression levels, amount and/or activities of at least 10 biomarkers, at least 11 biomarkers, at least 12 biomarkers, at least 13 biomarkers, at least 14 biomarkers or at least 15 biomarkers. In another embodiment, step a) may comprise detecting expression levels, amount and/or activities of at least 16 biomarkers, at least 17 biomarkers or at least 18 biomarkers.

Preferably, step a) comprises detecting, in a sample obtained from the individual, the expression levels, amount and/or activities of the biomarkers: TNF-D; GSTA1 ; NT-proBNP; RORA; TNC; GHR; A2M; IGFBP2; APOB;

SEPP1 ; TFF3; IL6; CHI3L1 ; MET; GDF15; CCL22; TNFRSF11 ; ANGPT2 and Re I A NF-KB.

The cardiovascular disease may be selected from the group consisting of: cardiovascular death; myocardial infarction; stroke; and heart failure.

In one embodiment, RORA is provided by gene bank locus ID: HGNC: 10258; Entrez Gene: 6095; and/or Ensembl: ENSG00000069667. The protein sequence may be represented by the GeneBank ID P35398-2, which is provided herein as SEQ ID No: 1 , as follows: MESAPAAPDPAASEPGSSGADAAAGSRETPLNQESARKSEPPAPVRRQSYS

STSRGISVTKKTHTSQIEIIPCKICGDKSSGIHYGVITCEGCKGFFRRSQQSNAT

YSCPRQKNCLIDRTSRNRCQHCRLQKCLAVGMSRDAVKFGRMSKKQRDSLY

AEVQKHRMQQQQRDHQQQPGEAEPLTPTYNISANGLTELHDDLSNYIDGHTP

EGSKADSAVSSFYLDIQPSPDQSGLDINGIKPEPICDYTPASGFFPYCSFTNGE

TSPTVSMAELEHLAQNISKSHLETCQYLREELQQITWQTFLQEEIENYQNKQR

EVMWQLCAIKITEAIQYVVEFAKRIDGFMELCQNDQIVLLKAGSLEVVFIRMCR

AFDSQNNTVYFDGKYASPDVFKSLGCEDFISFVFEFGKSLCSMHLTEDEIALFS

AFVLMSADRSWLQEKVKIEKLQQKIQLALQHVLQKNHREDGILTKLICKVSTLR

ALCGRHTEKLMAFKAIYPDIVRLHFPPLYKELFTSEFEPAMQIDG

[SEQ ID No:1 ]

Accordingly, preferably RORA comprises or consists of an amino acid sequence substantially as set out in SEQ ID NO: 1 , or a fragment or variant thereof.

In one embodiment, RORA is encoded by a nucleotide sequence which is provided herein as SEQ ID No: 2, as follows:

GT ACCAT AG AGTTGCT CT G AAAACAG AAG AT AG AGGG AGT CT CGG AGCT C GCCAT CT CCAGCG AT CT CT ACATTGGG AAAAAACAT GG AGT CAGCT CCGG CAGCCCCCGACCCCGCCGCCAGCGAGCCAGGCAGCAGCGGCGCGGACG CGGCCGCCGGCT CCAGGG AG ACCCCGCT G AACCAGG AAT CCGCCCGCAA G AGCG AGCCGCCT GCCCCGGTGCGCAG ACAG AGCT ATT CCAGCACCAGC AG AGGT AT CT CAGT AACG AAGAAG ACACAT ACAT CT CAAATT G AAATT ATT C CATGCAAG AT CT GTGG AG ACAAAT CAT CAGG AAT CCATT ATGGT GT CATT A CAT GT G AAGGCTGCAAGGGCTTTTT CAGG AG AAGT CAGCAAAGCAATGCC ACCT ACT CCT GT CCT CGT CAG AAG AACT GTTT GATT GAT CG AACCAGT AG A AACCGCTGCCAACACT GT CG ATT ACAG AAATGCCTTGCCGT AGGG AT GT C T CG AGATGCT GT AAAATTTGGCCG AAT GT CAAAAAAGCAG AG AG ACAGCTT GT ATGCAG AAGT ACAG AAACACCGGATGCAGCAGCAGCAGCGCG ACCAC CAGCAGCAGCCTGG AG AGGCT G AGCCGCT G ACGCCCACCT ACAACAT CT CGGCCAACGGGCT GACGG AACTT CACG ACG ACCT CAGT AACT ACATT G AC GGGCACACCCCT G AGGGG AGT AAGGCAGACT CCGCCGT CAGCAGCTT CT ACCTGG ACAT ACAGCCTT CCCCAG ACCAGT CAGGT CTT GAT AT CAATGG AA T CAAACCAG AACCAAT AT GT G ACT ACACACCAGCAT CAGGCTT CTTT CCCT ACT GTT CGTT CACCAACGGCG AG ACTT CCCCAACT GT GT CCATGGCAG AA TT AG AACACCTTGCACAG AAT AT AT CT AAAT CGCAT CTGG AAACCTGCCAA T ACTT G AG AG AAG AGCT CCAGCAGAT AACGTGGCAG ACCTTTTT ACAGG AA G AAATT GAG AACT AT CAAAACAAGCAGCGGG AGGT GAT GTGGCAATT GT G TGCCAT CAAAATT ACAG AAGCT AT ACAGT AT GTGGTGG AGTTTGCCAAACG CATT G ATGG ATTT ATGG AACT GT GT CAAAAT GAT CAAATT GTGCTT CT AAAA GCAGGTT CT CT AGAGGTGGT GTTT AT CAG AAT GTGCCGTGCCTTT G ACT CT CAG AACAACACCGT GT ACTTT G ATGGG AAGT ATGCCAGCCCCG ACGT CTT CAAAT CCTT AGGTT GT G AAG ACTTT ATT AGCTTT GT GTTT G AATTTGGAAAG AGTTT AT GTT CT AT GCACCT G ACT G AAG AT G AAATTGCATT ATTTT CTGCAT TT GT ACT GAT GT CAGCAG AT CGCT CAT GGCTGCAAGAAAAGGT AAAAATT G AAAAACTGCAACAG AAAATT CAGCT AGCT CTT CAACACGT CCT ACAG AAG A AT CACCG AG AAG ATGG AAT ACT AACAAAGTT AAT ATGCAAGGT GT CT ACCT T AAG AGCCTT AT GTGG ACG ACAT ACAG AAAAGCT AATGGCATTT AAAGCAA T AT ACCCAG ACATT GTGCG ACTT CATTTT CCT CCATT AT ACAAGG AGTT GTT CACTT CAG AATTT G AGCCAGCAATGCAAATT G ATGGGT AAAT GTT AT CACC T AAGCACTT CT AG AAT GT CT G AAGT ACAAACAT G AAAAACAAACAAAAAAAT T AACCG AG ACACTTT AT ATGGCCCTGCACAG ACCTGG AGCGCCACACACT GCACAT CTTTTGGT GAT CGGGGT CAGGCAAAGG AGGGG AAACAAT G AAAA CAAAT AAAAGTT G AACTT GTTTTT CT CAT GCAT AT GATTT CCATT ATGCCT AC AG AT ATGGACCCTTTTT CT GT CTT G ACTT CTT GAT CATT G ACCT CT GTTT AC AACAGG AGG AGGGT ACT AAAGT CGG AGGATTT CCTTTT CTT GT AGCT CACT GCCCACAG ACTTT CT ACAG AGT CACCAAT CT GT CAGT AACAACAG AG AGT C CAGCAAT AAT CGGT G ACTGGT GTGCAT AGCGG AGGTTGCGGCATT ACTTT GCACAACT AGCT CTTT GTTT CAT G AAGG AAGTTTTT ATTTTTT CACCG ATT A TTGCCAGT CCGCAGG ATGGCAT G AAAAGGGT CCAT AGCAGT AGCAACAAT AG CATT AT AAT AT ATT ACAGG GT AAATGGGCAT G AAG ACT AT AT AT AGCT A A AAG AG AT ATT GTTT AT AT ATT GTTTT AAGT AAT AT AAAAT GT AGTT ACTGGT G T AGCTTTT CCT GTT G AATT GAT AAGGCACTTT CATTTTGCACCTTTTT CTTT A AATT AAATGCT AGCGT GTT CACT GT CGT GT CGCAT GTGCACCAG AAACACA AGTTT AACT G AG AAGGCTTGGAAGGT ACGTTGGG AGGT ATTT ATGCTGCTG TTT ACAAAATT ATTTTT AAG AG ACTGGCTGGT CAT AT CT AG AAAT CACCACG TTGGATTTTTTTTTT AACAT GT G AATTTGGAATT AG AAACGG AACT CT CCCT AAATT AT ACTTTGCTTTTTGGT AAGTTT AAT GAT AG AT GT GTTT ATGCTT CAT ACAAAGTT G AAT GATT G ATTGGCGTGGT GG ACAT AT ACCAT CATGCT CATT TTTTTTTTTT AAAGCTTTTT AAAATGCCACCT CATGG AGGCG AGGGGG AGG AG AAGCT CATTTT ACACAATT CAGT AGTT AAAT ATGG ACT CGGT CT CAACTT GGAATT CTT ATGCTTT G AG AACAAAT CAACAACCAG AAT ATTT ATTGGAAT C T AGCTTTT ATT AT AAG AAGG ACCCAAAG ATT AT AT CCT G AGCAAATGCACAC T CCCCAT GT G AGG ACAT G AAGT ATTT ACTTT GT G AAT GTTT AT GTT CTTGGT AT AAT CT AGG AACCCT AT G AGTTT AT CT CAG AGT G AACT AATT CT AG ATTT G TT GT CAAT AG ATGCT AT AATT CAAG AAT GTTGCT CT CCAT ATTT GAAAAACG AT GG AT AGG AGGGT G AGGG AAGCAT ACAAT GTT GAACCAGTTT CT CTT ATT T AAAT ATT AAAT ACTTT AAGCCTTT AAAGT G AAGTT G ATGCT AGCTGCAAAC ATTT ACT CTT GT ATTT AT CTT CACT AGG AAACT GT GG ACT GT AATTT ATTTT A TT AAAT ATTT AG AAG ATT ATTT GGCTT GT GT GTT CAGGT GAG AAAT ACT GAG TT GTTTTT GTTT AATTT CATGGTTTTTTTTTTTTT AAT GAT CCCT AGT GGGGG AAGGGG AAAGG AAT AGT CT GAT AAACAGAT GTGCAT ATTTT AAAAACAAGT G ACCTTTTGGG AAT GT AGGCATTT AG ACG AT G ATTTT AGT CGCACT AGGGG TGGG ATT CAAACT ACTGGT CAAAG ACCATTTT GT ACAGAAAAGGG AACAT C T CCG ATGGGT GTT AAGGT AGGAGTTT CCATGCAGCCCTTT AT GT CT G AG AA AT AGT CT CCT CTGCCATTGGGGT CCCTGCGG AAT CTT CT ACAGG AATTGCA GCT CTT CACGT CATGCT AGGTT ACCAGCATG GGCTT CCCAGAGCACTT CA CT CT GTTT CT AACT CCACT G ACTTTT CT G ACTT GTTT CTT G AGGG ACTT GG A AAAGGGG AAGGT ATT ATT CACACAGAT GT GT GT AT G AAGCCT AAAT AACAT CCAACTTTTTTT CCAAAAACAT AGT AAG AGTTT AACCAT AAAT AG AAG AGT A AT ATT CCTTT AAATTT CTT ACAGGCAT GT AAAATTTT AT GT GTTT AT AG AG AG ATGCT AACT GT CAGCAT AT AGT ATTT AT ATTGGGGCAAGAAGGGTT AAAT C AAAT CTTT AATTT AAGT AAGCAT AGTT CCTTT AAAG AT CAGT AGT ATTT AT AC T CT G AAAAG AGT ACCAAGCTT AGTT CAGTT ATTT ATT CAT CCATTTTTTTT CT ATT GTTT CT CCT CT GGGGGG AAATGGT GTTTT GTTTTT GTTT G AGTTTT GT G TTTT AGTTTTTTGGTTTTGGTTTT GTTTT GTTTTTTT GT CATT CAG ATT CACT A AATTTGGGG ATGGTTTT ATT AAAGT AT ATT AACTT CTTTTT AACCAAAGCTTT CT G AAT AT G ACCAGCCT CAGGTGCT AGCGCATT AAAG AGCAACT AAACCT A ATT CCAGT GT CG ATTT GT G AAAT AAT AAAAGCT AATT G AATTT CT CT AAGGG T G AG AG AG AACAT AAT GAT G AACTT AAAAAG AAATT CCTTTT CCCAG AAGG GATT CTGCAT GT ACCT ACAAACAAT CATGCT CT AACCACAGT AAT AGT CAT G CCATGGT G ACATT GCTTTT ACGGT AAACCAAG AT AGGT AAT AT G ATGCTT C TT CCCAGGT GTTT CT G AAAG AAAAGCAGCGGTGG AGCTT AAG AGCCAAGT CCACT GATT G ACAT AAT AGG AT GG AATT GT AG ACAG AG ACATGCT CCAT G A AACAAGG AAACAACT G ACT ACT ATTT GG ATCT AAGTT GGG ATCTG ATGTTA AACAAT AAATT CAGTTT AAAAAAAAT ATGGAGCT CAG AAAAGG AT GT GAAAA ACTTTGCATTTT CCTTT CTT CATT ATT ACAAAAACACCT AT GT GAT G ACAT AA AAT ACTTGGGT GAT AT CAG AAC AATT AT CCTT AAT AGTTTTT AT AATT AAAGT TT CCT AAACTT CAGT CT CCAAT AGT CTTTT AAGG ATT GG AACCACAT CACT G TC AG CCCCG CTG CT AC A AT G CCTTT GT ACA ATTTTTTT ACAT AAG G C AA AT A AGGCTTTT GT ACAAAGCCT G AAT ACCTT CT GTT CCAAT GGT GT CCAAATGG TT ATT AT ATT GT G AAAAACCTGGCCTT GGT CACAATGCAAACAAAAGT ACAG AT GAAAGTGCTTTTTGG ACAGTTTGCAAATT GT GTT AAAGCT ACGG ATTTTT TTT AAAGT GTT CAGCAT CTT AACGT GT ATT AAAGCT ATGGATTTTTTTTT AAA GT CTT CAGCAT CCT AATT CACCCTTT CT AACTT AAGAAAAACAT G ACTT AAG ACACTGCTT CT AAGTTTGGTT GTT CTTT AT AGT GTGG AT ACCCAGTT AT CT G TGAGCGTATGGGGGTGGGCTGAGGGTCAGGTGAGGAAGGAGTGTGTGTG T GT GT GT GT GT GT GT GT GT GT GT GT GT GT G ATTTGCAT GT GT AT GAT GT GT GTGCGT CGGACCGCTT CT AGGCT ACT AAGT GT CAATGG AAAAGAAAAT GT A TT CAAAAT ACTT AAAT CAAAACT AG AAG ATGGGG AAAAAAAG ATTT ATT CT A T ACAAAGCCTT GT CTGG ACCACTTT AG AG AG ACTT CT ATTTTTTT AACCCTT CT AT AAAT ATTT G ATGGCACTT G AAAT ATT CCTGCAAT AAAAT GT G ATTT GT GT AAAG AAAAAAAG ATTTT GT AAT GT G AAAC AAAG AAAG AAAGT AAT GT AAT TTT CT AAAAAAAAAAAT ACAAACAAACAAACTTT GT ATT ATTTT CTT G ATGG A ATTT GT CT AT CT GT CTTTGG AAAACTTTTT ATTT CATT G AAT GTGCCAT AGT A G AAAT GT GT GTTTTT AGTTTT AG ACT AAGGAAT AGCT AGTT GTT GT GTT CCG ACATT CCAAAATGCAAAACAACCT AGT AGT AT CTTT CAT G AAAAGGTTT AAG TAGT ATGT A AG CTT CT CT C ATT GTTG CTTTTTT G C AC AT GTT CTT C ATT CCTC T CT AGTGCAAT AT GT ACAT AG AGCACTT GCGGGT GT ACCTT GAT CCCT CAG GGAAAAAT ACAT ATTT GT ACAGTTTTTT GGGGTTTTTTT GTTTTTTG GTTTTT TTT GTTTTGCCTTTT GTTTTTGGCT AAGGAAT GT CG AT CG AAT CACTT GTT A TT GTT G AGGGGCAGCCAG AT A AT AAT CCT A AAG CC ACT GTTT CCA AC ATT G ATT GTTT AAAT CAT AT GT CCTT CCAATGCT ATT ATTTT AAG AT AAT AAT AAAA AGTT ATTTT CT G ACAGTT CTTT GTGCT G ACTGGT GAAAAACAAGGGT AAAT A AGCACCTT AT AATT G ACTT ACT GT G AAT G ACAAT CCAT CTTGGT AT CAACG A T AGAAGCCCT AT CATTTTGG AGTTGGGGTT AAG AGT CAG AAACAAT GTGCT CAGGG AT CT CCT AAAACT CTT AAAACAGGGTGGCCAGT ACT ACTGGG ACA AATT GT GTTTTTT ATT ATT AAT AAT AAT GAT AAT AAT ACT AT CCCT CCAAG GC ACAAGT G AACT AT AT AG AACTGCGT GTGTGT AAACT CTTT ACT CT CGT CT CA TTTT GTT G AGTTT AGAACTT GAT GTGCT CGT CAGT CTT GT GTTT CAAAACAC T G AAT AACTT CCAAAGCAAAGTT ATGCCAGT GT GTT CAAAAGAT AAATT AAT AAT GT ACCAGCAAAG AGCAT CATT CAAAGT AT AGT CCTTGCAT ATT CCAGTT ACC ATTT CT CT AAT AATT AAA AT ATT CAT GAT AAAT AT AT AT AT AG CAT CAT A TGTT AAAAACT ATTT CAAATT CT ACAT ATT AAGG AT G AAAATTTT AAAAT CCA GT AAT AAG AGG AG AG ACCTGCCT AT CAGT ACAGT GAT AT AGGT AAAAG AT G AAAT AT GTTTTT AAAAT ACC AG C A ATT AACT ATT GTTTT CATGGGTT CT CCT C T AG A AG CAA ACC A AAA ATT CCTT CATGGAAAACAATT CTT ACTT CT ACAT GT GT AGTTT AT AT CT G ATGCATT ACAAG AGCT AAT GTTT AAAGACAAAACAAAA CCTGCCT GT AT G ACAGCAGCAACT CG AGCCAACATTT AGT GTT ACAT GTT A T ATTTTT G AAAT ACCGTTTT GTT AT CAT ATT CCACAT ATT ACTTT CCAT AAAG T CAG AG AAGTT CAAT GT AATT GTTGGCT CT GATT CTT CCACCTT GGG AT AC ACATT CACAAG AAG ATTT AT AT ATTTT CTT ACT AT GAT AG AGG AACAT AAT CT GGG AAAACTT CCCAT GT CT GT AAG AT G AAAAGG AT ACCTTT ACCAT GTT GT TTTT GAT AAT AAAG AAT ATGG AAAATGGT AG AAAT CT CT CCT CT CT AT GT GT ATTT GT AT AT GT GT GT AT ACACACACACGTT G ACATTTTT ACAT AACAT GT G ATTGCCACTT CTT AT AAAGTT AT GAT AT AGAACCCCAT G AAAACAACTTT AT ATT AT AG AT CAAAT AAT GTGCCCAG AAACGCAATTGCAACAGT AAAT CTTG A T CT ATTGGT AAG AGT CT AT GAT ACAGGT CTT CATT CT AT CCCT AAACAT AAT GTT AGT AAAG AGGTT CCAT CAG ATT GT ATT AT AG AG ACCTT CCT ATTGCT AT TT ATTTTT AAGAG AT GAG AAG ACT G ACT AGCAAT GT CT CCACAGTGCAATT GGTTT CACTT CTGGGT CT GT CT GT CT GTTT GT GT AGGT GAG AT CAGT GTT G T CAGGCT CT CAG AAT AATTTTT AAAAG AAT CCAG AAGCCT G ACTT CACACC AAGT AGCCAT CCT AG ATGGGCGGGGGGG AT CT CCAT GTT CAACCAAACCC T CAG AACTT AG AGCAAT AAAT AC ATT CAGT CATTT ATTT AT AAAT G AAT G AC AG AAT ATT CAAT CCAAAT AG AAACAACCTTTT GT CAGT GATGCAACAT AACA TGG AGT CTT AT ATT G AG AAAGT G AAT AT GAAT ATTTT AAAT AG ATGCCT AGG AAAT CT GT GTTTGCT GTTT ACATTT AAT GT ACTTTGCCACATT AGCAGT ACC ACACTT CTTTT ACTTT CAT CCTT CT AAG AACT AAT GAAAGAT AGCTTGCT ATT AGCTTT GACAT GTT GCACATGCCATT AT GTT GTT CT CT AAG AACAACT AAAT T CT CT CCAGT GTT CAT GTGT GATT CCTTTTT CAT CATT ATT ACTTT AAT GTGG AT GAAT ACT ATTT CT AGG AACTT GATT AT CACT G AACAG ATTGCAT AT GT AA ATGCAG AT AATT CTT G AGCAAT AT AGT G AAAAT G ATTT CACAAAAAAAAT CC ATATGT ACT GT AT ACAT CTT CCT GAAT CCAAAATT CT CCT GTT AT CCACAC A GTTT G AAAT CT CAT ATT AAACAGTGGT AAATTTTT AAACATT AGGACATT AG CTGGCTGCTT CT AT AAAT GT ACCT GT ATT GT CTGCCTGCT CCTTTT GTGGA GAT GT GTTTTT GT ATTGG AT GTTTGGGCCAATGGG AGGCTT CAAGCT ACAA CAT ATTTT CCCACTTT AAAT AAT ATTT AACAT AT G ACAAACCCCAG ACAAGG CTTTT AAT AT GT AT AAAG AACTGCAGT GTT AGGTGGTTT ATTTGCAAACT GT TTT CAGT GTT GT CCATTTTT ATGGCACCTTT AACAAT ATT CTT GTTT CCAAAG TACAAAAAAAAT AGGTT AAAT AATCT AGCCAT AACTGAATGTCAAGCAATAA AACAAAT GACTTTT GTGCAT AGACTT AAAAAAT ACAAATTTT AGACAT CTGC AT GT AAATGCAAT CT CTT GTTT ACTGCTTT CTT CAACTT G AGCAAT CG ATT C CTT ATTT ACT ATT AACTT AAGAACTTGTAACGGCCTGTAAACATTTTCTCTCT T ACT CTT CTTT CAAATTT ACCT GT CCCTGCTTTT G AGT CAT AAT ATTT AAT GT T GTT CTGCACAT CT CT AT ACAGTT AACTTTTTGGCTTT CATT CT GT AT AG AT A AG AAAAT GTT AT ATT AT AAACAGCCT ACT CAGTGCAAAT ATTT AT CT GTTT AT CA AAT CC AC AAT ATG CTGT AT AAT ACCG GTTTT ACT AT AT AAT CT ATTTT AG A CAT AGCT GTTT AG AACT AG AGT GTGCT ATTTTT GT GTTTTT CT GAT GTGTGG TGCT AG ACAAGTT ACTTTT GT G AACAACAAAAATT AT CCCTTTT ATT CCT AG ACAAT ACCACCTTTGGGT CTT GTT AATTT CACT G AGT AT AACT AT AT ATTT GT AT AT AT AT AC AT AT AT AT AT AT AT CTACCT AT G CCC AACT G G C AG CTGT ATC AG AGTGCTGG ATTTGGG ACATGCTTTT CT CTTT AAAT ACAT AAT AT CATT AT AT AAATT ATT CT AG AGT GT ATTT AATT AGG AT AAAATT ACTT CCTT AGT ATGG ATATTTGACATCTATAGGGTG AATTT G TTT AT AAAT ATG G CT AT ATG G A A AC TT ATT AG CATTT ACTTT AT GTTT G CT ACTT G G CTTT AC AG CAT AT CT CCT AAG CT G AAAAAT AATTT GCCAGGCCTT CAAG AT CCT AAAGAAACTT GTTT AATGG AGT AAT AT ACTTTTTTTT CTT ATT AAG G A ATT GT ATT ACTGGCACCT A AC AC A GTT GT ATT CTT AGCT CCT ATT AT AG AT AATGGGCATTT ACAT AAAAT AT CCT A G ATGGCTT G ATGGCAG AAT AAACCTTT CCCCT CCT ACCT G AGT CAT G AG AA GGATGG AG ACGT CCT CTGCCAT AACAT GGGCCAT AAAGCAAATT CG ACAT GGG AT GTT CT GTTT CAGT AT G ACCT CAACCAGTT CCAT G AACT G AGT G AAG G ACCTT CATTTT CAAAGTT ATTT AAT AAGT AGCTT AATT AAGCCTTT CT ACCC ATT CT CCCAAG AT CT ACTGGCATT ATT G AAAAGCAAAGTTT AT CAAAT AT CT AACT AAGG AT GTAGTT AACCTT ATT AAAT ATT GATT AG AATT GTT CT GT AAT A TT ACT G AATTT GT AAG AT CTTT AGCAAAG ATTTTT G AGCAATTT AT AAAT GTA G AGCAAAT GTTT CT GTTT ACTGCACTTTTT GT AACT G AAGGT GAT AAATT CT CAAG CCAT GATT ATT G G CTT CCAT G C ACTG C AAT ATTT AT CC AC AATT CT AG ACATTTT CCATTTTT GT GG AAG AGTTGCT GTT ACCTT AATT AT AAATGCAATT GT GT GGTT AAT G AGAGCT AATGCT AGT AGTT AACCTTTT AAAGTGG ATTGG CT ACAGTT G AGGG AG AAAT CT CTTTT AAT AT AAAT CACAT CATT CCTT AACT GCCT CT CTTGG AAAG AG ATT G AAACCTTTTTTTT AAAGCACG ATTT AGCAT C CT AAGCTT CCT G AGGGT AG AG ATT GT AT CTTTTTGCGT CTGCACAAT GGCT AGCACAT GT CAGCATTT G ACAATT GTT AAAT GAT AACAAGT GTGCCCCAATT AAAACGTTTTT CCT GGGTT GTTTT GTT AAATTT ACAAAGT AAGCCAAGCCTT ACGGTT AACATT CT CCT CT ACAACCAAGT ATT AAAG CC AC ATTT AAA AAG AC CACAT G AAATGCT GATT CT AATT GT GT GT AGGT CTT G AGG ATT AAGCACAC AAATTT CACAAACTT CT GTTT G AGT AAACAAACT CAGCCTT CT GT AAAT AT A CATGCAAGTTTGG AAACAGT AAT ACT GT ACCT AT AAAT AT ATGCT GT CT GTT TT GT GT ACAGT AT GT AAAAACT CCTTTT CTGCCACACT AAAAATG CAAG CCA TTT ATGGG AAT CCT AAAACT AGT ATT G AACT AAAACTTTGCT AAT GAT CTTT A TT AG AGG AT CGT CCAACTTTT CACTT ACCTTGGGTTTT CTTTT CAATT CACT CTT ACACT AGT CTGCTT ATTT CCAGCT GTTT ATTTT ATT G AGT CCT G AATTT A AAAAAAAAAT ATTTT GATT CATTTT GT AAAT ACAAGCT GT ACAAAAAAG AG A G ATTT AAT GTT GT CTTTT AAAT ACT CCAATTTT CATT CT AAT AT G AAT GTT GT T AT ATT GT ACTT AG AAACT GT ACCTTT AAT ATT ACATT ACCTTT ATT AAAAGT GCATT G AACACAT CAATTTT AG AT GTGCTTT AT GT ACT GTT AT CCT AT AAT AA AACTT CAGCTT CT AATGG AA

[SEQ ID No: 2]

Accordingly, preferably RORA comprises or consists of a nucleotide sequence substantially as set out in SEQ ID NO: 2, or a fragment or variant thereof.

->ln one embodiment, GHR is provided by gene bank locus ID: HGNC: 4263; Entrez Gene: 2690; and/or Ensembl: ENSG000001 12964. The protein sequence is represented by the GeneBank ID P10912, which is provided herein as SEQ ID No: 4, as follows:

MDLWQLLLTLALAGSSDAFSGSEATAAILSRAPWSLQSVNPGLKTNSSKEPKF

TKCRSPERETFSCHWTDEVHHGTKNLGPIQLFYTRRNTQEWTQEWKECPDY

VSAGENSCYFNSSFTSIWIPYCIKLTSNGGTVDEKCFSVDEIVQPDPPIALNWT

LLNVSLTGIHADIQVRWEAPRNADIQKGWMVLEYELQYKEVNETKWKMMDPIL

TTSVPVYSLKVDKEYEVRVRSKQRNSGNYGEFSEVLYVTLPQMSQFTCEEDF

YFPWLLIIIFGIFGLTVMLFVFLFSKQQRIKMLILPPVPVPKIKGIDPDLLKEGKLE

EVNTILAIHDSYKPEFHSDDSWVEFIELDIDEPDEKTEESDTDRLLSSDHEKSH

SNLGVKDGDSGRTSCCEPDILETDFNANDIHEGTSEVAQPQRLKGEADLLCLD

QKNQNNSPYHDACPATQQPSVIQAEKNKPQPLPTEGAESTHQAAHIQLSNPS

SLSNIDFYAQVSDITPAGSVVLSPGQKNKAGMSQCDMHPEMVSLCQENFLMD

NAYFCEADAKKCIPVAPHIKVESHIQPSLNQEDIYITTESLTTAAGRPGTGEHVP

GSEMPVPDYTSIHIVQSPQGLILNATALPLPDKEFLSSCGYVSTDQLNKIMP

[SEQ ID No:4]

Accordingly, preferably GFIR comprises or consists of an amino acid sequence substantially as set out in SEQ ID NO: 4, or a fragment or variant thereof. In one embodiment, GHR is encoded by a nucleotide sequence, which is provided herein as SEQ ID No: 5, as follows:

AACT AGCAATGGT GGT ACAGTGG AT G AAAAGT GTTT CT CT GTT GAT G AAAT AGGCGGCGGCGGCGGCAGCGGCAGCAGCAGCTGCTACAGTGGCGGTGG CGGCGGCGGCTGCTGCTGAGCCCGGGCGGCGGCGGGGACCCCGGGCT GGGGCCACGCGGGCCGGAGGCCCCGGCACCATTGGCCCCAGCGCAGAC GCG AACCCGCGCT CT CT GAT CAGAGGCGAAGCT CGG AGGT CCT ACAGGT AT GG AT CT CT GGCAGCTGCT GTT G ACCTTGGCACTGGCAGG AT CAAGT G A TGCTTTTT CT GG AAGT G AGGCCACAGCAGCT AT CCTT AGCAG AGCACCCT GGAGT CTGCAAAGT GTT AAT CCAGGCCT AAAG ACAAATT CTT CT AAGG AGC CT AAATT CACCAAGTGCCGTT CACCT G AGCG AG AG ACTTTTT CATGCCACT GGACAG AT G AGGTT CAT CATGGT ACAAAG AACCT AGG ACCCAT ACAGCT G TT CT AT ACCAG AAGG AACACT CAAG AATGG ACT CAAG AATGGAAAG AATGC CCT GATT AT GTTT CTGCTGGGG AAAACAGCT GTT ACTTT AATT CAT CGTTT A CCT CCAT CTGG AT ACCTT ATT GT AT CAAGCTTGCAACCAGAT CCACCCATT GCCCT CAACTGG ACTTT ACT G AACGT CAGTTT AACTGGG ATT CATGCAGAT AT CCAAGT GAG AT GGG AAGCACCACGCAATGCAG AT ATT CAGAAAGG AT G G ATGGTTCTGG AGTAT G AACTT CAAT ACAAAG AAGT AAAT G AAACT AAAT G G AAAAT G ATGG ACCCT AT ATT G ACAACAT CAGTT CCAGT GT ACT CATT G AA AGTGG AT AAGG AAT AT GAAGTGCGT GT GAG AT CCAAACAACG AAACT CT G G AAATT AT GGCG AGTT CAGT GAGGTGCT CT AT GT AACACTT CCT CAG AT G A GCCAATTT ACAT GT GAAG AAG ATTT CT ACTTT CCATGGCT CTT AATT ATT AT CTTTGG AAT ATTTGGGCT AACAGT G ATGCT ATTT GT ATT CTT ATTTT CT AAAC AGCAAAGG ATT AAAATGCT GATT CTGCCCCCAGTT CCAGTT CCAAAG ATT A AAGG AAT CG AT CCAG AT CT CCT CAAGGAAGG AAAATT AGAGGAGGT G AAC AC AAT CTT AGCCATT CAT GAT AGCT AT AAACCCG AATT CCACAGT GAT G ACT CTTGGGTT GAATTT ATT G AGCT AG AT ATT GAT GAGCCAG AT G AAAAG ACT G AGG AAT CAG ACACAG ACAG ACTT CT AAGCAGT G ACCAT GAG AAAT CACAT A GT AACCT AGGGGT G AAGG AT GGCG ACT CTGG ACGT ACCAGCT GTT GT GAA CCT G ACATT CTGG AG ACT G ATTT CAATGCCAAT GACAT ACAT G AGGGT ACC TCAGAGGTTGCTCAGCCACAG AGGTT AAAAGGGGAAGCAGATCTCTTATG CCTT G ACCAG AAG AAT CAAAAT AACT CACCTT AT CAT G ATGCTTGCCCTGC T ACT CAGCAGCCCAGT GTT AT CCAAGCAG AG AAAAACAAACCACAACCACT T CCT ACT G AAGG AGCT G AGT CAACT CACCAAGCTGCCCAT ATT CAGCT AAG CAAT CCAAGTT CACT GT CAAACAT CG ACTTTT ATGCCCAGGT G AGCG ACAT T ACACCAGCAGGT AGT GT GGT CCTTT CCCCGGGCCAAAAG AAT AAGGCAG GGAT GT CCCAAT GT G ACATGCACCCGG AAATGGT CT CACT CTGCCAAG AA AACTT CCTT ATGG ACAATGCCT ACTT CT GT G AGGCAG ATGCCAAAAAGTGC AT CCCT GTGGCT CCT CACAT CAAGGTT G AAT CACACAT ACAGCCAAGCTT A AACCAAG AGG ACATTT ACAT CACCACAGAAAGCCTT ACCACTGCTGCTGG G AGGCCTGGG ACAGG AG AACAT GTT CCAGGTT CT G AG ATGCCT GT CCCAG ACT AT ACCT CCATT CAT AT AGT ACAGT CCCCACAGGGCCT CAT ACT CAAT G CG ACTGCCTTGCCCTTGCCT G ACAAAG AGTTT CT CT CAT CAT GTGGCT ATG T G AGCACAG ACCAACT G AACAAAAT CAT GCCTT AGCCTTT CTTTGGTTT CC CAAG AGCT ACGT ATTT AAT AGCAAAG AATT G ACT GGGGCAAT AACGTTT AA GCCAAAACAAT GTTT AAACCTTTTTTGGGGG AGT G ACAGG ATGGGGT AT G GATT CT AAAATGCCTTTT CCCAAAAT GTT G AAAT AT GAT GTT AAAAAAAT AA G AAG AATGCTT AAT CAG AT AG AT ATT CCT ATT GTGCAAT GT AAAT ATTTT AA AG AATT GT GT CAG ACT GTTT AGT AGCAGT GATT GT CTT AAT ATT GTGGGT GT T AATTTTT GAT ACT AAGCATT G AAT GGCTAT GTTTTT AAT GT AT AGT AAAT CA CGCTTTTTG A A A AAG CG A A A A A AT CAGGTGGCTTTTGCGGTTCAGG A A A AT T G AAT G C A AACC AT AG C AC AG G CT AATTTTTT GTT GTTT CTT AAAT AAG AAA CTTTTTT ATTT AAAAAACT AAAAACT AG AGGT G AG AAATTT AAACT AT AAG CA AG AAGGCAAAAAT AGTTTGGAT AT GT AAAACATTT ATTTT GACAT AAAGTT G AT AAAGATTTTTT AAT AATTT AG ACTT CAAG C ATG GCT ATTTT AT ATT AC ACT ACACACT GT GT ACTGCAGTTGGT AT G ACCCCT CT AAGG AGT GT AGCAACT A CAGT CT AAAGCTGGTTT AAT GTTTT GGCCAATGCACCT AAAG AAAAACAAA CT CGTTTTTT ACAAAGCCCTTTT AT ACCT CCCCAG ACT CCTT CAACAATT CT AAAAT GATT GT AGT AAT CTGCATT ATTGG AAT AT AATT GTTTT AT CT G AATTT TT AAACAAGT ATTT GTT AATTT AG AAAACTTT AAAGCGTTTGCACAG AT CAA CTT ACCAGGCACCAAAAG AAGT AAAAGCAAAAAAG AAAACCTTT CTT CACC AAAT CTTGGTT G ATGCCAAAAAAAAAT ACATGCT AAG AG AAGT AG AAAT CAT AGCT GGTT CACACT G ACCAAG AT ACTT AAGTGCTGCAATTGCACGCGG AG T G AGTTTTTT AGTGCGTGCAG ATGGT GAG AG AT AAG AT CT AT AGCCT CTGC AGCGG AAT CT GTT CACACCCAACTTGGTTTTGCT ACAT AATT AT CCAGG AA GGG AAT AAGGT ACAAG AAGCATTTT GT AAGTT G AAGCAAAT CG AAT G AAAT T AACT GGGT AAT G AAACAAAG AGTT CAAG AAAT AAGTTTTT GTTT CACAGCC T AT AACCAG ACACAT ACT CATTTTT CAT GAT AAT G AACAGAACAT AG ACAG A AG AAACAAGGTTTT CAGT CCCCACAG AT AACT G AAAATT ATTT AAACCGCT A AAAG AAACTTT CTTT CT CACT AAAT CTTTT AT AGG ATTT ATTT AAAAT AGCAA AAG AAG AAGTTT CAT CATTTTTT ACTT CCT CT CT G AGTGG ACT GGCCT CAAA GCAAGCATT CAG AAG AAAAAG AAGCAACCT CAGT AATTT AG AAAT CATTTT GCAAT CCCTT AAT AT CCT AAACAT CATT CATTTTT GTT GTT GTT GTT GTT GTT G AG ACAGAGT CT CGCT CT GT CGCCAGGCT AG AGTGCGGTGGCGCG AT CT T G ACT CACTGCAAT CT CCACCT CCCACAGGTT CAGGCG ATT CCCGTGCCT CAGCCT CCT GAGT AGCTGGG ACT ACAGGCACGCACCACCATGCCAGGCT A ATTTTTTT GT ATTTT AGCAG AG ACGGGGTTT CACCAT GTTGGCCAGG ATGG T CT CG AT CT CCT G ACCT CGT GAT CCACCCG ACT CGGCCT CCCAAAGTGCT GGG ATT ACAGGT GT AAGCCACCGTGCCCAGCCCT AAACAT CATT CTT GAG AGCATTGGG AT AT CT CCT G AAAAGGTTT AT G AAAAAG AAG AAT CT CAT CT C AGT G AAG AAT ACTT CT CATTTTTT AAAAAAGCTT AAAACTTT G AAGTT AGCTT T AACTT AAAT AGT ATTT CCCATTT AT CGCAG ACCTTTTTT AGG AAGCAAGCT T AATGGCT GAT AATTTT AAATT CT CT CT CTTGCAGG AAGG ACT AT G AAAAGC T AG AATT GAGT GTTT AAAGTT CAACAT GTT ATTT GT AAT AG AT GTTT GAT AG ATTTT CTGCT ACTTTGCTGCT ATGGTTTT CT CCAAG AGCT ACAT AATTT AGT TT CAT AT AAAGT AT CAT CAGT GT AG AACCT AATT CAATT CAAAGCT GT GT GT TTGGAAG ACT AT CTT ACT ATTT CACAACAGCCT G ACAACATTT CT AT AGCCA AAAAT AGCT AAAT ACCT CAAT CAGT CT CAG AAT GT CATTTTGGT ACTTTGGT GGCCACAT AAGCCATT ATT CACT AGT AT G ACT AGTT GT GT CTGGCAGTTT A T ATTT AACT CT CTTT ATGTCTGT GG ATTTTTT CCTT CAAAGTTT AAT AAATTT A TTTT CTTGG ATT CCT GAT AGT GTGCTT CT GTT AT CAAACACCAACAT AAAAA TGATCTAAACC

[SEQ ID No: 5] Accordingly, preferably GHR comprises or consists of a nucleotide sequence substantially as set out in SEQ ID NO: 5, or a fragment or variant thereof.

In one embodiment, TNF-D is provided by gene bank locus ID: FIGNC: 11892; Entrez Gene: 7124; and/or Ensembl: ENSG00000232810. The protein sequence may be represented by the GeneBank ID P01375, which is provided herein as SEQ ID No: 6 , as follows:

MSTESMIRDVELAEEALPKKTGGPQGSRRCLFLSLFSFLIVAGATTLFCLLHFG

VIGPQREEFPRDLSLISPLAQAVRSSSRTPSDKPVAHVVANPQAEGQLQWLN

RRANALLANGVELRDNQLVVPSEGLYLIYSQVLFKGQGCPSTHVLLTHTISRIA

VSYQTKVNLLSAIKSPCQRETPEGAEAKPWYEPIYLGGVFQLEKGDRLSAEIN

RPDYLDFAESGQVYFGIIAL

[SEQ ID No: 6]

Accordingly, preferably TNF-D comprises or consists of an amino acid sequence as substantially as set out in SEQ ID NO: 6, or a fragment or variant thereof.

In one embodiment TNF-D is encoded by a nucleotide sequence which is provided herein as SEQ ID No: 7, as follows:

AGCAG ACGCT CCCT CAGCAAGG ACAGCAG AGG ACCAGCT AAG AGGG AGA G AAGCAACT ACAG ACCCCCCCT G AAAACAACCCT CAG ACGCCACAT CCCC T G ACAAGCTGCCAGGCAGGTT CT CTT CCT CT CACAT ACT G ACCCACGGCT CCACCCT CT CT CCCCTGG AAAGG ACACCAT G AGCACT G AAAGCAT GAT CC GGGACGTGGAGCTGGCCGAGGAGGCGCTCCCCAAGAAGACAGGGGGGC CCCAGGGCT CCAGGCGGTGCTT GTT CCT CAGCCT CTT CT CCTT CCT GAT C GTGGCAGGCGCCACCACGCT CTT CTGCCTGCTGCACTTTGG AGT GAT CGG CCCCCAG AGGG AAG AGTT CCCCAGGG ACCT CT CT CT AAT CAGCCCT CT GG CCCAGGCAGT CAG AT CAT CTT CT CG AACCCCG AGT G ACAAGCCT GT AGCC CAT GTT GT AGCAAACCCT CAAGCT G AGGGGCAGCT CCAGTGGCT G AACCG CCGGGCCAATGCCCT CCTGGCCAATGGCGTGG AGCT GAG AG AT AACCAG CTGGTGGTGCCAT CAG AGGGCCT GT ACCT CAT CT ACT CCCAGGT CCT CTT CAAGGGCCAAGGCTGCCCCT CCACCCAT GTGCT CCT CACCCACACCAT CA GCCGCAT CGCCGT CT CCT ACCAG ACCAAGGT CAACCT CCT CT CTGCCAT C AAGAGCCCCTGCCAGAGGGAGACCCCAGAGGGGGCTGAGGCCAAGCCCT GGT AT G AGCCCAT CT AT CTGGG AGGGGT CTT CCAGCTGG AG AAGGGT G AC CG ACT CAGCGCT GAG AT CAAT CGGCCCG ACT AT CT CG ACTTTGCCG AGT C TGGGCAGGT CT ACTTTGGG AT CATTGCCCT GT G AGG AGG ACG AACAT CCA ACCTT CCCAAACGCCT CCCCTGCCCCAAT CCCTTT ATT ACCCCCT CCTT CA G ACACCCT CAACCT CTT CT GGCT CAAAAAG AG AATT GGGGGCTT AGGGT C GGAACCCAAGCTT AG AACTTT AAGCAACAAG ACCACCACTT CGAAACCTGG GATT CAGG AAT GT GTGGCCTGCACAGT G AAGTGCTGGCAACCACT AAG AA TT CAAACT GGGGCCT CCAG AACT CACT GGGGCCT ACAGCTTT GAT CCCT G ACAT CTGGAAT CTGG AG ACCAGGG AGCCTTTGGTT CTGGCCAG AATGCT G CAGG ACTT G AG AAG ACCT CACCT AG AAATT G ACACAAGTGGACCTT AGGC CTT CCT CT CT CCAG AT GTTT CCAG ACTT CCTT G AG ACACGG AGCCCAGCCC T CCCCAT GG AGCCAGCT CCCT CT ATTT AT GTTTGCACTT GT GATT ATTT ATT ATTT ATTT ATT ATTT ATTT ATTT ACAGAT G AAT GT ATTT ATTTGGGAG ACCGG GGT AT CCTGGGGG ACCCAAT GT AGG AGCTGCCTTGGCT CAG ACAT GTTTT CCGT G AAAACGGAGCT G AACAAT AGGCT GTT CCCAT GT AGCCCCCTGGCC T CT GTGCCTT CTTTT GATT AT GTTTTTT AAAAT ATTT AT CT GATT AAGTT GTC T AAACAATGCT G ATTTGGT GACCAACT GT CACT CATTGCT G AGCCT CTGCT CCCCAGGGG AGTT GT GT CT GT AAT CGCCCT ACT ATT CAGTGGCG AG AAAT AAAGTTTGCTT AG AAAAG AAA

[SEQ ID No: 7]

Accordingly, preferably TNF-D comprises or consists of a nucleotide sequence as substantially as set out in SEQ ID NO: 7, or a fragment or variant thereof.

In one embodiment, GSTA1 is provided by gene bank locus ID: HGNC: 4626; Entrez Gene: 2938; Ensembl: ENSG00000243955; OMIM: 138359; and/or UniProtKB: P08263. The protein sequence may be represented by the GeneBank ID: ENST00000334575.6, which is provided herein as SEQ ID No: 8, as follows: MAEKPKLHYFNARGRMESTRWLLAAAGVEFEEKFIKSAEDLDKLRNDGYLMF

QQVPMVEIDGMKLVQTRAILNYIASKYNLYGKDIKERALIDMYIEGIADLGEMILL

LPVCPPEEKDAKLALIKEKIKNRYFPAFEKVLKSHGQDYLVGNKLSRADIHLVEL

LYYVEELDSSLISSFPLLKALKTRISNLPTVKKFLQPGSPRKPPMDEKSLEEARK

IFRF

[SEQ ID No: 8]

Accordingly, preferably GSTA1 comprises or consists of an amino acid sequence substantially as set out in SEQ ID NO: 8, or a fragment or variant thereof.

In one embodiment GSTA1 is encoded by a nucleotide sequence which is provided herein as SEQ ID No: 9, as follows:

GT CG AGCCAGG ACGGT GACAGCGTTT AACAAAGCTT AG AGAAACCT CCAG G AG ACTGCT AT CATGGCAG AGAAGCCCAAGCT CCACT ACTT CAATGCACG GGGCAG AATGG AGT CCACCCGGTGGCT CCTGGCTGCAGCTGGAGT AG AG TTT G AAG AG AAATTT AT AAAAT CTGCAGAAG ATTTGG ACAAGTT AAG AAAT G AT GG AT ATTT GAT GTT CCAGCAAGTGCCAATGGTT GAG ATT GATGGG AT G A AGCT GGTGCAG ACCAG AGCCATT CT CAACT AC ATTG CCAG C AAAT ACAACC T CT ATGGGAAAG ACAT AAAGG AG AG AGCCCT GATT GAT AT GT AT AT AG AAG GTAT AGCAG ATTT GGGT G AAAT GAT CCT CCTT CTGCCCGT ATGT CCACCT G AGG AAAAAGATGCCAAGCTTGCCTT GAT CAAAG AG AAAAT AAAAAAT CGCT ACTT CCCTGCCTTT GAAAAAGT CTT AAAG AGCCATGG ACAAG ACT ACCTT G TTGGCAACAAGCT GAGCCGGGCT G ACATT CAT CTGGTGG AACTT CT CT ACT ACGT CG AGG AGCTT G ACT CCAGT CTT AT CT CCAGCTT CCCT CTGCT G AAG GCCCT G AAAACCAG AAT CAGCAACCTGCCCACAGT GAAG AAGTTT CT ACA GCCTGGCAGCCCAAGG AAGCCT CCCATGG AT GAG AAAT CTTT AG AAG AAG CAAGG AAG ATTTT CAGGTTTT AAT AACGCAGT CAT GGAGGCCAAG AACTT G CAAT ACCAAT GTT CT AAAGTTTTGCAACAAT AAAGT ACTTT ACCT AAGT GTT GATT GTGCCT GTT GT G AAGCT AAT G AACT CTTT CAAATT AT AT GCT AATT AA AT AAT ACAACT CCT ATT CGCT G ACTT AGTT AAAATT GATTT GTTTT CATT AGG AT CT GAT GT G AATT CAG ATTT CCAAT CTT CT CCT AGCCAACCATTTT CCTGG AATT AAAAATT CAGT AAAAAAGG AAACT AT AG ATT AT GTGGTTT GTTT G ACT TTT CCAAG AATT GT CCCGT AACAT ACAATTT GT CAT ACAAT CT ATT AAAAT GT CAAT GT AG AAATGCACTT CT G ACATTTT CAGGT ATGCACAGG AG AAG AGTT ACCAT CCTGG AT AATGGCAT AAAG ACATTTT CTT CTTTT CCTGG ACAGT CAT TTT ATTT CT GAT AAAAGCGTT CTTT CTT ATGCATTTGCAAAA

[SEQ ID No: 9]

Accordingly, preferably GSTA1 comprises or consists of a nucleotide sequence as substantially as set out in SEQ ID NO: 9, or a fragment or variant thereof.

In one embodiment, NT-proBNP is provided by gene bank locus ID: HGNC: 7940; Entrez Gene: 4879; Ensembl: ENSG00000120937; OMIM: 600295; and/or UniProtKB: P16860.The protein sequence may be represented by the GeneBank ID P16860, which is provided herein as SEQ ID No: 10, as follows:

MDPQTAPSRALLLLLFLHLAFLGGRSHPLGSPGSASDLETSGLQEQRNHLQG

KLSELQVEQTSLEPLQESPRPTGVWKSREVATEGIRGHRKMVLYTLRAPRSP

KMVQGSGCFGRKMDRISSSSGLGCKVLRRH

[SEQ ID No: 10]

Accordingly, preferably NT-proBNP comprises or consists of an amino acid sequence substantially as set out in SEQ ID NO: 10, or a fragment or variant thereof.

In one embodiment NT-proBNP is encoded by a nucleotide sequence which is provided herein as SEQ ID No: 1 1 , as follows:

AGGAGGAGCACCCCGCAGGCTGAGGGCAGGTGGGAAGCAAACCCGGAC GCAT CGCAGCAGCAGCAGCAGCAGCAG AAGCAGCAGCAGCAGCCT CCGC AGT CCCT CCAG AG ACATGG AT CCCCAG ACAGCACCTT CCCGGGCGCT CCT GCT CCTGCT CTT CTTGCAT CTGGCTTT CCTGGG AGGT CGTT CCCACCCGC TGGGCAGCCCCGGTT CAGCCT CGG ACTTGG AAACGT CCGGGTT ACAGG A GCAGCGCAACCATTTGCAGGGCAAACT GT CGG AGCTGCAGGTGG AGCAG ACAT CCCTGG AGCCCCT CCAGG AGAGCCCCCGT CCCACAGGT GT CTGGA AGT CCCGGGAGGT AGCCACCG AGGGCAT CCGTGGGCACCGCAAAATGGT CCT CT ACACCCTGCGGGCACCACG AAGCCCCAAG ATGGTGCAAGGGT CT GGCTGCTTTGGGAGG AAG ATGG ACCGG AT CAGCT CCT CCAGTGGCCTGG GCTGCAAAGTGCT G AGGCGGCATT AAG AGG AAGT CCTGGCT GCAG ACAC CTGCTT CT GATT CCACAAGGGGCTTTTT CCT CAACCCT GTGGCCGCCTTT G AAGT G ACT CATTTTTTT AAT GT ATTT ATGT ATTT ATTT GATT GTTTT AT AT AAG AT GGTTT CTT ACCTTT G AGCACAAAATTT CCACGGT G AAAT AAAGT CAACAT TATAAGCTTTA

[SEQ

ID No: 1 1 ]

Accordingly, preferably NT-proBNP comprises or consists of a nucleotide sequence substantially as set out in SEQ ID NO: 1 1 , or a fragment or variant thereof.

In one embodiment, TNC is provided by gene bank locus ID: HGNC: 5318; Entrez Gene: 3371 ; Ensembl: ENSG00000041982; OMIM: 187380; and/or UniProtKB: P24821 . The protein sequence may be represented by the GeneBank ID P24821 , which is provided herein as SEQ ID No: 12, as follows:

MGAMTQLLAGVFLAFLALATEGGVLKKVIRHKRQSGVNATLPEENQPVVFNH

VYNIKLPVGSQCSVDLESASGEKDLAPPSEPSESFQEHTVDGENQIVFTHRINI

PRRACGCAAAPDVKELLSRLEELENLVSSLREQCTAGAGCCLQPATGRLDTR

PFCSGRGNFSTEGCGCVCEPGWKGPNCSEPECPGNCHLRGRCIDGQCICD

DGFTGEDCSQLACPSDCNDQGKCVNGVCICFEGYAGADCSREICPVPCSEE

HGTCVDGLCVCHDGFAGDDCNKPLCLNNCYNRGRCVENECVCDEGFTGED

CSELICPNDCFDRGRCINGTCYCEEGFTGEDCGKPTCPHACHTQGRCEEGQ

CVCDEGFAGVDCSEKRCPADCHNRGRCVDGRCECDDGFTGADCGELKCPN

GCSGHGRCVNGQCVCDEGYTGEDCSQLRCPNDCHSRGRCVEGKCVCEQG

FKGYDCSDMSCPNDCHQHGRCVNGMCVCDDGYTGEDCRDRQCPRDCSNR

GLCVDGQCVCEDGFTGPDCAELSCPNDCHGQGRCVNGQCVCHEGFMGKD

CKEQRCPSDCHGQGRCVDGQCICHEGFTGLDCGQHSCPSDCNNLGQCVSG

RCICNEGYSGEDCSEVSPPKDLVVTEVTEETVNLAWDNEMRVTEYLVVYTPT

HEGGLEMQFRVPGDQTSTIIQELEPGVEYFIRVFAILENKKSIPVSARVATYLPA

PEGLKFKSIKETSVEVEWDPLDIAFETWEIIFRNMNKEDEGEITKSLRRPETSY

RQTGLAPGQEYEISLHIVKNNTRGPGLKRVTTTRLDAPSQIEVKDVTDTTALIT

WFKPLAEIDGIELTYGIKDVPGDRTTIDLTEDENQYSIGNLKPDTEYEVSLISRR

GDMSSNPAKETFTTGLDAPRNLRRVSQTDNSITLEWRNGKAAIDSYRIKYAPIS

GGDHAEVDVPKSQQATTKTTLTGLRPGTEYGIGVSAVKEDKESNPATINAATE LDTPKDLQVSETAETSLTLLWKTPLAKFDRYRLNYSLPTGQWVGVQLPRNTTS

YVLRGLEPGQEYNVLLTAEKGRHKSKPARVKASTEQAPELENLTVTEVGWDG

LRLNWTAADQAYEHFIIQVQEANKVEAARNLTVPGSLRAVDIPGLKAATPYTVS

IYGVIQGYRTPVLSAEASTGETPNLGEVVVAEVGWDALKLNWTAPEGAYEYFF

IQVQEADTVEAAQNLTVPGGLRSTDLPGLKAATHYTITIRGVTQDFSTTPLSVE

VLTEEVPDMGNLTVTEVSWDALRLNWTTPDGTYDQFTIQVQEADQVEEAHNL

TVPGSLRSMEIPGLRAGTPYTVTLHGEVRGHSTRPLAVEVVTEDLPQLGDLAV

SEVGWDGLRLNWTAADNAYEHFVIQVQEVNKVEAAQNLTLPGSLRAVDIPGL

EAATPYRVSIYGVIRGYRTPVLSAEASTAKEPEIGNLNVSDITPESFNLSWMAT

DGIFETFTIEIIDSNRLLETVEYNISGAERTAHISGLPPSTDFIVYLSGLAPSIRTK T

ISATATTEALPLLENLTISDINPYGFTVSWMASENAFDSFLVTVVDSGKLLDPQE

FTLSGTQRKLELRGLITGIGYEVMVSGFTQGHQTKPLRAEIVTEAEPEVDNLLV

SDATPDGFRLSWTADEGVFDNFVLKIRDTKKQSEPLEITLLAPERTRDITGLRE

ATEYEIELYGISKGRRSQTVSAIATTAMGSPKEVIFSDITENSATVSWRAPTAQ

VESFRITYVPITGGTPSMVTVDGTKTQTRLVKLIPGVEYLVSIIAMKGFEESEPV

SGSFTTALDGPSGLVTANITDSEALARWQPAIATVDSYVISYTGEKVPEITRTV

SGNTVEYALTDLEPATEYTLRIFAEKGPQKSSTITAKFTTDLDSPRDLTATEVQ

SETALLTWRPPRASVTGYLLVYESVDGTVKEVIVGPDTTSYSLADLSPSTHYT

AKIQALNGPLRSNMIQTIFTTIGLLYPFPKDCSQAMLNGDTTSGLYTIYLNGDKA

EALEVFCDMTSDGGGWIVFLRRKNGRENFYQNWKAYAAGFGDRREEFWLGL

DNLNKITAQGQYELRVDLRDHGETAFAVYDKFSVGDAKTRYKLKVEGYSGTA

GDSMAYHNGRSFSTFDKDTDSAITNCALSYKGAFWYRNCHRVNLMGRYGDN

NHSQGVNWFHWKGHEHSIQFAEMKLRPSNFRNLEGRRKRA

[SEQ ID No: 12]

Accordingly, preferably TNC comprises or consists of an amino acid sequence substantially as set out in SEQ ID NO: 12, or a fragment or variant thereof.

In one embodiment TNC is encoded by a nucleotide sequence which is provided herein as SEQ ID No: 13, as follows:

GGCCACAGCCTGCCT ACT GT CACCCGCCT CT CCCGCGCGCAG AT ACACG CCCCCGCCT CCGTGGGCACAAAGGCAGCGCTGCTGGGG AACT CGGGGG AACGCGCACGTGGG AACCGCCGCAGCT CCACACT CCAGGT ACTT CTT CCA AGG ACCT AGGT CT CT CGCCCAT CGG AAAG AAAAT AATT CTTT CAAG AAG AT CAGGG ACAACT GATTT G AAGT CT ACT CT GTGCTT CT AAAT CCCCAATT CT G CT G AAAGT GAG AT ACCCT AG AGCCCT AGAGCCCCAGCAGCACCCAGCCAA ACCCACCT CCACCATGGGGGCCAT G ACT CAGCT GTTGGCAGGT GT CTTT C TTGCTTT CCTTGCCCT CGCT ACCG AAGGTGGGGT CCT CAAG AAAGT CAT C CGGCACAAGCGACAGAGTGGGGTGAACGCCACCCTGCCAGAAGAGAACC AGCCAGTGGT GTTT AACCACGTTT ACAACAT CAAGCTGCCAGTGGG AT CC CAGT GTT CGGTGG AT CTGG AGT CAGCCAGTGGGG AG AAAG ACCTGGCAC CGCCTT CAG AGCCCAGCG AAAGCTTT CAGG AGCACACAGT GG ATGGGG A AAACCAG ATT GT CTT CACACAT CGCAT CAACAT CCCCCGCCGGGCCT GT G GCT GTGCCGCAGCCCCT GAT GTT AAGG AGCTGCT G AGCAG ACTGG AGG A GCTGG AG AACCTGGT GT CTT CCCT G AGGG AGCAAT GT ACTGCAGG AGCAG GCTGCT GT CT CCAGCCTGCCACAGGCCGCTTGG ACACCAGGCCCTT CT GT AGCGGT CGGGGCAACTT CAGCACT G AAGG AT GTGGCT GT GT CTGCG AAC CTGGCTGGAAAGGCCCCAACTGCT CT GAGCCCG AAT GT CCAGGCAACT GT CACCTT CGAGGCCGGTGCATT GATGGGCAGTGCAT CT GT G ACG ACGGCTT CACGGGCG AGG ACTGCAGCCAGCT GGCTTGCCCCAGCG ACTGCAAT GAC CAGGGCAAGTGCGT AAATGG AGT CTGCAT CT GTTT CG AAGGCT ACGCCGG GGCT G ACTGCAGCCGT G AAAT CTGCCCAGTGCCCTGCAGT G AGG AGCAC GGCACAT GT GT AG ATGGCTT GT GT GT GTGCCACG ATGGCTTTGCAGGCG A T G ACTGCAACAAGCCT CT GT GT CT CAACAATTGCT ACAACCGTGG ACG AT G CGTGG AG AAT G AGTGCGT GT GT GAT G AGGGTTT CACGGGCG AAGACTGC AGT G AGCT CAT CT GCCCCAAT G ACTGCTT CG ACCGGGGCCGCTGCAT CAA TGGCACCTGCT ACTGCG AAG AAGGCTT CACAGGT G AAG ACT GCGGG AAAC CCACCTGCCCACATGCCTGCCACACCCAGGGCCGGT GT G AGG AGGGGCA GT GT GT AT GT GAT G AGGGCTTTGCCGGT GT GG ACTGCAGCG AG AAG AGGT GT CCTGCT G ACT GT CACAAT CGTGGCCGCT GT GT AG ACGGGCGGT GT G A GT GT GAT G ATGGTTT CACTGG AGCT G ACT GTGGGG AGCT CAAGT GT CCCA AT GGCTGCAGTGGCCATGGCCGCT GT GT CAAT GGGCAGT GT GT GT GT GAT G AGGGCT AT ACTGGGG AGG ACTGCAGCCAGCT ACGGTGCCCCAAT G ACT GT CACAGT CGGGGCCGCT GT GT CG AGGGCAAAT GT GT AT GT GAGCAAGG CTT CAAGGGCT AT GACTGCAGT G ACAT GAGCTGCCCT AAT GACT GT CACC AGCACGGCCGCT GT GT G AATGGCAT GT GT GTTT GT GAT G ACGGCT ACACA GGGG AAG ACTGCCGGG AT CGCCAATGCCCCAGGG ACTGCAGCAACAGGG GCCT CT GT GTGG ACGG ACAGTGCGT CT GT G AGG ACGGCTT CACCGGCCC T GACT GTGCAG AACT CT CCT GT CCAAAT GACTGCCATGGCCAGGGT CGCT GT GT GAATGGGCAGTGCGT GTGCCAT G AAGG ATTT ATGGGCAAAG ACTGC AAGG AGCAAAGAT GT CCCAGT GACT GT CATGGCCAGGGCCGCTGCGTGG ACGGCCAGTGCAT CTGCCACG AGGGCTT CACAGGCCTGG ACT GT GGCCA GCACT CCTGCCCCAGT GACTGCAACAACTT AGG ACAATGCGT CT CGGGCC GCTGCAT CTGCAACG AGGGCT ACAGCGGAG AAGACTGCT CAGAGGT GT CT CCT CCCAAAG ACCT CGTT GT G ACAG AAGT G ACGG AAGAG ACGGT CAACCT GGCCTGGG ACAAT GAG ATGCGGGT CACAG AGT ACCTT GT CGT GT ACACGC CCACCCACG AGGGTGGT CTGGAAATGCAGTT CCGT GTGCCT GGGG ACCA G ACGT CCACCAT CAT CCAGG AGCTGG AGCCTGGT GTGG AGT ACTTT AT CC GT GT ATTTGCCAT CCTGG AG AACAAG AAG AGCATT CCT GT CAGCGCCAGG GTGGCCACGT ACTT ACCTGCACCT G AAGGCCT G AAATT CAAGT CCAT CAA GGAG ACAT CT GTGG AAGTGG AGTGGG AT CCT CT AG ACATTGCTTTT G AAA CCTGGG AG AT CAT CTT CCGG AAT AT G AAT AAAG AAG AT G AGGG AG AG AT C ACCAAAAGCCT G AGG AGGCCAG AGACCT CTT ACCGGCAAACTGGT CT AGC T CCTGGGCAAG AGT AT GAG AT AT CT CTGCACAT AGT G AAAAACAAT ACCCG GGGCCCTGGCCT G AAG AGGGT G ACCACCACACGCTTGG ATGCCCCCAGC CAG AT CG AGGT GAAAG AT GT CACAG ACACCACTGCCTT GAT CACCTGGTT CAAGCCCCTGGCT GAG AT CG ATGGCATT G AGCT G ACCT ACGGCAT CAAAG ACGTGCCAGG AG ACCGT ACCACCAT CG AT CT CACAG AGG ACG AG AACCAG T ACT CCAT CGGG AACCT G AAGCCT G ACACT G AGT ACG AGGT GT CCCT CAT CT CCCGCAG AGGT GACAT GT CAAGCAACCCAGCCAAAG AG ACCTT CACAA CAGGCCT CG ATGCT CCCAGG AAT CTT CG ACGT GTTT CCCAG ACAG AT AAC AGCAT CACCCTGG AATGG AGG AATGGCAAGGCAGCT ATT G ACAGTT ACAG AATT AAGT ATGCCCCCAT CT CTGG AGGGGACCACGCT G AGGTT GAT GTT C CAAAG AGCCAACAAGCCACAACCAAAACCACACT CACAGGT CT G AGGCCG GGAACT G AAT ATGGG ATTGG AGTTT CTGCT GT G AAGG AAG ACAAGG AG AG CAAT CCAGCGACCAT CAACGCAGCCACAG AGTTGG ACACGCCCAAGG ACC TT CAGGTTT CT G AAACTGCAG AG ACCAGCCT G ACCCTGCT CTGG AAG ACA CCGTTGGCCAAATTT G ACCGCT ACCGCCT CAATT ACAGT CT CCCCACAGG CCAGTGGGTGGG AGTGCAGCTT CCAAGAAACACCACTT CCT AT GT CCT G A G AGGCCTGG AACCAGG ACAGG AGT ACAAT GT CCT CCT G ACAGCCG AG AAA GGCAG ACACAAG AGCAAGCCCGCACGT GT G AAGGCAT CCACT G AACAAG CCCCT G AGCTGG AAAACCT CACCGT G ACT G AGGTTGGCTGGG ATGGCCT C AG ACT CAACTGG ACCGCAGCT G ACCAGGCCT AT G AGCACTTT AT CATT CA GGTGCAGG AGGCCAACAAGGTGG AGGCAGCT CGG AACCT CACCGTGCCT GGCAGCCTT CGGGCT GTGG ACAT ACCGGGCCT CAAGGCTGCT ACGCCTT AT ACAGT CT CCAT CTATGGGGT GAT CCAGGGCT AT AG AACACCAGTGCT CT CTGCT G AGGCCT CCACAGGGG AAACT CCCAATTT GGG AGAGGT CGTGGT GGCCG AGGTGGGCTGGG ATGCCCT CAAACT CAACTGG ACT GCT CCAG AA GGGGCCT AT G AGT ACTTTTT CATT CAGGTGCAGG AGGCT G ACACAGT AG A GGCAGCCCAG AACCT CACCGT CCCAGG AGG ACT G AGGT CCACAG ACCT G CCTGGGCT CAAAGCAGCCACT CATT AT ACCAT CACCAT CCGCGGGGT CAC T CAGGACTT CAGCACAACCCCT CT CT CT GTT GAAGT CTT G ACAG AGGAGGT T CCAG AT ATGGG AAACCT CACAGT G ACCG AGGTT AGCTGGG ATGCT CT CA G ACT GAACTGGACCACGCCAGATGG AACCT AT G ACCAGTTT ACT ATT CAG GT CCAGGAGGCT G ACCAGGT GGAAG AGGCT CACAAT CT CACGGTT CCT G GCAGCCTGCGTT CCAT GG AAAT CCCAGGCCT CAGGGCTGGCACT CCTT AC ACAGT CACCCTGCACGGCG AGGT CAGGGGCCACAGCACT CGACCCCTT G CT GT AGAGGT CGT CACAG AGG AT CT CCCACAGCTGGG AG ATTT AGCCGT G T CT G AGGTTGGCTGGG ATGGCCT CAG ACT CAACTGG ACCGCAGCT G ACAA TGCCT AT G AGCACTTT GT CATT CAGGTGCAGG AGGT CAACAAAGTGG AGG CAGCCCAG AACCT CACGTTGCCTGGCAGCCT CAGGGCT GTGGACAT CCC GGGCCT CG AGGCT GCCACGCCTT AT AG AGT CT CCAT CT ATGGGGT GAT CC GGGGCT AT AG AACACCAGT ACT CT CTGCT G AGGCCT CCACAGCCAAAG AA CCT G AAATTGG AAACTT AAAT GTTT CT G ACAT AACT CCCG AG AGCTT CAAT C T CT CCTGGATGGCT ACCG ATGGG AT CTT CG AG ACCTTT ACCATT G AAATT A TT GATT CCAAT AGGTTGCTGG AG ACT GT GG AAT AT AAT AT CT CTGGTGCTG AACG AACTGCCCAT AT CT CAGGGCT ACCCCCT AGT ACT G ATTTT ATT GT CT ACCT CT CTGG ACTTGCT CCCAGCAT CCGG ACCAAAACCAT CAGTGCCACA GCCACG ACAG AGGCCCTGCCCCTT CTGGAAAACCT AACCATTT CCG ACAT T AAT CCCT ACGGGTT CACAGTTT CCTGGAT GGCAT CGG AGAATGCCTTT G A CAGCTTT CT AGT AACGGTGGTGG ATT CT GGG AAGCTGCTGG ACCCCCAGG AATT CACACTTT CAGGAACCCAG AGG AAGCTGG AGCTT AG AGGCCT CAT A ACTGGCATTGGCT AT G AGGTT ATGGT CT CTGGCTT CACCCAAGGGCAT CA AACCAAGCCCTT GAGGGCT GAG ATT GTT ACAG AAGCCG AACCGG AAGTT G ACAACCTT CTGGTTT CAG ATGCCACCCCAG ACGGTTT CCGT CT GT CCTGG A CAGCT GAT G AAGGGGT CTT CG ACAATTTT GTT CT CAAAAT CAG AG AT ACCA AAAAGCAGT CT GAGCCACTGG AAAT AACCCT ACTTGCCCCCG AACGT ACC AGGG ACAT AACAGGT CT CAG AG AGGCT ACT G AAT ACG AAATT G AACT CT AT GGAAT AAGCAAAGG AAGGCG AT CCCAG ACAGT CAGTGCT AT AGCAACAAC AGCCATGGGCT CCCCAAAGG AAGT CATTTT CT CAG ACAT CACT G AAAATT C GGCT ACT GT CAGCTGG AGGGCACCCACAGCCCAAGT GG AG AGCTT CCGG ATT ACCT AT GTGCCCATT ACAGG AGGT ACACCCT CCATGGT AACT GTGG AC GGAACCAAG ACT CAG ACCAGGCTGGT G AAACT CAT ACCTGGCGT GG AGT A CCTT GT CAGCAT CAT CGCCAT G AAGGGCTTT G AGG AAAGT G AACCT GT CT CAGGGT CATT CACCACAGCT CTGGATGGCCCAT CTGGCCT GGT G ACAGCC AACAT CACT G ACT CAG AAGCCTTGGCCAGGTGGCAGCCAGCCATTGCCAC T GTGGACAGTT AT GT CAT CT CCT ACACAGGCG AG AAAGTGCCAG AAATT AC ACGCACGGT GT CCGGG AACACAGT GG AGT ATGCT CT G ACCG ACCT CG AG CCTGCCACGG AAT ACACACT GAG AAT CTTTGCAG AG AAAGGGCCCCAG AA G AGCT CAACCAT CACTGCCAAGTT CACAACAG ACCT CG ATT CT CCAAG AG A CTT G ACTGCT ACT G AGGTT CAGT CGG AAACTGCCCT CCTT ACCTGGCG AC CCCCCCGGGCAT CAGT CACCGGTT ACCTGCTGGT CT AT G AAT CAGTGG AT GGCACAGT CAAGGAAGT CATT GTGGGT CCAG AT ACCACCT CCT ACAGCCT GGCAG ACCT G AGCCCAT CCACCCACT ACACAGCCAAG AT CCAGGCACT CA AT GGGCCCCT G AGG AGCAAT AT GAT CCAG ACCAT CTT CACCACAATT GG A CT CCT GT ACCCCTT CCCCAAGG ACTGCT CCCAAGCAATGCT G AATGG AGA CACG ACCT CTGGCCT CT ACACCATTT AT CT G AATGGT GAT AAGGCT G AGGC GCTGG AAGT CTT CT GT GACAT G ACCT CT G ATGGGGGTGG ATGG ATT GT GT TCCTGAGACGCAAAAACGGACGCGAGAACTTCTACCAAAACTGGAAGGCA T ATGCTGCTGG ATTTGGGG ACCGCAG AGAAG AATT CTGGCTTGGGCTGG A CAACCTGAACAAAATCACAGCCCAGGGGCAGTACGAGCTCCGGGTGGAC CTGCGGG ACCATGGGG AG ACAGCCTTT GCT GT CT AT G ACAAGTT CAGCGT GGG AG ATGCCAAG ACT CGCT ACAAGCT GAAGGTGGAGGGGT ACAGTGGG ACAGCAGGT G ACT CCATGGCCT ACCACAATGGCAG AT CCTT CT CCACCTTT G ACAAGGACACAG ATT CAGCCAT CACCAACT GTGCT CT GT CCT ACAAAGG GGCTTT CTGGT ACAGG AACT GT CACCGT GT CAACCT G ATGGGG AG AT AT G GGG ACAAT AACCACAGT CAGGGCGTT AACTGGTT CCACTGG AAGGGCCAC - oh -

G AACACT CAAT CCAGTTTGCT GAG AT G AAGCT GAG ACCAAGCAACTT CAGA AAT CTT G AAGGCAGGCGCAAACGGGCAT AAATT CCAGGG ACCACTGGGT G AGAGAGGAATAAGGCCCAGAGCGAGGAAAGGATTTTACCAAAGCATCAAT ACAACCAGCCCAACCAT CGGT CCACACCTGGGCATTTGGT G AG AGT CAAA GCT G ACCAT GG AT CCCTGGGGCCAACGGCAACAGCATGGGCCT CACCT C CT CT GT G ATTT CTTT CTTTGCACCAAAGACAT CAGT CT CCAACAT GTTT CT G TTTT GTT GTTT GATT CAGCAAAAAT CT CCCAGT G ACAACAT CGCAAT AGTTT TTT ACTT CT CTT AGGTGGCT CTGGG AATGGG AG AGGGGT AGGAT GT ACAG GGGT AGTTT GTTTT AG AACCAGCCGT ATTTT ACAT GAAGCT GT AT AATT AAT T GT CATT ATTTTT GTT AGCAAAGATT AAAT GT GT CATTGG AAGCCAT CCCTT TTTTT ACATTT CAT ACAACAG AAACCAG AAAAGCAAT ACT GTTT CCATTTT AA GGAT AT GATT AAT ATT ATT AAT AT AAT AAT GAT GAT GAT GAT GAT G AAAACT A AGG ATTTTT CAAG AGAT CTTT CTTT CCAAAACATTT CTGG ACAGT ACCT GAT TGT ATTTTTTTTTT AAAT AAAAGCACAAGT ACTTTT G AGTTT GTT ATTTT G CTT T G AATT GTT G AGT CT G AATTT CACCAAAGCCAAT CATTT G AACAAAGCGGG G AAT GTTGGGAT AGG AAAGGT AAGT AGGG AT AGTGGT CAAGTGGG AGGG GTGG AAAGG AG ACT AAAG ACTGGG AG AG AGGG AAGCACTTTTTTT AAAT AA AGTT G AACACACTTGGG AAAAGCTT ACAGGCCAGGCCT GT AAT CCCAACA CTTTGGG AGGCCAAGGTGGG AGG AT AGCTT AACCCCAGG AGTTT GAG ACC AGCCT G AGCAACAT AGT G AG AACTT GTCTCT ACAG AAAAAAAAAAAAAAAA AAATTT AATT AGGCAAGCGTGGT AGTGCGCACCT GT CGT CCCAGCT ACT CA GGAGGCT G AGGT AGG AAAAT CACTGG AGCCCAGG AGTT AG AGGTT ACAGT G AGCT AT GAT CACACT ACTGCACT CCAGCCTGGGCAACAG AGGG AG ACCC TGTCTCT AAAT AAAAAAAG AAAAG AAAAAAAAAGCTT AC AACTT GAG ATT CA GCAT CTTGCT CAGT ATTT CCAAG ACT AAT AG ATT ATGGTTT AAAAG ATGCTT TT AT ACT CATTTT CT AATGCAACT CCT AGAAACT CT AT GAT AT AGTT G AGGT AAGT ATT GTT ACCACACAT GGGCT AAGAT CCCCAG AGGCAG ACTGCCT G A GTT CAATT CTTGGCT CCACCATT CCCAAGTT CCCT AACCT CT CT ATGCCT CA GTTT CCT CTT CT GT AAAGT AGGG ACACT CAT ACTT CT CATTT CAG AACATTT TT GT G AAG AAT AAATT AT GTT AT CCATTT G AGGCCCTT AG AATGGT ACCCG GTGTAT ATT AAGTGCT AGT ACAT GTT AGCT AT CAT CATT AT CACTTT AT AT G A G ATGG ACTGGGGTT CAT AGAAACCCAAT G ACTT GATT GTGGCT ACT ACT CA AT AAAT AAT AG AATTTGG ATTT AAA

[SEQ ID No: 13]

Accordingly, preferably TNC comprises or consists of a nucleotide sequence substantially as set out in SEQ ID NO: 13, or a fragment or variant thereof.

In one embodiment, A2M is provided by gene bank locus ID: HGNC: 7; Entrez Gene: 2; Ensembl: ENSG00000175899; OMIM: 103950; and/or UniProtKB: P01023. The protein sequence may be represented by the GeneBank ID P01023, which is provided herein as SEQ ID No: 14, as follows:

MGKNKLLHPSLVLLLLVLLPTDASVSGKPQYMVLVPSLLHTETTEKGCVLLSYL

NETVTVSASLESVRGNRSLFTDLEAENDVLHCVAFAVPKSSSNEEVMFLTVQV

KGPTQEFKKRTTVMVKNEDSLVFVQTDKSIYKPGQTVKFRVVSMDENFHPLN

ELIPLVYIQDPKGNRIAQWQSFQLEGGLKQFSFPLSSEPFQGSYKVVVQKKSG

GRTEHPFTVEEFVLPKFEVQVTVPKIITILEEEMNVSVCGLYTYGKPVPGHVTV

SICRKYSDASDCHGEDSQAFCEKFSGQLNSHGCFYQQVKTKVFQLKRKEYE

MKLHTEAQIQEEGTVVELTGRQSSEITRTITKLSFVKVDSHFRQGIPFFGQVRL

VDGKGVPIPNKVIFIRGNEANYYSNATTDEHGLVQFSINTTNVMGTSLTVRVNY

KDRSPCYGYQWVSEEHEEAHHTAYLVFSPSKSFVHLEPMSHELPCGHTQTV

QAHYILNGGTLLGLKKLSFYYLIMAKGGIVRTGTHGLLVKQEDMKGHFSISIPVK

SDIAPVARLLIYAVLPTGDVIGDSAKYDVENCLANKVDLSFSPSQSLPASHAHL

RVTAAPQSVCALRAVDQSVLLMKPDAELSASSVYNLLPEKDLTGFPGPLNDQ

DNEDCINRHNVYINGITYTPVSSTNEKDMYSFLEDMGLKAFTNSKIRKPKMCP

QLQQYEMHGPEGLRVGFYESDVMGRGHARLVHVEEPHTETVRKYFPETWIW

DLVVVNSAGVAEVGVTVPDTITEWKAGAFCLSEDAGLGISSTASLRAFQPFFV

ELTMPYSVIRGEAFTLKATVLNYLPKCIRVSVQLEASPAFLAVPVEKEQAPHCI

CANGRQTVSWAVTPKSLGNVNFTVSAEALESQELCGTEVPSVPEHGRKDTVI

KPLLVEPEGLEKETTFNSLLCPSGGEVSEELSLKLPPNVVEESARASVSVLGDI

LGSAMQNTQNLLQMPYGCGEQNMVLFAPNIYVLDYLNETQQLTPEIKSKAIGY

LNTGYQRQLNYKHYDGSYSTFGERYGRNQGNTWLTAFVLKTFAQARAYIFID

EAHITQALIWLSQRQKDNGCFRSSGSLLNNAIKGGVEDEVTLSAYITIALLEIPLT

VTHPVVRNALFCLESAWKTAQEGDHGSHVYT KALLA YAFALAGNQDKRKEVL

KSLNEEAVKKDNSVHWERPQKPKAPVGHFYEPQAPSAEVEMTSYVLLAYLTA

QPAPTSEDLTSATNIVKWITKQQNAQGGFSSTQDTVVALHALSKYGAATFTRT

GKAAQVTIQSSGTFSSKFQVDNNNRLLLQQVSLPELPGEYSMKVTGEGCVYL

QTSLKYNILPEKEEFPFALGVQTLPQTCDEPKAHTSFQISLSVSYTGSRSASNM

AIVDVKMVSGFIPLKPTVKMLERSNHVSRTEVSSNHVLIYLDKVSNQTLSLFFT

VLQDVPVRDLKPAIVKVYDYYETDEFAIAEYNAPCSKDLGNA

[SEQ ID No: 14]

Accordingly, preferably A2M comprises or consists of an amino acid sequence substantially as set out in SEQ ID NO: 14, or a fragment or variant thereof. In one embodiment A2M is encoded by a nucleotide sequence which is provided herein as SEQ ID No: 15, as follows:

G AAAAGCTT ATT AGCTGCT GT ACGGT AAAAGT G AGCT CTT ACGGG AATGG G AAT GT AGTTTT AGCCCT CCAGGG ATT CT ATTT AGCCCGCCAGG AATT AAC CTT G ACT AT AAAT AGGCCAT CAAT G ACCTTT CCAG AG AAT GTT CAG AG ACC T CAACTTT GTTT AGAG AT CTT GT GT GGGTGG AACTT CCT GTTTGCACACAG AGCAGCAT AAAGCCCAGTTGCTTTGGGAAGT GTTTGGG ACCAG ATGG ATT GT AGGG AGT AGGGT ACAAT ACAGT CT GTT CT CCT CCAGCT CCTT CTTT CT G CAACATGGGG AAG AACAAACT CCTT CAT CCAAGT CT GGTT CTT CT CCT CTT GGT CCT CCTGCCCACAG ACGCCT CAGT CT CTGG AAAACCGCAGT AT ATGG TT CTGGT CCCCT CCCTGCT CCACACT G AG ACCACT GAG AAGGGCT GT GT C CTT CT GAGCT ACCT G AAT GAG ACAGT G ACT GT AAGTGCTT CCTTGG AGT CT GT CAGGGG AAACAGG AGCCT CTT CACT GACCTGG AGGCGG AGAAT G ACG T ACT CCACT GT GT CGCCTT CGCT GT CCCAAAGT CTT CAT CCAAT G AGG AGG T AAT GTT CCT CACT GT CCAAGT G AAAGG ACCAACCCAAG AATTT AAG AAGC GGACCACAGT G ATGGTT AAG AACGAGG ACAGT CTGGT CTTT GT CCAG ACA G ACAAAT CAAT CT ACAAACCAGGGCAG ACAGT G AAATTT CGT GTT GT CT CC ATGG AT G AAAACTTT CACCCCCT G AAT G AGTT GATT CCACT AGT AT ACATT C AGG AT CCCAAAGG AAAT CGCAT CGCACAATGGCAG AGTTT CCAGTT AGAG GGTGGCCT CAAGCAATTTT CTTTT CCCCT CT CAT CAG AGCCCTT CCAGGGC TCCTACAAGGTGGTGGTACAGAAGAAATCAGGTGGAAGGACAGAGCACCC TTT CACCGTGG AGG AATTT GTT CTT CCCAAGTTT G AAGT ACAAGT AACAGT GCCAAAG AT AAT CACCAT CTTGG AAG AAG AG AT G AAT GT AT CAGT GTGTGG CCT AT ACACAT ATGGG AAGCCT GT CCCTGG ACAT GT G ACT GT G AGCATTT G CAG AAAGT AT AGT G ACGCTT CCG ACTGCCACGGT GAAG ATT CACAGGCTT T CT GT G AGAAATT CAGT GG ACAGCT AAACAGCCAT GGCTGCTT CT AT CAGC AAGT AAAAACCAAGGT CTT CCAGCT GAAG AGG AAGG AGT AT GAAAT G AAAC TT CACACT G AGGCCCAG AT CCAAG AAG AAGG AACAGTGGTGGAATT G ACT GGAAGGCAGT CCAGT GAAAT CACAAG AACCAT AACCAAACT CT CATTT GT G AAAGTGG ACT CACACTTT CG ACAGGG AATT CCCTT CTTTGGGCAGGTGCG CCT AGT AGATGGG AAAGGCGT CCCT AT ACCAAAT AAAGT CAT ATT CAT CAG AGG AAAT G AAGCAAACT ATT ACT CCAAT GCT ACCACGG AT G AGCATGGCCT T GT ACAGTT CT CT AT CAACACCACCAAT GTT ATGGGT ACCT CT CTT ACT GTT AGGGT CAATT ACAAGG AT CGT AGT CCCT GTT ACGGCT ACCAGTGGGT GT C AG AAGAACACG AAG AGGCACAT CACACTGCTT AT CTT GT GTT CT CCCCAAG CAAG AGCTTT GT CCACCTT G AGCCCAT GT CT CAT G AACT ACCCT GTGGCCA T ACT CAG ACAGT CCAGGCACATT AT ATT CT G AATGG AGGCACCCTGCTGG GGCT G AAG AAGCT CT CCTT CT ATT AT CT GAT AATGGCAAAGGGAGGCATT G T CCG AACTGGG ACT CATGG ACTGCTT GT G AAGCAGG AAG ACAT G AAGGGC CATTTTT CCAT CT CAAT CCCT GT G AAGT CAG ACATTGCT CCT GT CGCT CGG TTGCT CAT CT ATGCT GTTTT ACCT ACCGGGG ACGT G ATTGGGG ATT CTGCA AAAT AT GAT GTT G AAAATT GT CT GGCCAACAAGGTGG ATTT G AGCTT CAGC CCAT CACAAAGT CT CCCAGCCT CACACGCCCACCTGCG AGT CACAGCGGC T CCT CAGT CCGT CTGCGCCCT CCGTGCT GT GG ACCAAAGCGTGCTGCT CA T G AAGCCT G ATGCT G AGCT CT CGGCGT CCT CGGTTT ACAACCTGCT ACCA G AAAAGG ACCT CACTGGCTT CCCTGGGCCTTT G AAT GACCAGG ACAAT G A AG ACTGCAT CAAT CGT CAT AAT GT CT AT ATT AATGG AAT CACAT AT ACT CCA GT AT CAAGT ACAAAT G AAAAGG AT AT GT ACAGCTT CCT AG AGG ACAT GGGC TT AAAGGCATT CACCAACT CAAAG ATT CGT AAACCCAAAAT GT GT CCACAG CTT CAACAGT AT G AAATGCATGG ACCT G AAGGT CT ACGT GT AGGTTTTT AT G AGT CAG AT GT AATGGG AAGAGGCCAT GCACGCCTGGTGCAT GTT G AAG A GCCT CACACGG AG ACCGT ACG AAAGT ACTT CCCT GAG ACATGG AT CT GGG ATTTGGTGGTGGT AAACT CAGCAGGT GTGGCT GAGGT AGG AGT AACAGT C CCT G ACACCAT CACCG AGTGG AAGGCAGGGGCCTT CTGCCT GT CT G AAG A TGCTGGACTTGGT AT CT CTT CCACTGCCT CT CT CCG AGCCTT CCAGCCCTT CTTT GTGG AGCT CACAATGCCTT ACT CT GT GATT CGT GG AG AGGCCTT CAC ACT CAAGGCCACGGT CCT AAACT ACCTT CCCAAATGCAT CCGGGT CAGT G TGCAGCTGG AAGCCT CT CCCGCCTT CCT AGCT GT CCCAGTGGAG AAGG AA CAAGCGCCT CACTGCAT CT GTGCAAACGGGCGGCAAACT GT GT CCTGGGC AGT AACCCCAAAGT CATT AGG AAAT GT GAATTT CACT GT GAGCGCAG AGGC ACT AG AGT CT CAAG AGCT GT GTGGG ACT G AGGTGCCTT CAGTT CCT G AAC ACGG AAGG AAAG ACACAGT CAT CAAGCCT CT GTTGGTT G AACCT G AAGG A CT AG AG AAGG AAACAACATT CAACT CCCT ACTTT GT CCAT CAGGT GGT GAG GTTT CT G AAG AATT AT CCCT G AAACTGCCACCAAAT GTGGT AGAAG AAT CT GCCCG AGCTT CT GT CT CAGTTTTGGG AG ACAT ATT AGGCT CTGCCATGCAA AACACACAAAAT CTT CT CCAG ATGCCCT ATGGCT GTGG AG AGCAG AAT AT G GT CCT CTTTGCT CCT AACAT CTATGT ACT GG ATT AT CT AAAT G AAACACAGC AGCTT ACT CCAG AGAT CAAGT CCAAGGCCATTGGCT AT CT CAACACTGGTT ACCAG AG ACAGTT GAACT ACAAACACT AT G ATGGCT CCT ACAGCACCTTT G GGG AGCG AT ATGGCAGG AACCAGGGCAACACCTGGCT CACAGCCTTT GTT CT G AAG ACTTTTGCCCAAGCT CG AGCCT ACAT CTT CAT CG AT GAAGCACAC ATT ACCCAAGCCCT CAT ATGGCT CT CCCAG AGGCAG AAGG ACAATGGCT G TTT CAGGAGCT CTGGGT CACTGCT CAACAATGCCAT AAAGGG AGG AGT AG AAG AT G AAGT G ACCCT CT CCGCCT AT AT CACCAT CGCCCTT CTGG AG ATT C CT CT CACAGT CACT CACCCT GTT GT CCGCAATGCCCT GTTTT GCCTGG AGT CAGCCTGG AAG ACAGCACAAG AAGGGGACCATGGCAGCCAT GT AT AT ACC AAAGCACTG CTGG CCT ATGCTTTTGCCCTGGCAGGT AACCAGG ACAAG AG G AAGG AAGT ACT CAAGT CACTT AAT G AGG AAGCT GT G AAG AAAG ACAACT C T GT CCATTGGG AGCGCCCT CAG AAACCCAAGGCACCAGTGGGGCATTTTT ACG AACCCCAGGCT CCCT CTGCT G AGGTGG AG AT GACAT CCT AT GTGCTC CT CGCTT AT CT CACGGCCCAGCCAGCCCCAACCT CGG AGG ACCT G ACCT C TGCAACCAACAT CGT G AAGTGG AT CACG AAGCAGCAGAATGCCCAGGGCG GTTT CT CCT CCACCCAGG ACACAGTGGTGGCT CT CCATGCT CT GT CCAAAT AT GG AGCAGCCACATTT ACCAGGACT GGG AAGGCTGCACAGGT G ACT AT C CAGT CTT CAGGG ACATTTT CCAGCAAATT CCAAGTGG ACAACAACAACCGC CT GTT ACTGCAGCAGGT CT CATTGCCAG AGCTGCCTGGGG AAT ACAGCAT G AAAGT G ACAGGAGAAGG AT GT GT CT ACCT CCAG ACAT CCTT G AAAT ACAA T ATT CT CCCAG AAAAGGAAGAGTT CCCCTTTGCTTT AGGAGTGCAG ACT CT GCCT CAAACTT GT GAT G AACCCAAAGCCCACACCAGCTT CCAAAT CT CCCT AAGT GT CAGTT ACACAGGG AGCCGCT CTGCCT CCAACATGGCG AT CGTT G AT GT G AAGATGGT CT CTGGCTT CATT CCCCT G AAGCCAACAGT G AAAATGC TT G AAAG AT CT AACCAT GT G AGCCGG ACAG AAGT CAGCAGCAACCAT GT C TT G ATTT ACCTT GAT AAGGT GT CAAAT CAG ACACT G AGCTT GTT CTT CACG GTT CTGCAAG AT GT CCCAGT AAG AG AT CT G AAACCAGCCAT AGT G AAAGT C T AT GATT ACT ACG AG ACGG AT G AGTTTGCAATTGCT G AGT ACAATGCT CCT TGCAGCAAAGAT CTTGGAAATGCTT G AAG ACCACAAGGCT G AAAAGTGCTT TGCTGGAGT CCT GTT CT CAG AGCT CCACAG AAGACACGT GTTTTT GT AT CT TT AAAG ACTT GAT G AAT AAACACTTTTT CTGGT CAAAGACCACAAGGCT G AA AAGTGCTTTGCTGGAGT CCT GTT CT CAG AGCT CCACAG AAG ACACGT GTTT TT GT AT CTTT AAAG ACTT GAT G AAT AAACACTTTTT CTGGT CAA

[SEQ ID No: 15]

Accordingly, preferably A2M comprises or consists of a nucleotide sequence substantially as set out in SEQ ID NO: 15, or a fragment or variant thereof.

In one embodiment, IGFBP2 is provided by gene bank locus ID FIGNC: 5471 ; Entrez Gene: 3485; Ensembl: ENSG00000115457; OMIM: 146731 and/or UniProtKB: P18065.

The protein sequence may be represented by the GeneBank ID P18065, which is provided herein as SEQ ID No: 16, as follows:

MLPRVGCPALPLPPPPLLPLLLLLLGASGGGGGARAEVLFRCPPCTPERLAAC

GPPPVAPPAAVAAVAGGARMPCAELVREPGCGCCSVCARLEGEACGVYTPR

CGQGLRCYPHPGSELPLQALVMGEGTCEKRRDAEYGASPEQVADNGDDHS

EGGLVENHVDSTMNMLGGGGSAGRKPLKSGMKELAVFREKVTEQHRQMGK

GGKHHLGLEEPKKLRPPPARTPCQQELDQVLERISTMRLPDERGPLEHLYSL

HIPNCDKHGLYNLKQCKMSLNGQRGECWCVNPNTGKLIQGAPTIRGDPECHL

FYNEQQEARGVHTQRMQ

[SEQ ID No: 16]

Accordingly, preferably IGFBP2 comprises or consists of an amino acid sequence substantially as set out in SEQ ID NO: 16 or a fragment or variant thereof.

In one embodiment IGFBP2 is encoded by a nucleotide sequence which is provided herein as SEQ ID No: 17, as follows: G AGG AAG AAGCGG AGG AGGCGGCT CCCGCGCT CGCAGGGCCGTGCCAC CTGCCCGCCCGCCCGCT CGCT CGCT CGCCCGCCGCGCCGCGCTGCCG A CCGCCAGCATGCT GCCG AG AGTGGGCTGCCCCGCGCTGCCGCTGCCGC CGCCGCCGCTGCTGCCGCTGCTGCTGCTGCTACTGGGCGCGAGTGGCGG CGGCGGCGGGGCGCGCGCGGAGGTGCTGTTCCGCTGCCCGCCCTGCAC ACCCGAGCGCCTGGCCGCCTGCGGGCCCCCGCCGGTTGCGCCGCCCGC CGCGGTGGCCGCAGTGGCCGGAGGCGCCCGCATGCCATGCGCGGAGCT CGTCCGGGAGCCGGGCTGCGGCTGCTGCTCGGTGTGCGCCCGGCTGGA GGGCGAGGCGTGCGGCGTCTACACCCCGCGCTGCGGCCAGGGGCTGCG CTGCT AT CCCCACCCGGGCT CCG AGCTGCCCCTGCAGGCGCTGGT CAT G GGCGAGGGCACTTGTGAGAAGCGCCGGGACGCCGAGTATGGCGCCAGC CCGG AGCAGGTTGCAG ACAATGGCG AT GACCACT CAG AAGG AGGCCTGG TGG AG AACCACGTGG ACAGCACCAT G AACAT GTTGGGCGGGGG AGGCAG TGCTGGCCGGAAGCCCCT CAAGT CGGGT AT GAAGG AGCTGGCCGT GTT C CGGG AG AAGGT CACT G AGCAGCACCGGCAG ATGGGCAAGGGTGGCAAGC AT CACCTTGGCCT GG AGG AGCCCAAG AAGCTGCG ACCACCCCCTGCCAG G ACT CCCTGCCAACAGG AACTGGACCAGGT CCTGG AGCGG AT CT CCACCA TGCGCCTT CCGGAT G AGCGGGGCCCT CTGG AGCACCT CT ACT CCCTGCA CAT CCCCAACT GT GACAAGCATGGCCT GT ACAACCT CAAACAGTGCAAG AT GT CT CT G AACGGGCAGCGTGGGGAGTGCTGGT GT GT G AACCCCAACACC GGG AAGCT GAT CCAGGG AGCCCCCACCAT CCGGGGGG ACCCCG AGT GT C AT CT CTT CT ACAAT GAGCAGCAGGAGGCT CGCGGGGTGCACACCCAGCG G ATGCAGT AGACCGCAGCCAGCCGGTGCCTGGCGCCCCTGCCCCCCGCC CCT CT CCAAACACCGGCAG AAAACGG AGAGTGCTT GGGTGGTGGGTGCT GGAGG ATTTT CCAGTT CT G ACACACGT ATTT AT ATTTGG AAAG AG ACCAGC ACCG AGCT CGGCACCT CCCCGGCCT CT CT CTT CCCAGCTGCAG ATGCCAC ACCTGCT CCTT CTT GCTTT CCCCGGGGG AGG AAGGGGGTT GTGGT CGGG G AGCTGGGGT ACAGGTTTGGGG AGGGGG AAG AG AAATTTTT ATTTTT G AA CCCCT GT GT CCCTTTTGCAT AAG ATT AAAGG AAGGAAAAGT AAA

[SEQ ID No: 17] Accordingly, preferably IGFBP2 comprises or consists of a nucleotide sequence substantially as set out in SEQ ID NO: 17, or a fragment or variant thereof.

In one embodiment, APOB is provided by gene bank locus ID: FIGNC: 603; Entrez Gene: 338; Ensembl: ENSG00000084674; OMIM: 107730; and/or UniProtKB: P04114. The protein sequence may be represented by the GeneBank ID P04114, which is provided herein as SEQ ID No: 18, as follows:

MDPPRPALLALLALPALLLLLLAGARAEEEMLENVSLVCPKDATRFKHLRKYTY

NYEAESSSGVPGTADSRSATRINCKVELEVPQLCSFILKTSQCTLKEVYGFNP

EGKALLKKTKNSEEFAAAMSRYELKLAIPEGKQVFLYPEKDEPTYILNIKRGIISA

LLVPPETEEAKQVLFLDTVYGNCSTHFTVKTRKGNVATEISTERDLGQCDRFK

PIRTGISPLALIKGMTRPLSTLISSSQSCQYTLDAKRKHVAEAICKEQHLFLPFS

YKNKYGMVAQVTQTLKLEDTPKINSRFFGEGTKKMGLAFESTKSTSPPKQAE

AVLKTLQELKKLTISEQNIQRANLFNKLVTELRGLSDEAVTSLLPQLIEVSSPITL

QALVQCGQPQCSTHILQWLKRVHANPLLIDVVTYLVALIPEPSAQQLREIFNMA

RDQRSRATLYALSHAVNNYHKTNPTGTQELLDIANYLMEQIQDDCTGDEDYTY

LILRVIGNMGQTMEQLTPELKSSILKCVQSTKPSLMIQKAAIQALRKMEPKDKD

QEVLLQTFLDDASPGDKRLAAYLMLMRSPSQADINKIVQILPWEQNEQVKNFV

ASHIANILNSEELDIQDLKKLVKEALKESQLPTVMDFRKFSRNYQLYKSVSLPSL

DPASAKIEGNLIFDPNNYLPKESMLKTTLTAFGFASADLIEIGLEGKGFEPTLEA

LFGKQGFFPDSVNKALYWVNGQVPDGVSKVLVDHFGYTKDDKHEQDMVNGI

MLSVEKLIKDLKSKEVPEARAYLRILGEELGFASLHDLQLLGKLLLMGARTLQGI

PQMIGEVIRKGSKNDFFLHYIFMENAFELPTGAGLQLQISSSGVIAPGAKAGVK

LEVANMQAELVAKPSVSVEFVTNMGIIIPDFARSGVQMNTNFFHESGLEAHVA

LKAGKLKFIIPSPKRPVKLLSGGNTLHLVSTTKTEVIPPLIENRQSWSVCKQVFP

GLNYCTSGAYSNASSTDSASYYPLTGDTRLELELRPTGEIEQYSVSATYELQR

EDRALVDTLKFVTQAEGAKQTEATMTFKYNRQSMTLSSEVQIPDFDVDLGTIL

RVNDESTEGKTSYRLTLDIQNKKITEVALMGHLSCDTKEERKIKGVISIPRLQAE

ARSEILAHWSPAKLLLQMDSSATAYGSTVSKRVAWHYDEEKIEFEWNTGTNV

DTKKMTSNFPVDLSDYPKSLHMYANRLLDHRVPQTDMTFRHVGSKLIVAMSS

WLQKASGSLPYTQTLQDHLNSLKEFNLQNMGLPDFHIPENLFLKSDGRVKYTL NKNSLKIEIPLPFGGKSSRDLKMLETVRTPALHFKSVGFHLPSREFQVPTFTIPK

LYQLQVPLLGVLDLSTNVYSNLYNWSASYSGGNTSTDHFSLRARYHMKADSV

VDLLSYNVQGSGETTYDHKNTFTLSCDGSLRHKFLDSNIKFSHVEKLGNNPVS

KGLLIFDASSSWGPQMSASVHLDSKKKQHLFVKEVKIDGQFRVSSFYAKGTY

GLSCQRDPNTGRLNGESNLRFNSSYLQGTNQITGRYEDGTLSLTSTSDLQSGI

IKNTASLKYENYELTLKSDTNGKYKNFATSNKMDMTFSKQNALLRSEYQADYE

SLRFFSLLSGSLNSHGLELNADILGTDKINSGAHKATLRIGQDGISTSATTNLKC

SLLVLENELNAELGLSGASMKLTTNGRFREHNAKFSLDGKAALTELSLGSAYQ

AMILGVDSKNIFNFKVSQEGLKLSNDMMGSYAEMKFDHTNSLNIAGLSLDFSS

KLDNIYSSDKFYKQTVNLQLQPYSLVTTLNSDLKYNALDLTNNGKLRLEPLKLH

VAGNLKGAYQNNEIKHIYAISSAALSASYKADTVAKVQGVEFSHRLNTDIAGLA

SAIDMSTNYNSDSLHFSNVFRSVMAPFTMTIDAHTNGNGKLALWGEHTGQLY

SKFLLKAEPLAFTFSHDYKGSTSHHLVSRKSISAALEHKVSALLTPAEQTGTWK

LKTQFNNNEYSQDLDAYNTKDKIGVELTGRTLADLTLLDSPIKVPLLLSEPINIID

ALEMRDAVEKPQEFTIVAFVKYDKNQDVHSINLPFFETLQEYFERNRQTIIVVLE

NVQRNLKHINIDQFVRKYRAALGKLPQQANDYLNSFNWERQVSHAKEKLTALT

KKYRITENDIQIALDDAKINFNEKLSQLQTYMIQFDQYIKDSYDLHDLKIAIANIID

EIIEKLKSLDEHYHIRVNLVKTIHDLHLFIENIDFNKSGSSTASWIQNVDTKYQIRI

QIQEKLQQLKRHIQNIDIQHLAGKLKQHIEAIDVRVLLDQLGTTISFERINDILEHV

KHFVINLIGDFEVAEKINAFRAKVHELIERYEVDQQIQVLMDKLVELAHQYKLKE

TIQKLSNVLQQVKIKDYFEKLVGFIDDAVKKLNELSFKTFIEDVNKFLDMLIKKLK

SFDYHQFVDETNDKIREVTQRLNGEIQALELPQKAEALKLFLEETKATVAVYLE

SLQDTKITLIINWLQEALSSASLAHMKAKFRETLEDTRDRMYQMDIQQELQRYL

SLVGQVYSTLVTYISDWWTLAAKNLTDFAEQYSIQDWAKRMKALVEQGFTVP

EIKTILGTMPAFEVSLQALQKATFQTPDFIVPLTDLRIPSVQINFKDLKNIKIPSRF

STPEFTILNTFHIPSFTIDFVEMKVKIIRTIDQMLNSELQWPVPDIYLRDLKVEDIP

LARITLPDFRLPEIAIPEFIIPTLNLNDFQVPDLHIPEFQLPHISHTIEVPTFGKLY SI

LKIQSPLFTLDANADIGNGTTSANEAGIAASITAKGESKLEVLNFDFQANAQLSN

PKINPLALKESVKFSSKYLRTEHGSEMLFFGNAIEGKSNTVASLHTEKNTLELS

NGVIVKINNQLTLDSNTKYFHKLNIPKLDFSSQADLRNEIKTLLKAGHIAWTSSG

KGSWKWACPRFSDEGTHESQISFTIEGPLTSFGLSNKINSKHLRVNQNLVYES

GSLNFSKLEIQSQVDSQHVGHSVLTAKGMALFGEGKAEFTGRHDAHLNGKVI GTLKNSLFFSAQPFEITASTNNEGNLKVRFPLRLTGKIDFLNNYALFLSPSAQQ

ASWQVSARFNQYKYNQNFSAGNNENIMEAHVGINGEANLDFLNIPLTIPEMRL

PYTIITTPPLKDFSLWEKTGLKEFLKTTKQSFDLSVKAQYKKNKHRHSITNPLAV

LCEFISQSIKSFDRHFEKNRNNALDFVTKSYNETKIKFDKYKAEKSHDELPRTF

QIPGYTVPVVNVEVSPFTIEMSAFGYVFPKAVSMPSFSILGSDVRVPSYTLILPS

LELPVLHVPRNLKLSLPDFKELCTISHIFIPAMGNITYDFSFKSSVITLNTNAELF

NQSDIVAHLLSSSSSVIDALQYKLEGTTRLTRKRGLKLATALSLSNKFVEGSHN

STVSLTTKNMEVSVATTTKAQIPILRMNFKQELNGNTKSKPTVSSSMEFKYDF

NSSMLYSTAKGAVDHKLSLESLTSYFSIESSTKGDVKGSVLSREYSGTIASEAN

TYLNSKSTRSSVKLQGTSKIDDIWNLEVKENFAGEATLQRIYSLWEHSTKNHL

QLEGLFFTNGEHTSKATLELSPWQMSALVQVHASQPSSFHDFPDLGQEVALN

ANTKNQKIRWKNEVRIHSGSFQSQVELSNDQEKAHLDIAGSLEGHLRFLKNIIL

PVYDKSLWDFLKLDVTTSIGRRQHLRVSTAFVYTKNPNGYSFSIPVKVLADKFII

PGLKLNDLNSVLVMPTFHVPFTDLQVPSCKLDFREIQIYKKLRTSSFALNLPTLP

EVKFPEVDVLTKYSQPEDSLIPFFEITVPESQLTVSQFTLPKSVSDGIAALDLNA

VANKIADFELPTIIVPEQTIEIPSIKFSVPAGIVIPSFQALTARFEVDSPVYNATWS

ASLKNKADYVETVLDSTCSSTVQFLEYELNVLGTHKIEDGTLASKTKGTFAHR

DFSAEYEEDGKYEGLQEWEGKAHLNIKSPAFTDLHLRYQKDKKGISTSAASPA

VGTVGMDMDEDDDFSKWNFYYSPQSSPDKKLTIFKTELRVRESDEETQIKVN

WEEEAASGLLTSLKDNVPKATGVLYDYVNKYHWEHTGLTLREVSSKLRRNLQ

NNAEWVYQGAIRQIDDIDVRFQKAASGTTGTYQEWKDKAQNLYQELLTQEGQ

ASFQGLKDNVFDGLVRVTQEFHMKVKHLIDSLIDFLNFPRFQFPGKPGIYTREE

LCTMFIREVGTVLSQVYSKVHNGSEILFSYFQDLVITLPFELRKHKLIDVISMYR

ELLKDLSKEAQEVFKAIQSLKTTEVLRNLQDLLQFIFQLIEDNIKQLKEMKFTYLI

NYIQDEINTIFSDYIPYVFKLLKENLCLNLHKFNEFIQNELQEASQELQQIHQYIM

ALREEYFDPSIVGWTVKYYELEEKIVSLIKNLLVALKDFHSEYIVSASNFTSQLS

SQVEQFLHRNIQEYLSILTDPDGKGKEKIAELSATAQEIIKSQAIATKKIISDYHQ

QFRYKLQDFSDQLSDYYEKFIAESKRLIDLSIQNYHTFLIYITELLKKLQSTTVMN

PYMKLAPGELTIIL

[SEQ ID No: 18] Accordingly, preferably APOB comprises or consists of an amino acid sequence substantially as set out in SEQ ID NO: 18, or a fragment or variant thereof.

In one embodiment APOB is encoded by a nucleotide sequence which is provided herein as SEQ ID No: 19, as follows:

ATT CCCACCGGG ACCTGCGGGGCT G AGTGCCCTT CT CGGTT GCTGCCGC TGAGGAGCCCGCCCAGCCAGCCAGGGCCGCGAGGCCGAGGCCAGGCCG CAGCCCAGGAGCCGCCCCACCGCAGCTGGCGATGGACCCGCCGAGGCC CGCGCTGCTGGCGCTGCTGGCGCTGCCTGCGCTGCTGCTGCTGCTGCTG GCGGGCGCCAGGGCCG AAG AGGAAATGCTGG AAAAT GT CAGCCTGGT CT GT CCAAAAG ATGCGACCCG ATT CAAGCACCT CCGG AAGT ACACAT ACAAC T AT G AGGCT G AG AGTT CCAGTGG AGT CCCTGGGACTGCT GATT CAAG AAG TGCCACCAGG AT CAACTGCAAGGTT G AGCTGG AGGTT CCCCAGCT CTGCA GCTT CAT CCT GAAG ACCAGCCAGTGCACCCT G AAAG AGGT GT ATGGCTT C AACCCT G AGGGCAAAGCCTTGCT G AAG AAAACCAAG AACT CT G AGG AGTT TGCTGCAGCCAT GT CCAGGT AT G AGCT CAAGCTGGCCATT CCAG AAGGG A AGCAGGTTTT CCTTT ACCCGG AG AAAG AT G AACCT ACTT ACAT CCT G AACA T CAAG AGGGGCAT CATTT CTGCCCT CCTGGTT CCCCCAG AG ACAG AAGAA GCCAAGCAAGT GTT GTTT CTGG AT ACCGT GT ATGG AAACTGCT CCACT CAC TTT ACCGT CAAG ACG AGG AAGGGCAAT GTGGCAACAG AAAT AT CCACT G A AAG AGACCTGGGGCAGT GT GAT CGCTT CAAGCCCAT CCGCACAGGCAT CA GCCCACTTGCT CT CAT CAAAGGCAT GACCCGCCCCTT GT CAACT CT GAT CA GCAGCAGCCAGT CCT GT CAGT ACACACTGG ACGCT AAG AGG AAGCAT GT G GCAG AAGCCAT CTGCAAGG AGCAACACCT CTT CCTGCCTTT CT CCT ACAAG AAT AAGT ATGGG ATGGT AGCACAAGT G ACACAG ACTTT G AAACTT GAAG AC ACACCAAAG AT CAACAGCCGCTT CTTT GGT G AAGGT ACT AAGAAG ATGGG CCT CGCATTT GAG AGCACCAAAT CCACAT CACCT CCAAAGCAGGCCG AAG CT GTTTT GAAG ACT CT CCAGG AACT G AAAAAACT AACCAT CT CT G AGCAAA AT AT CCAG AG AGCT AAT CT CTT CAAT AAGCTGGTT ACT G AGCT G AG AGGCC T CAGT GAT G AAGCAGT CACAT CT CT CTTGCCACAGCT GATT G AGGT GT CCA GCCCCAT CACTTT ACAAGCCTTGGTT CAGT GTGG ACAGCCT CAGTGCT CC ACT CACAT CCT CCAGTGGCT GAAACGT GTGCATGCCAACCCCCTT CT GAT A GAT GTGGT CACCT ACCTGGTGGCCCT GAT CCCCG AGCCCT CAGCACAGCA GCTGCG AG AG AT CTT CAACAT GGCG AGGG AT CAGCGCAGCCG AGCCACC TT GT ATGCGCT GAGCCACGCGGT CAACAACT AT CAT AAG ACAAACCCT ACA GGG ACCCAGG AGCTGCTGGACATTGCT AATT ACCT G ATGG AACAG ATT CA AG AT G ACTGCACT GGGG AT G AAG ATT ACACCT ATTT GATT CT GCGGGT CAT TGG AAAT ATGGGCCAAACCATGG AGCAGTT AACT CCAG AACT CAAGT CTT C AAT CCT G AAAT GT GT CCAAAGT ACAAAGCCAT CACT GAT GAT CCAG AAAGC TGCCAT CCAGGCT CTGCGGAAAATGG AGCCT AAAG ACAAGG ACCAGG AG GTT CTT CTT CAG ACTTT CCTT GAT G ATGCTT CT CCGGG AG AT AAGCG ACT G GCTGCCT AT CTT AT GTT GAT GAGG AGT CCTT CACAGGCAG AT ATT AACAAA ATT GT CCAAATT CT ACCATGGG AACAG AAT G AGCAAGT GAAG AACTTT GT G

GCTT CCCAT ATTGCCAAT AT CTT G AACT CAG AAGAATTGG AT AT CCAAG A

[SEQ ID No: 19]

Accordingly, preferably APOB comprises or consists of a nucleotide sequence substantially as set out in SEQ ID NO: 19, or a fragment or variant thereof.

In one embodiment, SEPP1 is provided by gene bank locus ID: FIGNC: 10751 ; Entrez Gene: 6414; Ensembl: ENSG00000250722; OMIM: 601484; and/or UniProtKB: P49908. The protein sequence may be represented by the GeneBank ID P49908, which is provided herein as SEQ ID No: 20, as follows:

MWRSLGLALALCLLPSGGTESQDQSSLCKQPPAWSIRDQDPMLNSNGSVTV

VALLQASUY

LCILQASKLEDLRVKLKKEGYSNISYIVVNHQGISSRLKYTHLKNKVSEHIPVYQ

QEENQ

TDVWTLLNGSKDDFLIYDRCGRLVYHLGLPFSFLTFPYVEEAIKIAYCEKKCGN

CSLTTL

KDEDFCKRVSLATVDKTVETPSPHYHHEHHHNHGHQHLGSSELSENQQPGA

PNAPTHPAP PGLHHHHKHKGQHRQGHPENRDMPASEDLQDLQKKLCRKRCINQLLCKLPT

DSELAPRSU

CCHCRHLIFEKTGSAITUQCKENLPSLCSUQGLRAEENITESCQURLPPAAUQI

SQQLIP

TEASASURUKNQAKKUEUPSN

[SEQ ID No: 20]

Accordingly, preferably SEPP1 comprises or consists of an amino acid sequence substantially as set out in SEQ ID NO: 20, or a fragment or variant thereof.

In one embodiment SEPP1 is encoded by a nucleotide sequence which is provided herein as SEQ ID No: 21 as follows:

ACAG ACAGGCAGGTGCAGG AG ACGGGGT G AGCGCTTTTGGGCT CT AGCC CC AT G G C AG CAT CTAGGGGTGTGTT AC AATT AAT G CT CTTTT AG C AGTT G C T GTT CGCGG ATGGCT AAGT GTT AAACCAGCT CAGTGG AG AGT CAGGGTGG CAGCCTTTT ACGCCCT GT CCT CTTGGT ACCCGGGT CTTT GT CT AGCAT CCA GGAAG AAT CAGGT CCT ACAGACTT G AAGG ATGGT G AAT ACAGAG ATTTT AT T G AAT G ATGG AGG AGGCTTT CAGCAGG ATGGATGGGG AGCT GG AAAGCG G ATGG AGT G AG AAGAT AAT CTT CCCCTGGGGTTTGGCCAT CCCATGGCT G AT CT CT CT GACAGT CCCCAGCCG AACT CCT CT CG AT GTT CAG ACACT CTTT CT CCT CT CT CCT CT GCCACACTGCT CTGCTGCT CTT CTGCT CATGG AGCCT G AGG ACAACCCCAGCAAT GTGG AGAAGCCTGGGGCTTGCCCTGGCT CT C T GT CT CCT CCCAT CGGG AGG AACAG AG AGCCAGG ACCAAAGCT CCTT AT G T AAGCAACCCCCAGCCTGG AGCAT AAG AG AT CAAG AT CCAATGCT AAACT C CAATGGTT CAGT G ACT GTGGTTGCT CTT CTT CAAGCCAGCT GAT ACCT GT G CAT ACTGCAGGCAT CT AAATT AG AAG ACCTGCG AGT AAAACT GAAG AAAG A AGG AT ATT CT AAT ATTT CTT AT ATT GTT GTT AAT CAT CAAGG AAT CT CTT CT C GATT AAAAT ACACACAT CTT AAG AAT AAGGTTT CAG AGCAT ATT CCT GTTT A T CAACAAG AAG AAAACCAAACAG AT GT CTGG ACT CTTTT AAATGG AAGCAA AG AT G ACTT CCT CAT AT AT GAT AG AT GTGGCCGT CTT GT AT AT CAT CTTGGT TTGCCTTTTT CCTT CCT AACTTT CCCAT AT GT AG AAGAAGCCATT AAG ATT G CTT ACT GT G AAAAGAAAT GTGG AAACTGCT CT CT CACG ACT CT CAAAG AT G AAG ACTTTT GT AAACGT GT AT CTTTGGCT ACT GTGG AT AAAACAGTT GAAAC T CCAT CGCCT CATT ACCAT CAT G AGCAT CAT CACAAT CATGG ACAT CAGCA CCTTGGCAGCAGT G AGCTTT CAGAG AAT CAGCAACCAGG AGCACCAAAT G CT CCT ACT CAT CCT GCT CCT CCAGGCCTT CAT CACCACCAT AAGCACAAGG GTCAGCATAGGCAGGGTCACCCAGAGAACCGAGATATGCCAGCAAGTGAA G ATTT ACAAG ATTT ACAAAAG AAGCT CTGT CG AAAG AG AT GT AT AAAT CAAT T ACT CT GT AAATTGCCCACAG ATT CAG AGTTGGCT CCT AGG AGCT G ATGCT GCCATT GT CG ACAT CT GAT ATTT GAAAAAACAGGGT CTGCAAT CACCT G AC AGT GT AAAG AAAACCT CCCAT CTTT AT GT AGCT G ACAGGGACTT CGGGCAG AGG AG AACAT AACT G AAT CTT GT CAGT GACGTTTGCCT CCAGCTGCCT G AC AAAT AAGT CAGCAGCTT AT ACCCACAG AAGCCAGTGCCAGTT GACGCT G A AAG AAT CAGGCAAAAAAGT GAG AAT G ACCTT CAAACT AAAT ATTT AAAAT AG G ACAT ACT CCCCAATTT AGT CT AGACACAATTT CATTT CCAGCATTTTT AT A AACT ACCAAATT AGT G AACCAAAAAT AG AAATT AG ATTT GTGCAAACATGG A G AAAT CT ACT G AATTGGCTT CCAG ATTTT AAATTTT AT GT CAT AGAAAT ATT G ACT CAAACCAT ATTTTTT AT G ATGG AGCAACT G AAAGGT G ATTGCAGCTTTT GGTT AAT AT GT CTTTTTTTTT CTTTTT CCAGT GTT CT ATTTGCTTT AAT GAGA AT AG AAACGT AAACT AT G ACCT AGGGGTTT CT GTTGG AT AATT AGCAGTTT A G AATGG AGG AAG AACAACAAAG ACATGCTTT CCATTTTTTT CTTT ACTT AT C T CT CAAAACAAT ATT ACTTT GT CTTTT CAAT CTT CT ACTTTT AACT AAT AAAAT AAGTGG ATTTT GT ATTTT AAG AT CCAG AAAT ACTT AACACGT G AAT ATTTT G CTAAAAAAGCAT AT AT AACT ATTTT AAAT ATCCATTTATCTTTTGTATATCTAA G ACT CAT CCT G ATTTTT ACT AT CACACAT G AAT AAAGCCTTT GT AT CTT T [SEQ ID No: 21 ]

Accordingly, preferably SEPP1 comprises or consists of a nucleotide sequence substantially as set out in SEQ ID NO: 21 , or a fragment or variant thereof.

In one embodiment, TFF3 is provided by gene bank locus ID: FIGNC: 1 1757; Entrez Gene: 7033; Ensembl: ENSG00000160180; OMIM: 600633; and/or UniProtKB: Q07654. The protein sequence may be represented by the GeneBank ID Q07654, which is provided herein as SEQ ID No: 22, as follows:

MKRVLSCVPEPTVVMAARALCMLGLVLALLSSSSAEEYVGLSANQCAVPAKD

RVDCGYPH

VTPKECNNRGCCFDSRIPGVPWCFKPLQEAECTF

[SEQ ID No: 22]

Accordingly, preferably TFF3 comprises or consists of an amino acid sequence substantially as set out in SEQ ID NO: 22, or a fragment or variant thereof.

In one embodiment TFF3 is encoded by a nucleotide sequence which is provided herein as SEQ ID No: 23, as follows:

CGCT CCCCAGT AG AGG ACCCGG AACCAGAACTGG AAT CCGCCCTT ACCG CTTGCTGCCAAAACAGTGGGGGCT G AACT G ACCT CT CCCCTTTGGG AG AG AAAAACT GT CTGGGAGCTT G ACAAAGGCATGCAGG AG AG AACAGG AGCAG CCACAGCCAGG AGGG AG AGCCTT CCCCAAGCAAACAAT CCAGAGCAGCT GTGCAAACAACGGTGCAT AAAT G AGGCCT CCTGG ACCAT G AAGCG AGT CC T G AGCTGCGT CCCGG AGCCCACGGTGGT CATGGCTGCCAG AGCGCT CT G CATGCTGGGGCTGGT CCTGGCCTTGCT GT CCT CCAGCT CTGCT G AGG AGT ACGT GGGCCT GT CTGCAAACCAGT GTGCCGTGCCAGCCAAGG ACAGGGT GGACTGCGGCTACCCCCATGTCACCCCCAAGGAGTGCAACAACCGGGGC TGCTGCTTT G ACT CCAGG AT CCCTGG AGTGCCTTGGT GTTT CAAGCCCCT GCAGG AAGCAG AATGCACCTT CT G AGGCACCT CCAGCTGCCCCCGGCCG GGGG ATGCG AGGCT CGG AGCACCCTTGCCCGGCT GT G ATTGCTGCCAGG CACT GTT CAT CT CAGCTTTT CT GT CCCTTTGCT CCCGGCAAGCGCTT CTGC T G AAAGTT CAT AT CTGG AGCCT GAT GT CTT AACG AAT AAAGGT CCCATGCT CCACCCG AGG ACAGTT CTT CGTGCCT GAG ACTTT CT G AGGTT GTGCTTT AT TTCTGCTGCGTCGTGGGAGAGGGCGGGAGGGTGTCAGGGGAGAGTCTGC CCAGGCCT CAAGGGCAGG AAAAG ACT CCCT AAGG AGCTGCAGTGCATGC AAGG AT ATTTT GAAT CCAG ACTGGCACCCACGT CACAGG AAAGCCT AGG A ACACT GT AAGTGCCGCTT CCT CGGG AAAGCAG AAAAAAT ACATTT CAGGT A G AAGTTTT CAAAAAT CACAAGT CTTT CTTGGT GAAG ACAGCAAGCCAAT AA AACT GT CTT CCAAAGTGGT CCTTT ATTT CACAACCACT CT CGCT ACT GTT CA AT ACTT GT ACT ATT CCTGGGTTTT GTTT CTTT GT ACAGT AAACATT AT G AACA AACAGGCA

[SEQ ID No: 23]

Accordingly, preferably TFF3 comprises or consists of a nucleotide sequence substantially as set out in SEQ ID NO: 23, or a fragment or variant thereof.

In one embodiment, IL6 is provided by gene bank locus ID; FIGNC: 6018;

Entrez Gene: 3569; Ensembl: ENSG00000136244; OMIM: 147620; and/or UniProtKB: P05231 . The protein sequence may be represented by the GeneBank ID P05231 , which is provided herein as SEQ ID No: 24, as follows:

MNSFSTSAFGPVAFSLGLLLVLPAAFPAPVPPGEDSKDVAAPHRQPLTSSERI

DKQIRYILDGISALRKETCNKSNMCESSKEALAENNLNLPKMAEKDGCFQSGF

NEETCLVKNTGLLEFEVYLEYLQNRFESSEEQARAVQMSTKVLIQFLQKKAKN

LDAITTPDPTTNASLLTKLQAQNQWLQDMTTHLILRSFKEFLQSSLRALRQM

[SEQ ID No: 24]

Accordingly, preferably IL6 comprises or consists of an amino acid sequence substantially as set out in SEQ ID NO: 24 or a fragment or variant thereof.

In one embodiment IL6 is encoded by a nucleotide sequence which is provided herein as SEQ ID No: 25, as follows:

AGGT AACACCAT GTTTGGT AAAT AAGT GTTTTGGT GTT GTGCAAGGGT CT G GTTT CAGCCT G AAGCCAT CT CAG AGCT GT CTGGGT CT CTGG AG ACT GG AG GGACAACCT AGT CT AG AGCCCATTTGCAT G AG ACCAAGGAT CCT CCTGCA AG AG ACACCAT CCT G AGGG AAG AGGGCTT CT G AACCAGCTT G ACCCAAT A AG AAATT CTTGGGTGCCG ACGCGG AAGCAG ATT CAG AGCCT AG AGCCGT G CCTGCGT CCGT AGTTT CCTT CT AGCTT CTTTT G ATTT CAAAT CAAG ACTT AC AGGG AG AGGG AGCG AT AAACACAAACT CTGCAAG ATGCCACAAGGT CCT C CTTT G ACAT CCCCAACAAAG AGG ACTGG AG AT GT CT G AGGCT CATT CTGC CCT CG AGCCCACCGGG AACG AAAG AG AAGCT CT AT CT CCCCT CCAGG AGC CCAGCT AT G AACT CCTT CT CCACAAGCGCCTT CGGT CCAGTTGCCTT CT CC CTGGGGCTGCT CCTGGT GTTGCCTGCTGCCTT CCCTGCCCCAGT ACCCCC AGG AG AAG ATT CCAAAG AT GT AGCCGCCCCACACAG ACAGCCACT CACCT CTT CAG AACG AATT G ACAAACAAATT CGGT ACAT CCT CG ACGGCAT CT CAG CCCT G AG AAAGGAGACAT GT AACAAG AGT AACAT GT GT G AAAGCAGCAAA G AGGCACTGGCAG AAAACAACCT G AACCTT CCAAAG ATGGCT G AAAAAG A TGG ATGCTT CCAAT CTGG ATT CAAT G AGG AG ACTTGCCTGGT G AAAAT CAT CACTGGT CTTTTGG AGTTT G AGGT AT ACCT AG AGT ACCT CCAG AACAG ATT T GAG AGT AGT G AGG AACAAGCCAG AGCT GTGCAG AT G AGT ACAAAAGT CC T GAT CCAGTT CCTGCAG AAAAAGGCAAAG AAT CT AG ATGCAAT AACCACCC CT G ACCCAACCACAAATGCCAGCCTGCT G ACG AAGCTGCAGGCACAG AAC CAGT GGCTGCAGG ACAT G ACAACT CAT CT CATT CTGCGCAGCTTT AAGG A GTT CCTGCAGT CCAGCCT G AGGGCT CTT CGGCAAAT GT AGCATGGGCACC T CAG ATT GTT GTT GTT AATGGGCATT CCTT CTT CT GGT CAG AAACCT GT CCA CTGGGCACAG AACTT AT GTT GTT CT CT ATGG AG AACT AAAAGT AT GAGCGT T AGG ACACT ATTTT AATT ATTTTT AATTT ATT AAT ATTT AAAT ATGT G AAG CT G AGTT AATTT ATGT AAGT CAT ATTT AT ATTTTT AAG AAGT ACCACTT G AAACAT TTT AT GT ATT AGTTTT G AAAT AAT AATGGAAAGTGGCT ATGCAGTTT G AAT A T CCTTT GTTT CAG AGCCAG AT CATTT CTT GG AAAGT GT AGGCTT ACCT CAAA TAAATGGCT AACTT ATACAT ATTTTT AAAGAAAT ATTT AT ATTGTATTT AT ATA AT GT AT AAATGGTTTTT AT ACCAAT AAAT GGCATTTT AAAAAATT CAGCA

[SEQ

ID No: 25]

Accordingly, preferably IL6 comprises or consists of a nucleotide sequence substantially as set out in SEQ ID NO: 25, or a fragment or variant thereof. In one embodiment, CHI3L1 is provided by gene bank locus ID; HGNC: 1932; Entrez Gene: 1 1 16; Ensembl: ENSG00000133048; OMIM: 601525; and/or UniProtKB: P36222. The protein sequence may be represented by the GeneBank ID P36222, which is provided herein as SEQ ID No: 26 as follows:

MGVKASQTGFVVLVLLQCCSAYKLVCYYTSWSQYREGDGSCFPDALDRFLC

THIIYSFANISNDHIDTWEWNDVTLYGMLNTLKNRNPNLKTLLSVGGWNFGSQ

RFSKIASNTQSRRTFIKSVPPFLRTHGFDGLDLAWLYPGRRDKQHFTTLIKEMK

AEFIKEAQPGKKQLLLSAALSAGKVTIDSSYDIAKISQHLDFISIMTYDFHGAWR

GTTGHHSPLFRGQEDASPDRFSNTDYAVGYMLRLGAPASKLVMGIPTFGRSF

TLASSETGVGAPISGPGIPGRFTKEAGTLAYYEICDFLRGATVHRILGQQVPYA

TKGNQWVGYDDQESVKSKVQYLKDRQLAGAMVWALDLDDFQGSFCGQDLR

FPLTNAIKDALAAT

[SEQ ID No: 26]

Accordingly, preferably CHI3L1 comprises or consist of an amino acid sequence substantially as set out in SEQ ID NO: 26, or a fragment or variant thereof.

In one embodiment CHI3L1 is encoded by a nucleotide sequence which is provided herein as SEQ ID No: 27, as follows:

G AGGCCCT GT CT AGGT AGCTGGCACCAGG AGCCGTGGGCAAGGG AAGAG GCCACACCCTGCCCTGCT CTGCTGCAGCCAG AATGGGT GT G AAGGCGT CT CAAACAGGCTTT GTGGT CCTGGTGCTGCT CCAGTGCTGCT CTGCAT ACAA ACTGGT CTGCT ACT ACACCAGCTGGT CCCAGT ACCGGG AAGGCG ATGGG A GCTGCTT CCCAG ATGCCCTT G ACCGCTT CCT CT GT ACCCACAT CAT CT ACA GCTTTGCCAAT AT AAGCAACG AT CACAT CG ACACCTGGG AGTGG AAT GAT GT G ACGCT CT ACGGCATGCT CAACACACT CAAG AACAGG AACCCCAACCT G AAG ACT CT CTT GT CT GT CGG AGG ATGGAACTTTGGGT CT CAAAG ATTTT C CAAG AT AGCCT CCAACACCCAG AGT CGCCGG ACTTT CAT CAAGT CAGT AC CGCCATTT CTGCGCACCCATGGCTTT G ATGGGCTGG ACCTTGCCTGGCT C T ACCCTGG ACGGAGAG ACAAACAGCATTTT ACCACCCT AAT CAAGG AAAT G AAGGCCG AATTT AT AAAGG AAGCCCAGCCAGGG AAAAAGCAGCT CCTGCT CAGCGCAGCACT GT CTGCGGGG AAGGT CACCATT G ACAGCAGCT AT G ACA TTGCCAAGAT AT CCCAACACCTGGATTT CATT AGCAT CAT G ACCT ACG ATTT T CATGG AGCCTGGCGT GGG ACCACAGGCCAT CACAGT CCCCT GTT CCG A GGT CAGG AGG ATGCAAGT CCT G ACAG ATT CAGCAACACT G ACT ATGCT GT GGGGTACATGTTGAGGCTGGGGGCTCCTGCCAGTAAGCTGGTGATGGGC AT CCCCACCTT CGGG AGG AGCTT CACT CTGGCTT CTT CT GAGACTGGT GTT GGAGCCCCAAT CT CAGG ACCGGG AATT CCAGGCCGGTT CACCAAGG AGG CAGGG ACCCTTGCCT ACT AT GAG AT CT GT G ACTT CCT CCGCGG AGCCACA GT CCAT AGAAT CCT CGGCCAGCAGGT CCCCT ATGCCACCAAGGGCAACCA GTGGGTAGGATACGACGACCAGGAAAGCGTCAAAAGCAAGGTGCAGTAC CT G AAGG ACAGGCAGCTGGCGGGCGCCATGGT ATGGGCCCTGG ACCT GG AT GACTT CCAGGGCT CCTT CT GTGGCCAGG AT CTGCGCTT CCCT CT CACC AATGCCAT CAAGGATGCACT CGCTGCAACGT AGCCCT CT GTT CTGCACAC AGCACGGGGGCCAAGG ATGCCCCGT CCCCCT CTGGCT CCAGCTGGCCGG G AGCCT GAT CACCTGCCCTGCT G AGT CCCAGGCT GAGCCT CAGT CT CCCT CCCTTGGGGCCT ATGCAG AGGT CCACAACACACAG ATTT G AGCT CAGCCC TGGTGGGCAGAGAGGTAGGGATGGGGCTGTGGGGATAGTGAGGCATCGC AAT GT AAG ACT CGGG ATT AGT ACACACTT GTT GATT AATGGAAAT GTTT ACA GAT CCCCAAGCCTGGCAAGGGAATTT CTT CAACT CCCTGCCCCCCAGCCC T CCTT AT CAAAGG ACACCATTTTGGCAAGCT CT AT CACCAAGGAGCCAAAC AT CCT ACAAG ACACAGT G ACCAT ACT AATT AT ACCCCCTGCAAAGCCCAGC TT G AAACCTT CACTT AGG AACGT AAT CGT GT CCCCT AT CCT ACTT CCCCTT C CT AATT CCACAGCTGCT CAAT AAAGT ACAAG AGCTT AACAGT GGT ATCTGG GCT AGCCAAGGTT AAT CCAT CAG AGTT GTGGGTTTT CAGGCCCAG ACAGC CCGCAG AGCCAT CTGCCTGCTGGGT GAGGG ACT AAGGG AGT GGGCAG AG GGGG AGG AG AAGCAG AGCCAGGGG AGGG ACT G AGGCTGCAACCAGG AG GTGGGGGTGGGGGAGGTGGGTCTCAGTTGCTTGGGGGAGGGAGCAGGG CGG AAGGGCAGG ATGCACTTGCAGGGGT CT CAT CCT GG ATTT CT CTT CAG GT G AGT AAT CCCT CCACCT CCACTTTT AAGT CCAG AGGCGTGGCG AGGGC ACAGGGCAGGT GTGG AGG AGGT CTT AGCT CCAAGGG AACACTTTGCCAG GTT CTT CT GTGCTT CCAAT GACTT CG AAAT AGT CACGTTTGCT AAACTGGA GT G AGG AGT AT AT GG AT GTTT ATTTTGCT AT ACT CTTTGGT AT AT GT G AAAA TT CT CAAAAT AAG AAACTTT AAAAGGT CCTGGTT CAGT G AAGGCTTT G ACT A ACAGCCCCTGGG AGCCGCT AGT ACAGTTGCCCAGT AGCTGCTTGGAT GAG TGATGCCCACATTGTTTGCAGCAGGAGAAACAGAATTAGAGGCGAGGAAG G ATTT CCTGGT CCATT GT GAT ATT GTGCCCCAGCCTGGGGG ACATGCCG A GGG AGCACAGG ACTGCCATT CCGGGTGGGTT ACAAGTT AGAGGCACT CT C ACCAGG AGGCAG AG AAAGGT GGGCCAG AGT CCCT CATGG AT G AAGG ACC T CACTGCAGGT CACT AAAGGTGCCAGCCT G AAAGCCAGGGCCAT CT GT G A CACAGGCT AT CTTGGGCCTT CCCCT CCCACAAAGCCACTGCTT CCAGG AA GCCTGCT GTGCTGGCACCAAGT CCCATTT CAT CCTT ATT CCCCAGCT CCT A CCCTT CCCCCACACAGTGCCT GT AGCGT ATT CAT CCCCTGCATTGG AT CTT TT CT AAGCATTTT CAG AAT AGGCCTT ATTTT CACCAGT CAGGG AACT CCCC AG AGGCAGGGGG ACCT G AACCAGCT CCTTT CATTT AAG AATT CT G ACTTT G CT CACAAT CTGCACCCCACT CCT CCCT ACCCAACCCCACACCCCATTGGT GCT CAGCT CTGGGCT CAGCCT G AGGT CT CTTGCCG AAT CCT GGCCCGGG CCCCAT CCCAG ACTT CT CCTT CAGGTTGGGCCAGGGCTGGCCAGG AG AG G AGGG ACCAGCTGGGCTGGT CCCAGGGCAGGGCAACCCCACCCAGG AG G AAGGGCG AGGCCCCACCCTGCT CACCT AT GTTTGGAG AGTT AG AGGGG CTTGGCTT AAATGGGT GG AGGGG AT G ACT GT AGGTGCTGGGCAAT ATTT G GGGT GAT GAG AG ACT GT AG ACGG AAGCGCAGCT G AGCCCCT CTGGG AG A GGGG AACCCT CT CCTGCCCCT AACGTGCT CT CCAGGG AGCATGG AAAAT G AGCAAG ACT CT ACTTTGCCAT ACT CT GT CTT CT CCACT GGGG AAAACAAAA AT GGGCAGCAG ACCT CAG AGT AACT CCCAT G AAGCCT GT G ACCCCT CAG A GAT CCACCACT AT G AAT CT CT AT GGG ATT CCCAT GAT GT CCT ATGGG AGGA CAT CAGCGGCCT AACCCAGCCT CT CACCCAAAGCAGG AAACCTT CTATGG CCT CT CAGACATGGGGCCACCCAGT GT CAT GACAAT GT CATT CCACT CCT GCCCT CCCCACCT CCCT GTGCT CCACAGGT AGG AGCCT CT CCCCAGGGG CAGG ACGGCAGAGG AT G ATGGCAT AGG AGT G AGG AGCTTGGGT CT CCCG CAT CCACT GT ATGG AT GTT CCAGGGGCT CCAG ACT AG AT AGGT ACAAGGC CT ACCT GTTT GT CAAGGGCCTGCCT GAT GT GT AGCAAAGAAAGCAG AGCC CAG AG AGAG AGCAGG ACTTGCCAAG AGT CACACAGCAAGTT AACAGTGG A T CT CCCAATT CCCT GCCAATGCT CT ATTT ACT ACCT CACAGGCCCG AAAAT AT GGG ACTT CTGGGGCT ACCACCATT AGGGCTGG AAAG AG AG AG ATGG AA ACCAATGGGG ACATT G AG AAGTGGGG AG ACCCTGGG AGG AGT CTTTGG AT T AGGTGGGGTT GG AGCAGGGCTT G AAATGGGGGTGGTT ATGGCCAT CGTT GGCACCCAT GTGCTGGGT ACT CTTT CT GT GT CCAGCACCCT GT CT CCCCA T GAT CCCAT GTT GT CCT CACAACAGCAAGGT G AGCT GTT AT ATTT GT CCCC ATTTT ACAG AT GAT GAAACT G AGACT CAGGTT AT AACCTTT CACAGTGGCT GGCT AGT AAGT GAT AG AGCCAG ACTTGG AAACCTGG AATGCCTGGCT CT A AGGCACATGCATGCCTGG AGGCG ACCCCT GT CCCAAT CATGCCCT CCCAG AGCT GT GTGGCCT CAGG AT CCCAGCT CTGCAGGT CCT GG AAACCCCACCA G AGGCCCAAGGCACCT AGCAT AT CAGT GCT G AGCATGCT ACAGGGCT GAT TTTGGT CCT AGGTGCTGGT CAGGT CAGGG ACAGGGAGGG AGGTGGGCAG GCAT GAGGGAATGGGGTGGGCT AG AAGCCGGCGT CAGCTGCT GT CCT CC T G AGG ACAGGT AAAG AGGG ACTT CAGCCT CAGGGCAGT GT CCTGGG ACC CT GTGCCCT GAAG AT CT CACAT AGCAAGG AAGGCTT CTT GT GACCATGGT GGG AGGTGGG AAT GGGGTT CT AAG AGGTGG AAGGTT GT G ACT G AGCAG A GCACCCACTT AT AACT ACCCACTT AGTGCATTGCCCATTGCCCACCCTT CA AT CCCAT ACT GAT GCCACCCAT ACCCAGCATGCACT GT GT CCAGCAGCT CT CAGCT CTGCCTGGGGGCAGCCT AAGT AT CT CCAGTT ACCAGGGGCAG AAG GAG AGT CT AAACAAATT GTT CACAAT ACCAGCAGCAAT CACTTT AACTTTGG G ATTT CATGCACCAGCT AGAACAAAGCACAG ACCG AAAGGCAAAT GT CT C CCAAAGT CAT CT GTGGGCCAAT AGGGGT CACCCACT GT CCG ATGCT CCCT CCAGG ACAGGGAGT AATT G AGTGCT G ATGGGT GTGCATGGGTTTGGGG A GAAG ATT AGT CAGTTTTTT AGG AAAAGG ATGGGT G ACTT G AGG ATGGG AC CACT GAT GAGCCACTTT AT CACTT CCACT AGGCCCCAGGCAGGTTGGGG A GT ACAGCAAAT GGGTTGCGCAGAG ACCT AGT CCGCCCT CGAT G AGT CT AC CT CT CATGCCACTTGGG ACCCTTT CT CT CACCTGCCT GT CT CCCAT CT AGG T G AGT CT CCTGCCCCAGGCAGCCT CCCCAG AACCT CTGCCAACACT CCT G TTT CCCCGCCACCCCCAGCCCT AGCACT CCAT GT AGACT G AGG ACACT GG GCTT CCAGT CCCAGTT CTT CCAAAGCCTT GAT GT GT G ACCTT G AGCAAGT C ACTT CCCTGCCCT GTGCCT CAGTTT CCCCAT CT GT CAAAT G AGGCAT AACA AT CCTTGCCTTT CCTTTTT CACCT GGGCT GTTGGCT G AGCAGTT GAG AG AG CTGCT AT G AAAGT ATTTTGG AAAGGG AAGTGGAAAAAGCT ATT CAAACCCT CCAGCT ATTGCAGAGT ATTTT CT CCACACCAGGCAG AGCAG AGTGCTGGG CCCTGG AG AGGCCACACAGCAGCCT CTTT CT AG AGTGCTGGGG AGT CCT C AAGGCT CT CT CAT ACACACAGGCCT CCCCCTTT CCT GTTGGCCCCACGCC T CAT CCT CACCCCACCCCT CATGCCT GT G AGG AACAAGGAACAGCCAAGC AGCCT CAT CT CT CCT G AG AGCAG AGCCATGGGT CT CGGG AAACCCAGG AG T AAGG AACAAAGACT CTT CAAAAT G ACTT CAG AGCTTT CTTT AGG AT CCCA GGG AGGT GT AAACT CAGTGCTT AATT AAATGGATT CTTT AG AGGGTGGG AA ACAGGTGG AT GT CAACCATTTGCCCCACAT ACCCGT ATT CATGCAAT CCAC CCCCAGTGGGCT CACCGGTGCGGGT GT GCAAAACCT CCT CCCACCCCAG T CAT CTT AGGTT CAGGT CAT CCTTTGGT CCTGCT CTTT CCCCGCCAGGCT G T CT GTT G ATGCT ACTTTT AAT CTGCTTT CACT AAG AT AAGCCTGGCAG AAGT GGTGGGGGTAAGGTGGGCTGTAGGCCAGCTCCCAAATGTGTGCTGGGCA T AACAG AAG ACCCATT CTT G ACT G AAGT GCCCTT GT GGG ACCCT G AGCCC GTGCCCTGGAGTGGCACAGGGAGGTGTGCCAGCAATGGGGACCGAGGTC ACTGGGGACTGCTGGGGTTCGGGCTAGTGGCCTGTCTGGCTGCTGTCTG CTT CCCT CGTT CACAT CCTGCT GG AGCCTT AAAT AGGAGCCCAAAAGCTTT T CTTT CTT ACTTTTTTTT G AG ACAAAAT CT CACT CT GTTGCCCAGGCCGG AG TGCAGTGGCACAACCT CTGCCGCCTGGGTT CAAGCG ATT CT CCTGCCT CA GCCT CCCG AGT AGCTGGG ATT ACAGGTGCCTGCCACCATGCCTGGCT AAT TTT CGT AGTT AGAGT AG AG ACGGGGTTT CACCAT CTTGGCCAGGCTGGT C TT G AACT CCT CACCT CAT GAT CCACCTGCCT CGGCCT CCCAAAGTGCTGG GATT CCAGGT AT G AGCCACCACGCCCGGCCT AAAGCTTTT CT ATT AAT AAT TT CCTGCCT CACCCT CCAT CCCCTT CCT CCT CAGGT GAGTT CCCAG AAGG A GGG AGGGCCAGG AGGGGGCCGCAAGACCTGCCAT CTGCCAGTGCT CACA CACCAAT CT CT CT AGCCCT CAGT ACAGT CCTGCAAG AGGGGGGT CAT GAG GCCCATT CCACAG AT AAGG AAACT G AGGCCCAAAGT CTGGGCATGCTGCG TTGCT CTGGG AAGGT GAT CTGCAGGGT AAATGG AGT G AGGGCAGGGGGC CG AATGGGGAG AGGCTGGG AGCCG AGGAGGT AGG AGT CATT GTGCCCT C AG AGCCAACCACCT G ATTT CTGCAT CT GT CAAAT AGT AAT AGCCCCTT CCT ATGCCT CAAAGG ATTTTTTT CAAG AAT G AAACT GT AAAATT CACTTT AAAGT G ACAT GAT CCGTT CCCGG AGGG ACAGGGG AAT CCCCAGTGCACCAT ACAC CAAT AACCCCTGCT AAGGCAGCAGT ATT AATTGCT CAACCTT CCGCACCT G TGCACT AACTGCT CACGTT GTT CCCACCCT ACCCCACCCAGGT AATGG AGT TGGGG AG AAGGT AGG ATT CCCCCACACCACCAGCAGCCCTGGGG AAGGT GATT CCCCACCACGTT CTTGCTTTTT CT CCTTTGGG AAT AAAGAAAAT GT CC GGTTGCCCCAGCATGCCT G AGG AAGTGGAAGGG AG AGGTT AGG ACATTT GTTGCT G AAAAT CT CCAGCAGGT ACAGGCACATGGGCCTGCACCACT AGG CACCTGGG AT AGCCCT CTGGCT ATGGGGCT G AGGT CTT CCTT CCAGCCCA GGAAAG AGCAG AGGT CAAGAGGCAG ATTTTTT GTTT CACT CT AGCCT CTGC T ACT CT GT GTGGCCTTGGGCCT GT CCT CAGT GT CAACCAGCAGGCCT CAC AT CT CT GTTT AAAT GG AAG AAGCT AAGCAGGGCCAAGGCAGCCACCAT CT CTGGGT CATTTGCCT CTGGTTT GT AT AAACTT GT GT G ATGCATGCAGCCT G CAG ACCCTGCAGAGAGT G AGGCTGCAGG ATGG AGCAGG AGCT AAAAGAG ATTTGGAG AGTGGCGT CT CCTGGT G ACCTGCAAGGT CT CGGCACG ACT CC CCACACTGCCTTTT CCCT GTT AT CTGCT CAGGT AGGTTT CCCCAAGGCCAC ACCT CAGG ACAAAGAG AAAG AAGG AGGCCCCGCT CCCAGCAGCCAG ATT CCT GT CCCTTGCACT G AGGT CTGGGCT GGGCT CACAG AGCACAT GTGCCC T GT ACACACT CTGGGT CAGGG AACCCAGCCCTGCT CCT CTGGGCCT CCCC TGCCAGCCCT CT GTT CTGCACACAGCACGGGGGCCAAGG ATGCCCCGT C CCCCT CTGGCT CCAGCTGGCCGGG AGCCT GAT CACCTGCCCTGCT G AGT CCCAGGCT G AGCCT CAGT CT CCCT CCCTTGGGGCCT ATGCAG AGGT CCAC AACACACAG ATTT GAGCT CAGCCCTGGT GGGCAG AG AGGT AGGG ATGGG GCT GTGGGG AT AGT G AGGCAT CGCAAT GT AAG ACT CGGG ATT AGT ACACA CTT GTT GATT AATGG AAAT GTTT ACAG AT CCCCAAGCCTGGCAAGGG AATT T CTT CAACT CCCTGCCCCCCAGCCCT CCTT AT CAAAGG ACACCATTTTGGC AAGCT CT AT CACCAAGG AGCCAAACAT CCT ACAAG ACACAGT G ACCAT ACT AATT AT ACCCCCTGCAAAGCCCAGCTT G AAACCTT CACTT AGGAACGT AAT CGT GT CCCCT AT CCT ACTT CCCCTT CCT AATT CCACAGCTGCT CAAT AAAG T ACAAG AGCTT AACAGTGG AGGCCCT GT CT AGGT AGCTGGCACCAGG AGC CGTGGGCAAGGG AAG AGGCCACACCCTGCCCTGCT CTGCTGCAGCCAG A G AGGCCCT GT CT AGGT AGCTGGCACCAGG AGCCGTGGGCAAGGGAAGAG GCCACACCCTGCCCTGCT CTGCTGCAGCCAG AATGGGT GT G AAGGCGT CT CAAACAGGT ATCTGGGCT AGCCAAGGTT AAT CCAT CAG AGTT GTGGGTTTT CAGGCCCAG ACAGCCCGCAG AGCCAT CTGCCTGCTGGGT G AGGG ACT AA GGGAGTGGGCAGAGGGGGAGGAGAAGCAGAGCCAGGGGAGGGACTGAG GCTGCAACCAGGAGGTGGGGGTGGGGGAGGTGGGTCTCAGTTGCTTGGG GGAGGGAGCAGGGCGG AAGGGCAGG ATGCACTTGCAGGGGT CT CAT CCT GGATTT CT CTT CAGGCTTT GTGGT CCT GGTGCTGCT CCAGTGCT GT G AGT A AT CCCT CCACCT CCACTTTT AAGT CCAGAGGCGTGGCG AGGGCACAGGGC AGGT GTGG AGG AGGT CTT AGCT CCAAGGG AACACTTTGCCAGGTT CTT CT GTGCTT CCAAT G ACTT CG AAAT AGT CACGTTTGCT AAACTGGAGT G AGGAG T AT ATGG AT GTTT ATTTTGCT AT ACT CTTTGGT AT AT GT G AAAATT CT CAAAA T AAG AAACTTT AAAAGGT CCTGGTT CAGT G AAGGCTTT G ACT AACAGCCCC TGGGAGCCGCTAGTACAGTTGCCCAGTAGCTGCTTGGATGAGTGATGCCC ACATT GTTTGCAGCAGG AG AAACAG AATT AG AGGCG AGG AAGG ATTT CCT GGT CCATT GT GAT ATT GTGCCCCAGCCT GGGGG ACATGCCG AGGG AGCA CAGG ACTGCCATT CCGGGTGGGTT ACAAGTT AG AGGCACT CT CACCAGG A GGCAG AG AAAGGTGGGCCAG AGT CCCT CATGG AT G AAGG ACCT CACTGC AGGT CACT AAAGGTGCCAGCCT G AAAGCCAGGGCCAT CT GT G ACACAGGC T AT CTTGGGCCTT CCCCT CCCACAAAGCCACTGCTT CCAGG AAGCCTGCT GTGCTGGCACCAAGT CCCATTT CAT CCTT ATT CCCCAGCT CCT ACCCTT CC CCCACACAGTGCCT GT AGCGT ATT CAT CCCCTGCATTGG AT CTTTT CT AAG CATTTT CAG AAT AGGCCTT ATTTT CACCAGT CAGGG AACT CCCCAG AGGCA GGGGG ACCT G AACCAGCT CCTTT CATTT AAG AATT CT G ACTTT GCT CACAA T CTGCACCCCACT CCT CCCT ACCCAACCCCACACCCCATT GGTGCT CAGC TCTGGGCT CAGCCT G AGGT CT CTTGCCGAAT CCTGGCCCGGGCCCCAT CC CAG ACTT CT CCTT CAGGCT CTGCAT ACAAACTGGT CTGCT ACT ACACCAGC TGGT CCCAGT ACCGGG AAGGCGATGGG AGCTGCTT CCCAG ATGCCCTT G ACCGCTT CCT CT GT ACCCACAT CAT CT ACAGCTTTG CCAAT AT AAGCAACG AT CACAT CG ACACCTGGG AGTGG AAT GAT GT GACGCT CT ACGGCATGCT C AACACACTCAAGAACAGGTTGGGCCAGGGCTGGCCAGGAGAGGAGGGAC CAGCTGGGCTGGT CCCAGGGCAGGGCAACCCCACCCAGG AGG AAGGGC G AGGCCCCACCCTGCT CACCT AT GTTTGG AG AGTT AG AGGGGCTTGGCTT AAATGGGTGG AGGGG AT G ACT GT AGGT GCTGGGCAAT ATTTGGGGT GAT G AG AG ACT GT AG ACGG AAGCGCAGCT G AGCCCCT CTGGG AG AGGGG AACC CT CT CCTGCCCCT AACGTGCT CT CCAGGG AGCATGG AAAAT GAGCAAG AC T CT ACTTTGCCAT ACT CT GT CTT CT CCACTGGGG AAAACAAAAATGGGCAG CAG ACCT CAGAGT AACT CCCAT G AAGCCT GT GACCCCT CAG AG AT CCACC ACT AT GAAT CT CT ATGGG ATT CCCAT GAT GT CCT ATGGG AGG ACAT CAGCG GCCT AACCCAGCCT CT CACCCAAAGCAGG AAACCTT CT ATGGCCT CT CAG ACATGGGGCCACCCAGT GT CAT G ACAAT GT CATT CCACT CCTGCCCT CCC CACCT CCCT GTGCT CCACAGG AACCCCAACCT G AAG ACT CT CTT GT CT GT C GGAGG ATGG AACTTTGGGT CT CAAAGGT AGG AGCCT CT CCCCAGGGGCA GGACGGCAG AGG AT G ATGGCAT AGG AGT G AGG AGCTTGGGT CT CCCGCA T CCACT GT ATGG AT GTT CCAGGGGCT CCAG ACT AG AT AGGT ACAAGGCCT ACCT GTTT GT CAAGGGCCTGCCT GAT GT GT AGCAAAG AAAGCAG AGCCCA G AG AG AG AGCAGG ACTTGCCAAG AGT CACACAGCAAGTT AACAGTGG AT C T CCC A ATT CCCT G CC AAT G CT CT ATTT ACT ACCT C AC AG G CCCG A AA AT AT GGG ACTT CT GGGGCT ACCACCATT AGGGCTGG AAAGAG AG AG ATGG AAAC CAATGGGG ACATT G AG AAGT GGGG AG ACCCTGGG AGG AGT CTTTGGATT A GGTGGGGTTGGAGCAGGGCTT G AAATGGGGGTGGTT ATGGCCAT CGTT G GCACCCAT GTGCTGGGT ACT CTTT CT GT GT CCAGCACCCT GT CT CCCCAT GAT CCCAT GTT GT CCT CACAACAGCAAGGT GAGCT GTT AT ATTT GT CCCCA TTTT ACAGAT GAT G AAACT GAG ACT CAGGTT AT AACCTTT CACAGT GGCT G GCT AGT AAGT GAT AG AGCCAG ACTTGG AAACCTGG AATGCCTGGCT CT AA GGCACATGCATGCCTGGAGGCG ACCCCT GT CCCAAT CATGCCCT CCCAG A GCT GT GTGGCCT CAGG AT CCCAGCT CT GCAGGT CCTGG AAACCCCACCAG AGGCCCAAGGCACCT AGCAT AT CAGTGCT G AGCATGCT ACAGGGCT GATT TTGGT CCT AG ATTTT CCAAG AT AGCCT CCAACACCCAG AGT CGCCGG ACTT T CAT CAAGT CAGT ACCGCCATTT CTGCGCACCCATGGCTTT G ATGGGCT G G ACCTTGCCTGGCT CT ACCCTGG ACGG AG AG ACAAACAGCATTTT ACCAC CCT AAT CAAGGTGCTGGT CAGGT CAGGGACAGGG AGGG AGGTGGGCAGG CAT G AGGG AATGGGGTGGGCT AG AAGCCGGCGT CAGCTGCT GT CCT CCT G AGG ACAGGT AAAGAGGG ACTT CAGCCT CAGGGCAGT GT CCTGGG ACCC T GTGCCCT G AAG AT CT CACAT AGCAAGGAAGGCTT CTT GT G ACCATGGT G GGAGGTGGG AATGGGGTT CT AAG AGGTGG AAGGTT GT G ACT GAGCAG AG CACCCACTT AT AACT ACCCACTT AGTGCATTGCCCATTGCCCACCCTT CAA T CCCAT ACT G ATGCCACCCAT ACCCAGCATGCACT GT GT CCAGCAGCT CT CAGCT CTGCCTGGGGGCAGCCT AAGT AT CT CCAGTT ACCAGGGGCAG AAG GAG AGT CT AAACAAATT GTT CACAAT ACCAGCAGCAAT CACTTT AACTTTGG G ATTT CATGCACCAGCT AGAACAAAGCACAG ACCG AAAGGCAAAT GT CT C CCAAAGT CAT CT GTGGGCCAAT AGGGGT CACCCACT GT CCG ATGCT CCCT CCAGG ACAGGGAGT AATT G AGTGCT G ATGGGT GTGCATGGGTTTGGGG A G AAG ATT AGT CAGTTTTTT AGG AAAAGG ATGGGT G ACTT G AGG ATGGG AC CACT GAT GAGCCACTTT AT CACTT CCACT AGGCCCCAGGCAGGTTGGGG A GT ACAGCAAAT GGGTTGCGCAGAG ACCT AGT CCGCCCT CGAT G AGT CT AC CT CT CATGCCACTTGGG ACCCTTT CT CT CACCTGCCT GT CT CCCAT CT AGG AAAT G AAGGCCG AATTT AT AAAGG AAGCCCAGCCAGGG AAAAAGCAGCT C CTGCT CAGCGCAGCACT GT CTGCGGGG AAGGT CACCATT G ACAGCAGCT A T G ACATTGCCAAG AT AT CCCAGT G AGT CT CCTGCCCCAGGCAGCCT CCCC AG AACCT CTGCCAACACT CCT GTTT CCCCGCCACCCCCAGCCCT AGCACT CCAT GT AGACT G AGG ACACTGGGCTT CCAGT CCCAGTT CTT CCAAAGCCTT GAT GT GT G ACCTT GAGCAAGT CACTT CCCTGCCCT GTGCCT CAGTTT CCCC AT CT GT CAAAT G AGGCAT AACAAT CCTT GCCTTT CCTTTTT CACCTGGGCT GTTGGCT G AGCAGTT G AG AG AGCTGCT AT G AAAGT ATTTTGG AAAGGGAA GTGG AAAAAGCT ATT CAAACCCT CCAGCT ATTGCAG AGT ATTTT CT CCACA CCAGGCAG AGCAG AGTGCTGGGCCCTGGAG AGGCCACACAGCAGCCT CT TT CT AG AGTGCTGGGG AGT CCT CAAGGCT CT CT CAT ACACACAGGCCT CC CCCTTT CCT GTTGGCCCCACGCCT CAT CCT CACCCCACCCCT CATGCCT G T G AGG AACAAGGAACAGCCAAGCAGCCT CAT CT CT CCT GAG AGCAG AGCC AT GGGT CT CGGG AAACCCAGG AGT AAGG AACAAAG ACT CTT CAAAAT G AC TT CAG AGCTTT CTTT AGG AT CCCAGGG AGGT GT AAACT CAGT GCTT AATT A AATGG ATT CTTT AG AGGGTGGG AAACAGGTGG AT GT CAACCATTTGCCCC ACAT ACCCGT ATT CATGCAAT CCACCCCCAGTGGGCT CACCGGTGCGGGT GTGCAAAACCT CCT CCCACCCCAGT CAT CTT AGGTT CAGGT CAT CCTTTGG T CCTGCT CTTT CCCCGCCAGGCT GT CT GTT G ATGCT ACTTTT AAT CTGCTTT CACT AAG AT AAGCCTGGCAG AAGTGGTGGGGGT AAGGTGGGCT GT AGGC CAGCT CCCAAAT GT GTGCTGGGCAT AACAG AAGACCCATT CTT G ACT GAA GTGCCCTT GTGGGACCCT G AGCCCGTGCCCTGG AGTGGCACAGGG AGGT GTGCCAGCAATGGGGACCGAGGTCACTGGGGACTGCTGGGGTTCGGGCT AGTGGCCT GT CTGGCTGCT GT CTGCTT CCCT CGTT CACAT CCTGCTGGAG CCTT AAAT AGG AGCCCAAAAGCTTTT CTTT CTT ACTTTTTTTT GAG ACAAAAT CT CACT CT GTTGCCCAGGCCGG AGTGCAGTGGCACAACCT CTGCCGCCT G GGTT CAAGCG ATT CT CCTGCCT CAGCCT CCCG AGT AGCT GGG ATT ACAGG TGCCTGCCACCATGCCTGGCT AATTTT CGT AGTT AG AGT AG AGACGGGGT TT CACCAT CTT GGCCAGGCT GGT CTT G AACT CCT CACCT CAT GAT CCACCT GCCT CGGCCT CCCAAAGTGCTGGG ATT CCAGGT AT G AGCCACCACGCCC GGCCT AAAGCTTTT CT ATT AAT AATTT CCTGCCT CACCCT CCAT CCCCTT CC T CCT CAG ACACCT GG ATTT CATT AGCAT CAT GACCT ACG ATTTT CATGGAG CCTGGCGTGGGACCACAGGCCAT CACAGT CCCCT GTT CCG AGGT CAGG A GGATGCAAGT CCT G ACAG ATT CAGCAACACT GT G AGTT CCCAG AAGGAGG G AGGGCCAGG AGGGGGCCGCAAG ACCTGCCAT CTGCCAGTGCT CACACA CCAAT CT CT CT AGCCCT CAGT ACAGT CCTGCAAG AGGGGGGT CAT G AGGC CCATT CCACAG AT AAGG AAACT G AGGCCCAAAGT CTGGGCATGCTGCGTT GCT CTGGGAAGGT GAT CTGCAGGGT AAATGG AGT G AGGGCAGGGGGCCG AATGGGGAG AGGCTGGG AGCCG AGG AGGT AGG AGT CATT GTGCCCT CAG AGCCAACCACCT G ATTT CTGCAT CT GT CAAAT AGT AAT AGCCCCTT CCT AT GCCT CAAAGG ATTTTTTT CAAG AAT G AAACT GT AAAATT CACTTT AAAGT G A CAT GAT CCGTT CCCGG AGGG ACAGGGG AAT CCCCAGTGCACCAT ACACCA AT AACCCCTGCT AAGGCAGCAGT ATT AATTGCT CAACCTT CCGCACCT GT G CACT AACTGCT CACGTT GTT CCCACCCT ACCCCACCCAGG ACT ATGCT GT G GGGT ACAT GTT G AGGCTGGGGGCT CCT GCCAGT AAGCTGGT G ATGGGCA T CCCCACCTT CGGG AGG AGCTT CACT CTGGCTT CTT CT G AG ACTGGT GTT GGAGCCCCAAT CT CAGG ACCGGG AATT CCAGGCCGGTT CACCAAGG AGG CAGGG ACCCTTGCCT ACT AT G AGGT AAT GG AGTTGGGGAG AAGGT AGG AT T CCCCCACACCACCAGCAGCCCTGGGG AAGGT GATT CCCCACCACGTT CT TGCTTTTT CT CCTTTGGG AAT AAAGAAAAT GT CCGGTTGCCCCAGCATGCC T G AGG AAGTGG AAGGG AG AGGTT AGG ACATTT GTTGCT G AAAAT CT CCAG CAGGT ACAGGCACATGGGCCTGCACCACT AGGCACCTGGG AT AGCCCT CT GGCT ATGGGGCT G AGGT CTT CCTT CCAGCCCAGG AAAG AGCAG AGGT CAA G AGGCAG ATTTTTT GTTT CACT CT AGCCT CTGCT ACT CT GT GTGGCCTTGG GCCT GT CCT CAGT GT CAACCAGCAGGCCT CACAT CT CT GTTT AAATGG AAG AAGCT AAGCAGGGCCAAGGCAGCCACCAT CT CT GGGT CATTT GCCT CTGG TTT GT AT AAACTT GT GT GATGCATGCAGCCTGCAG ACCCTGCAG AG AGT G A GGCTGCAGG ATGG AGCAGG AGCT AAAAGAG ATTTGG AG AGT GGCGT CT C CTGGT GACCTGCAAGGT CT CGGCACGACT CCCCACACTGCCTTTT CCCT G TT AT CTGCT CAG AT CT GT G ACTT CCT CCGCGG AGCCACAGT CCAT AG AAT C CT CGGCCAGCAGGT CCCCT ATGCCACCAAGGGCAACCAGT GGGT AGG AT ACG ACG ACCAGG AAAGCGT CAAAAGCAAGGT AGGTTT CCCCAAGGCCACA CCT CAGG ACAAAG AG AAAGAAGG AGGCCCCGCT CCCAGCAGCCAG ATT C CT GT CCCTTGCACT G AGGT CTGGGCTGGGCT CACAG AGCACAT GTGCCCT GT ACACACT CTGGGT CAGGG AACCCAGCCCTGCT CCT CTGGGCCT CCCCT GCCAGGTGCAGT ACCT G AAGG ACAGGCAGCTGGCGGGCGCCATGGT AT G GGCCCTGG ACCTGG AT G ACTT CCAGGGCT CCTT CT GT GGCCAGG AT CTGC GCTT CCCT CT CACCAATGCCAT CAAGG ATGCACT CGCTGCAACGT AGCCC T CT GTT CTGCACACAGCACGGGGGCCAAGG ATGCCCCGT CCCCCT CTGG CT CCAGCTGGCCGGG AGCCT GAT CACCTGCCCTGCT G AGT CCCAGGCT G AGCCT CAGT CT CCCT CCCTTGGGGCCT ATGCAG AGGT CCACAACACACAG ATTT GAGCT CAGCCCTGGTGGGCAG AGAGGT AGGG ATGGGGCT GTGGGG AT AGT G AGGCAT CGCAAT GT AAG ACT CGGG ATT AGT ACACACTT GTT GATT AATGG AAAT GTTT ACAG AT CCCCAAGCCTGGCAAGGGAATTT CTT CAACT C CCTGCCCCCCAGCCCT CCTT AT CAAAGGACACCATTTTGGCAAGCT CT AT C ACCAAGG AGCCAAACAT CCT ACAAG ACACAGT GACCAT ACT AATT AT ACCC CCTGCAAAGCCCAGCTT G AAACCTT CACTT AGG AACGT AAT CGT GT CCCCT AT CCT ACTT CCCCTT CCT AATT CCACAGCTGCT CAAT AAAGT ACAAG AGCTT AACAGTG

[SEQ ID No: 27]

Accordingly, preferably CHI3L1 comprises or consists of a nucleotide sequence substantially as set out in SEQ ID NO: 27, or a fragment or variant thereof.

In one embodiment, MET is provided by gene bank locus ID: HGNC: 7029; Entrez Gene: 4233; Ensembl: ENSG00000105976; OMIM: 164860; and/or UniProtKB: P08581 . The protein sequence may be represented by the GeneBank ID P08581 , which is provided herein as SEQ ID No: 28, as follows: MKAPAVLAPGILVLLFTLVQRSNGECKEALAKSEMNVNMKYQLPNFTAETPIQ

NVILHEHHIFLGATNYIYVLNEEDLQKVAEYKTGPVLEHPDCFPCQDCSSKANL

SGGVWKDNINMALVVDTYYDDQLISCGSVNRGTCQRHVFPHNHTADIQSEVH

CIFSPQIEEPSQCPDCVVSALGAKVLSSVKDRFINFFVGNTINSSYFPDHPLHSI

SVRRLKETKDGFMFLTDQSYIDVLPEFRDSYPIKYVHAFESNNFIYFLTVQRET

LDAQTFHTRIIRFCSINSGLHSYMEMPLECILTEKRKKRSTKKEVFNILQAAYVS

KPGAQLARQIGASLNDDILFGVFAQSKPDSAEPMDRSAMCAFPIKYVNDFFNK

IVNKNNVRCLQHFYGPNHEHCFNRTLLRNSSGCEARRDEYRTEFTTALQRVD

LFMGQFSEVLLTSISTFIKGDLTIANLGTSEGRFMQVVVSRSGPSTPHVNFLLD

SHPVSPEVIVEHTLNQNGYTLVITGKKITKIPLNGLGCRHFQSCSQCLSAPPFV

QCGWCHDKCVRSEECLSGTWTQQICLPAIYKVFPNSAPLEGGTRLTICGWDF

GFRRNNKFDLKKTRVLLGNESCTLTLSESTMNTLKCTVGPAMNKHFNMSIIISN

GHGTTQYSTFSYVDPVITSISPKYGPMAGGTLLTLTGNYLNSGNSRHISIGGKT

CTLKSVSNSILECYTPAQTISTEFAVKLKIDLANRETSIFSYREDPIVYEIHPTKSF

ISTWWKEPLNIVSFLFCFASGGSTITGVGKNLNSVSVPRMVINVHEAGRNFTV

ACQHRSNSEIICCTTPSLQQLNLQLPLKTKAFFMLDGILSKYFDLIYVHNPVFKP

FEKPVMISMGNENVLEIKGNDIDPEAVKGEVLKVGNKSCENIHLHSEAVLCTVP

NDLLKLNSELNIEWKQAISSTVLGKVIVQPDQNFTGLIAGVVSISTALLLLLGFFL

WLKKRKQIKDLGSELVRYDARVHTPHLDRLVSARSVSPTTEMVSNESVDYRA

TFPEDQFPNSSQNGSCRQVQYPLTDMSPILTSGDSDISSPLLQNTVHIDLSAL

NPELVQAVQHVVIGPSSLIVHFNEVIGRGHFGCVYHGTLLDNDGKKIHCAVKSL

NRITDIGEVSQFLTEGIIMKDFSHPNVLSLLGICLRSEGSPLVVLPYMKHGDLRN

FIRNETHNPTVKDLIGFGLQVAKGMKYLASKKFVHRDLAARNCMLDEKFTVKV

ADFGLARDMYDKEYYSVHNKTGAKLPVKWMALESLQTQKFTTKSDVWSFGV

LLWELMTRGAPPYPDVNTFDITVYLLQGRRLLQPEYCPDPLYEVMLKCWHPK

AEMRPSFSELVSRISAIFSTFIGEHYVHVNATYVNVKCVAPYPSLLSSEDNADD

EVDTRPASFWETS

[SEQ ID No: 28]

Accordingly, preferably MET comprises or consist of an amino acid sequence substantially as set out in SEQ ID NO: 28, or a fragment or variant thereof. In one embodiment MET is encoded by a nucleotide sequence which is provided herein as SEQ ID No: 29, as follows:

AG ACACGTGCTGGGGCGGGCAGGCG AGCGCCT CAGT CTGGT CGCCTGG CGGTGCCTCCGGCCCCAACGCGCCCGGGCCGCCGCGGGCCGCGCGCGC CGATGCCCGGCTGAGTCACTGGCAGGGCAGCGCGCGTGTGGGAAGGGG CGGAGGGAGTGCGGCCGGCGGGCGGGCGGGGCGCTGGGCTCAGCCCG GCCGCAGGTGACCCGGAGGCCCTCGCCGCCCGCGGCGCCCCGAGCGCT TTGTGAGCAGATGCGGAGCCGAGTGGAGGGCGCGAGCCAGATGCGGGG CGACAGCTGACTTGCTGAGAGGAGGCGGGGAGGCGCGGAGCGCGCGTG TGGT CCTTGCGCCGCT G ACTT CT CCACTGGTT CCTGGGCACCG AAAGAT A AACCT CT CAT AAT G AAGGCCCCCGCT GT GCTTGCACCTGGCAT CCT CGT G CT CCT GTTT ACCTTGGTGCAG AGG AGCAATGGGG AGT GT AAAGAGGCACT AGCAAAGT CCG AG AT G AAT GT G AAT AT G AAGT AT CAGCTT CCCAACTT CAC CGCGG AAACACCCAT CCAG AAT GT CATT CT ACAT G AGCAT CACATTTT CCT TGGTGCCACT AACT ACATTT AT GTTTT AAAT G AGG AAG ACCTT CAG AAGGTT GCT G AGT ACAAG ACTGGGCCT GTGCTGGAACACCCAG ATT GTTT CCCAT G T CAGGACTGCAGCAGCAAAGCCAATTT AT CAGG AGGT GTTTGG AAAGAT AA CAT CAACATGGCT CT AGTT GT CGACACCT ACT AT GAT GAT CAACT CATT AG CT GTGGCAGCGT CAACAG AGGG ACCTGCCAGCG ACAT GT CTTT CCCCACA AT CAT ACTGCT G ACAT ACAGT CGG AGGTT CACTGCAT ATT CT CCCCACAG A T AGAAG AGCCCAGCCAGT GT CCT G ACT GT GTGGT G AGCGCCCTGGG AGC CAAAGT CCTTT CAT CT GT AAAGG ACCGGTT CAT CAACTT CTTT GT AGGCAAT ACCAT AAATT CTT CTT ATTT CCCAG AT CAT CCATTGCATT CG AT AT CAGT G A G AAGGCT AAAGGAAACG AAAGATGGTTTT AT GTTTTT G ACGG ACCAGT CCT AC ATT GAT GTTTT ACCT G AGTT CAG AG ATT CTT ACCCCATT AAGT ATGT CCA TG CCTTT G AAAGCAACAATTTT ATTT ACTT CTT G ACGGT CCAAAGGG AAACT CT AG ATGCT CAG ACTTTT CACACAAGAAT AAT CAGGTT CT GTT CCAT AAACT CTGG ATTGCATT CCT ACATGG AAATGCCT CTGG AGT GT ATT CT CACAG AAA AG AG AAAAAAG AG AT CCACAAAG AAGG AAGT GTTT AAT AT ACTT CAGGCT G CGT AT GT CAGCAAGCCTGGGGCCCAGCTTGCT AGACAAAT AGG AGCCAGC CT G AAT GAT G ACATT CTTTT CGGGGT GTT CGCACAAAGCAAGCCAG ATT CT GCCG AACCAATGG AT CG AT CTGCCAT GT GTGCATT CCCT AT CAAAT AT GT C AACG ACTT CTT CAACAAG AT CGT CAACAAAAACAAT GT GAG AT GTCT CCAG CATTTTT ACGG ACCCAAT CAT G AGCACTGCTTT AAT AGG ACACTT CT GAGA AATT CAT CAGGCT GT G AAGCGCGCCGT GAT G AAT AT CG AACAGAGTTT ACC ACAGCTTTGCAGCGCGTT G ACTT ATT CATGGGT CAATT CAGCGAAGT CCT C TT AACAT CT AT AT CCACCTT CATT AAAGG AG ACCT CACCAT AGCT AAT CTT G GGACAT CAG AGGGT CGCTT CATGCAGGTT GTGGTTT CT CG AT CAGG ACCA T CAACCCCT CAT GT G AATTTT CT CCTGG ACT CCCAT CCAGT GT CT CCAG AA GT GATT GTGG AGCAT ACATT AAACCAAAATGGCT ACACACTGGTT AT CACT GGG AAG AAGAT CACG AAG AT CCCATT G AAT GGCTT GGGCTGCAG ACATTT CCAGT CCTGCAGT CAATGCCT CT CTGCCCCACCCTTT GTT CAGT GT GGCT GGTGCCACG ACAAAT GT GTGCGAT CGG AGG AATGCCT G AGCGGG ACAT G G ACT CAACAG AT CT GT CTGCCTGCAAT CT ACAAGGTTTT CCCAAAT AGTGC ACCCCTT GAAGG AGGG ACAAGGCT GACCAT AT GTGGCTGGGACTTTGG AT TT CGG AGG AAT AAT AAATTT G ATTT AAAGAAAACT AG AGTT CT CCTTGG AAA T G AG AGCTGCACCTT G ACTTT AAGT G AG AGCACG AT G AAT ACATT G AAAT G CACAGTT GGT CCT GCCAT G AAT AAGCATTT CAAT AT GT CCAT AATT ATTT CA AATGGCCACGGG ACAACACAAT ACAGT ACATT CT CCT AT GTGG AT CCT GT A AT AACAAGT ATTT CGCCG AAAT ACGGT CCT AT GGCTGGTGGCACTTT ACTT ACTTT AACTGG AAATT ACCT AAACAGTGGG AATT CT AG ACACATTT CAATT G GTGG AAAAACAT GT ACTTT AAAAAGT GTGT CAAACAGT ATT CTT G AAT GTT A T ACCCCAGCCCAAACCATTT CAACT G AGTTTGCT GTT AAATT G AAAATT G AC TT AGCCAACCG AG AG ACAAGCAT CTT CAGTT ACCGT G AAG AT CCCATT GT C T AT G AAATT CAT CCAACCAAAT CTTTT ATT AGT ACTTGGTGG AAAG AACCT C T CAACATT GT CAGTTTT CT ATTTTGCTTT GCCAGTGGTGGG AGCACAAT AAC AGGT GTTGGG AAAAACCT G AATT CAGTT AGT GT CCCG AGAAT GGT CAT AAA T GTGCAT GAAGCAGG AAGGAACTTT ACAGTGGCAT GT CAACAT CGCT CT AA TT CAG AG AT AAT CT GTT GT ACCACT CCTT CCCTGCAACAGCT G AAT CTGCA ACT CCCCCT G AAAACCAAAGCCTTTTT CAT GTT AGATGGG AT CCTTT CCAA AT ACTTT GAT CT CATTT AT GT ACAT AAT CCT GT GTTT AAGCCTTTT G AAAAGC CAGT GAT GAT CT CAATGGGCAAT G AAAAT GT ACTGG AAATT AAGGG AAAT G AT ATT G ACCCT G AAGCAGTT AAAGGT G AAGT GTT AAAAGTT GG AAAT AAG A GCT GT GAG AAT AT ACACTT ACATT CT G AAGCCGTTTT ATGCACGGT CCCCA AT GACCTGCT G AAATT G AACAGCG AGCT AAAT AT AG AGTGGAAGCAAGCAA TTT CTT CAACCGT CCTTGG AAAAGT AAT AGTT CAACCAGAT CAGAATTT CAC AGG ATT G ATTGCTGGT GTT GT CT CAAT AT CAACAGCACT GTT ATT ACT ACTT GGGTTTTT CCT GTGGCT G AAAAAG AGAAAGCAAATT AAAG AT CTGGGCAGT G AATT AGTT CGCT ACG ATGCAAG AGT ACACACT CCT CATTTGGAT AGGCTT GT AAGTGCCCG AAGT GT AAGCCCAACT ACAG AAAT GGTTT CAAAT GAAT CT GT AG ACT ACCG AGCT ACTTTT CCAG AAGAT CAGTTT CCT AATT CAT CT CAG A ACGGTT CATGCCG ACAAGTGCAGT AT CCT CT G ACAG ACAT GT CCCCCAT C CT AACT AGTGGGG ACT CT GAT AT AT CCAGT CCATT ACTGCAAAAT ACT GT C CACATT G ACCT CAGTGCT CT AAAT CCAG AGCTGGT CCAGGCAGTGCAGCA T GT AGT G ATTGGGCCCAGT AGCCT GATT GTGCATTT CAAT G AAGT CAT AGG AAG AGGGCATTTT GGTT GT GT AT AT CATGGG ACTTT GTTGG ACAAT G ATGG CAAG AAAATT CACT GTGCT GT G AAAT CCTT G AACAG AAT CACT G ACAT AGG AG AAGTTT CCCAATTT CT G ACCG AGGG AAT CAT CAT G AAAG ATTTT AGT CAT CCCAAT GT CCT CT CGCT CCTGGG AAT CTGCCTGCG AAGT G AAGGGT CT CC GCTGGT GGT CCT ACCAT ACAT GAAACATGG AG AT CTT CG AAATTT CATT CG AAAT GAG ACT CAT AAT CCAACT GT AAAAG AT CTT ATTGGCTTTGGT CTT CAA GT AGCCAAAGGCAT G AAAT AT CTTGCAAGCAAAAAGTTT GT CCACAG AG AC TTGGCTGCAAG AAACT GT ATGCTGG AT GAAAAATT CACAGT CAAGGTTGCT G ATTTTGGT CTTGCCAG AG ACAT GT AT GAT AAAG AAT ACT AT AGT GT ACACA ACAAAACAGGTGCAAAGCTGCCAGT G AAGTGG ATGGCTTTGG AAAGT CT G CAAACT CAAAAGTTT ACCACCAAGT CAG AT GT GTGGT CCTTTGGCGTGCT C CT CTGGG AGCT GAT G ACAAG AGG AGCCCCACCTT AT CCT G ACGT AAACAC CTTT GAT AT AACT GTTT ACTT GTTGCAAGGG AG AAG ACT CCT ACAACCCG A AT ACTGCCCAG ACCCCTT AT AT G AAGT AATGCT AAAATGCTGGCACCCT AA AGCCG AAATGCGCCCAT CCTTTT CT G AACTGGT GT CCCGG AT AT CAGCG A T CTT CT CT ACTTT CATTGGGG AGCACT AT GT CCAT GT G AACGCT ACTT AT GT G AACGT AAAAT GTGT CGCT CCGT AT CCTT CTCTGTTGT CAT CAG AAG AT AA CGCT GAT GAT G AGGTGGACACACG ACCAGCCT CCTT CTGGG AG ACAT CAT AGTGCT AGT ACT AT GT CAAAGCAACAGT CCACACTTT GT CCAATGGTTTTTT CACTGCCT G ACCTTT AAAAGGCCAT CG AT ATT CTTTGCT CTT GCCAAAATT G CACT ATT AT AGG ACTT GT ATT GTT ATTT AAATT ACT GG ATT CT AAGG AATTT C TT AT CT G ACAG AGCAT CAGAACCAG AGGCTTGGT CCCACAGGCCACGG AC CAATGGCCTGCAGCCGT G ACAACACT CCT GT CAT ATT GG AGT CCAAAACTT G AATT CTGGGTT G AATTTTTT AAAAAT CAGGT ACCACTT G ATTT CAT ATGGG AAATT G AAGCAGGAAAT ATT G AGGGCTT CTT GAT CACAGAAAACT CAG AAG AG AT AGT AATGCT CAGG ACAGG AGCGGCAGCCCCAG AACAGGCCACT CAT TT AG AATT CT AGT GTTT CAAAACACTTTT GTGTGTTGTATGGT CAAT AACATT TTT CATT ACT G ATGGT GT CATT CACCCATT AGGT AAACATT CCCTTTT AAAT GTTT GTTT GTTTTTT GAG ACAGG AT CT CACT CT GTTGCCAGGGCT GT AGT G CAGT GGT GT GAT CAT AGCT CACTGCAACCT CCACCT CCCAGGCT CAAGCC T CCCG AAT AGCT GGG ACT ACAGGCGCACACCACCAT CCCCGGCT AATTTT T GT ATTTTTT GT AG AG ACGGGGTTTTGCCAT GTTGCCAAGGCTGGTTT CAA ACT CCTGG ACT CAAG AAAT CCACCCACCT CAGCCT CCCAAAGTGCT AGG A TT ACAGGCAT G AGCCACTGCGCCCAGCCCTT AT AAATTTTT GT AT AG ACAT T CCTTTGGTTGG AAG AAT ATTT AT AGGCAAT ACAGT CAAAGTTT CAAAAT AG CAT CACACAAAACAT GTTT AT AAAT G AACAGG AT GT AAT GT ACAT AG AT G AC ATT AAG AAAATTT GT AT G AAAT AATTT AGT CAT CAT G AAAT ATTT AGTTGT CA T AT AAAAACCCACT GTTT GAG AAT G ATGCT ACT CT GAT CT AAT G AAT GTG AA CAT GT AG AT GTTTT GTGTGT ATTTTTTT AAAT G AAAACT CAAAAT AAG ACAA GT AATTT GTTG AT AAAT ATTTTT AAAG AT AACT CAGCAT GTTT GT AAAGCAG GAT ACATTTT ACT AAAAGGTT CATTGGTT CCAAT CACAGCT CAT AGGT AG AG CAAAG AAAGGGTGGAT GG ATT G AAAAG ATT AGCCT CT GT CT CGGT GGCAG GTT CCCACCT CGCAAGCAATTGG AAACAAAACTTTTGGGG AGTTTT ATTTT GCATT AGGGT GT GTTTT AT GTT AAGCAAAACAT ACTTT AG AAACAAAT G AAA AAGGCAATT G AAAAT CCCAGCT ATTT CACCT AG ATGG AAT AGCCACCCT G A GCAG AACTTT GT G ATGCTT CATT CT GTGG AATTTT GTGCTTGCT ACT GT AT A GTGCAT GTGGT GT AGGTT ACT CT AACT GGTTTT GT CG ACGT AAACATTT AA AGTGTTAT ATTTTTT AT AAAAATGTTT ATTTTT AATGATATGAGAAAAATTTTG TT AGGCCACAAAAACACTGCACT GT G AACATTTT AG AAAAGGT AT GT CAG A CTGGG ATT AAT G ACAGCAT G ATTTT CAAT G ACT GT AAATTGCGAT AAGG AA AT GT ACT G ATTGCCAAT ACACCCCACCCT CATT ACAT CAT CAGG ACTT G AA GCCAAGGGTT AACCCAGCAAGCT ACAAAG AGGGT GT GT CACACT G AAACT - 6o -

CAAT AGTT G AGTTTGGCT GTT GTTGCAGGAAAAT GATT AT AACT AAAAGCT C T CT GAT AGTGCAG AG ACTT ACCAGAAG ACACAAGG AATT GT ACT G AAG AGC T ATT ACAAT CCAAAT ATTGCCGTTT CAT AAAT GT AAT AAGT AAT ACT AATT CA CAG AGT ATT GT AAATGGTGG AT G ACAAAAG AAAAT CTGCT CT GTGG AAAG A AAG AACT GT CT CT ACCAGGGT CAAG AGCAT G AACGCAT CAAT AG AAAGAAC T CGGGG AAACAT CCCAT CAACAGG ACT ACACACTT GT AT AT ACATT CTT G A G AACACTGCAAT GT G AAAAT CACGTTTGCT ATTT AT AAACTT GT CCTT AG AT T AAT GT GT CTGG ACAG ATT GTGGG AGT AAGT GATT CTT CT AAGAATT AG AT ACTT GT CACTGCCT AT ACCTGCAGCT G AACT G AATGGT ACTT CGT AT GTT A AT AGTT GTT CT GAT AAAT CATGCAATT AAAGT AAAGT GATGCAA

[SEQ ID No: 29]

Accordingly, preferably MET comprises or consist of a nucleotide sequence substantially as set out in SEQ ID NO: 29, or a fragment or variant thereof.

In one embodiment, GDF15 is provided by gene bank locus ID: FIGNC: 30142; Entrez Gene: 9518; Ensembl: ENSG00000130513; OMIM: 605312; and/or UniProtKB: Q99988. The protein sequence may be represented by the GeneBank ID Q99988, which is provided herein as SEQ ID No: 30, as follows:

MPGQELRTVNGSQMLLVLLVLSWLPHGGALSLAEASRASFPGPSELHSEDSR

FRELRKRYEDLLTRLRANQSWEDSNTDLVPAPAVRILTPEVRLGSGGHLHLRI

SRAALPEGLPEASRLHRALFRLSPTASRSWDVTRPLRRQLSLARPQAPALHLR

LSPPPSQSDQLLAESSSARPQLELHLRPQAARGRRRARARNGDHCPLGPGR

CCRLHTVRASLEDLGWADWVLSPREVQVTMCIGACPSQFRAANMHAQIKTSL

HRLKPDTVPAPCCVPASYNPMVLIQKTDTGVSLQTYDDLLAKDCHCI

[SEQ ID No:30]

Accordingly, preferably GDF15 comprises or consists of an amino acid sequence substantially as set out in SEQ ID NO: 30, or a fragment or variant thereof.

In one embodiment GDF15 is encoded by a nucleotide sequence which is provided herein as SEQ ID No: 31 , as follows: AGT CCCAGCT CAG AGCCGCAACCTGCACAGCCATGCCCGGGCAAG AACT CAGG ACGGT G AAT GGCT CT CAG ATGCT CCTGGT GTTGCTGGTGCT CT CGT GGCTGCCGCATGGGGGCGCCCTGTCTCTGGCCGAGGCGAGCCGCGCAA GTTT CCCGGG ACCCT CAG AGTTGCACT CCG AAG ACT CCAG ATT CCG AG AG TTGCGGAAACGCT ACG AGG ACCTGCT AACCAGGCTGCGGGCCAACCAG A GCTGGG AAGATT CG AACACCG ACCT CGT CCCGGCCCCTGCAGT CCGG AT ACT CACGCCAG AAGTGCGGCTGGG AT CCGGCGGCCACCTGCACCTGCGT AT CT CT CGGGCCGCCCTT CCCG AGGGGCT CCCCG AGGCCT CCCGCCTT C ACCGGGCT CT GTT CCGGCT GT CCCCG ACGGCGT CAAGGT CGTGGG ACGT G ACACG ACCGCTGCGGCGT CAGCT CAGCCTTGCAAGACCCCAGGCGCCC GCGCTGCACCTGCG ACT GT CGCCGCCGCCGT CGCAGT CGGACCAACTGC TGGCAG AAT CTT CGT CCGCACGGCCCCAGCTGG AGTTGCACTTGCGGCC GCAAGCCGCCAGGGGGCGCCGCAGAGCGCGTGCGCGCAACGGGGACCA CTGTCCGCTCGGGCCCGGGCGTTGCTGCCGTCTGCACACGGTCCGCGCG TCGCTGGAAGACCTGGGCTGGGCCGATTGGGTGCTGTCGCCACGGGAGG TGCAAGT G ACCAT GTGCAT CGGCGCGT GCCCG AGCCAGTT CCGGGCGGC AAACATGCACGCGCAG AT CAAG ACG AGCCTGCACCGCCT GAAGCCCG AC ACGGTGCCAGCGCCCTGCTGCGTGCCCGCCAGCT ACAAT CCCATGGTGC T CATT CAAAAG ACCG ACACCGGGGT GT CGCT CCAG ACCT AT GAT G ACTT G TT AGCCAAAG ACTGCCACTGCAT AT G AGCAGT CCTGGT CCTT CCACT GTGC ACCTGCGCGG AGG ACGCG ACCT CAGTT GT CCTGCCCT GTGG AATGGGCT CAAGGTT CCT G AG ACACCCG ATT CCTGCCCAAACAGCT GT ATTT AT AT AAG TCTGTT ATTT ATT ATT A ATTT ATT G G G GT G ACCTT CTT G G G G ACT CG G G G G CTGGT CT G ATGG AACT GT GT ATTT ATTT AAAACT CTGGT GAT AAAAAT AAAG CTGTCT G AACT GTT

[SEQ ID No: 31 ]

Accordingly, preferably GDF15comprises or consists of a nucleotide sequence substantially as set out in SEQ ID NO: 31 , or a fragment or variant thereof.

In one embodiment, CCL22 is provided by gene bank locus ID: FIGNC: 10621 ; Entrez Gene: 6367; Ensembl: ENSG00000102962; OMIM: 602957; and/or UniProtKB: 000626. The protein sequence may be represented by the GeneBank ID 000626, which is provided herein as SEQ ID No: 32, as follows:

MDRLQTALLVVLVLLAVALQATEAGPYGANMEDSVCCRDYVRYRLPLRVVKH

FYWTSDSC

PRPGVVLLTFRDKEICADPRVPWVKMILNKLSQ

[SEQ ID No: 32] Accordingly, preferably CCL22 comprises or consists of an amino acid sequence substantially as set out in SEQ ID NO: 32, or a fragment or variant thereof.

In one embodiment CCL22 is encoded by a nucleotide sequence which is provided herein as SEQ ID No: 33, as follows:

GCAG ACACCTGGGCT G AG ACAT ACAGG ACAG AGCATGG AT CGCCT ACAG A CTGCACT CCTGGTT GT CCT CGT CCT CCTTGCT GTGGCGCTT CAAGCAACT G AGGCAGGCCCCT ACGGCGCCAACATGGAAG ACAGCGT CTGCTGCCGT G ATT ACGT CCGTT ACCGT CTGCCCCTGCGCGTGGT G AAACACTT CT ACTGG ACCT CAG ACT CCT GCCCG AGGCCTGGCGTGGT GTTGCT AACCTT CAGGG A T AAGG AG AT CT GT GCCG AT CCCAG AGTGCCCTGGGT G AAG AT GATT CT CA AT AAGCT G AGCCAAT G AAG AGCCT ACT CT GAT G ACCGTGGCCTTGGCT CC T CCAGG AAGGCT CAGG AGCCCT ACCT CCCTGCCATT AT AGCTGCT CCCCG CCAG AAGCCT GTGCCAACT CT CTGCATT CCCT GAT CT CCAT CCCT GTGGCT GT CACCCTTGGT CACCT CCGTGCT GT CACTGCCAT CT CCCCCCT G ACCCC T CT AACCCAT CCT CTGCCT CCCT CCCTGCAGT CAG AGGGT CCT GTT CCCAT CAGCG ATT CCCCTGCTT AAACCCTT CCAT G ACT CCCCACTGCCCT AAGCT G AGGT CAGT CT CCCAAGCCTGGCAT GTGGCCCT CTGG AT CT GGGTT CCAT C T CT GT CT CCAGCCTGCCCACTT CCCTT CAT GAAT GTTGGGTT CT AGCT CCC T GTT CT CCAAACCCAT ACT ACACAT CCCACTT CTGGGT CTTTGCCTGGGAT GTTGCT G ACACCCAG AAAGT CCCACCACCTGCACAT GT GT AGCCCCACCA GCCCT CCAAGGCATTGCT CGCCCAAGCAGCTGGT A ATT CCATTT CAT GT AT T AGAT GT CCCCTGGCCCT CT GT CCCCT CTT AAT AACCCT AGT CACAGT CT C CGCAG ATT CTTGGG ATTTGGGGGTTTT CT CCCCCACCT CT CCACT AGTTGG ACCAAGGTTT CT AGCT AAGTT ACT CT AGT CT CCAAGCCT CT AGCAT AGAGC ACTGCAG ACAGGCCCTGGCT CAG AAT CAG AGCCCAG AAAGT GGCTGCAG ACAAAAT CAAT AAAACT AAT GT CCCT CCCCT CT CCCTGCCAAAAGGCAGTT ACAT AT CAAT ACAG AG ACT CAAGGT CACT AG AAATGGGCCAGCTGGGT CA AT GT G AAGCCCCAAATTTGCCCAG ATT CACCTTT CTT CCCCCACT CCCTTTT TTTTTTTTTTTTT G AGAT GG AGTTT CGCT CTT GT CACCCACGCTGG AGTGCA AT GGT GTGGT CTT GGCTT ATT G AAGCCT CTGCCT CCTGGGTT CAAGT GATT CT CTTGCCT CAGCCT CCT G AGT AGCT GGG ATT ACAGGTT CCTGCT ACCAC GCCCAGCT AATTTTT GT ATTTTT AGT AG AG ACG AGGCTT CACCAT GTTGGC CAGGCT GGT CT CG AACT CCT GT CCT CAGGT AAT CCGCCCACCT CAGCCT C CCAAAGTGCTGGG ATT ACAGGCGT G AGCCACAGTGCCTGGCCT CTT CCCT CT CCCCACCCCCCCCCCAACTTTTTTTTTTTTTT ATGGCAGGGT CT CACT CT GT CGCCCAGGCTGG AGTGCAGTGGCGT GAT CT CGGCT CACT ACAACCT C G ACCT CCTGGGTT CAAGCG ATT CT CCCACCCCAGCCT CCCAAGT AGCTGG GATT ACAGGT GT GTGCCACT ACGGCTGGCT AATTTTT GT ATTTTT AGT AG A G ACAGGTTT CACCAT ATTGGCCAGGCTGGT CTT G AACT CCT GACCT CAAGT GAT CCACCTT CCTT GTGCT CCCAAAGTGCT GAG ATT ACAGGCGT GAGCT AT CACACCCAGCCT CCCCCTTTTTTT CCT AAT AGG AG ACT CCT GT ACCTTT CTT CGTTTT ACCT AT GT GT CGT GT CTGCTT ACATTT CCTT CT CCCCT CAGGCTTT TTTTGGGTGGT CCT CCAACCT CCAAT ACCCAGGCCTGGCCT CTT CAG AGT A CCCCCCATT CCACTTT CCCTGCCT CCTT CCTT AAAT AGCT G ACAAT CAAATT CATGCT ATGGT GT GAAAG ACT ACCTTT G ACTTGGT ATT AT AAGCTGG AGTT AT AT AT GT ATTT G AAAACAG AGT AAAT ACTT AAG AGGCCAAAT AG AT G AAT G G AAG AATTTT AGG AACT GT G AG AGGGGG ACAAGGTGG AGCTTT CCTGGCC CTGGG AGG AAGCTGGCT GTGGT AGCGT AGCGCT CT CT CT CT CT GT CT GT G GCAGG AGGCAAAG AGT AGGGT GT AATT G AGT G AAGG AAT CCTGGGT AG AG ACCATT CT CAGGTGGTTGGGCCAGGCT AAAG ACTGGGATTTGGGT CT AT C T ATGCCTTT CTGGCT G ATTTTT GT AG AG ACGGGGTTTTGCCAT GTT ACCCA GGCTGGT CT CAAACT CCTGGGCT CAAGCG AT CCT CCTGGCT CAGCCT CCC AAAGTGCTGGG ATT ACAGGCGT G AGT CACTGCGCCTGGCTT CCT CTT CCT CTT GAG AAAT ATT CTTTT CAT ACAGCAAGT ATGGG ACAGCAGT GT CCCAGG T AAAGG ACAT AAAT GTT ACAAGT GT CTGGT CCTTT CT G AGGG AGGCTGGT G CCGCT CTGCAGGGT ATTT GAACCT GTGG AATTGG AGG AGGCCATTT CACT CCCT G AACCCAGCCT G ACAAAT CACAGT GAG AAT GTT CACCTT AT AGGCTT GCT GTGGGGCT CAGGTT G AAAGT GTGGGG AGT G ACACTGCCT AGGCAT C CAGCT CAGT GT CAT CCAGGGCCT GT GT CCCT CCCG AACCCAGGGT CAACC TGCCT ACCACAGGCACT AGAAGG ACG AAT CTGCCT ACTGCCCAT G AACGG GGCCCT CAAGCGT CCTGGG AT CT CCTT CT CCCT CCT GT CCT GT CCTTGCC CCT CAGG ACTGCTGG AAAAT AAAT CCTTT AAAAT AG

[SEQ ID No: 33]

Accordingly, preferably CCL22 comprises or consists of a nucleotide sequence substantially as set out in SEQ ID NO: 33 or a fragment or variant thereof.

In one embodiment, TNFRSF1 1 is provided by gene bank locus ID: FIGNC:

11909; Entrez Gene: 4982; Ensembl: ENSG00000164761 ; OMIM: 602643; and/or UniProtKB: 000300. The protein sequence may be represented by the GeneBank ID 000300, which is provided herein as SEQ ID No: 34, as follows:

MNNLLCCALVFLDISIKWTTQETFPPKYLHYDEETSHQLLCDKCPPGTYLKQH

CTAKWKTVCAPCPDHYYTDSWHTSDECLYCSPVCKELQYVKQECNRTHNRV

CECKEGRYLEIEFCLKHRSCPPGFGVVQAGTPERNTVCKRCPDGFFSNETSS

KAPCRKHTNCSVFGLLLTQKGNATHDNICSGNSESTQKCGIDVTLCEEAFFRF

AVPTKFTPNWLSVLVDNLPGTKVNAESVERIKRQHSSQEQTFQLLKLWKHQN

KDQDIVKKIIQDIDLCENSVQRHIGHANLTFEQLRSLMESLPGKKVGAEDIEKTI

KACKPSDQILKLLSLWRIKNGDQDTLKGLMHALKHSKTYHFPKTVTQSLKKTIR

FLHSFTMYKLYQKLFLEMIGNQVQSVKISCL

[SEQ ID No: 34]

Accordingly, preferably TNFRSF1 1 comprises or consists of an amino acid sequence substantially as set out in SEQ ID NO: 34, or a fragment or variant thereof.

In one embodiment TNFRSF1 1 is be encoded by a nucleotide sequence which is provided herein as SEQ ID No: 35, as follows:

GGAG ACGCACCGGAGCGCT CGCCCAGCCGCCGCCT CCAAGCCCCT GAG GTTT CCGGGG ACCACAAT GAACAACTTGCT GTGCTGCGCGCT CGT GTTT C TGG ACAT CT CCATT AAGTGG ACCACCCAGG AAACGTTT CCT CCAAAGT ACC TT CATT AT G ACGAAG AAACCT CT CAT CAGCT GTT GT GT G ACAAAT GT CCT C CTGGT ACCT ACCT AAAACAACACT GT ACAGCAAAGTGGAAG ACCGT GTGC GCCCCTTGCCCT G ACCACT ACT ACACAGACAGCTGGCACACCAGT G ACG A GT GT CT AT ACTGCAGCCCCGT GTGCAAGG AGCTGCAGT ACGT CAAGCAGG AGTGCAAT CGCACCCACAACCGCGT GT GCG AATGCAAGGAAGGGCGCT A CCTT GAG AT AG AGTT CTGCTT G AAACAT AGG AGCTGCCCT CCTGG ATTTGG AGTGGTGCAAGCTGG AACCCCAG AGCG AAAT ACAGTTTGCAAAAG AT GT C CAG ATGGGTT CTT CT CAAAT GAG ACGT CAT CT AAAGCACCCT GT AG AAAAC ACACAAATTGCAGT GT CTTTGGT CT CCTGCT AACT CAG AAAGG AAATGCAA CACACG ACAACAT AT GTT CCGG AAACAGT G AAT CAACT CAAAAAT GTGG AA T AGAT GTT ACCCT GT GT GAGG AGGCATT CTT CAGGTTTGCT GTT CCT ACAA AGTTT ACGCCT AACTGGCTT AGT GT CTT GGT AG ACAATTTGCCTGGCACCA AAGTAAACGCAGAGAGTGTAGAGAGGATAAAACGGCAACACAGCTCACAA G AACAG ACTTT CCAGCTGCT G AAGTT ATGG AAACAT CAAAACAAAG ACCAA GAT AT AGT CAAG AAG AT CAT CCAAG AT ATT G ACCT CT GT G AAAACAGCGT G CAGCGGCACATTGGACATGCT AACCT CACCTT CG AGCAGCTT CGT AGCTT G ATGG AAAGCTT ACCGGG AAAG AAAGT GGG AGCAG AAG ACATT G AAAAAA CAAT AAAGGCATGCAAACCCAGT G ACCAG AT CCT G AAGCTGCT CAGTTT GT GGCG AAT AAAAAATGGCG ACCAAG ACACCTT G AAGGGCCT AATGCACGCA CT AAAGCACT CAAAG ACGT ACCACTTT CCCAAAACT GT CACT CAG AGT CT A AAG AAG ACCAT CAGGTT CCTT CACAGCTT CACAAT GT ACAAATT GT AT CAG AAGTT ATTTTT AG AAAT GAT AGGT AACCAGGT CCAAT CAGT AAAAAT AAGCT GCTT AT AACTGG AAATGGCCATT GAGCT GTTT CCT CACAATTGGCG AG AT C CCATGG AT G AGT AAACT GTTT CT CAGGCACTT G AGGCTTT CAGT GAT AT CT TT CT CATT ACCAGT G ACT AATTTTGCCACAGGGT ACT AAAAG AAACT AT GAT GTGG AG AAAGG ACT AACAT CT CCT CCAAT AAACCCCAAATGGTT AAT CCAA CT GT CAG AT CTGGAT CGTT AT CT ACT G ACT AT ATTTT CCCTT ATT ACTGCTT GCAGT AATT CAACTGG AAATT AAAAAAAAAAAACT AG ACT CCATT GTGCCTT ACT AAAT ATGGG AAT GT CT AACTT AAAT AGCTTT G AG ATTT CAGCT ATGCTA G AGGCTTTT ATT AG AAAGCCAT ATTTTTTT CT GT AAAAGTT ACT AAT AT AT CT GT AACACT ATT ACAGT ATTGCT ATTT AT ATT CATT CAG AT AT AAGATTT GT AC AT ATT AT CAT CCT AT AAAG AAACGGT AT G ACTT AATTTT AG AAAG AAAATT AT ATT CT GTTT ATT AT GACAAAT G AAAG AG AAAAT AT AT ATTTTT AATGGAAAGT TT GT AGCATTTTT CT AAT AGGT ACTGCCAT ATTTTT CT GT GTGGAGT ATTTTT AT AATTTT AT CT GT AT AAGCT GT AAT AT CATTTT AT AG AAAATGCATT ATTT A GT CAATT GTTT AAT GTTGG AAAACAT AT G AAAT AT AAATT AT CT G AAT ATT AG ATGCT CT G AG AAATT G AAT GT ACCTT ATTT AAAAGATTTT ATGGTTTT AT AAC T AT AT AAAT G ACATT ATT AAAGTTTT CAAATT ATTTTTT A

[SEQ ID No: 35]

Accordingly, preferably TNFRSF1 1 comprises or consists of a nucleotide sequence substantially as set out in SEQ ID NO: 35, or a fragment or variant thereof.

In one embodiment, ANGPT2 is provided by gene bank locus ID is FIGNC: 485; Entrez Gene: 285; Ensembl: ENSG00000091879; OMIM: 601922; and/or UniProtKB: 015123. The protein sequence may be represented by the GeneBank ID 015123, which is provided herein as SEQ ID No: 36, as follows:

MWQIVFFTLSCDLVLAAAYNNFRKSMDSIGKKQYQVQHGSCSYTFLLPEMDN

CRSSSSPYVSNAVQRDAPLEYDDSVQRLQVLENIMENNTQWLMKLENYIQDN

MKKEMVEIQQNAVQNQTAVMIEIGTNLLNQTAEQTRKLTDVEAQVLNQTTRLE

LQLLEHSLSTNKLEKQILDQTSEINKLQDKNSFLEKKVLAMEDKHIIQLQSIKEEK

DQLQVLVSKQNSIIEELEKKIVTATVNNSVLQKQQHDLMETVNNLLTMMSTSN

SAKDPTVAKEEQISFRDCAEVFKSGHTTNGIYTLTFPNSTEEIKAYCDMEAGG

GGWTIIQRREDGSVDFQRTWKEYKVGFGNPSGEYWLGNEFVSQLTNQQRYV

LKIHLKDWEGNEAYSLYEHFYLSSEELNYRIHLKGLTGTAGKISSISQPGNDFS

TKDGDNDKCICKCSQMLTGGWWFDACGPSNLNGMYYPQRQNTNKFNGIKW

YYWKGSGYSLKATTMMIRPADF

[SEQ ID No: 36]

Accordingly, preferably ANGPT2 comprises or consists of an amino acid sequence substantially as set out in SEQ ID NO: 36, or a fragment or variant thereof. In one embodiment ANGPT2 is encoded by a nucleotide sequence which is provided herein as SEQ ID No: 37, as follows:

CTGGTTGG AGGGCAGGCATT CTGCT CT GATTTTT CCT GTTGCCTGGCT AGT G ACCCCCT ACAGG AAG AT AACGGCT AAGCCAGG AGGGCGGAGCAGCCCA CT ACACAT GT CTGGCTGCT CTT AT CAACTT AT CAT AT AAGG AAAGG AAAGT GATT GATT CGG AT ACT G ACACT GT AGG AT CTGGGGAG AG AGG AACAAAGG ACCGT G AAAGCTGCT CT GT AAAAGCT G ACACAGCCCT CCCAAGT G AGCAG G ACT GTT CTT CCCACTGCAAT CT G ACAGTTT ACTGCATGCCT GG AG AG AAC ACAGCAGT AAAAACCAGGTTTGCT ACT GG AAAAAG AGG AAAG AG AAG ACTT T CATT G ACGG ACCCAGCCATGGCAGCGT AGCAGCCCTGCGTTTT AGACGG CAGCAGCT CGGG ACT CTGG ACGT GT GTTTGCCCT CAAGTTTGCT AAGCT G CTGGTTT ATT ACT G AAG AAAG AAT GT GGCAG ATT GTTTT CTTT ACT CT G AGC T GT GAT CTT GT CTTGGCCGCAGCCT AT AACAACTTT CGG AAG AGCATGG AC AGCAT AGG AAAG AAGCAAT AT CAGGT CCAGCATGGGT CCTGCAGCT ACAC TTT CCT CCTGCCAG AG ATGG ACAACTGCCGCT CTT CCT CCAGCCCCT ACG T GT CCAATGCT GTGCAG AGGGACGCGCCGCT CG AAT ACGAT G ACT CGGT G CAG AGGCTGCAAGTGCTGGAG AACAT CATGGAAAACAACACT CAGT GGCT AAT G AAGCTT GAG AATT AT AT CCAGG ACAACAT G AAGAAAGAAATGGT AG A GAT ACAGCAGAAT GCAGT ACAG AACCAGACGGCT GT GAT GAT AG AAAT AG GGACAAACCT GTT GAACCAAACAGCGG AGCAAACGCGG AAGTT AACT GAT GTGG AAGCCCAAGT ATT AAAT CAG ACCACG AG ACTT G AACTT CAGCT CTT G G AACACT CCCT CT CG ACAAACAAATTGGAAAAACAG ATTTTGG ACCAG ACC AGT G AAAT AAACAAATTGCAAG AT AAG AACAGTTT CCT AG AAAAG AAGGT G CT AGCT ATGG AAGACAAGCACAT CAT CCAACT ACAGT CAAT AAAAG AAG AG AAAG AT CAGCT ACAGGT GTT AGT AT CCAAGCAAAATT CCAT CATT G AAG AA CT AG AAAAAAAAAT AGT G ACTGCCACGGT G AAT AATT CAGTT CTT CAG AAG CAGCAACAT GAT CT CATGG AG ACAGTT AAT AACTT ACT G ACT AT GAT GT CC ACAT CAAACT CAGCT AAGG ACCCCACT GTTGCT AAAG AAG AACAAAT CAGC TT CAG AG ACT GTGCT G AAGT ATT CAAAT CAGG ACACACCACG AATGGCAT C T ACACGTT AACATT CCCT AATT CT ACAG AAG AG AT CAAGGCCT ACT GT G AC AT GG AAGCT GG AGGAGGCGGGTGG ACAATT ATT CAGCG ACGT G AGG AT G GCAGCGTT G ATTTT CAG AGG ACTTGGAAAG AAT AT AAAGTGGG ATTT GGT A ACCCTT CAGG AG AAT ATTGGCTGGG AAAT G AGTTT GTTT CGCAACT G ACT A AT CAGCAACGCT AT GTGCTT AAAAT ACACCTT AAAG ACTGGGAAGGG AAT G AGGCTT ACT CATT GT AT G AACATTT CT AT CT CT CAAGT G AAG AACT CAATT A T AGG ATT CACCTT AAAGG ACTT ACAGGG ACAGCCGGCAAAAT AAGCAGCA T CAGCCAACCAGG AAAT G ATTTT AGCACAAAGG AT GG AG ACAACG ACAAAT GT ATTTGCAAAT GTT CACAAATGCT AACAGG AGGCTGGT GGTTT G ATGCAT GTGGT CCTT CCAACTT G AACGG AAT GT ACT AT CCACAG AGGCAG AACACAA AT AAGTT CAACGGCATT AAATGGT ACT ACTGGAAAGGCT CAGGCT ATT CGC T CAAGGCCACAACCAT GAT GAT CCG ACCAGCAG ATTT CT AAACAT CCCAGT CCACCT G AGG AACT GT CT CG AACT ATTTT CAAAG ACTT AAGCCCAGTGCAC T G AAAGT CACGGCTGCGCACT GT GT CCT CTT CCACCACAG AGGGCGT GT G CT CGGTGCT G ACGGG ACCCACATGCT CCAG ATT AG AGCCT GT AAACTTT AT CACTT AAACTTGCAT CACTT AACGG ACCAAAGCAAGACCCT AAACAT CCAT AATT GT GATT AGACAG AACACCT ATGCAAAG AT G AACCCG AGGCT G AGAAT CAG ACT G ACAGTTT ACAG ACGCTGCT GT CACAACCAAG AAT GTT AT GTGCA AGTTT AT CAGT AAAT AACT GG AAAACAG AACACTT AT GTT AT ACAAT ACAG A T CAT CTTGG AACT GCATT CTT CT G AGCACT GTTT AT ACACT GT GT AAAT ACC CAT AT GT CCT G AATT CACCAT CACT AT CACAATT AAAAGG AAG AAAAAAACT CT CT AAGCCAT AAAAAG ACAT ATT CAGGG AT ATT CT G AG AAGGGGTT ACT A G AAGTTT AAT ATTTGG AAAAACAGTT AGTGCATTTTT ACT CCAT CT CTT AGG TGCTTT AAATTTTT ATTT CAAAAACAGCGT ATTT ACATTT AT GTT G ACAGCTT AGTT AT AAGTT AAT GCT CAAAT ACGT ATTT CAAATTT AT ATGGT AG AAACTT C CAG AAT CT CT GAAATT AT CAACAG AAACGTGCCATTTT AGTTT AT ATGCAG A CCGT ACT ATTTTTTT CTGCCT GATT GTT AAAT AT G AAGGT ATTTTT AGT AATT AAAT AT AACTT ATT AG G G G AT ATG CCT AT GTTT AACTTTT AT GAT AAT ATTT A CAATTTT AT AATTT GTTT CCAAAAG ACCT AATT GTGCCTT GT GAT AAGG AAA CTT CTT ACTTTT AAT GAT G AGG AAAATT AT ACATTT CATT CT AT GACAAAG AA ACTTT ACT AT CTT CT CACT ATT CT AAAACAG AGGT CT GTTTT CTTT CCT AGT A AG AT AT ATTTTT AT AGAACTAGACT AC AATTT AATTT CTGGTT GAGAAAAGC CTT CT ATTT AAG AAATTT ACAAAGCT AT AT GT CT CAAGATT CACCCTT AAATT T ACTT AAGG AAAAAAAT AATT G ACACT AGT AAGTTTTTTT AT GT CAAT CAGC AAACT G AAAAAAAAAAAAGGGTTT CAAAGTGCAAAAACAAAAT CT GAT GTT C AT AAT AT ATTT AAAT ATTT ACCAAAAATTT G AG AACACAGGGCTGGGCGCAG TGGCT CACACCT AT AAT CCCAGT ACATTGGT AGGCAAGGTGGGCAG AT CA CCT G AGGT CAGG AGTT CAAGACCAGCCTGG ACAACATGGT G AAACCCT GT CT CT ACT AAAT AAT ACAAAAATT AGCCAGGCGTGCTGGCGGGCACCT GT AA T CCCAGCT ACT CGGG AGGCT G AGGCAGGG AG AATTGCTTGCACCAGGG A GGT AG AGGTTGCAGT G AGCCAAG AT CGCACCACTGCACT CCAGCCGGGG CAACAG AGCAAG ACT CCAT CT CAAAAAAAAAAAAAAAAAAAG AAAG AAAAG AAAATTT G AG AACACAGCTTT AT ACT CGGG ACT ACAAAACCAT AAACT CCT GGAGTTTT AACT CCTTTT G AAATTTT CAT AGT ACAATT AAT ACT AAT GAACAT TT GT GT AAAGCTTT AT AATTT AAAGGCAATTT CT CAT AT ATT CTTTT CT G AAT CATTTGCAAGG AAGTT CAG AGT CCAGT CT GT AACT AGCAT CT ACT AT AT GT CT GT CTT CACCTT ACAGT GTT CT ACCATT ATTTTTT CTTT ATT CCATTT CAAA AT CT AATTT ATTTT ACCCCAACTT CT CCCCACCACTT G ACGT AGTTTT AG AA CACACAGGT GTTGCT ACAT ATTTGG AGT CAAT G ATGGACT CTGGCAAAGT C AAGGCT CT GTTTT ATTT CCACCAAGGTGCACTTTT CCA AC A ACT ATTT AACT AGTT AAG AACCT CCCT AT CTT AG AACT GT AT CT ACTTT AT ATTT AAG AAGGT TTT AT G AATT CAACAACGGT AT CATGGCCTT GT AT CAAGTT G AAAAACAACT GAAAATAAGAAAATTTCACAGCCTCGAAAGACAACAACAAGTTTCTAGGAT AT CT CAAT GACAAGAGT G ATGG AT ACTT AGGT AGGG AAACGCT AATGCAG G AAAAACTGGCAACAACACAATTT AT AT CAATT CT CTTT GT AGGCAGGT GAT AAA AAATT C AAG G AC AAAT CT CATT ATGT C ATT GTG CAT CAT AT AT AAT CT CT T AT G AGCG AGAAT GGGGGG AATTT GT GTTTTT ACTTT ACACTT CAATT CCTT ACACGGT ATTT CAAACAAACAGTTTTGCT G AG AGGAGCTTTT GT CT CT CCTT AAG AAAAT GTTT AT AAAGCT G AAAGG AAAT CAAACAGT AAT CTT AAAAATG A AAACAAAACAACCCAACAACCT AG AT AACT ACAGT GAT CAGGGAGCACAGT T CAACT CCTT GTT AT GTTTT AGT CAT ATGGCCT ACT CAAACAGCT AAAT AAC AACACCAGTGGCAG AT AAAAAT CACCATTT AT CTTT CAGCT ATT AAT CTTTT G AAT G AAT AAACT GT G ACAAACAAATT AACATTTTT GAACAT G AAAGGCAAC TT CTGCACAAT CCT GT AT CCAAGCAAACTTT AAATT AT CCACTT AATT ATT AC TT AAT CTT AAAAAAAATT AG AACCCAG AACTTTT CAAT G AAGCATTT G AAAG TT G AAGTGG AATTT AGG AAAGCCAT AAAAAT AT AAAT ACT GTT AT CACAGCA CCAGCAAGCCAT AAT CTTT AT ACCT AT CAGTT CT ATTT CT ATT AACAGT AAA AACATT AAGCAAG AT AT AAG ACT ACCTGCCCAAG AATT CAGT CTTTTTT CAT TTTT GTTTTT CT CAGTT CT G AGGAT GTT AAT CGT CAAATTTT CTTTGG ACT G CATT CCT CACT ACTTTTTGCACAATGGT CT CACGTT CT CACATTT GTT CT CG CG AAT AAATT GAT AAAAGGT GTT AAGTT CTGT G AAT GT CTTTTT AATT ATG G GCAT AATT GTGCTT GACTGGAT AAAAACTT AAGT CCACCCTT AT GTTT AT AA T AATTT CTT G AG AACAGCAAACTGCATTT ACCAT CGT AAAACAACAT CT G AC TT ACGGG AGCTGCAGGG AAGTGGT G AGACAGTT CG AACGGCT CCT CAG AA AT CCAGT GACCCAATT CT AAAG ACCAT AGCACCTGCAAGT GACACAACAAG CAG ATTT ATT AT ACATTT ATT AGCCTT AGCAGGCAAT AAACCAAG AAT CACT TT G AAG ACACAGCAAAAAGT GAT ACACT CCGCAG AT CT GAAAT AG AT GT GT T CT CAG ACAACAAAGT CCCTT CAG AAT CTT CAT GTTGCAT AAAT GTT AT G AA T ATT AAT AAAAAGTT GATT G AG AAAAA

[SEQ ID No: 37]

Accordingly, preferably ANGPT2 comprises or consists of a nucleotide sequence substantially as set out in SEQ ID NO: 37, or a fragment or variant thereof.

In one embodiment, RelA NF-KB is provided by gene bank locus ID: HGNC: 9955; Entrez Gene: 5970; Ensembl: ENSG00000173039; OMIM: 164014; and/or UniProtKB: Q04206. The protein sequence may be represented by the GeneBank ID Q04206, which is provided herein as SEQ ID No: 38, as follows:

MDELFPLIFPAEPAQASGPYVEIIEQPKQRGMRFRYKCEGRSAGSIPGERSTD

TTKTHPTIKINGYTGPGTVRISLVTKDPPHRPHPHELVGKDCRDGFYEAELCPD

RCIHSFQNLGIQCVKKRDLEQAISQRIQTNNNPFQEEQRGDYDLNAVRLCFQV

TVRDPSGRPLRLPPVLSHPIFDNRAPNTAELKICRVNRNSGSCLGGDEIFLLCD

KVQKEDIEVYFTGPGWEARGSFSQADVHRQVAIVFRTPPYADPSLQAPVRVS

MQLRRPSDRELSEPMEFQYLPDTDDRHRIEEKRKRTYETFKSIMKKSPFSGPT DPRPPPRRIAVPSRSSASVPKPAPQPYPFTSSLSTINYDEFPTMVFPSGQISQA

SALAPAPPQVLPQAPAPAPAPAMVSALAQAPAPVPVLAPGPPQAVAPPAPKP

TQAGEGTLSEALLQLQFDDEDLGALLGNSTDPAVFTDLASVDNSEFQQLLNQ

GIPVAPHTTEPMLMEYPEAITRLVTGAQRPPDPAPAPLGAPGLPNGLLSGDED

FSSIADMDFSALLSQISS

[SEQ ID No: 38]

Accordingly, preferably RelA NF-KB comprises or consists of an amino acid sequence substantially as set out in SEQ ID NO: 38, or a fragment or variant thereof.

In one embodiment RelA NF-KB is be encoded by a nucleotide sequence which is provided herein as SEQ ID No: 39, as follows:

CGT CCT CGGCG AGGCGCGCACTTGGCCCCG ACCCCCGGCAGCGGCT GT GCGTGCAGCCT CTT CGT CCT CCGCGCGGCGTGCACTTGCT CCCGGCCCC TGCGCCGGGCGGCGGCGGGGCAGCGCGCAGGCGCGGCCGGATTCCGG GCAGT G ACGCG ACGGCGGGCCGCGCGGCGCATTT CCGCCT CTGGCG AAT GGCTCGTCTGTAGTGCACGCCGCGGGCCCAGCTGCGACCCCGGCCCCG CCCCCGGG ACCCCGGCCAT GG ACG AACT GTT CCCCCT CAT CTT CCCGGC AG AGCCAGCCCAGGCCT CT GGCCCCT AT GT GG AG AT CATT G AGCAGCCCA AGCAGCGGGGCAT GCGCTT CCGCT ACAAGTGCG AGGGGCGCT CCGCGG GCAGCAT CCCAGGCG AG AGG AGCACAG AT ACCACCAAGACCCACCCCAC CAT CAAG AT CAAT GGCT ACACAGG ACCAGGG ACAGTGCGCAT CT CCCTGG T CACCAAGG ACCCT CCT CACCGGCCT CACCCCCACG AGCTT GT AGG AAAG G ACTGCCGGG ATGGCTT CT AT G AGGCT GAGCT CTGCCCGG ACCGCTGCA T CCACAGTTT CCAG AACCTGGG AAT CCAGT GT GT G AAG AAGCGGG ACCT G G AGCAGGCT AT CAGT CAGCGCAT CCAGACCAACAACAACCCCTT CCAAG A AG AGCAGCGTGGGG ACT ACG ACCT G AATGCT GTGCGGCT CTGCTT CCAG GT G ACAGTGCGGG ACCCAT CAGGCAGGCCCCT CCGCCTGCCGCCT GT CC TTT CT CAT CCCAT CTTT G ACAAT CGTGCCCCCAACACTGCCG AGCT CAAG A T CTGCCG AGT GAACCG AAACT CT GGCAGCTGCCT CGGTGGGG AT GAG AT C TT CCT ACT GT GT G ACAAGGTGCAG AAAG AGG ACATT G AGGT GT ATTT CACG GGACCAGGCTGGGAGGCCCG AGGCT CCTTTT CGCAAGCT GAT GTGCACC G ACAAGTGGCCATT GT GTT CCGG ACCCCT CCCT ACGCAG ACCCCAGCCT G CAGGCT CCT GTGCGT GT CT CCATGCAGCTGCGGCGGCCTT CCG ACCGGG AGCT CAGT G AGCCCATGG AATT CCAGT ACCTGCCAG AT ACAGACG AT CGT CACCGG ATT GAGG AG AAACGT AAAAGG ACAT AT G AG ACCTT CAAG AGCAT CAT G AAG AAG AGT CCTTT CAGCGG ACCCACCG ACCCCCGGCCT CCACCT C G ACGCATTGCT GTGCCTT CCCGCAGCT CAGCTT CT GT CCCCAAGCCAGCA CCCCAGCCCT AT CCCTTT ACGT CAT CCCT G AGCACCAT CAACT AT GAT GAG TTT CCCACCATGGT GTTT CCTT CTGGGCAG AT CAGCCAGGCCT CGGCCTT GGCCCCGGCCCCT CCCCAAGT CCTGCCCCAGGCT CCAGCCCCTGCCCCT GCT CCAGCCATGGT AT CAGCT CTGGCCCAGGCCCCAGCCCCT GT CCCAGT CCT AGCCCCAGGCCCT CCT CAGGCT GTGGCCCCACCTGCCCCCAAGCCC ACCCAGGCTGGGG AAGG AACGCT GT CAGAGGCCCTGCTGCAGCTGCAGT TT GAT GAT G AAG ACCTGGGGGCCTTGCTTGGCAACAGCACAGACCCAGCT GT GTT CACAG ACCTGGCAT CCGT CG ACAACT CCG AGTTT CAGCAGCTGCT G AACCAGGGCAT ACCT GTGGCCCCCCACACAACT G AGCCCATGCT G ATGG AGT ACCCT G AGGCT AT AACT CGCCT AGT G ACAGGGGCCCAG AGGCCCCC CG ACCCAGCT CCTGCT CCACTGGGGGCCCCGGGGCT CCCCAATGGCCT C CTTT CAGG AG AT G AAG ACTT CT CCT CCATTGCGG ACATGG ACTT CT CAGCC CTGCT G AGT CAG AT CAGCT CCT AAGGGGGT G ACGCCTGCCCT CCCCAG A GCACTGGGTTGCAGGGG ATT G AAGCCCT CCAAAAGCACTT ACGG ATT CT G GTGGGGTGT GTT CCAACTGCCCCCAACTTT GTGG AT GT CTT CCTTGGAGG GGGG AGCCAT ATTTT ATT CTTTT ATT GT CAGT AT CT GT AT CT CT CT CT CTTTT TGG AGGTGCTT AAGCAG AAGCATT AACTT CT CTGG AAAGGGGGG AGCTGG GGAAACT CAAACTTTT CCCCT GT CCT G ATGGT CAGCT CCCTT CT CT GT AGG G AACT CT GGGGT CCCCCAT CCCCAT CCT CCAGCTT CTGGT ACT CT CCT AG AG ACAG AAGCAGGCTGG AGGT AAGGCCTTT G AGCCCACAAAGCCTT AT CA AGT GT CTT CCAT CATGG ATT CATT ACAGCTT AAT CAAAAT AACGCCCCAG AT ACCAGCCCCT GT ATGGCACT GGCATT GT CCCT GTGCCT AACACCAGCGTT T G AGGGGCTGGCCTT CCTGCCCT ACAG AGGT CT CTGCCGGCT CTTT CCTT GCT CAACCATGGCT G AAGG AAACCAGTGCAACAGCACTGGCT CT CT CCAG GAT CCAG AAGGGGTTTGGT CTGGGACTT CCTTGCT CT CCCT CTT CT CAAGT GCCTT AAT AGT AGGGT AAGTT GTT AAG AGT GGGGG AG AGCAGGCT GGCAG CT CT CCAGT CAGG AGGCAT AGTTTTT ACT G AACAAT CAAAGCACTTGG ACT CTTGCT CTTT CT ACT CT G AACT AAT AAAT CT GTTGCCAAGCT G

[SEQ ID No: 39]

Accordingly, preferably RelA NF-KB comprises or consists of a nucleotide sequence substantially as set out in SEQ ID NO: 39, or a fragment or variant thereof.

In one embodiment, the biomarker is RORA, and a decrease in the expression, amount and/or activity of RORA when compared to a reference is indicative of an individual having a higher risk of suffering from cardiovascular disease.

In one embodiment, the biomarker is GHR, and an increase in the expression, amount and/or activity of GHR when compared to a reference is indicative of an individual having a higher risk of suffering from cardiovascular disease.

Preferably, the sample comprises a biological sample. The sample may be any material that is obtainable from the subject from which protein, RNA and/or DNA is obtainable. Furthermore, the sample may be blood, plasma, serum, spinal fluid, urine, sweat, saliva, tears, breast aspirate, prostate fluid, seminal fluid, vaginal fluid, stool, cervical scraping, cytes, amniotic fluid, intraocular fluid, mucous, moisture in breath, animal tissue, cell lysates, tumour tissue, hair, skin, buccal scrapings, lymph, interstitial fluid, nails, bone marrow, cartilage, prions, bone powder, ear wax, or combinations thereof.

Preferably, however, the sample comprises blood, urine or tissue.

In one embodiment, the sample comprises a blood sample. The blood may be venous or arterial blood. Blood samples may be assayed immediately. Alternatively, the blood sample may be stored at low temperatures, for example in a fridge or even frozen before the method is conducted. Detection may be carried out on whole blood. Preferably, however, the blood sample comprises blood serum. Preferably, the blood sample comprises blood plasma.

The blood may be further processed before the method is performed. For instance, an anticoagulant, such as citrate (such as sodium citrate), hirudin, heparin, PPACK, or sodium fluoride may be added. Thus, the sample collection container may contain an anticoagulant in order to prevent the blood sample from clotting. Alternatively, the blood sample may be centrifuged or filtered to prepare a plasma or serum fraction, which may be used for analysis. Hence, it is preferred that the method is performed in a blood plasma or a blood serum sample. It is preferred that the expression level, amount and/or activity of the biomarker is measured in vitro from a blood serum sample or a plasma sample taken from the individual.

The invention also provides for a kit for determining, diagnosing and/or prognosing CVD risk.

Accordingly, in a second aspect, there is provided a kit for determining, diagnosing and/or prognosing the risk of an individual suffering from cardiovascular disease, the kit comprising: a. detection means for detecting, in a sample obtained from a test subject, the expression level, amount and/or activity of two or more biomarkers selected from the group consisting of TNF-a; GSTA1 ; NT-proBNP; RORA; TNC; GHR; A2M; IGFBP2; APOB; SEPP1 ; TFF3; IL6; CHI3L1 ; MET; GDF15; CCL22; TNFRSF1 1 ; ANGPT2 and Re I A NF-KB; and b. a reference value from a healthy control population for expression level, amount and/or activity of two or more a biomarkers selected from the group consisting of TNF-a; GSTA1 ; NT-proBNP; RORA; TNC; GHR; A2M; IGFBP2; APOB; SEPP1 ; TFF3; IL6; CHI3L1 ; MET; GDF15; CCL22;

TNFRSF1 1 ; ANGPT2 and Re I A NF-KB, wherein the kit is used to identify: i) a decrease in expression, amount and/or activity of TNF-a; GSTA1 ; NT-proBNP; RORA and/or TNC when compared to the reference; and /or an increase in expression, amount and/or activity of GHR; A2M; IGFBP2; APOB; SEPP1 ; TFF3; IL6 and/or CHI3L1 ; when compared to the reference to determine, diagnose and/or prognose that an individual has a higher risk of suffering from cardiovascular disease; and/or ii) a decrease expression, amount and/or activity of MET ; GDF1 5; CCL22; TNFRSF11 ; ANGPT2 and/or RelA NF-KB when compared to the reference to determine, diagnose and/or prognose that an individual has a lower risk of suffering from cardiovascular disease.

The cardiovascular disease, the biomarker, detection and the sample may be as defined in the first aspect.

Preferably, a decrease in expression, amount and/or activity of TNF-D, GSTA1 , NT-proBNP, RORA and/or TNC, when compared to the reference, is indicative of an individual having a higher risk of suffering from cardiovascular disease or a negative prognosis.

Preferably, an increase in expression, amount and/or activity of GFIR, A2M, IGFBP2, APOB, SEPP1 , TFF3, IL6 and/or CHI3L1 , when compared to the reference, is indicative of an individual having a higher risk of suffering from cardiovascular disease or a negative prognosis.

Preferably, a decrease in expression, amount and/or activity of MET, GDF15, CCL22, TNFRSF11 , ANGPT2 and/or RelA NF-KB, when compared to the reference, is indicative of an individual having a lower risk of suffering from cardiovascular disease or a positive prognosis. The expression levels, amount and/or activities of the biomarkers may be as defined in the first aspect.

The kit may comprise detection means for detecting the expression levels, amount and/or activities of at least 3 biomarkers or at least 4 biomarkers. The kit may comprise detection means for detecting the expression levels, amount and/or activities of at least 5 biomarkers. The kit may comprise detection means for detecting the expression levels, amount and/or activities of at least 6 biomarkers or at least 7 biomarkers. Alternatively, the kit may comprise detection means for detecting the expression levels, amount and/or activities of at least 8 biomarkers or at least 9 biomarkers. In another embodiment, the kit may comprise detection means for detecting the expression levels, amount and/or activities of at least 10 biomarkers, at least 11 biomarkers, at least 12 biomarkers, at least 13 biomarkers, at least 14 biomarkers or at least 15 biomarkers. In another embodiment, the kit may comprise detection means for detecting the expression levels, amount and/or activities of at least 16 biomarkers, at least 17 biomarkers or at least 18 biomarkers.

Preferably, the kit of the second aspect may be for determining, diagnosing and prognosing the risk of an individual suffering from cardiovascular disease, the kit comprising: a. detection means for detecting, in a sample obtained from a test subject, the expression level, amount and/or activity of: TNF-a; GSTA1 ; NT- proBNP; RORA; TNC; GHR; A2M; IGFBP2; APOB; SEPP1 ; TFF3; IL6; CHI3L1 ; MET; GDF15; CCL22; TNFRSF11 ; ANGPT2 and RelA NF-KB; and b. a reference value from a healthy control population for expression level, amount and/or activity of TNF-a; GSTA1 ; NT-proBNP; RORA; TNC; GHR; A2M; IGFBP2; APOB; SEPP1 ; TFF3; IL6; CHI3L1 ; MET; GDF15; CCL22; TNFRSF1 1 ; ANGPT2 and RelA NF-KB, wherein the kit is used to identify: i) a decrease in expression, amount and/or activity of TNF-a, GSTA1 , NT-proBNP, RORA and TNC when compared to the reference; and an increase in expression, amount and/or activity of GFIR, A2M, IGFBP2, APOB, SEPP1 , TFF3, IL6 and CHI3L1 ; when compared to the reference to determine, diagnose and/or prognose that an individual has a higher risk of suffering from cardiovascular disease; and ii) a decrease expression, amount and/or activity of MET, GDF1 5, CCL22, TNFRSF11 , ANGPT2 and Re I A NF-KB when compared to the reference to determine, diagnose and/or prognose that an individual has a lower risk of suffering from cardiovascular disease.

The detection means may detect the expression level, for example the level or concentration, of a biomarker polynucleotide sequence, for example DNA or RNA. The DNA may be genomic DNA. The RNA may be mRNA. Alternatively, the detection means may detect polypeptide concentration and/or activity of the biomarker.

Accordingly, the detection means may include: sequencing methods (e.g., Sanger, Next Generation Sequencing, RNA-SEQ), hybridization- based methods, including those employed in biochip arrays, mass spectrometry (e.g., laser desorption/ionization mass spectrometry), fluorescence (e.g., sandwich immunoassay), surface plasmon resonance, ellipsometry and atomic force microscopy. Expression levels of markers (e.g., polynucleotides, polypeptides, or other analytes) may be compared by procedures well known in the art, such as RT-PCR, Northern blotting, Western blotting, flow cytometry, immunocytochemistry, binding to magnetic and/or antibody-coated beads, in situ hybridization, fluorescence in situ hybridization (FISH), flow chamber adhesion assay, ELISA, microarray analysis, or colorimetric assays. Methods may further include one or more of electrospray ionization mass spectrometry (ESI-MS), ESI-MS/MS, ESI-MS/(MS)n, matrix-assisted laser desorption ionization time-of-flight mass spectrometry (MALDI-TOF- MS), surface- enhanced laser desorption/ionization time-of-flight mass spectrometry (SELDI- TOF-MS), desorption/ionization on silicon (DIOS), secondary ion mass spectrometry (SFMS), quadrupole time-of-flight (Q-TOF), atmospheric pressure chemical ionization mass spectrometry (APCI-MS), APCI-MS/MS, APCI-(MS)n, atmospheric pressure photoionization mass spectrometry (APPI-MS), APPI- MS/MS, and APPI-(MS)n, quadrupole mass spectrometry, fourier transform mass spectrometry (FTMS), and ion trap mass spectrometry, where n is an integer greater than zero.

Preferably, the kit comprises detection means for detecting RORA present in a sample from a test subject, wherein a decrease in the expression, amount and/or activity of RORA indicative of an individual having a higher risk of suffering from cardiovascular disease.

Preferably, the kit comprises detection means for detecting GHR present in a sample from a test subject, wherein an increase in the expression, amount and/or activity of GHR indicative of an individual having a higher risk of suffering from cardiovascular disease.

Using the methods described herein, the inventors have been able to identify SNPs within RORA and GHR that may be used in diagnosis and prognosis, and in particular gene variants that are associated with CVD risk.

Accordingly, in a third aspect of the invention, there is provided a method of determining, diagnosing and/or prognosing an individual’s risk of suffering from cardiovascular disease, the method comprising detecting, in a sample obtained from a individual, a single nucleotide polymorphism (SNP) in the RORA gene, wherein the presence of the SNP is indicative of an individual having an increased risk of suffering from cardiovascular disease.

Preferably, RORA, the sample, detection and the cardiovascular disease is as defined in the first aspect.

The method may be performed in vivo, in vitro or ex vivo. Preferably, the method is performed in vitro or ex vivo. Most preferably, the method is performed in vitro. Preferably, the SNP is present in a region of chromosome 15, preferably at nucleic acid position 60542728 of the reference sequence NC_000015.10.

Preferably, the SNP comprises a substitution of Adenine (A) to Guanine (G) or Adenine (A) to Cytosine (C).

Preferably, the SNP comprises a substitution of Adenine (A) to Guanine (G)

Thus, preferably, the SNP may be referred to by the sequence variant GRCh38.p12 chr 15; NC 000015.10:g.60542728A>G.

Preferably, the SNP comprises a substitution of Adenine (A) to Cytosine (C).

Thus, preferably, the SNP may be referred to by the sequence variant GRCh38.p12 chr 15; NC 000015.10:g.60542728A>C.

Preferably, the SNP may be Reference SNP cluster ID: rs73420079.

Thus, in one embodiment, SNP is present in the sequence represented by Reference SNP cluster ID rs73420079, referred to herein as SEQ ID No: 3, as follows:

AGGCGCACCT CACACGGCAC ACAGGCACAT CTCACACATG GCACACATGC ACACCTCACA CAGATGGCAC ACATGCACAC CTCACACACA CGGCACGCAT GCACACCTCA CACACACGGC ACGCATGCAC ACCTCACACA CGGCACACAT GCACACCTCA CACACGACAC ACGGGCACAC CTCACACACA TGGCACACGG GCACACCTCC CACACACGGC ACACGGGCAC ACCTCCCACA CACGGCACAC

V GGCACACCTC AAACGACACA CGGCACACC TCACACACAA GTCTATTCAG CTGCAAGTCC TGCCTCCACT TGCTGAGAAC - 8o -

CTGCATGACT GGGCACCAAG GATACGGCAC ACACACGCAC CCACCCCACA TACATACAGT CCACACACAC ACAACACATA TACACCACAC GCACCACAGA TGCACACCAC ACATGCCACA CACACATACA CTGCACACGC ACCCTACACA CACCCCCCAC ATGCTTACAC

[SEQ ID No: 3]

Where “V” represent the SNP position. Accordingly, in one embodiment, SNP may comprise or consist of the sequence as substantially set out in SEQ ID No: 3, or a fragment or variant thereof.

Preferably, the SNP comprises a substitution of nucleic acid position X in SEQ D No: 3.

Thus, preferably RORA comprises a single nucleotide polymorphism (SNP), the presence of which is associated with an individual having an increased risk of suffering from CVD.

Preferably, the method of detecting the presence of the SNP comprises a probe that is capable of hybridizing to the biomarker sequence. Preferably, the probe is capable of hybridizing to SEQ ID No 3 such that the SNP is detected.

In a fourth aspect of the invention, there is provided a method of determining, diagnosing and/or prognosing a individual’s risk of suffering from cardiovascular disease, the method comprising detecting, in a sample obtained from an individual, a single nucleotide polymorphism (SNP) in the GHR gene, wherein the presence of the SNP is indicative of an individual having an increased risk of suffering from cardiovascular disease.

Preferably, GHR, the sample, detection and the cardiovascular disease is as defined in the first aspect. - 8i -

Preferably, detecting the SNP in a subject is indicative of an increased risk of suffering from cardiovascular disease.

The method may be performed in vivo, in vitro or ex vivo. Preferably, the method is performed in vitro or ex vivo. Most preferably, the method is performed in vitro.

Preferably, the SNP is present in a region of chromosome 5, preferably at nucleic acid position 42546623 of the reference sequence NC_000005.10.

Preferably, the SNP comprises a substitution of Guanine (G) to Adenine (A).

Preferably, the SNP may be Reference SNP cluster ID rs4314405.

Thus, in one embodiment, SNP is present in the sequence represented by Reference SNP cluster ID: rs73420079, referred to herein as SEQ ID No: 40, as follows:

AGGCGCACCT CACACGGCAC ACAGGCACAT CTCACACATG GCACACATGC ACACCTCACA CAGATGGCAC ACATGCACAC CTCACACACA CGGCACGCAT GCACACCTCA CACACACGGC ACGCATGCAC ACCTCACACA CGGCACACAT GCACACCTCA CACACGACAC ACGGGCACAC CTCACACACA TGGCACACGG GCACACCTCC CACACACGGC ACACGGGCAC ACCTCCCACA CACGGCACAC

V GGCACACCTC AAACGACACA CGGCACACC TCACACACAA GTCTATTCAG CTGCAAGTCC TGCCTCCACT TGCTGAGAAC CTGCATGACT GGGCACCAAG GATACGGCAC ACACACGCAC CCACCCCACA TACATACAGT CCACACACAC ACAACACATA TACACCACAC GCACCACAGA TGCACACCAC ACATGCCACA CACACATACA CTGCACACGC ACCCTACACA CACCCCCCAC ATGCTTACAC

[SEQ ID No: 40]

Where “V” represent the SNP position. Accordingly, in one embodiment, SNP may comprise or consist of the sequence as substantially set out in SEQ ID No: 40, or a fragment or variant thereof.

Preferably, the SNP comprises a substitution of nucleic acid position X in SEQ ID No: 3.

Thus, preferably GHR comprises a single nucleotide polymorphism (SNP), the presence of which is associated with an individual having an increased risk of suffering from CVD. ->

In a fifth aspect, there is provided GHR and/or RORA, for use in diagnosis or prognosis.

Preferably, RORA and GHR may comprise a SNP as defined in the third and fourth aspects. The cardiovascular disease may be as defined in the first aspect.

In a sixth aspect, there is provided GHR and/or RORA, for use in diagnosing or prognosing an individual’s risk of suffering from cardiovascular disease.

Preferably, RORA and GHR may comprise a SNP as defined in the third and fourth aspects. The cardiovascular disease may be as defined in the first aspect. Preferably, the method of detecting the presence of the SNP comprises a probe that is capable of hybridizing to the biomarker sequence. Preferably, the probe is capable of hybridizing to SEQ ID No 40 such that the SNP is detected.

In a seventh aspect, there is provided a kit for determining, diagnosing and/or prognosing an individual’s risk of suffering from cardiovascular disease, the kit comprising a detection means for detecting, in a sample obtained from a test subject, a single nucleotide polymorphism (SNP) in the RORA gene and/or GHR gene, wherein the presence of the SNP is used to determine, diagnose and/or prognose that an individual has a higher risk of suffering from cardiovascular disease.

The RORA gene, the GHR gene, the sample, detection and the cardiovascular disease may be as defined in the first aspect.

The detection means may be as defined in the second aspect. The single nucleotide polymorphism (SNP) in the RORA gene and/or GHR gene may be as defined in the third and fourth aspects.

In an eighth aspect, there is provided a method of treating an individual having a higher risk of suffering from cardiovascular disease, the method comprising:-

(a) analysing, in a sample obtained from the subject, the expression level, amount and/or activity of two or more biomarkers selected from the group consisting of: TNF-D; GSTA1 ; NT-proBNP; RORA; TNC; GHR; A2M; IGFBP2; APOB; SEPP1 ; TFF3; IL6; CHI3L1 ; MET; GDF15; CCL22; TNFRSF11 ;

ANGPT2 and RelA NF-KB;

(b) comparing the expression level, amount and/or activity of the biomarker with a reference from a healthy control population, where a decrease in expression, amount and/or activity of TNF-D, GSTA1 , NT-proBNP, RORA and/or TNC when compared to the reference and an increase in expression, amount and/or activity of GHR, A2M, IGFBP2, APOB, SEPP1 , TFF3, IL6 and/or CHI3L1 when compared to the reference is suggests that the individual has a higher risk of suffering from cardiovascular disease; and

(c) administering, or having administered, to the individual, a therapeutic agent that prevents, or reduces the likelihood of, the individual suffering from cardiovascular disease.

Preferably, the biomarkers, detection of the biomarkers, the cardiovascular disease, the expression levels, amount and/or activities of the biomarkers and the sample are as defined in the first aspect.

Preferably, the method of treatment comprises analysing and comparing the expression levels, amount and/or activities of 3, 4, 5, 6, 7, 8, 9, 10, 11 , 12, 13, 14, 15, 16, 17, 18 or more of the biomarkers: TNF-D; GSTA1 ; NT-proBNP; RORA; TNC; GHR; A2M; IGFBP2; APOB; SEPP1 ; TFF3; IL6; CHI3L1 ; MET; GDF1 5; CCL22; TNFRSF11 ; ANGPT2 and Re I A NF-KB.

The method of treatment may comprise analysing and comparing the expression levels, amount and/or activities of at least 6 biomarkers or at least 7 biomarkers. Alternatively, the method of treatment may comprise analysing and comparing the expression levels, amount and/or activities of at least 8 biomarkers or at least 9 biomarkers. In another embodiment, the method of treatment may comprise analysing and comparing the expression levels, amount and/or activities of at least 10 biomarkers, at least 11 biomarkers, at least 12 biomarkers, at least 13 biomarkers, at least 14 biomarkers or at least 15 biomarkers. In another embodiment, the method of treatment may comprise analysing and comparing the expression levels, amount and/or activities of at least 16 biomarkers, at least 17 biomarkers or at least 18 biomarkers.

Preferably, the method of treatment comprises analysing and comparing the expression levels, amount and/or activities of the biomarkers: TNF-D; GSTA1 ; NT-proBNP; RORA; TNC; GHR; A2M; IGFBP2; APOB; SEPP1 ; TFF3; IL6; CHI3L1 ; MET; GDF15; CCL22; TNFRSF11 ; ANGPT2 and RelA NF-KB A clinician would be able to make a decision as to the preferred course of treatment required, for example, the type and dosage of the therapeutic agent according to the eighth and ninth aspects to be administered.

Suitable therapeutic agents may include: statins, including the statins selected from the group consisting of: atorvastatin; simvastatin; rosuvastatin; and pravastatin, beta blockers, blood thinning agents including the blood thinning agents selected from the group consisting of: low-dose aspirin; clopidogrel; rivaroxaban; ticagrelor and prasugrel, nitrates, angiotensin-converting enzyme (ACE) inhibitors, angiotensin II receptor antagonists, calcium channel blockers, including the calcium channel blockers selected from the group consisting of: amlodipine; verapamil and diltiazem and/or diuretics.

Treatment may include enacting lifestyle changes.

In an ninth aspect, there is also provided a method of treating an individual having a higher risk of suffering from cardiovascular disease, the method comprising:-

(a) detecting, in a sample obtained from the subject, a single nucleotide polymorphism (SNP) in the RORA gene and/or the GHR gene, wherein the presence of the SNP suggests that the individual has a higher risk of suffering from cardiovascular disease; and

(b) administering, or having administered, to the individual, a therapeutic agent that prevents, or reduces the likelihood of, the individual suffering from cardiovascular disease.

Preferably, the biomarkers, detection of the biomarkers, the cardiovascular disease, the expression levels, amount and/or activities of the biomarkers and the sample are as defined in the first aspect.

Preferably, the single nucleotide polymorphism (SNP) RORA and/or GHR is as defined in the third and/or fourth aspect. A clinician would be able to make a decision as to the preferred course of treatment required, for example, the type and dosage of the therapeutic agent according to the eighth and ninth aspects to be administered.

Suitable therapeutic agents may include: statins, including the statins selected from the group consisting of: atorvastatin; simvastatin; rosuvastatin; and pravastatin, beta blockers, blood thinning agents including the blood thinning agents selected from the group consisting of: low-dose aspirin; clopidogrel; rivaroxaban; ticagrelor and prasugrel, nitrates, angiotensin-converting enzyme (ACE) inhibitors, angiotensin II receptor antagonists, calcium channel blockers, including the calcium channel blockers selected from the group consisting of: amlodipine; verapamil and diltiazem and/or diuretics.

Treatment may include enacting lifestyle changes.

It will be appreciated that the invention extends to any nucleic acid or peptide or variant, derivative or analogue thereof, which comprises substantially the amino acid or nucleic acid sequences of any of the sequences referred to herein, including variants or fragments thereof. The terms “substantially the amino acid/nucleotide/peptide sequence”, “variant” and “fragment”, can be a sequence that has at least 40% sequence identity with the amino acid/nucleotide/peptide sequences of any one of the sequences referred to herein, for example 40% identity with the sequence identified as SEQ ID Nos: 1 to 40 and so on.

Amino acid/polynucleotide/polypeptide sequences with a sequence identity which is greater than 65%, more preferably greater than 70%, even more preferably greater than 75%, and still more preferably greater than 80% sequence identity to any of the sequences referred to are also envisaged. Preferably, the amino acid/polynucleotide/polypeptide sequence has at least 85% identity with any of the sequences referred to, more preferably at least 90% identity, even more preferably at least 92% identity, even more preferably at least 95% identity, even more preferably at least 97% identity, even more preferably at least 98% identity and, most preferably at least 99% identity with any of the sequences referred to herein.

The skilled technician will appreciate how to calculate the percentage identity between two amino acid/polynucleotide/polypeptide sequences. In order to calculate the percentage identity between two amino acid/polynucleotide/polypeptide sequences, an alignment of the two sequences must first be prepared, followed by calculation of the sequence identity value. The percentage identity for two sequences may take different values depending on:- (i) the method used to align the sequences, for example, ClustalW, BLAST, FASTA, Smith-Waterman (implemented in different programs), or structural alignment from 3D comparison; and (ii) the parameters used by the alignment method, for example, local vs global alignment, the pair-score matrix used (e.g. BLOSUM62, PAM250, Gonnet etc.), and gap-penalty, e.g. functional form and constants.

Having made the alignment, there are many different ways of calculating percentage identity between the two sequences. For example, one may divide the number of identities by: (i) the length of shortest sequence; (ii) the length of alignment; (iii) the mean length of sequence; (iv) the number of non-gap positions; or (v) the number of equivalenced positions excluding overhangs. Furthermore, it will be appreciated that percentage identity is also strongly length dependent. Therefore, the shorter a pair of sequences is, the higher the sequence identity one may expect to occur by chance.

Hence, it will be appreciated that the accurate alignment of protein or DNA sequences is a complex process. The popular multiple alignment program ClustalW (Thompson etal., 1994, Nucleic Acids Research, 22, 4673-4680; Thompson etal., 1997, Nucleic Acids Research, 24, 4876-4882) is a preferred way for generating multiple alignments of proteins or DNA in accordance with the invention. Suitable parameters for ClustalW may be as follows: For DNA alignments: Gap Open Penalty = 15.0, Gap Extension Penalty = 6.66, and Matrix = Identity. For protein alignments: Gap Open Penalty = 10.0, Gap Extension Penalty = 0.2, and Matrix = Gonnet. For DNA and Protein alignments: ENDGAP = -1 , and GAPDIST = 4. Those skilled in the art will be aware that it may be necessary to vary these and other parameters for optimal sequence alignment.

Preferably, calculation of percentage identities between two amino acid/polynucleotide/polypeptide sequences may then be calculated from such an alignment as (N/T) * 100, where N is the number of positions at which the sequences share an identical residue, and T is the total number of positions compared including gaps and either including or excluding overhangs. Preferably, overhangs are included in the calculation. Flence, a most preferred method for calculating percentage identity between two sequences comprises (i) preparing a sequence alignment using the ClustalW program using a suitable set of parameters, for example, as set out above; and (ii) inserting the values of N and T into the following formula:- Sequence Identity = (N/T) * 100.

Alternative methods for identifying similar sequences will be known to those skilled in the art. For example, a substantially similar nucleotide sequence will be encoded by a sequence which hybridizes to DNA sequences or their complements under stringent conditions. By stringent conditions, the inventors mean the nucleotide hybridises to filter-bound DNA or RNA in 3x sodium chloride/sodium citrate (SSC) at approximately 45 Q C followed by at least one wash in 0.2x SSC/0.1% SDS at approximately 20-65 Q C. Alternatively, a substantially similar polypeptide may differ by at least 1 , but less than 5, 10, 20, 50 or 100 amino acids from the sequences shown in, for example, SEQ ID Nos: 1 to 40 and so on.

Due to the degeneracy of the genetic code, it is clear that any nucleic acid sequence described herein could be varied or changed without substantially affecting the sequence of the protein encoded thereby, to provide a functional variant thereof. Suitable nucleotide variants are those having a sequence altered by the substitution of different codons that encode the same amino acid within the sequence, thus producing a silent (synonymous) change. Other suitable variants are those having homologous nucleotide sequences but comprising all, or portions of, sequence, which are altered by the substitution of different codons that encode an amino acid with a side chain of similar biophysical properties to the amino acid it substitutes, to produce a conservative change. For example, small non-polar, hydrophobic amino acids include glycine, alanine, leucine, isoleucine, valine, proline, and methionine. Large non polar, hydrophobic amino acids include phenylalanine, tryptophan and tyrosine. The polar neutral amino acids include serine, threonine, cysteine, asparagine and glutamine. The positively charged (basic) amino acids include lysine, arginine and histidine. The negatively charged (acidic) amino acids include aspartic acid and glutamic acid. It will therefore be appreciated which amino acids may be replaced with an amino acid having similar biophysical properties, and the skilled technician will know the nucleotide sequences encoding these amino acids.

All of the features described herein (including any accompanying claims, abstract and drawings), and/or all of the steps of any method or process so disclosed, may be combined with any of the above aspects in any combination, except combinations where at least some of such features and/or steps are mutually exclusive.

For a better understanding of the invention, and to show how embodiments of the same may be carried into effect, reference will now be made, by way of example, to the accompanying Figures, in which:-

Figure 1 shows after the disease network identification (upper left panel), single level patient data is integrated based on Bayesian probabilistic graphical models to inferred patient specific pathway activity for each of the BMKs entered in the model. Calculated activities per patient and molecule are them clustered by hierarchical clustering. Survival analysis is done in an independent step to define the relevance of the identified clusters for disease progression (patient stratification). BMKs = biomarkers; TF = transcription factor.

Figure 2 a-c shows the networks of molecular entities that were prioritized in the overconnectivity analyses. To infer patient specific activity (Workflow Figure 1), the inventors used patient data from: i) 13 proteins (nodes highlighted in blue), and ii) genotypes of 2 genes (nodes with stars). The activity of 4 nodes (red circles), was inferred by Bayesian statistics based on molecular interactions (connecting edges, red = inhibition, or green = activation, if bi directional, right edge = from central node to neighbouring molecule, and left edge = from neighbouring molecule to central node) known from the literature (Supplementary Table S5).

Figure 3 shows that the molecular signatures of population sub-types relate to CVD progression a, Two main patient clusters (highlighted in red and blue frames) were identified based on molecular signatures in Caucasians (left) and Latinos (right) b-c,

Kaplan-Meier analyses comparing survival probabilities for the identified clusters (a, red and blue) related to 1st and 2 nd co-primary expanded CV composites in Caucasians (b) and Latinos (c). d,e, Proportion of CVO events or all death per cluster (blue = higher and blue = lower CVO risk clusters) with the respective calculated relative risk of event for patients in the blue versus red highlighted cluster (a). There were significant differences in survival between clusters (Log-rank P i 0.0000 in b-c) for each of the outcomes adjusted for CVO risk factors (Cox regression models P i 0.0000 in b-c). A, C, B represent the main clusters of pathway entities (Table 1). PA = pathway activity; Ml = myocardial infarction; Hospit. HF = hospitalization by heart failure.

Figure 4 shows cox-regression results presented as Hazard rations (HR) with confidence intervals a-b, HR for the 2nd co-primary composite of CVO in Caucasians (a) and Latinos (b) for known CVO risk factors and cluster identity (as defined in Figure 3, panel a-b). c, HR for the 1st co-primary composite of CVO for the same factors as above, considering the entire population (Caucasians plus Latinos). Cox regression models were significant P below 0.0000).

Figure 5 shows the list of gene names from the identified disease network. Three main clusters (A, B, C in Figure 3) of genes were identified (highlighted in red = lower activity, and shades of blue = higher activity of molecules in patients at higher risk vs. those at lower risk to CVO events). Individual patient data for these genes (G = genotype and P = protein levels) was used for the patient stratification workflow (figure 1). Al = molecular activity was inferred by Bayesian statistics. RORA and GHR variants were not previously reported to be associated with CVD, with the effect allele being rare in Europeans, but frequent for in Africans (MAF = 0.41) and for the GHR variant in Asians (MAF = 0.13). Results from standard BMK analyses for CVO in the ORIGIN cohort are reported for (a) 1st and (b) second co-primary cardiovascular composites, as previously published (CITE PUBLICATION). 1st co-primary endpoint = composite of CV death, or nonfatal Ml or nonfatal stroke, and 2 nd co-primary endpoint = composite of 1st co primary or revascularization procedure or hospitalization for heart failure) n.s. = biomarkers that were not significantly associated with CVO, but associated with death in the ORIGIN CVO trial.

Figures 6a and 6b (s2) show locus zoom plots of the GWAs results from the ORIGIN study. SNPs are plotted on the x-axis according to their position on each chromosome against associations with CAD on the y-axis (shown as - Iog10 P). The loci used for patient stratification (rs73420079, rs4314405; P < 5 x 10-8) are indicated in the locus zoom plots b and c, respectively. MAF for these loci was around 1% in Europeans, except for MIR3169, which had a MAF = 18%.

Figure 7 shows the network analyses design. Summary statistics from ORIGIN CAD GWAs was used to identify most relevant genes association with CVO in this cohort. In a posterio analysis these gene plus a list of biomarkers (BMKs) encoding genes were used to produce networks of overconnected genes. This procedure was replicated using GWAs summary statistics for CAD outcome from the CARDIOGRAM consortium (CARDIoGRAMplusC4D, Nikpey et al 2015) plus the identical list of BMKs associate with CAD outcome.

Figure 8 shows networks of molecular entities that were prioritized in the overconnectivity analyses a, b, c, First 3 top ranking networks in the discovery network analyses using candidates from GWAs and BMK analyses from the ORIGIN datasets c, d, e Replication network analyses showing top sub networks (ranking 1st, 4th and 8th) identified when using the CARDIOGRAM GWAs results plus BMK analyses results from ORIGIN. These most resembled the networks identified in a-c (replicated BMK in blue). Stars = genes associated with CVO in the ORIGIN GWAs.

Figure 9 shows Kaplan-Meier survival estimates by patient clusters for the measured CV outcomes in Caucasian and Latinos (a), clusters identified and b- c, survival curves (red = cluster highlighted in red, blue = cluster highlighted in blue in the upper panels) (b) Caucasians and Latinos (c). There were significant differences in survival between clusters in all cases (Log-rank P < 0.000 was the less significant difference).

Figure 10 Box-plots for levels of biomarkers comprising the clusters of genes (A, B, C) identified when clustering patient specific BMK activity (Figure 2). Patients in the cluster at higher risk for CVO (left box-plot in each graphic) had higher BMKs levels, except for GSTalpha, which was lower in the higher risk group.

Figure 11 shows visual inspections of cardiovascular risk factors in the high (1) versus low (2) CVO risk clusters (Caucasian plus Latino sub-populations).

Upper panels, Box-plots for age, BMI and levels of BMK routinely measured in the clinic. Patients in the cluster at higher risk for CVO (left box-plots in each graphic) were in average similar to patients in the cluster with lower CVO risk. Lower panel, Categorical CVO risk variables such as sex (left lower panel), smoking (mid lower panel), and albuminuria (right lower panel) or reported albuminuria were slightly different between clusters. More details for these and other risk factors are given in Table 1. Statistical analyses for CVO risk (logistic regression) and survival (cox regression) included these as co-variates to calculate the risk and Hazard ratios per cluster as presented in the MS body. NormalTC = normalized Total Cholesterol; normalSBP / DBP = normalized systolic blood pressure / diastolic blood pressure.

Figure 12 shows sub-cluster analyses (a) Sub-clusters within clusters (cluster 2 is now divided in 2 and 3) in both Caucasian and Latinos (b) Proportion of CVO events per clusters 1 to 3 and the corresponding N of individuals.

Figure 13 shows Kaplan-Meier survival estimates for measured outcomes in Caucasian (left panels) and Latinos (right panels) for the 3 identified sub clusters (as shown in Figure 13). There were significant differences in survival between clusters in all cases (Log-rank P < 0.0000).

Figure 14 (S10) shows NTproBNP biomarker levels in 3 clusters of patients obtained by either using NT-proBNT levels (left panel) or without using it (right panel) as input for the network calculations that generated these clusters in Latinos. There were no significant differences NT-proBNT levels when using the first or the later analyses results as classifier of patients (clusters 1 to 3).

Examples

The inventors set out to identify biomarkers, and combinations of biomarkers, associated with CVD progression, with the aim of developing strategies to enable identification of individuals who are at risk from suffering from a CVD event, and thus enable early intervention to prevent, or reduce the risk of, the individual from suffering from a CVD event. Identification of suitable biomarkers and biomarker networks will also optimize clinical trials plans, drug efficacy, and optimize treatment.

Example 1 - identification of biomarkers and biomarker networks.

Materials and Methods

ORIGIN cohort

The participants of the ORIGIN biomarker sub-study (N = 8,401 , Supplementary Table S1) were chosen among the randomized patients who were treated by Lantus or placebo. A smaller sub-population of this sample was genotyped (5,078 samples). After quality control (see below), sample size was 4,390 individuals. Only these genotyped subjects entered the analyses described in this protocol for patient stratification (demographic characteristics in Supplementary Table S1 ).

Biomarkers quality control and normalization

The quality control for the protein biomarkers measured in the ORIGIN cohort was previously described (Gerstein et al., 2015). Normalization = biomarkers that were not normally distributed were log transformed. Standard biomarker analyses for CVO prediction in this cohort were described elsewhere (Gerstein et al., 2015).

Quantitative BMK measurements were also transformed to categorical variables (-1 , 0, +1) based on percentiles of the distributions for each BMK, separately for Caucasian, Latinos and subjects of African origin. The algorithm PARADIGM cannot deal with continuous traits for clustering.

Genotyping quality control and population stratification analysis Genotyping of the ORIGIN cohort, N = 5,078 samples, was performed using lllumina HumanCore Exome DNA Analysis Bead Chip (lllumina Omni2.5). Over 540,000 genetic variants were called, including extensive coverage of coding variants, both common and rare. Single nucleotide polymorphisms (SNPs) were excluded with call rate < 0.99, minor allele frequency < 0.01 , or deviation from Hardy-Weinberg equilibrium (P<1x10-6). Individuals were excluded if their self-reported sex, ethnicity and relatedness were not in concordance with their genetic information. After quality control, sample size was 4,390 individuals and 284,024 SNPs (note: SNPs excluded because of low allele frequency were accurately genotyped for the most part and will be included in future analyses). Genome-wide genotype imputation was performed using Impute v2.3.0. The inventors used the NCBI build 37 of Phase I integrated variant 1000 Genomes Haplotypes (SHAPEIT2) as the reference panel. Imputed SNPs were excluded with imputation certainty score < 0.3. Final number of Imputed SNPs was 10,501 ,330.

Principal components were generated based on whole genome genotyping separately for Caucasian and Latinos. These were used as co-variates in genetic analysis for the genome-wide association study (GWAs) from the ORIGIN study.

Due to population ethnic substructure, subpopulations were defined by ethnic groups (Caucasian and Latinos) and analyzed separately. HWE (P > 0.001) for the SNPs entering the analyses was tested as part of the quality control to define if the population subgroups met expected genotypes distributions. Subpopulations were meta-analyzed and association results were visualized in Manhattan and QQ plots (Figure 6).

Genome-wide association analyses

GWA analyses were conducted separately for Caucasian (N = 1 ,931) and Latinos (N = 2,216). Genotypes consisting of both directly typed and imputed SNPs (N = 4.9-9

Mio) entered the GWA analyses. To avoid over-inflation of test statistics due to population structure or relatedness, the inventors applied genomic controls for the GWA analysis. Principal components were generated based on whole genome genotyping separately for Caucasian and Latinos. These were used as co-variates in the genome-wide association analyses from the ORIGIN study. Linear regression (PLINK) for associations with normalization was performed under an additive model, with SNP allele dosage as predictor and with age, gender.

Meta-analysis was performed. Corresponding to Bonferroni adjustment for one million independent tests, the inventors specified a threshold of P for genome wide significance. The CARDIOGRAM consortium dataset used was the CARDIoGRAMplusC4D 1000 Genomes-based GWAS meta-analysis summary statistics. It comprised GWAS studies of mainly European, South Asian, and East Asian, descents imputed using the 1000 Genomes phase 1 v3 training set with 38 million variants. The study interrogated 9.4 million variants and involved 60,801 CVD cases and 123,504 controls (Nikpey et al., 2015). To assess the number of independent loci associated with CVD, correlated SNPs were grouped using a LD-based result clumping procedure (PLINK, Purcell et al, 2007). This procedure was used for gene mapping of loci (Supplementary Table S2) entering the overconnectivity network analysis. Variants associated with CVD at a p-value below 10-6 with proxies at a p-value below or equal 10-5 in ORIGIN cohort and p-value below 10-7 with proxies at a p-value below equal 10-6 in the CARDIOGRAM cohort were mapped to genes, so that these could be considered in the network analyses. The inventors excluded alleles with a MAF below 1 percent and poor imputation quality (Info below 0.4) from the clumping procedure.

Workflow

All steps of the workflow (Figure 1 ) are as follows:

(1) Candidate molecules selection

A list of entities included in the discovery and replication studies is presented in Table S3.

(2) Gene mapping

SNPs associated with CAD in ORIGIN and in CARDIOGRAM were mapped to genes using the clustering procedure available in PLINK (Purcell et al, 2007) based on empirical estimates of linkage disequilibrium (LD) between single nucleotide polymorphisms (SNPs). The inventors used 1000 Genomes (phase 1 release v3) as the reference dataset to estimate the LD between variants; the clumping analysis was performed using LD r2 > 0.8 (--clump-r20.8) to clump variants to an index SNP within a range of 250kb (--clump-kb 250). To identify clumped genomic regions corresponding to genes The inventors used the --clump-range function with a gene list (hg19). (Supplementary Table S2);

(3) Network analysis for disease network identification

The inventors used overconnectivity analyses to build a network representative of cardiovascular disease, as described in more detail below.

(4) Single level patient data curation

To initiate the patient stratification workflow (Figure 1), the inventors generated first the disease network and used the patient specific data for the prioritized molecules (Table 1). Patient specific data quality control is described above.

(5) Probabilistic graphical model analysis for identification of patient specific pathway activity

The method is explained in the manuscript and below in more detail.

(6) Clustering

The inventors used an R package as described below.

(7) Linking the identified patient clusters to CVO

More details about the analyses, outcomes and co-variates are described below.

(8) Single biomarkers comparisons These were done by visual inspection of box-plots and median comparisons.

(9) Characterization of clustered populations.

Disease network identification

A disease network was identified using the overconnectivity algorithm, as implemented in the R based Computational Biology for Drug discovery (CDDD) package developed by Clarivate Analytics. The specificity of the network for the disease relies on the disease linked molecules chosen to produce the network (Figure 7) and on the underneath libraries of protein-protein interactions in humans. For this purpose, the inventors extracted high trust interaction manually curated systems biology knowledge bases from Ingenuity (IPA from QIAGEN Inc.) and Metabase (Metacore from Clarivate Analytics) to be used as libraries for the CBDD package. To create the CVD disease network, the inventors: i) made use of topological characteristics of human protein interaction networks to identify one-step away direct regulators of the dataset that are statistically overconnected with the objects from the data set (hypergeometric distribution), ii) used as input datasets, names of genes having evidence of association to CVD (see Baysian network analyses below and in Figure 7). As input data for the overconnectivity analyses, the inventors used in the discovery analyses, names of genes corresponding to 16 protein BMKs and, from 8 loci associated with CVO in the Origin cohort GWAs (Table S3). To validate the initially identified network, the inventors re-ran these analyses replacing genetic associated loci from the ORIGIN study, by other 90 genes from an independent large GWAs meta-analysis (CARDIoGRAMplusC4D, cite Nikpey et al 20115) from the CARDIOGRAM consortium (Figure 7, Table S3).

Bayesian network analyses for data integration

PARADIGM is a data integration approach based on probabilistic graphical models. It renders a pathway or network as a probabilistic graphical model (PGM), learning its parameters from supplied omics data sets. The model allows inference of true activity score for each node in the pathway given the different omics measurements for the nodes. PARADIGM allows prediction on the level of individual patients and is capable of accommodating such data types as gene / protein expression, copy number changes, metabolomics, direct protein activity assays such as kinase activity measurements. PARADIGM combines multiple genome-scale measurements at the sample level to infer the activities of genes, products and abstract process within a pathway or subnetwork. Edges of original network connect hidden variables of different nodes (e.g. activity hidden variable of node A affects protein or DNA hidden variable of node B, depending on mechanism of A-B link).

Each node is assigned a conditional probability distribution when the model is created. The distribution tells how likely it is to observe a node in particular state given states of its parents in the model Three states are allowed for each node (activated, repressed, unchanged). The distributions for hidden variables are defined at the first step. Distributions for observed variables of each molecular level are learned by EM algorithm using the input data. After model is complete, inference can be made about probabilities of observing hidden nodes in a particular state - either without observed data (prior probability) or taking data into account (posterior probability).

The main output is a matrix of integrated pathway activities (IPAs) A where AJj represents the inferred activity of entity i in patient sample j. The values in A are signed and are non-zero if the patient data makes the activation or inhibition of the hidden node more likely compared to prior. The A is supposed to be used instead of original data sets for purposes such as patient stratification or association analysis to reveal biological entities with activity associated to clinical traits.

The output is a matrix of activity scores for each node in the network and each sample. The activity score represents signed log likelihood ratio (positive when the node is predicted to be active, negative when node is predicted to be repressed). A pathway is converted into a probabilistic graphical model that includes both hidden states for each node and observed states for the nodes which can correspond to the input data sets. There are two possible modes to assess I PA significance. Both involve permutation - calculation of I PA scores on many randomized samples.

For the 'within' permutation, a permuted data sample is created by creating new set of evidence (i.e. states for observed variables at gene expression and gene copy number) by assigning a value of the random node in pathway / subnetwork and random sample to each observed node.

For the 'any' permutation, the procedure is the same, but the random node selection step could choose a node from anywhere in the input data (regardless of whether a particular pathway / subnetwork contains such a node).

For both permutation types, iterations permuted samples are created, and the IPA scores for each permuted sample is calculated. The distribution of scores from permuted data is used as a null distribution to estimate the significance of IPA scores in real data set. SCRIPT: Paradigm (Vaske et al., 2010) in R with CBDD package developed by CLARIVATES.

Clusters were produced separately for the discovery (Caucasian; N = 1908) and replication (Latinos; N = 2146 sub-populations.

1 . Network: Interactome (Sanofi network from IPA and Metabase) human high quality.

2. Matrix: Patient specific information form 13 protein biomarkers and genotypes from 2 GWAs genes were used to run paradigm. These molecular entities appeared in the first 3 subnetworks obtained from the network analysis with the overconnectivity algorithm. Proteins coded as -1 , 0, 1 (accordingly with the distribution) per patient. A matrix containing the 3 subnetworks obtained with the OVERCONNECTIVITY algorithm was also used as input for paradigm (though it contained molecular entities from which the inventors did not provide BMK measurements as input).

3. Levels: DNA and protein

List of proteins/genes included in the analysis:

Phenotypic Matrix (proteins)

1) Alpha 2 macroglobulin (1st co-primary and death)

2) Angiopoietin 2 (1st, 2nd CVO and death)

3) Apolipoprotein B (1st, 2nd CVO and death)

4) YLK-40 (CHI3L1 ) (death)

5) Glutathione S transferase alpha (1st, 2nd CVO and death)

6) Tenascin c (death)

7) IGF binding protein 2 (death)

8) Hepatocyte growth factor receptor (MET) (1st, 2nd CVO and death)

9) Osteoprotegerin (TNFRSF11) (1st, 2nd CVO and death)

10) Macrophage derived chemokine (CCL22) (death)

11) Selenoprotein P (death)

12) Trefoil factor 3 (1st CVO and death)

13) GDF15 (1st, 2nd CVO and death)

Genotypes from genes:

1) RORA (rs73420079 CAD effect allele G -AA=-1 , AG=0, GG=1),

2) GHR (rs4314405 CAD effect allele A - GG=-1 , AG=0, AA=1 ).

Biomarkers to be predicted as key members of the input networks:

TNF-alpha, IL6, RelA NF-KB Subunit, NT-proBNP (NPPB)

Patient Stratification

The inventors initiated the patient stratification analyses based on the disease sub-networks identified in the ORIGIN cohort (Figure 2) and using the respective patient specific protein or genotypes data (workflow Figure 1 ; Figure 2 and Table 1). Due to population ethnic background. Two main subpopulation with protein biomarker and genotypes available were considered (Caucasian and Latinos), with similar sample sizes (N = 1 ,908 and N = 2,146; respectively) and sample characteristics for CVO risk factors (Supplementary Table S1 ). These were thereby considered as discovery and validation samples for i) Bayesian network analyses, ii) hierarchical clustering, and iii) CVO risk and survival analyses. The Bayesian network analyses were conducted using the PARADIGM algorithm (Vaske et al), implemented in R as part of the CDDD package developed by Clarivate Analytics. The data integration approach is based on probabilistic graphical models. It renders a pathway or network as a probabilistic graphical model (PGM), learning its parameters from supplied omics data sets (Figure 2. The model allows inference of true activity score for each node (molecule) in the pathway given the different omics measurements (per patient) for the nodes (Figure 2; Table 1). Each node is assigned a conditional probability distribution when the model is created. The distribution tells how likely it is to observe a node in particular state given states of its parents in the model. Three states are allowed for each node (activated, repressed, unchanged). The distributions for hidden nodes are defined at the first step (e.g., a transcription factor regulating a network of genes, see Figure 1). Distributions for observed variables of each molecular level are learned by EM algorithm using the input data. After model is complete, inference can be made about probabilities of observing hidden nodes in a particular state - either without observed data (prior probability) or taking data into account (posterior probability). The output is a matrix of activity scores for each node in the network and each sample. The activity score represents signed log likelihood ratio (positive when the node is predicted to be active, negative when node is predicted to be repressed). A pathway is converted into a probabilistic graphical model that includes both hidden states for each node and observed states for the nodes which can correspond to the input data sets. This function takes a matrix of activity scores and calculates p-values for each value using permutation approach. Hierarchical clustering of the calculated pathway activity per patient and molecule was done using the R hclust function. The dissimilarities between clusters was computed using squared Euclidean distance.

Linking the identified patient clusters to CVO

To verify the relevance of the identified patient clusters in both the Caucasian and Latino populations (Figure 3) to the CV outcomes measured in the ORIGIN cohort, the inventors used logistic regression, Cox proportional hazard models, and survival analyses. To confirm consistency of results, these analyses were repeated for sub-clusters within the initially identified clusters (Figures 12 and 13). Logistic regression was used to estimate if known CVO risk factors could alone distinguish patients from the higher versus lower CVO risk clusters identified by molecular signatures. Cox proportional hazard models were used to estimate the association between clusters and time-to-event, as variation in cumulative disease incidence could be accessed in the prospective study. Kaplan-Meier survival estimates were used to visualize the relevance of the identified clusters to disease progression and log-rank test was used to test the equality of survivor functions. CV outcomes used for the analyses were: myocardial infarction (Ml), stroke, cardiovascular death and heart failure (HF) with hospitalization, beyond death for all causes (Figures 9 and 12). Two additional MACEs were used, (i) 1st co-primary endpoint (= CV composite): composite of CV death, or nonfatal Ml or nonfatal stroke, and (ii) 2nd co-primary endpoint (= expanded CV composite): composite of 1st co-primary or revascularization procedure or hospitalization for heart failure (Figure 3). Co variates used in the models were: age, sex, BMI (kg/m2), HbA1c (%), c-Peptide, HDL-C (mmol/L), LDL-C (mmol/L), TG (mmol/L), TC (mmol/L), SBP (mm Hg), DBP (mm Hg), smoking status, albuminuria or reported albuminuria. All continuous variants were normalized by inverse normal transformation. For sample characteristics for the higher versus lower CVO risk identified clusters in the Caucasian and Latino see Supplementary Table S1 and Figure 10. Box- plots of the original biomarker levels were produced for visual inspection of differences in BMK levels between clusters (Figure 9). Analyses were done using STATA Version 15. Results

ORIGIN GW As results

In the GW As the inventors conducted with the ORIGIN cohort, the inventors identified a few variants associated or borderline associated to CVD (Figure 11). These were mostly rare variants (MAF = 0.01 in EUR), or did not map to a gene region (Supplementary Table S2). These variants were not present in the CARDIOGRAM dataset, and the inventors could not validate these findings using the Cardiovascular Disease Knowledge Portal

(http://broadcvdi.org/home/portalFlome). The inventors mapped GWAs results to nearby loci (Supplementary Tables S2) to inform the network analysis, identifying biological interactions of these loci to other CVD biomarkers (Supplementary Table S3). Results from the GWA analyses are shown in Table 1 for loci that were prioritized in the network analyses, all other findings are reported in Figure 6 and Supplementary Table S2.

CVD disease network prioritization

The inventors built a CVD network (Figure 7) based on proteins reported to be associated with CVO or death and the loci associated with CVD in the ORIGIN CVO trial (Supplementary Table S3), with the purpose to identify direct regulators of disease that are statistically overconnected. From the 8 top ranking loci (GHR, RORA, CLINT1 , GRM8, LOC101928784, MYH16, SPOCK1 , and TMTC2) from the CVD GWAs in the ORIGIN cohort (Supplementary Table S2), only 2 (RORA and GFIR; Figure 6 were prioritized in the overconnectivity analyses (Figure 2). To validate the main networks identified (Figure 2), the inventors re-ran these analyses with an independent GWAs dataset from the CARDIOGRAM consortium (Supplementary Table S2 and Figure 7 for study design). Overconnectivity analysis pointed to 14 (12 in the replication study; (Figure 8) out of the 16 protein BMKs (Supplementary Table S3) that were earlier identified to be associated with CVO in the ORIGIN cohort (Table 1). These were part of the top ranking significant networks (Supplementary Table S4) in the discovery and replication analyses (Figures 7 and 8). To note, though 90 other candidate genes were fed as input to the network calculations in the replication study (Supplementary Table S3), only 2 protein BMKs (NPPB and TFF3) and two genetically associated genes (RORA and GFIR) were specific to the sub-network identified in the discovery analyses (Figure 8).

Patient stratification linked to CVO progression

The prioritized sub-networks and their molecular directional interactions (Figure 2; Supplementary Table S5 for interactions with the respective supporting references) were used to inform the subsequent patient stratification analyses, as specified in the workflow (Figure 1 ; see Methods). The clustering of the calculated molecular activities was agnostic to CVO: 3 main gene clusters (A, B, C in Figure 3, and panel a) and 2 main patient clusters were identified (Figure 3, panel a) in Caucasians (with 1 ,059 and 849 patients) and replicated in Latinos (with 1 ,078 and 1 ,068 patients). The pathway activity for the higher CVO risk patients was clearly repressed in gene cluster A, and activated in gene cluster B, as for most of the patients in cluster C (Figure 3 and Table 1). The protein BMKs were previously reported to be associated with CVO or death for all causes in the ORIGIN cohort (Gerstein et al., 2015).

A relation of the clusters to 1st and 2nd co-primary composites of CVO was found in a second step through survival analyses (Kaplan-Maier survival estimates and Log rank analyses to test for equality of survival function) and Cox-regression models adjusted for CVO risk factors (Figure 3, panels b-c; and Figure 4, panels a-c). Similarly, patient clusters also associated to myocardial infarction, stroke, cardiovascular death, heart failure with hospitalization, beyond death for all causes (Figure 9 for Kaplan-Maier survival estimates), with the higher risk cluster having in average about 2 times greater chances of event occurrence than the lower risk cluster (Figure 3, panels d-e). - io6 -

The distribution of CVO risk factors (Supplementary Table 1 and Figure 11) was mostly similar between the higher versus lower CVO risk patients cluster in both Caucasian and Latinos (Supplementary Table 1). In a logistic regression model including age, sex, BMI, HbA1c, c-Peptide, HDL-C, LDL-C, TG, TC, SBP, DBP, smoking status, albuminuria or reported albuminuria; only albuminuria, total cholesterol and age were relevant to separate lower versus higher CVO risk clusters in both Caucasians and Latinos (Supplementary Table 6). Smoking habits were just distinct for clusters identified in Caucasians, while Gender and BMI only contributed to separate clusters in Latinos (Supplementary Table 6).

To test how relevant the clusters were to predict CVO after adjusting for these known CVO risk factors, the inventors used Cox-regression models (Figure 3 and in more detail in Figure 4). Hazard ratios (HR) to the 2nd co-primary composites of CVO were largest for gender (women had a reduced HR in relation to man) and cluster (lower CVO risk cluster was protective) in both the Caucasian and Latino group (Figure 4, a-b). Smoking had a large effect in Caucasians, but a modest one in Latinos. Consistently across populations,

HDL, TC, SBP and albuminuria were also relevant, but contributed to a lesser extent than Gender and Cluster. Above 80 percent of the population was diabetic (Supplementary Table 1 ); HbA1 C was more relevant for the HR in Latinos, while was C-Peptide in Caucasians (Figure 4, a-b). The inventors repeated this analysis for all outcomes (Figure 3, d-e) for the combined Caucasian and Latino populations, as presented in Figure 4 (panel c) for the 1st co-primary composites of CVO - which results were similar to analyses with the 2nd co-primary composites of CVO. To test if the clustering was not randomly associated with CVO survival, the inventors repeated the analyses shown in Figure 3 for sub-clusters of the initially identified clusters, with consistent results (Supplementary Figures 12 and 13). Average levels of protein BMKs in the higher versus lower CVO risk cluster (Figure 10) were as expected, corresponding to the direction of associations of these proteins to CVO in the ORIGIN cohort (Table 1). Also, measured levels of protein BMKs in the higher versus lower CVO risk cluster mostly corresponded to higher versus lower calculated activity for the respective molecules (Figure 3, panel a; Table 1), - IO7 - except for NT-proBNP. Thereby, calculated activities does not necessary reflects levels of BMKs, as activity will be inferred (e.g., how likely the protein x activity is de-activated?) based on known molecular interactions. Although NT- proBNP levels were higher in the higher CVO risk patient cluster compared to the lower CVO risk patients (Figure 10), the inferred activity for this gene in the network was lower in the higher CVO risk patient group, than in the lower risk patient group (Table 1). When recalculating the network activities using NT- proBNP levels as input for a sub-set of patients who had its levels measured, levels of the protein per cluster were equivalent to its levels in similar clusters when calculating the network activity without using NTproBNP levels (Figure 14). Similarly, in the absence of TNFalpha measurements for the ORIGIN trial, the network activity was calculated taking into consideration only the neighboring molecules, which indicated that TNFalpha was less activity in higher CVO risk patients, although levels of the actual protein could have been high. The molecular activities of IL6 and RelA NF-KB, also upstream regulators of the networks, were estimated to be higher in the higher CVO risk group (Figure 3, Table 1). These networks were less informative than TNFalpha, which contained already all interacting molecules (Figure 2), but as IL6 and RelA NF-KB were central to those their activities could be calculated.

Discussion

Flere, the inventors developed a workflow that goes from the identification of a CVD network to the proof of concept that the molecular interactions identified can be used to stratify patients with regard to disease progression. The inventor’s computational biology approach has identified the group of biomarkers associated with CVD outcomes and has associated, for the first time, SNPs associated with CVD risk. The inventor’s work has identified molecular interactions and shows interdependencies of molecular activities that associate with different stages of CVD progression.

These results may enable the identification of individuals who are at risk from suffering from a CVD event, and thus enables early intervention to prevent, or - io8 - reduce the risk of, the individual from suffering from a CVD event. In particular, detection of each biomarker in isolation enables the identification of individuals who are at risk from suffering from a CVD event, and detection of multiple biomarkers, i.e. the biomarker signature provides a particularly effective means of enabling early intervention to prevent, or reduce the risk of, the individual from suffering from a CVD event.

This is the first evidence for the reported GHR and RORA variants to be associated with CVD, and Example 2 sets out the method of determining a subject’s risk of cardiovascular disease based on detections of SNPs in GHR and RORA genes. Though the genetic evidence of association with CVD is not strong, RORA is known to regulate a number of genes involved in lipid metabolism such as apolipoproteins Al, APOA5, Clll, CYP71 and PPARgamma, possibly working as a receptor for cholesterol or one of its derivatives cite Uniprot). Additionally, overexpression of RORA isoforms suppresses TNFalpha induced expression of adhesion molecules in human umbilical vein endothelial cells, regulating inflammatory response (Migita et al., 2004). The inventor’s network analyses shows that RORA and GHR interact with main regulatory hubs of the identified CVD network of BMKs. Consistent with that, their genotypes contributed to calculations of the network activity and to the clustering of patients, which were in turn related to CVD progression.

RORA and GHR connected to the main networks through upstream regulators (TNFalpha and IL6) common to the prioritized molecules. Though, TNFalpha and IL6 were not part of the input information to create the network, the inventors could uncover these as hidden main regulators of the input molecules. Links of TNFalpha and IL6 to CVD are known (Lopez-Candales et al., 2017) - but, the network allows the visualization of the directions of the molecular interactions (Figure 2). These were used to inform the Bayesian statistics method to calculate the patient specific pathway activity, providing thereby, a summarization of different omics layers, as more robust signatures for elucidating meaningful patient subgroups and understanding mechanistic interactions.

One of the advantages of the Bayesian network analyses over single BMK analyses, is that the activity of BMKs that were not measured or had poor measurement quality in the patient sample (for instance TNFalpha, and IL6 in the ORIGIN trial) can be estimated, in special if these are central to the disease network. This can lead to the discovery of new BMKs of disease progression. TNFalpha and IL6 have been intensively studies in CVD models, but in humans their protein detection is somewhat cumbersome and their levels are also influenced by physical activity (Vijayaraghava et al., 2017).

Flere, the inventors could infer their activities based on downstream interacting protein, from which measured levels were fed into the Bayesian model (Figure 2). The calculated activities may be interpreted as activation of the molecule, rather than actually a measure of levels of the protein, i.e., a transcription factor is calculated to be active or not based on downstream targets, not only of the effective measured protein levels. For instance, the TNFalpha pathway was less active in patients at higher risk to CV events. It may appear at first contradictory, as in functional studies inhibition of TNFalpha had beneficial effects on cardiac function and outcome. Nevertheless, most prospective studies with TNFalpha inhibitors bind, which to TNFalpha directly, reported disappointing and inconsistent results for CVO risk. For instance, in rheumatoid arthritis (RA) patients, it is rather the control of the disease itself (e.g., inflammatory pathway) than TNFalpha levels reduction per se that is necessary for CVO prevention (Coblyn et al., 2016). Thereby, specific protein interactions should be relevant to the inflammatory process and TNFalpha is just a peace of the puzzle. In this context, IL6, which is also an inflammatory marker, was predicted to be more active in the higher CVO risk cluster in the inventors study. The transcription factor nuclear factor kappa-B, which was predicted to be more active in the higher CVO risk cluster in the inventors study, regulates expression of many proinflammatory cytokines and is cardioprotective during acute hypoxia and reperfusion injury. As for TNFalpha, the activity of NTproBNT, a cardiac hormone that may function as a paracrine antifibrotic factor in the heart, was predicted to be lower in patients at higher CVO risk. Here, the inventors could confirm that the majority of these patients had actually higher levels of protein, than those less prone to CVO. BNP and NTproBNP peptides are released into the blood circulation in response to pressure and volume overload of the cardiac chambers. Cleavage of pre-proBNP precursor within cardiomyocytes leads to the formation of proBNP, which is subsequently cleaved into Nterminal (NTproBNP) and C-terminal (BNP) fragments. Most biological effects of BNP are the result of its binding to the natriuretic peptide receptor (Dhingra 2002). Thereby, NTproBNP calculated activity reflects rather molecular interactions that indicate it is active than the actual protein levels. The inventors, without wishing to be bound to any particularly theory, hypothesize that though NTproBNP levels are high in patients at higher CVO risk, it is being not effective as it should be.

NPPB and TFF3 were the top ranking associated BMKs by conventional statistics in the ORIGIN cohort, but did not appear in the network constructed with the CARDIOGRAM gene set. As the overconnectivity algorithm identifies the shortest path connecting genes, the inventors, without wishing to be bound to any particular theory, can conclude that: i) there is an active biological network that connects GWAs associated genes to protein biomarkers that had been associated to CVD, to which TNFalpha plays a central role; ii) though NPPB and TFF3 clearly are relevant biomarkers of CVD, these appear to be more downstream to the cascade of events related to the TNFalpha than the gene set identified in the CARDIOGRAM study. Nevertheless, for the purpose of identifying a disease related network for BMK discovery, protein BMKs should be more relevant than loci associated with CVD, as GWA results often map to genes that do not encode circulating proteins. In the patient stratification analyses, the inventors used mainly circulating proteins (N = 13 plus 2 genetic markers), making it feasible for translational application, as tissue specific samples are often difficult or not feasible to obtain in standard clinical practice. -Ill-

Though the complexity of this computational approach does not make it a straight forward process to be used in the clinic, the obtained patient strata can be further investigated to identify ideal BMKs (including known clinical parameters) combinations and ratios that would represent different stages of disease progression.

Identifying clusters of sub-populations progressing differently towards CVO will be informative to: i) define how the molecular signatures of each cluster translate into combinations of biomarkers with prognostic value, and iii) to define how these markers will respond to treatment in an additional cohort. Thereby, molecular signatures of disease progression should lead to strategies to optimize clinical trials plans, and drug efficacy, and so to optimize treatment.

Supplementary TABLE S1. Study sample characteristics for the total sample, and high versus low CVO risk clusters for the Caucasian and Latino sub populations.

Supplementary TABLE S2. List of molecular entities used in the overconnectivity analyses. Results from the GWA analyses conducted in ORIGIN and in the CARDIOGRAM consortium were assigned to loci (see Methods) and protein biomarkers to their respective genes.

Supplementary TABLE S3. List of associated loci included in the overconnectivity network analyses. Identified loci at P < 10-6, in the ORIGIN cohort, and P < 10-8, in the CARDIOGRAM consortium, were entered in the analyses. Attached excel file.

Supplementary TABLE S4. List of sub-networks identified in the overconnectivity analyses with their respective p-values. Supplementary TABLE S5. Overconnectivity network result table. Nodes and relationships of entities comprising the networks used for the Bayesian statistics based network analyses.

Supplementary TABLE S6. Logistic regression model to identify standard CVO risk factors (age, sex, BMI, HbA1c, c-Peptide, HDL-C, LDL-C, TG, TC, SBP, DBP, smoking status, albuminuria or reported albuminuria) contributing to clusters separation in Caucasian and Latinos.

Example 2 - detection of SNPs associated with CVD risk

Having identified two SNPs associated with an individual having an increased risk in cardiovascular disease (CVD), .i.e. Reference SNP cluster ID: rs73420079 (SEQ ID No: 3) and Reference SNP cluster ID: rs4314405 (SEQ ID No: 40), the inventors work enables the identification of individuals with an increased risk of CVD using oligonucleotide probes designed to detected the SNPs.

Oligonucleotide probes for detecting the presence of the SNPs can be produced and synthesized by any available oligonucleotide probe design tool, based on the SNP rs number. For example, probes of the SNPs can be sent to lllumina’s® lllumina Assay Design Tool for scoring, based on the rs number format, to produce an assay ready probe.

A sample may be isolated from the patient, the sample can be a blood sample. The individual’s nucleic acid is isolated from the sample. The isolation may occur by any means convenient to the practitioner. For instance, the isolation may occur by first lysing the cell using detergents, enzymatic digestion or physical disruption. The contaminating material is then removed from the nucleic acids by use of, for example, enzymatic digestion, organic solvent extraction, or chromatographic methods. The individual’s nucleic acid may be purified and/or concentrated by any means, including precipitation with alcohol, centrifugation and/or dialysis. The individual’s nucleic acid is then assayed for presence or absence of one or more of the SNPs using the oligonucleotide probes that are capable of hybridizing to a nucleic acid sequence comprising one or more of the SNPs.

The detection of Reference SNP cluster ID: rs73420079 (SEQ ID No: 3) and/or Reference SNP cluster ID: rs4314405 (SEQ ID No: 40) in the sample indicates that the individual is at in increased risk of CVD.