Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
USE OF A TRANSCRIPTOMIC SIGNATURE BASED ON HERVs EXPRESSION TO CHARACTERIZE NEW ACUTE MYELOID LEUKEMIA SUBTYPES
Document Type and Number:
WIPO Patent Application WO/2023/094639
Kind Code:
A1
Abstract:
The invention concerns the use of a transcriptomic signature based on HERVs expression to characterize new AML subtypes, and a method to determine to which AML subtype a patient pertains. The method comprises providing relationship between said 9 AML subtypes and HERVs characterized by their specific herv_id and their relationship with one of these AML subtypes, determining from a patient cell sample HERVs expression profile, determining which of the 9 AML subtypes is the most represented based on HERV expression in said cell sample, and attributing to the patient the most represented AML subtype among the 9 AML subtypes. The invention allows identifying patients with medium good or bad prognosis and treating the same with a cancer therapy against AML.

Inventors:
DEPIL STÉPHANE (FR)
ALCAZER VINCENT (FR)
Application Number:
PCT/EP2022/083376
Publication Date:
June 01, 2023
Filing Date:
November 25, 2022
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
ERVACCINE TECH (FR)
CENTRE LEON BERARD (FR)
UNIV CLAUDE BERNARD LYON (FR)
INST NAT SANTE RECH MED (FR)
CENTRE NAT RECH SCIENT (FR)
International Classes:
C12Q1/6886
Other References:
ALCAZER V ET AL: "Human endogenous retroviruses characterize the different subpopulations of normal and leukemic cells and represent a source of epitopes for cancer immunotherapy in acute myeloid leukemia", vol. 5, no. s2, 1 January 2021 (2021-01-01), US, pages 12 - 13, XP055921029, ISSN: 2572-9241, Retrieved from the Internet DOI: 10.1097/HS9.0000000000000566
SUNIL KUMAR SAINI ET AL: "Abstract", NATURE COMMUNICATIONS, vol. 11, no. 1, 1 November 2020 (2020-11-01), XP055750761, DOI: 10.1038/s41467-020-19464-8
A.S. ATTERMANN ET AL: "Human endogenous retroviruses and their implication for immunotherapeutics of cancer", ANNALS OF ONCOLOGY, vol. 29, no. 11, 1 November 2018 (2018-11-01), NL, pages 2183 - 2191, XP055750945, ISSN: 0923-7534, DOI: 10.1093/annonc/mdy413
BENDALL MATTHEW L ET AL., PLOS COMPUTATIONAL BIOLOG, vol. 15, no. 9, 30 September 2019 (2019-09-30), pages e1006453
CONSORTIUM IHGS: "Initial sequencing and analysis of the human genome", NATURE, vol. 409, no. 6822, 15 February 2001 (2001-02-15), pages 35057062
JOHNSON WE: "Origins and evolutionary consequences of ancient endogenous retroviruses", NAT REV MICROBIOL., vol. 17, no. 6, June 2019 (2019-06-01), pages 355 - 70, XP036781356, DOI: 10.1038/s41579-019-0189-2
VARGIU L ET AL.: "Classification and characterization of human endogenous retroviruses; mosaic forms are common", RETROVIROLOGY, vol. 13, no. 1, 22 January 2016 (2016-01-22), pages 7
KASSIOTIS GSTOYE JP: "Immune responses to endogenous retroelements: taking the bad with the good", NAT REV IMMUNOL., vol. 16, no. 4, 2016, pages 207 - 19, XP002787793, DOI: 10.1038/nri.2016.27
ALCAZER V ET AL.: "Human Endogenous Retroviruses (HERVs): Shaping the Innate Immune Response in Cancers", CANCERS (BASEL, vol. 12, no. 3, 6 March 2020 (2020-03-06)
LAROUCHE J-D ET AL.: "Widespread and tissue-specific expression of endogenous retroelements in human somatic tissues", GENOME MED, vol. 12, no. 1, 2020, pages 1 - 16
BURNS KH: "Transposable elements in cancer", NAT REV CANCER., vol. 17, no. 7, 2017, pages 415 - 24
ATTERMANN AS ET AL.: "Human endogenous retroviruses and their implication for immunotherapeutics of cancer.", ANN ONCOL., vol. 29, no. 11, 2018, pages 2183 - 91, XP055750945, DOI: 10.1093/annonc/mdy413
DE KOUCHKOVSKY IABDUL-HAY M: "Acute myeloid leukemia: a comprehensive review and 2016 update", BLOOD CANCER JOURNAL, vol. 6, no. 7, 2016, pages e441 - e441, XP002793132, DOI: 10.1038/bcj.2016.50
HEROLD T ET AL.: "A 29-gene and cytogenetic score for the prediction of resistance to induction treatment in acute myeloid leukemia", HAEMATOLOGICA, vol. 103, no. 3, 2018, pages 456 - 65
ALEXANDROV LB ET AL.: "Signatures of mutational processes in human cancer", NATURE, vol. 500, no. 7463, 22 August 2013 (2013-08-22), pages 415 - 21, XP055251628, DOI: 10.1038/nature12477
SMITH CC ET AL.: "Alternative tumour-specific antigens.", NAT REV CANCER., vol. 19, no. 8, 2019, pages 465 - 78, XP037114954, DOI: 10.1038/s41568-019-0162-4
BRODSKY I ET AL.: "Expression of HERV-K proviruses in human leukocytes.", BLOOD, vol. 81, no. 9, 1 May 1993 (1993-05-01), pages 2369 - 74
DEPIL S ET AL.: "Expression of a human endogenous retrovirus, HERV-K, in the blood cells of leukemia patients", LEUKEMIA, vol. 16, no. 2, 2002, pages 254 - 9, XP037780122, DOI: 10.1038/sj.leu.2402355
TOBIASSON M ET AL.: "Comprehensive mapping of the effects of azacitidine on DNA methylation, repressive/permissive histone marks and gene expression in primary cells from patients with MDS and MDS-related disease", ONCOTARGET, vol. 8, no. 17, 25 April 2017 (2017-04-25), pages 28812 - 25
KAZACHENKA A ET AL.: "Epigenetic therapy of myelodysplastic syndromes connects to cellular differentiation independently of endogenous retroelement derepression", GENOME MEDICINE, vol. 1, 23 December 2019 (2019-12-23), pages 86
DENIZ O ET AL.: "Endogenous retroviruses are a source of enhancers with oncogenic potential in acute myeloid leukaemia", NAT COMMUN., vol. 11, no. 1, 2020, pages 3506
BENDALL ML ET AL.: "Telescope: Characterization of the retrotranscriptome by accurate estimation of transposable element expression", PLOS COMPUT BIOL., vol. 15, no. 9, 2019, pages el006453
CORCES, M.R. ET AL.: "Lineage-specific and single-cell chromatin accessibility charts human hematopoiesis and leukemia evolution", NAT. GENET., vol. 48, 2016, pages 1193 - 1203, XP055651307, DOI: 10.1038/ng.3646
SONDKA Z. ET AL.: "The COSMIC Cancer Gene Census: describing genetic dysfunction across all human cancers", NAT. REV. CANCER, vol. 18, 2018, pages 696 - 705, XP036619382, DOI: 10.1038/s41568-018-0060-1
TYNER J.W. ET AL.: "Functional genomic landscape of acute myeloid leukaemia", NATURE, vol. 562, 2018, pages 526, XP036900261, DOI: 10.1038/s41586-018-0623-z
LANGMEAD, B.SALZBERG, S.L.: "Fast gapped-read alignment with Bowtie 2", NAT. METHODS, vol. 9, 2012, pages 357 - 359, XP002715401, DOI: 10.1038/nmeth.1923
LI, H., HANDSAKER: "The Sequence Alignment/Map format and SAMtools.", BIOINFORMATICS, vol. 25, 2009, pages 2078 - 2079, XP055229864, DOI: 10.1093/bioinformatics/btp352
ANDERS, S.PYL, P.T.HUBER, W.: "HTSeq—a Python framework to work with high-throughput sequencing data", BIOINFORMATICS, vol. 31, 2015, pages 166 - 169
LOVE, M.I.HUBER, W.ANDERS, S: "Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2", GENOME BIOL., vol. 15, 2014, pages 550, XP021210395, DOI: 10.1186/s13059-014-0550-8
LIBERZON A. ET AL.: "The Molecular Signatures Database Hallmark Gene Set Collection", CELL SYST., vol. 1, 2015, pages 417 - 425
LOEFFLER-WIRTH H ET AL.: "A modular transcriptome map of mature B cell lymphomas.", GENOME MED., vol. 11, 2019, pages 27
HUBERT, M. ET AL.: "IFN-III is selectively produced by cDCl and predicts good clinical outcome in breast cancer", SCI. IMMUNOL., vol. 5, 2020
THORSSON ET AL.: "The Immune Landscape of Cancer", IMMUNITY, vol. 48, 2018, pages 812 - 830
ZHU, A., BIOINFORMATICS, vol. 35, pages 2084 - 2092
"The Cancer Genome Atlas Research Network (2013). Genomic and Epigenomic Landscapes of Adult De Novo Acute Myeloid Leukemia", N. ENGL. J. MED., vol. 368, pages 2059 - 2074
H. KANTARJIAN ET AL., BLOOD CANCER JOURNAL, vol. 11, 2021, pages 41
Attorney, Agent or Firm:
ICOSA (FR)
Download PDF:
Claims:
82

CLAIMS A method for atributing an AML patient to an AML subtype among 9 AML subtypes characterized by their specific HERVs listed in Table 1 with the indication of their herv_id, the method comprising providing relationship between said 9 AML sub-types and HERVs characterized by their specific herv_id and their relationship with one of these AML subtypes, as set forth in Table 1, determining from a patient cell sample HERVs expression profile, determining which of the 9 AML subtypes is the most represented based on HERV expression in said cell sample, and attributing to the patient the most represented AML subtype among the 9 AML subtypes. The method of claim 1, comprising a. determining from a patient’s sample the expression value of the 703 HERVs listed in Table 1, or of a sub-part comprising the HERVs with a coefficient > 1.2 and those with a coefficient < 0.8, as indicated in Table 1, b. multiplying each HERV expression value by the coefficient attributed to the corresponding HERV in Table 1, c. for each of the 9 AML subtypes, calculate their score as the mean of each HERV expression specific to the subtype, and d. attributing to the patient the AML subtype with the highest score among the 9 AML subtypes. The method of claim 2, wherein step a. comprises performing RNA-Seq, preferably next generation sequencing (NGS), in a sample of a patient, preferably a bulk bone marrow sample, method in which RNA from the sample is fragmented and the fragments are reverse transcribed into cDNA fragments, or the RNA is reverse transcribed to cDNA and then fragmented to get cDNA fragments. The method of claim 2 or 3, wherein in step a. cDNA fragments are sequenced and aligned back to a pre-sequenced reference human genome or human genome reference, using a sequence aligner, these alignments are tested for overlap with 83 said HERVs’ sequences, and the number of overlap reads mapped to a gene is registered for each HERVs’ sequence giving its expression value. The method of any one of the preceding claims, further attributing a prognosis to the patient relative to AML, say, if the patient is attributed AML subtype 1 or 9, prognosis is good, if he is attributed AML subtype 2 or 7, prognosis is medium good, and if he is attributed AML subtype 8, 4, 3, 6 or 5, prognosis is bad. The method of claim 5, further comprising the recommendation of treating said patient with a cancer therapy against AML, preferably an aggressive therapy, when the patient preferably intensified chemotherapy or an alternative therapy through enrollment into a clinical trial for a novel therapy, when the patient is attributed AML subtype 2 or 7, or AML subtype 8, 4, 3, 6 or 5. The method of claim 5, further comprising the recommendation of treating said patient with an aggressive therapy, preferably intensified chemotherapy or an alternative therapy through enrollment into a clinical trial for a novel therapy, if the patient is attributed AML subtype 8, 4, 3, 6 or 5, or with a less aggressive therapy, preferably standard chemotherapy, if the patient is attributed AML subtype 9, 1, 2 or 7. Use of an anticancer drug for treating a subject against AML, wherein the subject had been previously identified as having an AML with a medium good or a bad prognosis by use of the method of claim 5.

Description:
USE OF A TRANSCRIPTOMIC SIGNATURE BASED ON HERVs EXPRESSION TO CHARACTERIZE NEW ACUTE MYELOID LEUKEMIA SUBTYPES

FIELD OF INVENTION

[0001] The present invention concerns the use of a transcriptomic signature based on Human endogenous retroviruses (HERVs) expression to characterize new acute myeloid leukemia (AML) subtypes, and a method to determine to which AML subtype a patient pertains.

BACKGROUND OF INVENTION

[0002] Human endogenous retroviruses (HERVs) represent 8% of the human genome (1). These sequences are remnants of ancestral germline infections by exogenous retroviruses (2). The original sequence of a HERV is that of an exogenous retrovirus, with two promoter long-terminal repeat (LTR) sequences surrounding the virus open-reading frames (ORFs): gag, pro, pol and env (3). However, after millions of years of evolution, these ORFs have been deeply altered, and there is currently no description of any autonomous fully infectious HERV (4).

[0003] The long-standing belief is that HERVs are repressed by epigenetic mechanisms and are thus not expressed, or only poorly, in normal tissues (5). However, recent studies have shown that HERV expression can be detected in a vast range of normal tissues (6). Different pathological conditions can lead to aberrant HERV expression, as it has now been largely described in auto-immune diseases (4) and in cancers (7), where HERVs have been the subject of many studies over the last years. Indeed, it was reported that HERVs could participate in oncogenesis by inducing chromosomal instability, promoting aberrant gene expression with their LTR or by impacting the immune system with their RNA and protein products (7). HERVs could thus play a prominent role in cancer immunity, increasing tumor immunogenicity by promoting (i) an innate immune response triggered by the viral defense pathway induced by their nucleic acid intermediates, and (ii) an adaptive immune response by forming a pool of tumor-associated antigens (8).

[0004] Acute Myeloid Leukemia (AML) is a heterogeneous disease characterized by the clonal expansion of myeloid progenitor and stem cells (9). While some AML subtypes are characterized by recurrent genetic translocations or mutations associated with particular prognoses, most AMLs present a normal or complex karyotype, and identifying key factors that predict treatment resistance in these patients represents a major challenge (9, 10). Aside from disease stratification, AML also belongs to malignancies with the lowest mutational burdens (11), and finding tumor- specific antigens for immunotherapeutic approaches remains very difficult as the frequency of mutations creating neoantigens is expected to be low. In this context, HERV-derived antigens could represent a unique source of non-conventional epitopes that could be exploited for the development of new immunotherapies (12).

[0005] To date, little is known about the expression of HERV s in AML and its relevance as either a biomarker or a therapeutic target. Evidence of HERV-K /HML-2 expression in AML cells was shown as early as 1993 and confirmed in the early 2000s (13, 14). Few studies then focused on HERVs in AML until the late 2010s, with the demonstration that azacytidine (Aza) activates the transcription of different HERVs, potentially contributing to its clinical effects (15). The exact role of HERVs in Aza therapy is however a matter of debate, with recent evidence arguing in favor of a HERV-independent therapeutic effect (16). More recently, a link was established between HERVs and the expression of surrounding genes in AML, suggesting a regulatory role of these retroelements (17). Albeit, few data exist on HERV expression and their immune impact in AML, with studies relying on non-exhaustive quantification methods, such as polymerase chain reaction (PCR), or focusing only on a few HERV loci.

SUMMARY OF THE INVENTION

[0006] In this invention, we established different signatures based on HERV expression from several RNA sequencing data to characterize different AML subtypes. 9 clusters or AML subtypes have been characterized. Relying only on HERV expression, it was possible to define these new AML subtypes independently of karyotype of molecular alterations. These results suggest that HERVs can be used to improve risk stratification and treatment resistance prediction, in particular in patients with no genetic or molecular abnormalities associated with well-defined prognosis and resistance profiles. These 9 subtypes were associated with significant differences in overall survival (OS) and Hazard ratio (HR) among intensively treated patient, independently of established prognosis factors such as age, ELN2017 and white blood count, integrated in a multivariate Cox model. These clusters also presented distinct cancer hallmark profiles, as assessed by single sample gene-set variation analysis (GSVA) based on cancer hallmark signatures.

[0007] It is first provided 9 AML subtypes defined by representative HERVs as disclosed in Table 1. The first column of Table 1 indicates the cluster number (1, 2, 3, 4, 5, 6, 7, 8 and 9) to which the HERV of column 2 pertains. It is then provided a method for attributing a patient to an AML subtype among 9 AML subtypes characterized by their specific HERVs listed in Table 1 with the indication of their herv_id and of their locus in the GRCH38 version of the human genome.

[0008] The present invention thus relates to a method for attributing an AML patient to an AML subtype among 9 AML subtypes characterized by their specific HERVs listed in Table 1 with the indication of their herv_id (and of their locus in the GRCH38 version of the human genome). The method comprises considering or providing relationship between said 9 AML sub-types and HERVs characterized by their specific herv_id, their locus in the GRCH38 version of the human genome and their relationship with one of these AML subtypes, as set forth in Table 1. The method comprises determining HERVs expression profile in a patient cell sample. The expression profile is the expression level for HERVs as set forth in Table 1. This step allows to determine which of the 9 AML subtypes (or the HERVs corresponding to this subtype) is the most represented based on HERV expression in the sample cells. In particular, HERV expression is related to, and determinable by, HERV RNA in the cell sample, such as bone marrow sample. Based on Table 1 and HERV expression, the method comprises attributing to the patient an AML subtype among the 9 AML subtypes. As explained later, each one subtype may correspond to a given prognosis, so that the method may further comprise attributing a particular prognosis, which is the one corresponding to the attributed sub-type. Also, as explained later, the method may comprise treating the patient with a cancer therapy suited to its prognosis resulting from the preceding steps.

[0009] The method comprises attributing to the patient an AML subtype among the 9 AML subtypes. This subtype is the most represented in the cell sample based on HERV expression profile. It is the most represented in the cell sample as determined by identifying and quantifying the HERVs pertaining to the sub-type. In an aspect, the method may comprise a. determining from a patient’s sample the expression value of the 703 HERVs listed in Table 1, or of a sub-part of these 703 HERVs, b. multiplying each HERV expression value by the coefficient attributed to the corresponding HERV in Table 1, c. for each of the 9 AML subtypes, calculate their score as the mean of each HERV expression specific to the subtype, and d. attributing to the patient the AML subtype with the highest score among the 9 AML subtypes.

DETAILED DESCRIPTION

[0010] For each of the 4 public datasets (AMLCG, TCGA, BEAT and LEUCEGENE), DESEQ2 VST normalized expression data were independently calculated and further center- scaled for each dataset to correct the potential batch effect. The top 2,000 most variable HERVs was then selected independently for each dataset based on the scaled DESEQ2 VST normalized count. The 4 datasets were then merged, keeping only the intersect between each top 2,000 candidate HERVs, resulting in 961 variable HERVs conserved across the 4 datasets. Unsupervised hierarchical clustering guided by the average silhouette and Bayesian Information Criterion (BIC) evolution defined 9 clusters that were not dependent on the study and presented significant differences in term of overall survival, hazard ratio and hallmarks of cancer (Figures 1 and 2). HERVs signatures were calculated by selecting the top expressed HERVs in each cluster compared to all the others. Only features with a fold change > 1.2 or < 0.8 and an adjusted p-value < 0.05 were retained, leading to a final list of 703 HERVs (Table 1). Final signatures were calculated by ponderating each HERVs by its fold change for each cluster.

[0011] Table 1 presents the 703 HERVs with their identification herv_id (or gene_id) and their respective locus in the GRCH38 version of the human genome. Using the herv_id, the open-source tool Telescope made available online (Bendall Matthew L et al., (September 30, 2019), PLoS Computational Biology. 2019;15(9):el006453, Telescope: Characterization of the retrotranscriptome by accurate estimation of transposable element expression.

(https://journals.plos. org/ploscompbiol/article/comments?id=10.1371/joumal.pcbi.100 6 453, https://github.com/mlbendall/telescope) and the genomic reference file in the international General Feature Format 2613_703HERVs_GRCH38_genomic_ref.gtf (available online at : github.com/VincentAlcazer/hervs_ref/blob/main/2613_703HERVs_ GRCH38_genomic _ref.gtf), the skilled person has access to the exact genomic coordinates of each of the 703 HERV sequences in the GRCH38 version of the human genome, and consequently to their corresponding nucleotide sequences.

[0012] Thus, from any cell sample from an AML patient (bone marrow sample, blood sample containing white blood cells, for example) it is possible to determine the presence of specific HERVs RNAs that pertain to said 9 AML clusters or sub-types, to determine the dominant AML cluster or sub-type for the cell sample and the patient from which the cell sample originates, and then to attribute to the patient said AML cluster or sub-type. The dominant sub-type may be determined as the one for which the HERVs RNAs profile or number is the highest or the most significant among the 9 sub-types, as disclosed herein.

[0013] In an embodiment, HERVs RNAs are recovered, cDNAs are produced from RNAs, cDNA fragments are aligned to a reference human genome, followed by the determination of the number of cDNAs aligned to a sufficient number of the 703 HERVs listed in Table 1, or to a subset of the 703 HERVs, or to all the 703 HERVs. As a sub-set, one may use the HERVs with a coefficient > 1.2 and the HERVs with a coefficient < 0.8.

[0014] In an aspect, High-throughput sequencing with RNA, commonly referred to as RNA-Seq, involves mapping sequenced fragments of cDNA. In RNA-Seq, the RNA is fragmented and then reverse transcribed to cDNA, or is reverse transcribed and then fragmented. These fragments are then sequenced, producing reads that are aligned back to a pre-sequenced reference genome or human genome reference. The number of reads mapped to a gene is used to quantify its expression.

[0015] Thus, in an aspect, the method comprises performing RNA-Seq, preferably next generation sequencing (NGS), in a sample of a patient, preferably a bulk bone marrow sample, method in which RNA from the sample is fragmented and the fragments are reverse transcribed into cDNA fragments, or the RNA is reverse transcribed to cDNA and then fragmented to get cDNA fragments.

[0016] The size of the RNA fragments may vary in large proportion as known by the skilled person. Typically, RNA fragments may have a size of from 50 to 100 base pairs, e.g. about 75 base pairs.

[0017] Thus, the method may comprise first performing RNA-Seq, preferably next generation sequencing (NGS) in an AML patient. RNA-Seq, preferably NGS may be performed from bulk bone marrow sample. NGS can be either non-targeted (total or poly A RNA-seq) or targeted.

[0018] In an aspect, said cDNA fragments are sequenced and aligned back to a presequenced reference human genome or human genome reference, using a sequence aligner, these alignments are tested for overlap with said HERVs’ sequences, and the number of overlap reads mapped to a gene is registered for each HERVs’ sequence giving its expression value.

[0019] Quantifying HERVs is performed from RNA-Seq or NGS data. This comprises aligning raw-reads to human genome reference using any sequence aligner, preferably a fast or ultrafast sequence aligner; such as Bowtie2, with conservative parameters: — no- unal — score-min L,0,1.6 -k 100 —very-sensitive-local.

[0020] A relevant description of the method is described in reference (18), the whole content of which is incorporated herein by reference.

[0021] Quantifying HERVs expression is then made using a computer implemented method or an adequate software. The open-source tool Telescope (18) is a suitable one. A relevant description of the method is described in the previously mentioned reference (18).

[0022] Quantifying genes is also made with any suitable tool such as HTseq or featurecount.

[0023] Normalizing expression data taking genes raw count into account may then be realized.

[0024] 9 clusters or subtypes have been identified and the HERVs pertaining to each cluster are presented in Table 1, and the coefficient of each HERVs is also indicated in this Table.

[0025] The HERV-LSC score is calculated for each cluster (i.e. AML subtype): a. multiply each HERVs’ normalized expression value by its coefficient provided in Table 1, b. for each cluster, calculate its score as the mean of each pondered HERVs’ expression, c. the cluster with the highest score is attributed to the patient.

[0026] The subtype is easy to calculate or determine and can be applied to any patient using next generation sequencing (NGS) data:

1. Perform NGS in any AML patient at diagnosis. a. NGS is performed from bulk bone marrow sample at diagnosis. i. NGS can be either non-targeted (total or polyA RNA-seq) or targeted.

2. Quantify HERVs from NGS data a. Align raw-reads to human genome reference using any aligner such as Bowtie2 with conservative parameters: — no-unal — score-min L,0,1.6 -k 100 —very-sensitive-local b. Quantify HERVs using the open-source tool Telescope (18). c. Quantify genes (the 703 HERVs of Table 1) with any tool such as HTseq or featurecount. d. Normalize expression data taking genes raw count into account.

3. For each cluster (i.e. AML subtype), calculate its score according to the Table 1: a. Multiply each HERVs’ normalized expression value by the coefficient attributed to the corresponding HERV in Table 1. b. For each of the 9 AML subtypes or clusters, calculate their score as the mean of each pondered HERVs’ expression specific to the subtype. c. The cluster or AML subtype with the highest score among the 9 is attributed to the patient.

[0027] In case not all HERVs are available, core-HERVs with a coefficient > 1.2 or < 0.8 should at least be present.

[0028] The AML subtypes correspond to specific Overall survival (OS) and/or Hazard ratio (HR). Thus, attributing an AML subtype to a patient does attribute a prognosis, in particular based on OS and/or HR.

[0029] In some cases, AML subtype 1 or 9 is attributed to the patient, with a good prognosis or the best prognosis among the 9 subtypes.

[0030] In some cases, AML subtype 2 or 7 is attributed to the patient, with a medium good prognosis among the 9 subtypes.

[0031] In some cases, AML subtype 8, 4, 3, 6, 5 is attributed to the patient, with a bad or worse prognosis among the 9 subtypes.

[0032] Thus in an aspect, the method for attributing a patient to an AML subtype among the 9 AML subtypes disclosed herein is a method of attributing a prognosis to the patient relative to AML, e.g.: if the patient is attributed AML subtype 1 or 9, prognosis is good, if he is atributed AML subtype 2 or 7, prognosis is medium good, and if he is attributed AML subtype 8, 4, 3, 6 or 5, prognosis is bad.

[0033] In an aspect, the method further comprises the recommendation of treating said patient with a cancer therapy against AML, preferably an aggressive therapy, when the patient preferably intensified chemotherapy or an alternative therapy through enrollment into a clinical trial for a novel therapy, when the patient is attributed AML subtype 2 or 7, or AML subtype 8, 4, 3, 6 or 5.

[0034] In an aspect, the method further comprises the recommendation of treating said patient with an aggressive therapy, preferably intensified chemotherapy or an alternative therapy through enrollment into a clinical trial for a novel therapy, if the patient is attributed AML subtype 8, 4, 3, 6 or 5, or with a less aggressive therapy, preferably standard chemotherapy, if the patient is attributed AML subtype 9, 1, 2 or 7.

[0035] In an aspect, the invention concerns the use of a anticancer drug for treating a subject against AML, wherein the subject had been previously identified as having an AML with a medium good or a bad prognosis by use of this method.

[0036] In another aspect the invention relates to a method of treating a subject against AML, comprising treating the patient with a cancer therapy against AML, in particular an aggressive cancer therapy, wherein the subject had been previously identified as being in a medium good or bad prognosis, by use of this method of attributing a patient to an AML subtype.

[0037] In an aspect, the invention relates to a method of treating a subject against AML, comprising treating a patient with medium good or bad prognosis with a cancer therapy against AML, in particular an aggressive cancer therapy, wherein the subject had been previously identified as having an AML with a medium good or a bad prognosis by use of the method as disclosed herein.

[0038] The aggressive therapy is preferably an identified chemotherapy or an alternative therapy through enrollment into a clinical trial for a novel therapy. [0039] In another aspect, the invention relates to a method of treating AML in a patient, comprising the steps:

(a) attributing an AML subtype 1 to 9 to said patient by using the method as disclosed herein; and

(b) treating said patient with an aggressive therapy, preferably intensified chemotherapy or an alternative therapy through enrollment into a clinical trial for a novel therapy, if the patient is attributed AML subtype 8, 4, 3, 6 or 5, or with a less aggressive therapy, preferably standard chemotherapy, if the patient is attributed AML subtype 9, 1, 2 or 7.

[0040] Therapeutics include: cytarabine, fludarabine, idarubicin, avapritinib, dasatinib, mitoxantrone, clofarabine, cladribine, azacitidine, daunorubicin, etoposide, midostaurin, sorafenib, gilteritinib, decitabine, lomustine, quizartinib, crenolanib, enasidenib, ivosidenib, venetoclax, glasdegib, antibodies such as Gemtuzumab, magrolimab, and combinations thereof (32).

[0041] The present invention will now be described in further detail, referring to the drawings.

[0042] Figure 1: Overall survival (OS) of intensively treated patients according to the 9 clusters in the whole cohort.

[0043] Figure 2: Multivariate Cox analysis of overall survival of intensively treated patients. Known risk factor (ELN2017 and WBC), study (batch) and clusters are integrated in the multivariate model.

[0044] HERV retrotranscriptome accurately defines normal hematopoietic cell populations

[0045] As a first step, we examined HERVs expression in the different normal hematopoietic cell populations, assuming that distinct HERVs profiles may characterize the main cell types. Using a custom pipeline based on Telescope (18), we quantified the expression of 14,968 HERVs loci in RNA-seq data from sorted bone-marrow and peripheral blood cell populations from 9 healthy donors (n=49 samples) (19). Unsupervised hierarchical clustering based on the top 20% most variable HERVs showed a robust classification of normal hematopoietic cell types with a cluster purity of 77.6% and a corrected Rand Index of 0.61. The same approach based on genes reached a purity of 65.3% with a corrected Rand Index of 0.47.

[0046] We then sought to improve the clustering with the analysis of peaks from open chromatin regions assessed by ATAC-seq. Using the HOMER package, we applied a classic human genome annotation from gencode (v33) to annotate the set of 590,650 significant non-overlapping peaks from open chromatin regions previously defined in sorted healthy donors’ bone marrow and peripheral blood cells (n=80 samples) (19). As previously described, unsupervised hierarchical clustering based on promoters elements (peaks between -lOObp and 1,000 bp away from a transcription start-site (TSS)) and intergenic elements (peaks more than 1,000 bp away from any other feature) significantly improved cluster classification, with a purity reaching 81.8%. We then re-annotated these significant peaks with a custom reference consisting of the same gencode annotation concatenated with the previously used 14,968 HERVs loci from Repeatmasker. Overall annotation showed that 16% of the total significant peaks correspond to HERVs regions. One important previously reported finding is that classification based on intergenic elements only (the so-called “distal regulatory elements”) is sufficient to classify normal hematopoietic cell populations (19). Enhanced annotation of these distal regulatory elements revealed an enrichment in HERVs, with up to 37.6% of the top 500 variable intergenic peaks corresponding to a HERV region. Plot of the total aggregated count from these regions showed a gaussian distribution surrounding HERVs’ TSS, confirming the good quality of the ATAC-seq signal. Clustering of samples based on active HERVs regions (AHR, defined by peaks surrounding HERVs regions +/- 1000 or 3000 bp) further improved the clustering, reaching 88.3% cluster purity.

[0047] Altogether these results show that HERV retrotranscriptome can be used to characterize normal immature and mature hematopoietic cell populations. The improved clustering obtained with AHR defined on ATAC-seq data suggests that this retrotransciptomic signature may reflect epigenetic features associated with cell differentiation. [0048] CLP: Common Lymphoid Progenitor, CMP: Common Myeloid Progenitor, Ery: Erythrocyte, GMP: Granulocyte-Macrophage Progenitor, HSC: Hematopoietic Stem cell, LMPP: lymphoid-primed multipotent progenitor, MEP: Megakaryocyte-Erythroid Progenitor, MPP: Multipotent Progenitor.

[0049] Acute myeloid leukemia cells show distinct HERV profiles close to their normal cell of origin

[0050] We next evaluated how HERV retrotranscriptome may help in distinguishing AML cells. We performed the same clustering approach, adding this time the 32 RNA- seq and 45 ATAC-seq bone marrow samples from 15 AML patients at diagnosis (Corces et al., 2016). Unsupervised clustering based on the top 20% variable AHR (+/- l,000bp from a HERV TSS) in ATAC-seq resulted again in a good classification of normal and AML cells, with a slight increase in cluster purity compared to the top 20% most variable intergenic peaks. Clustering based on HERVs expression in RNA-seq yielded comparable results. Interestingly, leukemic blast cells (blasts) clustered with either monocytes or granulocyte-monocyte progenitor (GMP) cells, LSCs with either GMP or lymphoid- primed multipotent progenitor (LMPP) cells and pre-leukemic hematopoietic stem cells (pHSCs) with either GMP or HSC/multipotent progenitor (MPP) cells, suggesting a clustering with their cell of origin as already described by Corces et al (Corces et al., 2016). Cluster purity based on the original cell categories do not consider these similarities and is thus a poor indicator of clustering performance in this case. Differential ATAC-count analysis centered on extended AHR (+/- 20,000 bp from a HERV TSS) revealed distinct profiles between AML LSCs, blasts and pHSCs compared to their normal counterpart, with globally a chromatin more open in blasts and more closed in LSCs and pHSCs. To further characterize the role of HERVs in these AHR, we computed correlations between RNA expression of each HERV present in an AHR and its respective surrounding genes located at +/- 50,000 bp. Strikingly, we found mostly positive correlations between HERVs expression and their surrounding genes. Annotation of the genes with a pre-established list of cancer-associated genes from the Conser Gene Census database (20) found several genes positively correlated with HERVs expressed in AHR. Of note, the highest correlation was found for GATA1 with ER VLB4_Xp 11.23b (Pearson’s R: 0.74, adjusted p-value: 8.1 le- 14). Using TCGA LAML RNAseq data, we then explored the association between each HERV located in an AHR and gene copy number variation (CNV) on the same cytoband. We found several HERVs correlating both positively and negatively with deletions, and mostly positively with amplifications on the same cytoband. These results show that HERVs expression profile differs according to the AML cell type and suggest that HERVs are associated with gene regulation.

[0051] AML: Acute Myeloid Leukemia, BM: Bone Marrow, CLP: Common Lymphoid Progenitor, CMP: Common Myeloid Progenitor, CNV: Copy Number Variation, Ery: Erythrocyte, FDR: False Discovery Rate, GMP: Granulocyte-Macrophage Progenitor, HSC: Hematopoietic Stem cell, LMPP: lymphoid-primed multipotent progenitor, LSC: Leukemic Stem Cell, MEP: Megakaryocyte-Erythroid Progenitor, MPP: Multipotent Progenitor, pHSC: pre-leukemic Hematopoietic Stem Cell.

[0052] HERVs expression defines subtypes of AML with distinct cancer hallmarks and outcomes

[0053] After having established that HERV retrotranscriptome characterizes particular cell types, we asked whether HERVs expression could distinguish distinct AML profiles in bulk RNA-seq data. We explored HERVs expression in 4 independent RNA-seq datasets (TCGA, AMLCG, LEUCEGENE and BEAT), keeping only bone marrow samples from AML patients at diagnosis (n=788). For each dataset, we selected the top 2,000 most variables HERVs based on the scaled DESEQ2 VST normalized count. We merged the 4 datasets, keeping only the intersect between each top 2,000 candidates HERVs, resulting in 961 variable HERVs conserved across the 4 datasets. Unsupervised hierarchical clustering guided by the average silhouette evolution defined 9 clusters that were not dependent on the study. These 9 clusters were associated with significant differences in overall survival among intensively treated patients (Figure 1), independently of established prognosis factors such as age, ELN2017 and white blood count, integrated in a multivariate Cox model (Figure 2). These clusters also presented distinct cancer hallmark profiles, as assessed by single sample gene-set variation analysis (GSVA) based on cancer hallmark signatures. GSVA of immune signatures revealed no clear immune subtype profiles, distinguishing clusters with globally low or high immune scores.

[0054] We then assessed whether these clusters were associated with known recurrent translocations or gene mutations. We found a clear enrichment of inv(16), t(8;21) and t( 15; 17) in clusters 1, 7 and 9, respectively. Other karyotypes (such as complex, poor or intermediate abnormalities) showed no particular association with any cluster, underlining the heterogeneous composition of these groups. Regarding gene mutations, we found an enrichment of TP53 mutation in cluster 3.

[0055] Focusing on HERVs discriminating each cluster (i.e. HERVs overexpressed in a given cluster compared to all the others), we found that clusters 8 and 9 had the highest number of different discriminating HERVs, whereas only few HERVs discriminated clusters 3 and 5. HERVs from the HERV-H, ERV-L, MER4, HERV-L and HERV-K families were the most frequent HERVs discriminating clusters. When reported to the total number of HERVs per family, HERV-S, HERV-E, HERV-P, ERV1 and HARLEQUIN families were the most frequently represented.

[0056] LSC: Leukemic Stem Cell, ROC: Receiver Operating Characteristic Curve, ssGSVA: Single Sample Genes-set Variation Analysis, WBC: White Blood Count.

[0057] METHODS

[0058] Raw RNAseq data

[0059] Raw RNA-seq data files were accessed from the NCBI Gene Expression Omnibus (GEO) portal, under the accession numbers GSE74246 for the sorted hematopoietic normal and AML cells from Corces et al. (19), GSE49642, GSE52656, GSE62190, GSE66917, GSE67039 and GSE106272 for the LEUCEGENE datasets, GSE127825 and GSE127826 for the six rnTECs samples (6). TCGA LAML (31) and BEAT-AML (21) data were accessed from the NCI Genomic Data Commons (GDC) data portal (https://portal.gdc.cancer.gov/). Raw data for the AMLCG cohort (10) were directly provided by the AMLCG group. [0060] HERVs and genes expression quantification

[0061] HERVs expression was quantified using a custom pipeline derived from Telescope (18). Briefly, RNAseq reads were aligned to a custom transcriptome using bowtie2 v2.2.1 (22) with custom parameters to keep multimaps (-k 100 —very- sensitivelocal — score-min "L,0,1.6"). The custom transcriptome consisted in the hg38 reference transcriptome with 14,968 HERVs transcriptional units compiled from RepeatMasker annotations (18). SAM outputs were converted to BAM files using SAMtools vl.4 (23). HERVs and genes expression was then calculated using Telescope (Bendall et al., 2019) and HTSeq 0.12.3 (24), respectively. Raw counts were then concatenated and normalized independently for each dataset using DESEQ2 v 1.28.0 with variance stabilizing transformation (VST) (25).

[0062] ATACseq data

[0063] Significant peaks called from ATACseq data analysis were retrieved from the original paper (Corces et al., 2016). Briefly, peaks were called using MACS2 and filtered using a custom blacklist. A final set of 590,650 significant peaks were defined among a list of non-overlapping maximally significant 500 bp peaks ranked by their summit significance value. These significant peaks were re-annotated using HOMER with the command “annotatePeaks.pl” and two different references: Gencode v33 only and Gencode v33 with the previously used HERVs annotation. Regions containing significant peak around +/- 1,000 or 3,000 bp of a HERV TSS were considered as active HERVs regions.

[0064] Unsupervised hierarchical clustering

[0065] For the sorted cells, DESEQ2 VST normalized expression data were directly used for unsupervised hierarchical clustering. Cluster purity was used as an external validation criterion and was calculated by first creating a confusion matrix between assigned cluster number and annotated cell type before summing the maximum values from each row (i.e. assigned cluster) divided by the total number of samples. [0066] A benchmark of distances (euclidean, maximum and pearson) and methods (ward.D2, single, complete, average and centroid) was performed to identify the optimal method leading to the best cluster purity using a pre-defined number of clusters according to the original annotation.

[0067] For the bulk datasets, DESEQ2 VST normalized expression data were independently calculated and further center-scaled for each dataset to correct the potential batch effect. Unsupervised hierarchical clustering was then performed using the average silhouette width and the Bayesian Information Criterion as internal validation markers.

[0068] Differential ATAC-count analysis

[0069] For differential ATAC-count analysis, raw ATAC-seq count were retrieved from the original paper (Corces et al., 2016). Differential expression analysis between each AML populations (LSC, pHSC and Blasts) and their normal counterpart (HSC, GMP, LMPP and monocytes) was performed using DESEQ2, with cell type as a covariate. Differentially expressed regions surrounding a HERVs TSS (+/- 20,000 bp) and with a FDR < 5% were retained for the final plot. The rolling mean of 1,000 sequential regions, ordered by chromosome location, was then represented.

[0070] HERVs, genes and copy number variation correlations

[0071] HERVs located in previously defined active HERVs regions (so-called AHR +/- 20,000 bp) were selected for correlation analysis. For each HERV, a list of surrounding genes located at +/- 50,000 bp of their TSS was established. Pearson’s correlations were calculated between the RNA expression of each HERVs and each of its surrounding gene, independently. P-values were corrected with the FDR method. Genes were then annotated using a published list of cancer-related genes from the Cancer Gene Census (20). The same list of HERVs was then used to perform correlations with CNV from the same cytoband. TCGA LAML CNV data were retrieved from the NCI GDC portal and used as is to calculate Pearson’s correlations with HERVs from the same cytoband. [0072] Cancer hallmark and immune signatures GSVA

[0073] For each hallmark of cancer (Hanahan and Weinberg, 2011), a unique gene signature was established based on The Molecular Signatures Database (MSigDb) Hallmark Gene Set Collection (26). When not available in MSigDb, hallmark signatures where established from Gene Ontology (GO) signatures, as previously described (27). Signature for the immune evasion hallmark was retrieved from Hubert et al. (28). Individual enrichment score were calculated from each patient by single sample gene-set variation analysis (ssGSVA), and scaled by study. The mean score for each cluster was then calculated and shown in a radar plot.

[0074] Immune signatures were obtained from Thorsson et al. (29) and calculated by ssGSVA for each sample. Unsupervised hierarchical clustering was then performed on study-scaled ssGSVA scores in each cluster.

[0075] Differential HERVs expression analysis

[0076] Differential expression analysis was performed using DESEQ2 (25). HERVs and genes raw counts from all normal and AML datasets were merged and integrated into the same DESEQ object, using study (i.e. batch) as a covariate in the design formula. Differential expression analysis was performed for all the 4 independent bulk AML datasets and the sorted LSC and pHSC populations against each of the 42 normal tissues. Fold change were shrunk with the apeglm method (30). Features with a fold change superior to 4 (log2FC > 2) and a base mean of at least 1 normalized count per million were considered overexpressed.

[0077] Biological samples

[0078] Bone marrow samples were collected from AML patients at diagnosis at the Centre Hospitalier Lyon Sud in Lyon, France. Samples collection was approved from the institutional review board and ethics committee (20.01.31.72653 - 21/20_3) and after obtaining patients’ written informed consent, in accordance with the Declaration of Helsinki. BMMCs were obtained by Ficoll density gradient centrifugation (Eurobio, FR, EU) and immediately cryoconserved in foetal bovine serum (FBS) with 10% dimethylsulfoxyde (DMSO).

[0079] MILs growth

[0080] BMMCs were rapidly thawed at 37°C and put in culture in RPMI medium (Gibco, FR, EU) supplemented with 8% human AB-serum (Etablissement Fran^ais du

Sang, FR, EU) and high doses (6,000 UVmL) IL-2 (PROLEUKIN aldesleukine, Novartis Pharma, CH, EU) after a 2-hours resting. Plates were then incubated for 14 days, with medium replacement when needed.

[0081] TABLE I

[0082] TABLE 2

[0083] Table 2 summarizes the genomic coordinates (first nucleotide and last nucleotide) of each of the 703 HERV sequences in the GRCH38 version of the human genome.

[0084] For example, concerning the herv_id “ERV316A3_lq25.2b”, the “1” value corresponds to chromosome 1 of the human genome, the letter “q” corresponds to the long arm of the corresponding chromosome (alternatively the letter (p) corresponds to the short arm of the corresponding chromosome) and “25.2b” corresponds to the locus of the gene of the corresponding chromosome.

[0085] References

1. Consortium IHGS. Initial sequencing and analysis of the human genome. Nature. 15 fevr 2001;409(6822):35057062.

2. Johnson WE. Origins and evolutionary consequences of ancient endogenous retroviruses. Nat Rev Microbiol. June 2019;17(6):355-70.

3. Vargiu L, et al. Classification and characterization of human endogenous retroviruses; mosaic forms are common. Retro virology. 22 janv 2016; 13(1):7.

4. Kassiotis G, Stoye JP. Immune responses to endogenous retroelements: taking the bad with the good. Nat Rev Immunol, avr 2016;16(4):207-19.

5. Alcazer V et al. Human Endogenous Retroviruses (HERVs): Shaping the Innate Immune Response in Cancers. Cancers (Basel). 6 mars 2020; 12(3).

6. Larouche J-D et al. Widespread and tissue-specific expression of endogenous retroelements in human somatic tissues. Genome Med. dec 2020;12(l):l-16. 7. Burns KH. Transposable elements in cancer. Nat Rev Cancer, juill 2017;17(7):415-24.

8. Attermann AS et al. Human endogenous retroviruses and their implication for immunotherapeutic s of cancer. Ann Oncol. 01 2018;29(l 1):2183-91.

9. De Kouchkovsky I, Abdul-Hay M. ‘Acute myeloid leukemia: a comprehensive review and 2016 update’. Blood Cancer Journal, juill 2016;6(7):e441-e441.

10. Herold T et al. A 29-gene and cytogenetic score for the prediction of resistance to induction treatment in acute myeloid leukemia. Haematologica. 2018;103(3):456-65.

11. Alexandrov LB et al. Signatures of mutational processes in human cancer. Nature. 22 aout 2013;500(7463):415-21.

12. Smith CC et al. Alternative tumour- specific antigens. Nat Rev Cancer, aout 2019; 19(8):465-78.

13. Brodsky I et al. Expression of HERV-K proviruses in human leukocytes. Blood. 1 mai 1993;81(9):2369-74.

14. Depil S et al. Expression of a human endogenous retrovirus, HERV-K, in the blood cells of leukemia patients. Leukemia, fevr 2002;16(2):254-9.

15. Tobiasson M et al. Comprehensive mapping of the effects of azacitidine on DNA methylation, repressive/permissive histone marks and gene expression in primary cells from patients with MDS and MDS-related disease. Oncotarget. 25 avr 2017;8(17):28812-25.

16. Kazachenka A et al. Epigenetic therapy of myelodysplastic syndromes connects to cellular differentiation independently of endogenous retroelement derepression. Genome Medicine. 23 dec 2019; 11(1):86.

17. Deniz O et al. Endogenous retroviruses are a source of enhancers with oncogenic potential in acute myeloid leukaemia. Nat Commun. 14 2020; 1 l(l):3506. 18. Bendall ML et al. Telescope: Characterization of the retrotranscriptome by accurate estimation of transposable element expression. PLoS Comput Biol. 2019;15(9):el006453.

19. Corces, M.R. et al. (2016). Lineage- specific and single-cell chromatin accessibility charts human hematopoiesis and leukemia evolution. Nat. Genet. 48, 1193- 1203.

20. Sondka Z. et al. (2018). The COSMIC Cancer Gene Census: describing genetic dysfunction across all human cancers. Nat. Rev. Cancer 18, 696-705.

21. Tyner J.W. et al. (2018). Functional genomic landscape of acute myeloid leukaemia. Nature 562, 526.

22. Langmead, B., and Salzberg, S.L. (2012). Fast gapped-read alignment with Bowtie 2. Nat. Methods 9, 357-359.

23. Li, H., Handsaker et al., and 1000 Genome Project Data Processing Subgroup (2009). The Sequence Alignment/Map format and SAMtools. Bioinformatics 25, 2078- 2079.

24. Anders, S., Pyl, P.T., and Huber, W. (2015). HTSeq — a Python framework to work with high-throughput sequencing data. Bioinformatics 31, 166-169.

25. Love, M.I., Huber, W., and Anders, S. (2014). Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 15, 550.

26. Liberzon A. et al. (2015). The Molecular Signatures Database Hallmark Gene Set Collection. Cell Syst. 1, 417-425.

27. Loeffler- Wirth H et al. (2019). A modular transcriptome map of mature B cell lymphomas. Genome Med. 11, 27.

28. Hubert, M. et al. (2020). IFN-III is selectively produced by cDCl and predicts good clinical outcome in breast cancer. Sci. Immunol. 5. 29. Thorsson et al. (2018). The Immune Landscape of Cancer. Immunity 48, 812- 830. el4.

30. Zhu, A. et al. (2019). Heavy-tailed prior distributions for sequence count data: removing the noise and preserving large differences. Bioinformatics 35, 2084-2092. 31. The Cancer Genome Atlas Research Network (2013). Genomic and Epigenomic

Landscapes of Adult De Novo Acute Myeloid Leukemia. N. Engl. J. Med. 368, 2059- 2074.

32. H. Kantarjian et al., Blood Cancer Journal 11, 41 (2021).