Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
BIOMARKERS FOR PYOMETRA IN DOGS
Document Type and Number:
WIPO Patent Application WO/2022/191758
Kind Code:
A1
Abstract:
Pyometra predisposition of a dog is predicted by determining, in a sample comprising nucleic acid molecules obtained from the dog, presence or absence of at least one biomarker useful in predicting pyometra predisposition of the dog. The at least one biomarker is located in a region of from nucleotide position 45,500,000 to nucleotide position 46,500,000 on canine chromosome 22 (CFA22), preferably in the ATP-binding cassette sub-family C member 4 (ABCC4) gene. Furthermore, pyometra predisposition of a dog is predicted by determining whether a cell of the dog has altered prostaglandin transport.

Inventors:
HAGMAN RAGNVI (SE)
ARENDT MAJA (SE)
LINDBLAD-TOH KERSTIN (SE)
FALL TOVE (SE)
ANDERSSON GÖRAN (SE)
Application Number:
PCT/SE2022/050229
Publication Date:
September 15, 2022
Filing Date:
March 09, 2022
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
HAGMAN RAGNVI (SE)
ARENDT MAJA (SE)
International Classes:
C12Q1/6827; C12Q1/6883; G01N33/68
Other References:
HAGMAN, R. ; KINDAHL, H. ; FRANSSON, B.A. ; BERGSTROM, A. ; HOLST, B.S. ; LAGERSTEDT, A.S.: "Differentiation between pyometra and cystic endometrial hyperplasia/mucometra in bitches by prostaglandin F"2"@a metabolite analysis", THERIOGENOLOGY, LOS ALTOS, CA, US, vol. 66, no. 2, 15 July 2006 (2006-07-15), US , pages 198 - 206, XP025076375, ISSN: 0093-691X, DOI: 10.1016/j.theriogenology.2005.11.002
HAGMAN R: "Molecular aspects of uterine diseases in dogs", REPRODUCTION IN DOMESTIC ANIMALS, BLACKWELL WISS. VERLAG, BERLIN, DE, vol. 52, 1 August 2017 (2017-08-01), DE , pages 37 - 42, XP055968709, ISSN: 0936-6768, DOI: 10.1111/rda.13039
HAGMAN R: "Clinical and Molecular Characteristics of Pyometra in Female Dogs", REPRODUCTION IN DOMESTIC ANIMALS, BLACKWELL WISS. VERLAG, BERLIN, DE, vol. 47, 1 December 2012 (2012-12-01), DE , pages 323 - 325, XP055968710, ISSN: 0936-6768, DOI: 10.1111/rda.12031
CHEEPALA SATISH B., BAO JU, NACHAGARI DEEPA, SUN DAXI, WANG YAO, ZHONG TAO, NAREN ANJAPARAVANDA P., ZHENG JIE, SCHUETZ JOHN D.: "Crucial Role for Phylogenetically Conserved Cytoplasmic Loop 3 in ABCC4 Protein Expression", JOURNAL OF BIOLOGICAL CHEMISTRY, AMERICAN SOCIETY FOR BIOCHEMISTRY AND MOLECULAR BIOLOGY, US, vol. 288, no. 31, 1 August 2013 (2013-08-01), US , pages 22207 - 22218, XP055968711, ISSN: 0021-9258, DOI: 10.1074/jbc.M113.476218
ARENDT MAJA, AMBROSEN AIME, FALL TOVE, KIERCZAK MARCIN, TENGVALL KATARINA, MEADOWS JENNIFER R. S., KARLSSON ÅSA, LAGERSTEDT ANNE-S: "The ABCC4 gene is associated with pyometra in golden retriever dogs", SCIENTIFIC REPORTS, vol. 11, no. 1, 1 December 2021 (2021-12-01), XP055968712, DOI: 10.1038/s41598-021-95936-1
Attorney, Agent or Firm:
BARKER BRETTELL SWEDEN AB (SE)
Download PDF:
Claims:
CLAIMS

1. A method for predicting pyometra predisposition of a dog, the method comprising determining, in a sample comprising nucleic acid molecules obtained from the dog, presence or absence of at least one biomarker useful in predicting pyometra predisposition of the dog, wherein the at least one biomarker is located in a region of from nucleotide position 45,500,000 to nucleotide position 46,500,000 on canine chromosome 22 (CFA22).

2. The method according to claim 1, further comprising predicting pyometra predisposition of the dog based on the determined presence or absence of the at least one biomarker.

3. The method according to claim 1 or 2, wherein the at least one biomarker is located in a region of from nucleotide position 45,700,000 to nucleotide position 46,200,000 on CFA22

4. The method according to any of the claims 1 to 3, wherein the at least one biomarker is located in the ATP-binding cassette sub-family C member 4 ( ABCC4 ) gene on CFA22.

5. The method according to any of the claims 1 to 4, wherein the at least one biomarker is located in a region of from nucleotide position 45,815,581 to nucleotide position 45,934,522 on CFA22.

6. The method according to any of the claims 1 to 5, wherein determining presence or absence of at least one biomarker comprises determining genotype of at least one single nucleotide polymorphism (SNP) selected from the group consisting of a SNP located at position 45,815,581 on CFA22, a SNP located at position 45,823,359 on CFA22, a SNP located at position 45,875,420 on CFA22, a SNP located at position 45,882,260 on CFA22, a SNP located at position 45,886,617 on CFA22, a SNP located at position 45,892,301 on CFA22, a SNP located at position 45,893,198 on CFA22, a SNP located at position 45,893,599 on CFA22 and a SNP located at position 45,934,522 on CFA22.

7. The method according to claim 6, wherein determining genotype of the at least one SNP comprises: determining presence or absence of nucleotide A or G at position 45,815,581 on

CFA22; determining presence or absence of nucleotide G or A at position 45,823,359 on

CFA22; determining presence or absence of nucleotide T or G at position 45,875,420 on

CFA22; determining presence or absence of nucleotide A or G at position 45,882,260 on

CFA22; determining presence or absence of nucleotide T or G at position 45,886,617 on

CFA22; determining presence or absence of nucleotide G or A at position 45,892,301 on

CFA22; determining presence or absence of nucleotide A or G at position 45,893,198 on

CFA22; determining presence or absence of nucleotide C or T at position 45,893,599 on CFA22; and/or determining presence or absence of nucleotide T or C at position 45,934,522 on

CFA22.

8. The method according to claim 7, wherein determining genotype of the at least one SNP comprises determining presence or absence of nucleotide A or G at position 45,893,198 on CFA22.

9. The method according to any of the claims 1 to 8, wherein determining presence or absence of at least one biomarker comprises determining whether a codon coding for amino acid number 787 in the ABCC4 protein codes for valine (V) or methionine (M).

10. A method for predicting pyometra predisposition of a dog, the method comprising: extracting protein from a sample obtained from a dog; and determining presence or absence of valine (V) or methionine (M) at amino acid position 787 in the ATP-binding cassette sub-family C member 4 (ABCC4) protein.

11. The method according to claim 10, further comprising predicting pyometra predisposition of the dog based on the determined presence or absence of valine (V) or methionine (M) at amino acid position 787 in the ABCC4 protein.

12. The method according to claim 10 or 11 , further comprising: predicting the dog to be pyometra predisposed based on presence of M or absence of V at amino acid position 787 in the ABCC4 protein; and predicting the dog not to be pyometra predisposed based on absence of M or presence of V at amino acid position 787 in the ABCC4 protein.

13. A method for predicting pyometra predisposition of a dog, the method comprising: determining whether a cell of the dog has altered prostaglandin transport; and predicting the dog to be pyometra predisposed if the cell has altered prostaglandin transport.

14. The method according to claim 13, wherein the cell is a cell obtained from the reproductive system of the dog.

15. The method according to claim 14, wherein the cell is selected from the group consisting of an endometrial cell, a myometrial cell, an epithelial cell, a glandular cell, a blood cell, an endothelial cell, a cell derived from the gonads, such a germline cell and a placental-originating cell.

16. The method according to any of the claims 13 to 15, wherein determining whether the cell of the dog has altered prostaglandin transport comprises determining whether the ATP- binding cassette sub-family C member 4 (ABCC4) protein in the cell has an altered prostaglandin transport as compared to a wild-type ABCC4 protein having valine (V) at amino acid position 787 in the ABCC4 protein.

17. The method according to any of the claims 13 to 16, wherein determining whether the cell of the dog has altered prostaglandin transport comprises determining whether the cell of the dog has altered transport of prostaglandin E2.

18. A method for selection a dog for breeding, the method comprising: predicting pyometra predisposition of the dog according to any of the claims 1 to 17; and selecting the dog for breeding if the dog is predicted not to be pyometra predisposed.

19. A kit for predicting pyometra predisposition of a dog, the kit comprises: at least one oligonucleotide probe capable of forming a hybridized nucleic acid with a single nucleotide polymorphism (SNP) or a nucleic acid region flanking the SNP, wherein the SNP is located at position 45,893,198 on canine chromosome 22 (CFA22); and instructions for predicting pyometra predisposition of the dog based on whether the nucleotide at position 45,893,198 is A or G.

Description:
BIOMARKERS FOR PYOMETRA IN DOGS

TECHNICAL FIELD

The present invention generally relates to biomarkers for pyometra in dogs, and in particular to such biomarkers that can be used to predict pyometra predisposition and uses thereof.

BACKGROUND

Purulent bacterial infection of the uterus (pyometra) is one of the most common diseases of intact female dogs. On average one in five female dogs are diagnosed with the disease before 10 years of age. The proportion of affected bitches diagnosed varies greatly between different breeds, i.e., some breeds develop the disease to a much larger extent and at an earlier age than others (from 3% in Finnish spitz’ to 66% in Bernese mountain dogs). The golden retriever breed is among the breeds that have increased risk of pyometra (age corrected risk ratio 3.3). By 10 years of age, 37% of all intact Swedish female golden retrievers will have been affected by the disease.

Pyometra is a potentially life-threatening illness that develops as a consequence of a combination of hormonal and bacterial factors. During the luteal phase of the oestrus cycle, high progesterone hormone levels make the uterus susceptible to opportunistic bacterial infections, foremost by Escherichia coli. Infection of the uterus can lead to sepsis and related endotoxemia and organ dysfunctions in severely affected individuals. In addition, circulating inflammatory mediators increase. The treatment of choice is surgical ovariohysterectomy. Non- surgical treatment alternatives are possible in less severe cases, but are frequently associated with disease relapse.

KR 102154601 discloses that procalcitonin (PCT) can be used in prognosis of canine uterine sinusitis. KR 102107942 discloses a method for diagnosing canine pyometra by measuring cell-free DNA (cfDNA). Asian Pacific Journal of Reproduction 2020; 9(4): 166-173 discloses that circulating pro-inflammatory cytokines, acute phase proteins, endotoxin, growth factors and inflammatory mediators can be monitored as prognostic biomarkers of pyometra. Plos One 2009; 4(11): e8039 discloses that almost 800 genes are significantly upregulated in the uteri of female dogs suffering from uterine bacterial infection. These genes include chemokine and cytokine genes, as well as genes associated with inflammatory cell extravasation, anti-bacterial action, the complement system and innate immune responses, as well as proteoglycan- associated genes.

There is still a need for tests for predicting predisposition of pyometra in dogs.

SUMMARY

It is a general objective to provide biomarkers for predicting pyometra predisposition in dogs.

This and other objectives are met by embodiments as disclosed herein.

The present invention is defined in the independent claims. Further embodiments of the invention are defined in the dependent claims.

An aspect of the invention relates to a method for predicting pyometra predisposition of a dog. The method comprises determining, in a sample comprising nucleic acid molecules obtained from the dog, presence or absence of at least one biomarker useful in predicting pyometra predisposition of the dog. The at least one biomarker is located in a region of from nucleotide position 45,500,000 to nucleotide position 46,500,000 on canine chromosome 22 (CFA22).

Another aspect of the invention relates to a method for predicting pyometra predisposition of a dog. The method comprises extracting proteins from a sample obtained from a dog, and determining presence or absence of valine (V) or methionine (M) at amino acid position 787 in the ATP-binding cassette sub-family C member 4 (ABCC4) protein.

A further aspect of the invention relates to a method for predicting pyometra predisposition of a dog. The method comprises determining whether a cell of the dog has altered prostaglandin transport, and predicting the dog to be pyometra predisposed if the cell has altered prostaglandin transport.

Yet another aspect of the invention relates to a method for selection a dog for breeding. The method comprises predicting pyometra predisposition of the dog according to above, and selecting the dog for breeding if the dog is predicted not to be pyometra predisposed. An aspect of the invention relates to a kit for predicting pyometra predisposition of a dog. The kit comprises at least one oligonucleotide probe capable of forming a hybridized nucleic acid with a single nucleotide polymorphism (SNP) or a nucleic acid region flanking the SNP. The SNP is located at position 45,893,198 on CFA22. The kit also comprises instructions for predicting pyometra predisposition of the dog based on whether the nucleotide at position 45,893,198 is A or G.

The inventors have identified biomarkers indicative of pyometra predisposition of dogs. The identified biomarkers could thereby be used to predict pyometra predisposition of dogs and identify individuals that are predisposed to develop pyometra.

BRIEF DESCRIPTION OF THE DRAWINGS

The embodiments, together with further objects and advantages thereof, may best be understood by making reference to the following description taken together with the accompanying drawings, in which:

Fig. 1 illustrates principal component analysis of pyometra cases and controls. The graph shows the two first coordinates PC1 (x-axis) PC2 (y-axis) as calculated by PLINK [1] No major stratification was seen between pyometra cases and controls and no major population structure was noted. Pyometra cases are indicated as black dots whilst controls are open circles.

Fig. 2 illustrates GWAS of pyometra in golden retrievers. (A) QQ-plot (l=0.99) and (B) Manhattan plot for GWAS of 98 pyometra cases and 96 controls identified a genome-wide significant signal on chromosome 22. Stippled line shows Bonferroni corrected significance threshold. Black dotted line shows suggestive association at 10 . (C) QQ-plot and (D) Manhattan plot of conditional GWAS using the genotype for one of the most associated SNPs (chr22: 45,875,420) on chromosome 22 as covariate, illustrating a slightly stronger association on chromosome 18, and the disappearance of the association on chromosome 28. (E) QQ-plot and (F) Manhattan plot for age of onset analysis showed two suggestive loci on chromosome 15 and 32.

Fig. 3 illustrates detailed view of the associated locus on chromosome 22. (A) Zoomed in view of chromosome 22 at 44.7-47.2 Mbp. The LD Revalue between the most highly associated SNPs is shown. (B) Further zoomed in view of the most associated SNPs on chromosome 22 45.7-46.2 Mbp. (C) The ABCC4 gene is located within the chromosome 22 at 45.7-46.2 Mbp region.

Fig. 4 illustrates detailed view of the associated locus on chromosome 22 lifted over to the human genome GRCh38. (A) Lift over from CanFam3.1 to hg38 of identified genetic variants located with the complete LD locus. Separate track shows the location of the five associated GWAS SNPs and the identified coding variant within ABCC4. (B) Zoom in on the coding variant and the two nearest associated GWAS SNPs lifted over to human hg38. Tracks below the coding track include the ENCODE c/s-regulatory elements track, the gene expression track, the Multix Alignments of 100 Vertebrates conservation track and the H3K27Ac regulatory elements track [2]

DETAILED DESCRIPTION

The present invention generally relates to biomarkers for pyometra in dogs, and in particular to such biomarkers that can be used to predict pyometra predisposition and uses thereof.

Pyometra, also referred to as pyometritis, is one of the most common diseases in female dogs, presenting as purulent inflammation and bacterial infection of the uterus. Pyometra is an important disease to be aware of for any dog owner because of the sudden nature of the disease and the deadly consequences if left untreated. Hence, there is a need to identify dogs having a predisposition of suffering from pyometra to thereby monitor such predisposed dogs closely to detect any pyometra as early as possible. The present invention also finds uses in breeding selection by identifying dogs predisposed of suffering from pyometra and excluding them from breeding programs.

An aspect of the invention relates to a method for predicting pyometra predisposition of a dog. The method comprises determining, in a sample comprising nucleic acid molecules obtained from the dog, presence of absence of at least one biomarker useful in predicting pyometra predisposition of the dog. According to the invention, the at least one biomarker is located in a region of from nucleotide position 45,500,000 to nucleotide position 46,500,000 on canine chromosome 22 (CFA22).

In an embodiment, the method also comprises predicting pyometra predisposition of the dog based on the determined presence of absence of the at least one biomarker. This region of CFA22 contains the ATP-binding cassette sub-family C member 4 ( ABCC4 ) gene encoding for the ABCC4 protein, also referred to as multidrug resistance-associated protein 4 (MRP4) or multi-specific organic anion transporter B (MOAT-B). ABCC4 is a protein that acts as a regulator of intracellular cyclic nucleotide levels and as a mediator of cyclic adenosine monophosphate (cAMP) dependent signal transduction to the nucleus. ABCC4 also transports, among others, prostaglandins, for example prostaglandin E2 (PGE2), out of the cell where they can bind receptors.

In an embodiment, the at least one biomarker is located in a region of from nucleotide position 45,700,000 to nucleotide position 46,200,000 on CFA22. In a particular embodiment, the at least one biomarker is located in a region of from nucleotide position 45,750,000 to nucleotide position 46,050,000 on CFA22.

The ABCC4 gene is present on CFA22 from nucleotide position 45,767,063 to nucleotide position 46,013,484. In an embodiment, the at least one biomarker is located in the ABCC4 gene on CFA22. Accordingly, in an embodiment, pyometra predisposition of a dog can be predicted based on the presence or absence of at least one biomarker located in a region from nucleotide position 45,767,063 to nucleotide position 46,013,484 on CFA22.

The term nucleotide position as used herein relates to the NCBI Canis lupus familiaris (dog) reference sequence Dog10K_Boxer_Tasha having GenBank assembly accession no. GCA_000002285.4 and RefSeq assembly accession no. GCF_000002285.5 (https://www.ncbi.nlm.nih.gOv/assembly/GCF_000002285.5). Canine chromosome 22 (CFA22) has GenBank accession number CM000022 and version 4 (CM000022.4) and RefSeq assembly accession no. NC_006604 version 4 (NC_006604.4).

The sample is a biological sample that comprises nucleic acid molecules and is obtained from the dog. The nucleic acid molecules could be present as free nucleic acid molecules in the sample. Alternatively, the sample could contain nucleated cells from the dog and where these nucleated cells contain a nucleus comprising the nucleic acid molecules, and in particular a copy of the genome of the dog. The sample could be a body fluid sample comprising nucleic acid molecules, such as a body fluid sample comprising nucleated cells. Illustrative, but nonlimiting, examples of such body fluid samples include a blood sample, a plasma sample, a serum sample, a saliva sample, a mammary gland milk sample, a vaginal cell, smear or lubrication sample, and a semen sample. Alternatively, the sample could be a body tissue sample including, but not limited to, a hair sample, a hair root sample or a biopsy sample.

The nucleic acid molecules could be deoxyribonucleic acid (DNA) molecules or ribonucleic acid (RNA) molecules, including complementary DNA (cDNA) molecules. In a particular embodiment, the nucleic acid molecules are DNA molecules and preferably genomic DNA.

Nucleic acid molecules can extracted, isolated and optionally purified from the sample according to well-known nucleic acid extraction, isolation and purification methods. For instance, standard protocol for the isolation of genomic DNA could be used, such as are inter alia referred to in [16, 17].

The term biomarker is generally defined herein as a biological indicator, such as a particular molecular feature, that may affect or be related to predicting pyometra in dogs. For example, in certain embodiments of the present invention, the biomarker is a genetic marker, such as a single nucleotide polymorphism (SNP), e.g., a particular genotype at a SNP.

A genome-wide association study (GWAS) was conducted in golden retrievers, a breed with increased risk of developing pyometra, to identify genetic risk factors associated with the disease. A disease-associated locus was thereby identified on chromosome 22. The initial GWAS identified five SNPs in complete linkage disequilibrium (LD) in the region of chromosome 22 containing the ABCC4 gene, and in particular spanning introns 18 and 19 of the ABCC4 gene. In addition, next generation sequencing and genotyping of pyometra cases and controls identified four missense SNPs within the ABCC4 gene.

Single nucleotide polymorphism or SNP as used herein refers to a single nucleotide polymorphism at a particular position in the dog genome that varies among a population of individuals. A SNP can be identified by its location within CFA22, i.e., nucleotide position on CFA22, or by its name as shown in Table 1. SNPs identified as being useful for predicting pyometra predisposition are shown in Tables 1 and 2. For example, the SNP BICF2G630333328 in Table 1 indicates that the nucleotide base (or the allele) at nucleotide position 45,875,420 on CFA22 may be either thymidine (T) or guanine (G). The allele associated with or indicative of pyometra predisposition of a dog is in such an example SNP BICF2G630333328 of Table 1 thymidine (T).

SNPs in the coding region of the ABCC4 gene could be synonymous SNPs that do not affect the amino acid sequence of the ABCC4 protein or non-synonymous SNPs that change the amino acid sequence of the ABCC4 protein. A non-synonymous SNP could be a missense SNP, in which a single nucleotide change results in a codon coding for a different amino acid, or a nonsense SNP, in which a single nucleotide change results in a premature stop codon.

In an embodiment, determining presence or absence of at least one biomarker comprises determining genotype of at least one SNP selected from the group consisting of a SNP located at position 45,815,581 on CFA22, a SNP located at position 45,823,359 on CFA22, a SNP located at position 45,875,420 on CFA22, a SNP located at position 45,882,260 on CFA22, a SNP located at position 45,886,617 on CFA22, a SNP located at position 45,892,301 on CFA22, a SNP located at position 45,893,198 on CFA22, a SNP located at position 45,893,599 on CFA22 and a SNP located at position 45,934,522 on CFA22.

In a particular embodiment, determining genotype of the at least one SNP comprises determining presence or absence of nucleotide A or G at position 45,815,581 on CFA22; determining presence or absence of nucleotide G or A at position 45,823,359 on CFA22; determining presence or absence of nucleotide T or G at position 45,875,420 on CFA22; determining presence or absence of nucleotide A or G at position 45,882,260 on CFA22; determining presence or absence of nucleotide T or G at position 45,886,617 on CFA22; determining presence or absence of nucleotide G or A at position 45,892,301 on CFA22; determining presence or absence of nucleotide A or G at position 45,893,198 on CFA22; determining presence or absence of nucleotide C orT at position 45,893,599 on CFA22; and/or determining presence or absence of nucleotide T or C at position 45,934,522 on CFA22.

In a particular embodiment, determining genotype of the at least one SNP comprises determining presence or absence of nucleotide A or G at position 45,893,198 on CFA22.

Determining presence or absence of a particular nucleotide at a given position on CFA22 comprises, in an embodiment, determining the identity of the nucleotide at the given position on CFA22, i.e., determining whether the nucleotide in the given position is adenosine (A), cytosine (C), guanosine (G), thymidine (T) or uracil (U).

The above mentioned embodiments could involve determining presence of a particular nucleotide (A, T, G or C in the case of DNA and A, U, G or C in the case of RNA) at a given position on CFA22 in one or both alleles. Alternatively, the above mentioned embodiments could involve determining presence of the particular nucleotide at the given position on CFA22 in one allele. In a further example, the above mentioned embodiments could involve determining presence of the particular nucleotide at the given position on CFA22 in both alleles.

Correspondingly, the above mentioned embodiments could involve determining absence of a particular nucleotide at a given position on CFA22 in one or both alleles. Alternatively, the above mentioned embodiments could involve determining absence of the particular nucleotide at the given position on CFA22 in one allele. In a further example, the above mentioned embodiments could involve determining absence of the particular nucleotide at the given position on CFA22 in both alleles.

In particular embodiments, determining genotype of the at least one SNP comprises determining presence of nucleotide A at position 45,815,581 on CFA22 in one or both alleles, in one allele or in both alleles, determining absence of nucleotide A at position 45,815,581 on CFA22 in one or both alleles, in one allele or in both alleles, determining presence of nucleotide G at position 45,815,581 on CFA22 in one or both alleles, in one allele or in both alleles or determining absence of nucleotide G at position 45,815,581 on CFA22 in one or both alleles, in one allele or in both alleles.

In other particular embodiments, determining genotype of the at least one SNP comprises determining presence of nucleotide G at position 45,823,359 on CFA22 in one or both alleles, in one allele or in both alleles, determining absence of nucleotide G at position 45,823,359 on CFA22 in one or both alleles, in one allele or in both alleles, determining presence of nucleotide A at position 45,823,359 on CFA22 in one or both alleles, in one allele or in both alleles or determining absence of nucleotide A at position 45,823,359 on CFA22 in one or both alleles, in one allele or in both alleles. In further particular embodiments, determining genotype of the at least one SNP comprises determining presence of nucleotide T at position 45,875,420 on CFA22 in one or both alleles, in one allele or in both alleles, determining absence of nucleotide T at position 45,875,420 on CFA22 in one or both alleles, in one allele or in both alleles, determining presence of nucleotide G at position 45,875,420 on CFA22 in one or both alleles, in one allele or in both alleles or determining absence of nucleotide G at position 45,875,420 on CFA22 in one or both alleles, in one allele or in both alleles.

In yet other particular embodiments, determining genotype of the at least one SNP comprises determining presence of nucleotide A at position 45,882,260 on CFA22 in one or both alleles, in one allele or in both alleles, determining absence of nucleotide A at position 45,882,260 on CFA22 in one or both alleles, in one allele or in both alleles, determining presence of nucleotide G at position 45,882,260 on CFA22 in one or both alleles, in one allele or in both alleles or determining absence of nucleotide G at position 45,882,260 on CFA22 in one or both alleles, in one allele or in both alleles.

In particular embodiments, determining genotype of the at least one SNP comprises determining presence of nucleotide T at position 45,886,617 on CFA22 in one or both alleles, in one allele or in both alleles, determining absence of nucleotide T at position 45,886,617 on CFA22 in one or both alleles, in one allele or in both alleles, determining presence of nucleotide G at position 45,886,617 on CFA22 in one or both alleles, in one allele or in both alleles or determining absence of nucleotide G at position 45,886,617 on CFA22 in one or both alleles, in one allele or in both alleles.

In other particular embodiments, determining genotype of the at least one SNP comprises determining presence of nucleotide G at position 45,892,301 on CFA22 in one or both alleles, in one allele or in both alleles, determining absence of nucleotide G at position 45,892,301 on CFA22 in one or both alleles, in one allele or in both alleles, determining presence of nucleotide A at position 45,892,301 on CFA22 in one or both alleles, in one allele or in both alleles or determining absence of nucleotide A at position 45,892,301 on CFA22 in one or both alleles, in one allele or in both alleles.

In further particular embodiments, determining genotype of the at least one SNP comprises determining presence of nucleotide A at position 45,893,198 on CFA22 in one or both alleles, in one allele or in both alleles, determining absence of nucleotide A at position 45,893,198 on CFA22 in one or both alleles, in one allele or in both alleles, determining presence of nucleotide G at position 45,893,198 on CFA22 in one or both alleles, in one allele or in both alleles or determining absence of nucleotide G at position 45,893,198 on CFA22 in one or both alleles, in one allele or in both alleles.

In yet other particular embodiments, determining genotype of the at least one SNP comprises determining presence of nucleotide C at position 45,893,599 on CFA22 in one or both alleles, in one allele or in both alleles, determining absence of nucleotide C at position 45,893,599 on CFA22 in one or both alleles, in one allele or in both alleles, determining presence of nucleotide T at position 45,893,599 on CFA22 in one or both alleles, in one allele or in both alleles or determining absence of nucleotide T at position 45,893,599 on CFA22 in one or both alleles, in one allele or in both alleles.

In particular embodiments, determining genotype of the at least one SNP comprises determining presence of nucleotide T at position 45,934,522 on CFA22 in one or both alleles, in one allele or in both alleles, determining absence of nucleotide T at position 45,934,522 on CFA22 in one or both alleles, in one allele or in both alleles, determining presence of nucleotide C at position 45,934,522 on CFA22 in one or both alleles, in one allele or in both alleles or determining absence of nucleotide C at position 45,934,522 on CFA22 in one or both alleles, in one allele or in both alleles.

In an embodiment, the method comprises determining genotype of one SNP. In other embodiment, the method comprises determining genotype of two SNPs, of three SNPs, of four SNPs, of five SNPs, of six SNPs, of seven SNPs, of eight SNPs or of nine SNPs.

In a currently preferred embodiment, determining presence or absence of at least one biomarker comprises determining whether a codon coding for amino acid number 787 in the ABCC4 protein codes for valine (V) or methionine (M).

The missense SNP at nucleotide position 45,893,198 was in complete LD with the identified GWAS locus. The missense SNP A>G causes an amino acid substitution of p.M787V in the ABCC4 protein. This amino acid position 787 is located in the cytoplasmic loop 3 (CL3) of the ABCC4 protein, a region of the protein that is phylogenetically conserved. Across mammalian species, the most common amino acid residue at the 787 position is valine and out of 61 mammals only the dog, pig, brush tailed rat, naked mole rat and chinchilla have methionine in this position. As is shown herein, the risk allele, i.e., the allele associated with pyometra predisposition, results in keeping the less common amino acid methionine at position 787, whilst the protective allele results in a change into the more common valine at position 787.

The amino acid sequence of the canine ABCC4 protein is presented here below and in SEQ ID NO: 1 with the amino acid residue at position 787 marked in bold and underlining. In this amino acid sequence, the amino acid residue at position 787 is methionine (M).

MLPVYPEVKP NPLQDANLCS RVLFWWLNPL FKIGHKRRLE EDDMYSVLPE

DRSKHLGEEL QGYWDKEIQK AEKSDARKPS LTKAIIKCYW KSYLVLGIFT

LIEEGTRVIQ PIFLGKIIKY FENQDPNDSV ALHEAYGYAT VLTACTLVLA

ILHHLYFYHV QCAGMRLRVA MCHMIYRKAL RLSNMAMGKT TTGQIVNLLS

NDVNKFDQVT VFLHFLWAGP LQAIAVTALL WMEIGISCLA GMW LIILLP

LQSCIGKLFS SLRSKTATFT DVRIRTMNEV ITGIRIIKMY AWEKSFADLI

TNLRRKEISK ILRSSYLRGM NLASFFVANK IIIFVTFTTY VLLGNVITAS

RVFVAVSLYG AVRLTVTLFF PSAIERVSES W SIQRIKNF LLLDEISQRT

PQLPSDGKMI VHIQDFTAFW DKASETPTLE GLSFTVRPGE LLAVIGPVGA

GKSSLLSAVL GELPRNQGLV SVHGRIAYVS QQPWVFPGTV RSNILFGKKY

EKERYEKVIK ACALRKDLQC LEDGDLTVIG DRGATLSGGQ KARINLARAV

YQDADIYLLD DPLSAVDAEV GRHLFELCIC QTLHEKITIL VTHQLQYLKA

ASQILILKEG KMVQKGTYTE FLKSGVDFGS LLKKENEEAD QSPAPGSPIL

RTRSFSESSL WSQQSSRHSL KDSAPEAQDI ENTQVALSEE RRSEGKVGFK

AYRNYLTAGA HWFVIVFLIL LNIASQVAYV LQDWWLSYWA NEQSALNVTV

DGKENVTEKL DLPWYLGIYS GLTVATVLFG IARSLLMFYV LVNSSQTLHN

KMFESILRAP VLFFDTNPIG RILNRFSKDI GHMDDLLPLT FLDFLQTFLQ

VLGW GVAVA VIPWIAIPLL PLAVIFFILR RYFLATSRDV KRLESTSRSP

VFSHLSSSLQ GLWTIRAYKA EERFQELFDA HQDLHSEAWF LFLTTSRWFA

VRLDAICAMF VIW AFGSLI LAKTVDAGQV GLALSYALTL MGMFQWSVRQ

SAEVENMMIS VERVMEYTDL EKEAPWEYQK RPPPTWPQEG TIVFDNVNFT

YSLDGPLVLK HLTALIKSRE KVGIVGRTGA GKSSLIAALF RLSEPEGKIW

IDRILTTEIG LHDLRKKMSI IPQEPVLFTG TMRKNLDPFN EHTDEELWNA

LTEVQLKDAV EDLPGKLDTE LAESGSNFSV GQRQLVCLAR AILRKNRILI IDEATANVDP RTDELIQKKI REKFAHCTVL TIAHRLNTII DSDKIMVLDS GRLKEYDEPY VLLQNEESLF YKMVQQLGKA AAAALTETAK QFVASN

There are several methods known by those skilled in the art for determining whether a particular nucleotide sequence is present in a nucleic acid molecule and for identifying the nucleotide in a given position in a nucleic acid sequence. These include the amplification of a nucleic acid segment encompassing the genetic marker by means of polymerase chain reaction (PCR) or any other amplification method, interrogate the genetic marker by means of allele specific hybridization, 3'-exonuclease assay (Taqman assay), fluorescent dye and quenching agent-based PCR assay, the use of allele-specific restriction enzymes (RFLP-based techniques), direct sequencing, oligonucleotide ligation assay (OLA), pyrosequencing, invader assay, mini-sequencing, denaturing high pressure liquid chromatography (DHPLC) based techniques, single strand conformational polymorphism (SSCP), allele-specific PCR, denaturating gradient gel electrophoresis (DGGE), temperature gradient gel electrophoresis (TGGE), chemical mismatch cleavage (CMC), heteroduplex analysis based system, techniques based on mass spectroscopy (MS), invasive cleavage assay, polymorphism ratio sequencing (PRS), microarrays, rolling circle extension assay, high pressure liquid chromatography (HPLC) based techniques, extension based assays, amplification refractory mutation system (ARMS), amplification refractory mutation linear extension (ALEX), single base chain extension (SBCE), molecular beacon assays, invader (Third wave technologies), ligase chain reaction assays, 5'- nuclease assay-based techniques, hybridization capillary array electrophoresis (CAE), protein truncation assays (PTT), immunoassays, and solid phase hybridization (dot blot, reverse dot blot, chips). This list of methods is not meant to be exclusive, but just to illustrate the diversity of available methods. Some of these methods can be performed in microarray format (microchips) or on beads.

Another aspect of the invention relates to a method for predicting pyometra predisposition of a dog. The method comprises extracting proteins from a sample obtained from a dog. The method also comprises determining presence or absence of valine (V) or methionine (M) at amino acid position 787 in the ATP-binding cassette sub-family C member 4 (ABCC4) protein.

In an embodiment, the method also comprises predicting pyometra predisposition of the dog based on the determined presence or absence of valine (V) or methionine (M) at amino acid position 787 in the ABCC4 protein. In an embodiment, the method comprises predicting the dog to be pyometra predisposed based on presence of M or absence of V at amino acid position 787 in the ABCC4 protein. In a further embodiment, the method also comprises predicting the dog not to be pyometra predisposed based on absence of M or presence of V at amino acid position 787 in the ABCC4 protein.

The sample is a biological sample that comprises proteins and is obtained from the dog. The sample could be a body fluid sample comprising proteins, such as a body fluid sample comprising nucleated cells. Illustrative, but non-limiting, examples of such body fluid samples include a blood sample, a plasma sample, a serum sample, a saliva sample, a mammary gland milk sample, a vaginal cell, smear or lubrication sample, and a semen sample. Alternatively, the sample could be a body tissue sample including, but not limited to, a biopsy sample.

Proteins can be extracted from the sample according to techniques well known in the art including, but not limited to, centrifugation, electrophoresis, chromatography, and precipitation.

Presence or absence of valine or methionine at amino acid position 787 in the ABCC4 protein can be determined according to various techniques for single amino acid substitution (SAAS) identification. An example of such techniques is mass spectrometry (MS) analysis, such as tandem mass spectrometry (MS/MS) or liquid chromatography MS/MS (LC-MS/MS). Also immunoassays using antibodies specifically binding to an epitope on ABCC4 encompassing the amino acid position 787 and having different binding characteristics to the ABCC4 protein depending on whether the amino acid position 787 is methionine or valine could be used. Further examples include various proteomics technologies and analyses.

The ABCC4 protein has a domain organization typical of eukaryotic ABC transporters. This includes a core structure of two membrane-spanning domains (MSD1 and MSD2), each consisting of six transmembrane helices and two cytosolic nucleotide binding domains (NBD1 and NBD2) that bind and hydrolyze ATP to power substrate transport. The MSDs are composed of six long transmembrane helices that penetrate the cytoplasm and are connected by cytoplasmic loops (CL); CL1 between transmembrane helices two and three in MSD1, CL2 between transmembrane helices four and five in MDS1, CL3 between transmembrane helices two and three in MSD2 and CL4 between transmembrane helices four and five in MDS2. For ABC transporters, the location of the CLs provides an opportunity for interaction with the NBDs, and these CLs have an important role in both trafficking to the plasma membrane and in protein stability. For instance, a single amino acid substitution in human ABCC4 at position 796 in the CL3, pT796M, reduced the expression and stability of human ABCC4 [6]. This reduced expression and stability was seen even though the p.T796M substitution was predicted to be benign and well tolerated by SIFT and PolyPhen. Hence, an amino acid substitution in CL3 of ABCC4 between a large sized methionine and a smaller sized amino acid, such as threonine or valine, most likely affects the cellular transportation capacity of the ABCC4 protein.

Hence, the cellular transportation capacity of the ABCC4 protein can be used as a biomarker to predict pyometra predisposition in dogs. Such cellular transportation capacity could, for instance, be determined by analyzing the prostaglandin transport capacity of cells, such as endometrial cells, of dogs.

A further aspect of the invention therefore relates to a method for predicting pyometra predisposition of a dog. The method comprises determining whether a cell of the dog has altered prostaglandin transport. The method also comprises predicting the dog to be pyometra predisposed if the cell has altered prostaglandin transport.

The cell is preferably a cell obtained from the reproductive system of the dog. Illustrative, but non-limiting, examples of such cells include endometrial cells, myometrial cells, epithelial cells, glandular cells, blood cells, such as red blood cells (erythrocytes) or white blood cells (leukocytes), endothelial cells, cells derived from the gonads (ovaries) including germline cells (ova), or placental-originating cells.

In an embodiment, determining whether a cell of the dog has altered prostaglandin transport comprises determining whether the ABCC4 protein in the cell has an altered prostaglandin transport as compared to a wild-type ABCC4 protein having valine (V) at amino acid position 787 in the ABCC4 protein.

For instance, cells, such as endometrial cells, from a dog could be cultured in a culture medium in vitro and the amount of prostaglandin released into the culture medium could be measured by various techniques, such as gas chromatography with electron-capture detection, strong cation exchange (SCX) LC-MS/MS, or enzyme-linked immunosorbent assay (ELISA). The determined amount of prostaglandin in the culture medium could then be compared to a control amount of prostaglandin as measured in culture medium, in which cells, such as endometrial cells, comprising a known type of the ABCC4 protein have been cultured in vitro. The known type of the ABCC4 protein is preferably the wild-type ABCC4 protein but could alternatively be the p.M787V ABCC4 protein.

As an example, the prostaglandin measured in the culture medium is PGE2.

The method for predicting pyometra predisposition of a dog according to any of the embodiments above may also comprise selecting a surveillance schedule for the dog based on the predicted pyometra predisposition. Thus, an optimal or at least suitable surveillance schedule or scheme is selected for the dog based on the whether the dog is predicted to be predisposed to develop or suffer from pyometra or not. This means that dogs predicted to be pyometra predisposed could be selected for a more frequent surveillance and follow-up (first surveillance schedule) as compared to dogs predicted not to be pyometra predisposed, which instead can follow a less frequent surveillance and follow-up (second surveillance schedule) if any.

The method for predicting pyometra predisposition of a dog according to any of the embodiments above may further also comprise selecting a pyometra treatment for the dog based on the predicted pyometra predisposition. Thus, an optimal or at least suitable treatment is selected for the dog based on the whether the dog is predicted to be predisposed to develop or suffer from pyometra or not. As an example, if the dog is predicted to be predisposed for developing pyometra surgical treatment may be selected, such as ovariohysterectomy, also referred to as spaying. However, if the dog is not predicted to be predisposed for developing pyometra a medical treatment may instead be selected, such as antibiotics or other antibacterial agents.

An aspect of the invention relates to a method for selection a dog for breeding. The method comprises predicting pyometra predisposition of the dog according to any of the embodiments as disclosed herein. The method also comprises selecting the dog for breeding if the dog is predicted not to be pyometra predisposed. Hence, the at least one biomarker of the invention indicative of pyometra predisposition of dogs can be used to select dogs for breeding.

The various embodiments described in the foregoing for determining the allele of at least one biomarker or genetic marker, such as determining presence or absence of a particular nucleotide at a given position in CFA22, can also be applied to the present aspect of selecting a dog for breeding.

For instance, the method of the invention could be used to identify and select a stud or sire that is heterozygote or preferably homozygote for the alternative allele in Table 1 or 2 for any SNP in Table 1 or 2. Alternatively, or in addition, the method could be used to identify and select a bitch that is heterozygote or preferably homozygote for the alternative allele in Table 1 or 2 for any SNP in Table 1 or 2.

In a particular embodiment, the method comprises determining whether the dog (stud/sire or bitch) is heterozygote, preferably homozygote, for the missense SNP at CFA22: 45,893,198 (p.Met787Val). In such an embodiment, the method comprises selecting the dog for breeding if the dog is heterozygote, preferably homozygote, for the missense SNP at CFA22: 45,893,198 (p.Met787Val).

In a most preferred embodiment, the method of the invention is used to identify and select a stud/sire that is homozygote for the missense SNP at CFA22: 45,893,198 (p.Met787Val) and select a bitch that is homozygote for the missense SNP at CFA22: 45,893,198 (p.Met787Val). The identified and selected stud/sire and bitch can then be mated to get an offspring that will be homozygote for the missense SNP at CFA22: 45,893,198 (p.Met787Val).

A further aspect of the invention relates to a kit for predicting pyometra predisposition of a dog. The kit comprises at least one oligonucleotide probe capable of forming a hybridized nucleic acid with a SNP or a nucleic acid region flanking the SNP. In this aspect, the SNP is selected from the group consisting of a SNP located at position 45,815,581 on CFA22, a SNP located at position 45,823,359 on CFA22, a SNP located at position 45,875,420 on CFA22, a SNP located at position 45,882,260 on CFA22, a SNP located at position 45,886,617 on CFA22, a SNP located at position 45,892,301 on CFA22, a SNP located at position 45,893,198 on CFA22, a SNP located at position 45,893,599 on CFA22 and a SNP located at position 45,934,522 on CFA22. The kit also comprises instructions for predicting pyometra predisposition of the dog based on the dog’s genotype at the SNP.

In a particular embodiment, the kit for predicting pyometra predisposition of a dog comprises at least one oligonucleotide probe capable of forming a hybridized nucleic acid with a SNP or a nucleic acid region flanking the SNP. In this particular embodiment, the SNP is located at position 45,893,198 on CFA22. The kit also comprises instructions for predicting pyometra predisposition of the dog based on whether the nucleotide at position 45,893,198 on CFA22 is A or G.

In an embodiment, the at least one oligonucleotide probe capable of forming a hybridized nucleic acid with a SNP or a nucleic acid region flanking the SNP is at least one first oligonucleotide probe. In this embodiment, the kit further comprises a second oligonucleotide probe capable of forming a hybridized nucleic acid with a SNP defined above or a nucleic acid region flanking a SNP defined above.

In an embodiment, the first oligonucleotide probe is a first primer that hybridizes 5’ or 3’ to a SNP defined above and the second oligonucleotide probe is a second primer that hybridizes 3’ or 5’ to the SNP. In a particular embodiment, the first and second primers are capable of amplifying the SNP, i.e., the first and second primers form a primer pair. Alternatively, one of the first and second primers hybridizes to portion or segment of the nucleic acid sequence comprising the SNP.

The present invention also relates to a nucleic acid comprising a SNP. The SNP is selected from the group consisting of a SNP located at position 45,815,581 on CFA22, a SNP located at position 45,823,359 on CFA22, a SNP located at position 45,875,420 on CFA22, a SNP located at position 45,882,260 on CFA22, a SNP located at position 45,886,617 on CFA22, a SNP located at position 45,892,301 on CFA22, a SNP located at position 45,893,198 on CFA22, a SNP located at position 45,893,599 on CFA22 and a SNP located at position 45,934,522 on CFA22.

In a particular embodiment, the nucleic acid is an isolated nucleic acid molecule.

EXAMPLE Pyometra is one of the most common diseases in female dogs, presenting as purulent inflammation and bacterial infection of the uterus. On average 20% of intact female dogs are affected before 10 years of age, a proportion that varies greatly between breeds (3-66%). To identify genetic risk factors associated with the disease, a genome-wide association study (GWAS) was performed in golden retrievers, a breed with increased risk of developing pyometra (risk ratio: 3.3). A mixed model approach was applied comparing 98 cases, and 96 healthy controls and an associated locus was identified on chromosome 22 (p=1.2 x 10 6 , passing Bonferroni corrected significance). This locus contained five significantly associated single-nucleotide polymorphisms (SNPs) positioned within introns of the ATP-binding cassette sub-family C member 4 ( ABCC4 ) gene. This gene encodes a transmembrane transporter that is important for prostaglandin transport. Next generation sequencing and genotyping of cases and controls subsequently identified four missense SNPs within the ABCC4 gene. One missense SNP at chr22:45,893,198 (p.Met787Val) showed complete linkage disequilibrium (LD) with the associated GWAS SNPs suggesting a potential role in disease development.

Results

Genome-wide significant locus on chromosome 22 in CanFam 3.1 To identify disease-associated loci, 194 female golden retrievers were genotyped using the 170k CanineHD BeadChip. Ninety-eight of the dogs were classified as cases and 96 as controls. The mean age of onset for the cases was 6.6 years (standard deviation (SD) 2.1 years). All controls were intact and >7 years old with a mean age of 8.6 years (SD 1.4 years). At initial quality control and filtering, 1,000 SNPs were removed for low genotyping rate (<95%) and 72,878 SNPs were removed for having a minor allele frequency of less than 5% leaving 97,468 SNPs for further analysis. No individuals were removed for having a low genotyping rate and the average genotyping rate in the population was 99%. A multidimensional scaling plot was generated showing the first two coordinates (C1-C2) (Fig. 1). No clustering between cases versus controls was noted in the population as a whole. Calculation of relatedness showed that two control dogs were related at sibling level (PIJHAT 0.51 , both individuals were kept). A GWAS was performed using EMMAX to account for cryptic relatedness between individuals and population structure [3]. One genome-wide significant locus containing five SNPs in complete LD was identified on chromosome 22 at ~45 Mb, p=1.24 x 10 6 , which was below the LD corrected Bonferroni significance threshold calculated as 4.2 c 10 6 (see QQ-plot and Manhattan plot in Figs. 2A and 2B). The QQ-plot did not show evidence of inflation (lambda 0.99) with the associated SNPs on chromosome 22 above the dotted line reaching Bonferroni corrected significance.

Two tentative additional loci were seen in the Manhattan plot (Fig. 2B). A locus on chromosome 18 (top SNP chr18: 51,224,157, p=5.2 x 10 5 ), and a locus on chromosome 28 (top SNP chr28: 8,872,257, p=6.0 c 10 5 ). None of these reached Bonferroni corrected significance.

Conditioning on the top locus

To evaluate if either of the two additional loci represented independent risk factors from the chromosome 22 locus, a conditional genome-wide analysis was performed choosing the genotype of one of the top-associated SNPs (chr22: 45,875,420) on chromosome 22 as a covariate. As seen in the QQ plot (Fig. 2C) and Manhattan plot (Fig. 2D), the locus on chromosome 18 shifted ~2 Mb but showed a mildly improved p-value leaving it as a suggestive locus (chr18: 49,198,998, p=2.8 c 10 5 ), whilst the association to the SNP on chromosome 28, located in intron 7 of the Sorbin and SH3 domain-containing protein 1 ( SORBS1 ) gene, disappeared. The most significantly associated SNP on chromosome 18 identified in this analysis was located in intron 4 of the Testis Expressed Metallothionein Like Protein ( TESMIN) gene (the allele frequency was 0.64 in cases and 0.44 in controls).

The risk alleles on chromosome 22 were present in 40% of the cases. 96% of the cases carried at least one risk allele from one of the two loci versus only 70% of the controls when looking at the distribution of alleles for the chromosome 22 and chromosome 18 loci (at 49 Mb).

Association to age of onset

To investigate potential loci associated to early onset of pyometra, we performed an association analysis within the cases only, using age of onset in days as a continuous variable. No SNPs reached Bonferroni corrected significance (Figs. 2E and 2F). Two loci on chromosome 15 and 32 stood out and were considered as suggestively associated. The most associated SNP on chromosome 15 (chr15: 59,440,763, p=9.0 c 10 6 ) is located within intron 3 of the Nuclear Assembly Factor 1 ribonucleoprotein ( NAF1 ) gene. The most strongly associated SNP on chromosome 32 (chr32: 22,285,412, p=4.32 x 10 5 ) was located in intron 1 of the Endomucin ( EMCN) gene. Investigation of top locus identifies non-synonymous SNP in the ABCC4 gene The genome-wide significant locus on chromosome 22 was defined as a 18.2 kb haplotype block of five GWAS SNPs in complete LD (r 2 = 1.00, chr22: 45,875,420-45,893,599 bp) spanning introns 18 and 19 of the ATP-binding cassette sub-family C member 4 gene ( ABCC4 , ENSCAFG00000005433, ENSCAFT00000008769, UniProt

(http://www.uniprot.org/uniprot/F1 PNA2), (Figs. 3A to 3C). The allele frequency for the risk haplotype was 0.21 in the cases versus 0.05 in the controls (Table 1), resulting in an odds ratio of 4.8 (95% Cl 2.3-9.9). A summary of the allele frequencies and p-values for the five GWAS SNPs is shown in Table 1.

Table 1 : Significantly associated GWAS SNPs in chromosome 22

The CanFam3.1 genomic coordinates are shown in Table 1 for the five associated GWAS SNPs reaching the Bonferroni correction. Allele frequency in cases and controls for the 98 cases and 96 controls is shown.

To further investigate the associated locus on chromosome 22, we generated whole genome sequencing data from a pool of 10 pyometra cases (22x coverage), which were all heterozygous for the associated risk haplotype chromosome 22. In addition, sequencing data was generated from one individual homozygous for the GWAS risk haplotype (23x coverage) and 10 individually barcoded individuals homozygous for the non-risk alleles (3 cases and 7 controls; 4.4x coverage on average). A 0.25 Mb region covering the ABCC4 gene (chr22: 45,767,063-46,013,484 bp) was extracted from the sequencing data and called variants were annotated. In total 1,051 SNPs were identified in the region, out of which 627 were known variants. Four missense variants were identified within the ABCC4 gene (Table 2), one of which, chr22:45,893,198 (rs8937218), was located within the associated GWAS locus. Table 2: Fine-mapped SNPs with potential function in the chr22 region

The four missense SNPs in the ABCC4 gene are displayed in Table 2. They were identified in the 45-46 MB region of chromosome 22 based on next generation sequencing. The reference, risk and protective allele for each SNP is listed as well as the associated p-values for the SNPs when incorporated into the GWAS data set for 98 cases and 96 controls. The R 2 value is listed in relation to LD calculations between the most associated GWAS SNP on chromosome 22 and each of the coding SNPs. An evaluation of the most common amino acid residues for the ABCC4 coding changes based on the UCSC genome browser is also noted. *Only one dog (control) in the extended dataset with 292 dogs was heterozygous for the chr22: 45,823,359 SNP.

To expand the study of the ABCC4 gene in a larger population of golden retrievers, we designed TaqMan genotyping assays for the four identified missense variants. Genotyping of these selected SNPs was carried out in 292 golden retrievers including 134 cases and 158 controls. The 292 dogs included the 97 cases and 96 controls, which were part of the GWAS analysis. The additional dogs were individuals who were not chosen to be part of the GWAS analysis due to relatedness. For the dogs, which did not have 170k genotyping data available, two of the GWAS SNPs (chr22: 45,882,260 and chr18: 49,198,998) were also genotyped. The TaqMan data from the original GWAS dogs was merged with the GWAS dataset.

A genome wide association analysis was repeated on the merged GWAS dataset using a mixed model approach (EMMAX). One of the ABCC4 coding SNPs (Chr22: 45,893,198) was in complete LD (r 2 =1.00) with the identified GWAS locus on chromosome 22, i.e., equally associated with the disease phenotype based on the p-value (p=1.24 x 10 6 ). This coding sequence variant (A>G) causes an amino acid substitution p.Met787Val in the encoded ABCC4 protein. The SIFT score for the amino acid change is 1.0 indicating that this is likely to be a well-tolerated change.

When performing a basic association test using PLINK 1.07 for the six TaqMan SNPs for all 292 genotyped dogs, three dogs were removed for low genotyping rate < 0.5. For the remaining dogs, complete LD with a r 2 =1.00 was seen between two SNPs chr22: 45,882,260 and chr22: 45,893,198 indicating that the risk haplotype seen in the smaller GWAS dataset was still present in this larger population. For this basic association including all TaqMan genotyped individuals the best association was to the chr18: 49, 198,998 SNP with a p-value of 7.89 c 10 5 with the candidate SNP chr22: 45,893,198 being less associated p=1.48 c 10 4 and with a mildly reduced OR 2.6 (95% Cl 1.6-4.5).

In total 120 SNPs were identified within the ~18 kb haplotype block defined by five SNPs in complete LD. Though the chr22: 45,893,198 variant is the only coding variant within this locus there are other variants in the locus located in genetically conserved regions or in c/s-regulatory elements. Fig. 4 visualizes the SNPs lifted over to the human genome (hg38) in the UCSC browser [2] In total 85 of 120 SNPs could be lifted over to the human genome (hg38).

Allele frequencies of the chr22: 45,893, 198 SNP in other breeds Allele frequencies for the ABCC4 candidate SNP chr22: 45,893,198 were available in five other dog breeds and in a separate pool of American golden retrievers, from a Panel Of Normal (PON) dataset [4] This dataset was collected as part of a cancer study unrelated to the current study and, hence, the dataset contained both male and female dogs and their disease history and neutering status was unknown. The allele frequency for the chr22: 45,893,198 SNP in the population is summarized (Table 3). In this population of different dog breeds, the allele frequency for the risk variant varied between from 0.37 in golden retrievers to 0.98 in Rottweilers. Interestingly the Rottweiler, which is one of the breeds with the highest risk of developing pyometra (adjusted risk 4.4), was almost fixed for the chr22: 45,893,198 risk variant with an allele frequency of 0.98.

Table 3: The presence of the missense SNP in other breeds and American golden retrievers

The allele frequency for the chr22: 45,893,198 coding variant was evaluated from existing data. The allele frequency was compared to the adjusted risk as published by Egenvall et al. 2001 [5]

Discussion

We performed a GWAS comparing golden retrievers affected by pyometra, with healthy intact female dogs older than 7 years of age. We found a genome-wide significant association to a region on chromosome 22 localized in the ABCC4 gene. Sequencing data identified four non- synonymous SNPs in the ABCC4 gene in addition to the five non-coding variants equally associated in the GWAS. Genotyping of the coding SNPs in a larger cohort of golden retriever cases and controls showed that in particular one SNP, chr22: 45,893,198, was in complete LD with the top associated GWAS SNPs suggesting a potential causal function relating to the risk for development of pyometra in this breed.

The ABCC4 protein, also known as multidrug resistant protein 4 (MRP4), is a member of the ATP-binding cassette transporter family, which encodes proteins that are important for transportation of endogenous and exogenous molecules across cell membranes. The protein has central functions in the reproductive system, and has role in transporting prostaglandins (PGs) in the endometrium. However, ABCC4 has not previously been associated with uterine inflammation in dogs.

Prostaglandins have many roles in reproduction. The function and life-span of the corpus luteum, and physiology of parturition are regulated by complex interactions, in which PGs participate. PGs are also important inflammatory mediators and are produced and released from neutrophils, macrophages, lymphocytes, and platelets during inflammation. Importantly, uterine endometrial cells are capable of synthesizing and releasing PGs. In dogs as well as many other mammals, circulating levels of PGs increase in uterine inflammatory conditions. In the uterine tissue during canine pyometra, Prostaglandin-endoperoxide synthase 2 ( PTGS2 ), a gene that is responsible for PG synthesis, is among the top genes, for which expression is increased. Furthermore, Prostaglandin F2alpha (PGF2 a ) induces myometrial contractions, and is used therapeutically in medical treatment of pyometra in dogs. Taken together, the many roles of PGs point to that altered prostaglandin transport could contribute to the development of pyometra.

The associated coding sequence variant at chr22: 45,893,198 causes an amino acid change, p.M787V. This amino acid is located in the cytoplasmic loop 3 (CL3) of the ABCC4 protein, a region of the protein that is phylogenetically conserved [6]. Across mammalian species the most common amino acid residue at the 787 position is valine (V), and out of 61 mammals only the dog, pig, brush tailed rat, naked mole rat and chinchilla have methionine (M) in this position [2, 7] In this study the risk allele results in keeping the less common amino acid methionine, whilst the protective allele results in a change into to the more common valine. A recent paper described the importance of the cytoplasmic loop 3 and how a single amino acid substitution in ABCC4 p.T796M could reduce the expression and stability of the human ABCC4 protein. In the study the p.T796M ABCC4 substitution was predicted to be benign and well tolerated by SIFT and PolyPhen. However, this was found to be unlikely by the authors due to the larger size of methionine [6]. In this study, the canine p.M787V ABCC4 substitution was also predicted to be benign and well tolerated by PolyPhen and SIFT. Nevertheless, it is possible that it could influence the ABCC4 cellular transportation capacity [6].

The ABCC4 risk variant is present in the canine reference genome sequence rather than the non-risk allele. The Canfam 3.1 assembly is based on a single individual (female boxer) and it is feasible that this individual carries the risk allele as the disease is common in boxers (diagnosed in 28% by ten years of age, age adjusted relative risk 2.7). Though we predict that the coding sequence variant chr22: 45,893,198 can influence the risk of developing pyometra, it is also possible that this SNP could increase the reproductive potential of the individuals, such as contributing to increased fertility, litter size, conception, or fetal growth and therefore, it could have been under selection. The susceptibility to disease is likely not selected against as the majority of pyometra cases develop after the main reproductive period.

The discrepancy between the association results in the smaller GWAS population and the larger TaqMan genotyping population can possibly be explained by the larger dataset including dogs, which were excluded from the original GWAS dataset due to first degree relatedness. Hence, the association analysis containing more dogs included many highly related individuals, which could falsely skew the data away from the original association. Due to the few data points in the dataset it is not possible to correct this association using a mixed model approach.

Interestingly, the chr22: 45,893,198 SNP showed variation in allele frequency across breeds, with the Rottweiler being almost completely fixed for the risk allele. The Rottweiler is ranked as one of three dog breeds with the highest risk of developing pyometra (61% are diagnosed with the illness by 10 years of age). This suggests that this variant, could contribute to the disease risk in other breeds.

The ABCC4 transporter is known for its involvement in transporting drugs and other molecules across the cell membrane and altered function can lead to increased cellular toxicity in relation to exposure of various drugs. It is unknown whether the ABCC4 amino acid changes identified herein can affect drug transport, but differences in NSAID transportation have been linked with altered ABCC4 function [8].

In total 120 SNPs were identified in the associated GWAS locus. When lifted over to the human genome hg38, 85 SNPs could be transferred. Some of these SNPs were found to be located in areas with H3K27Ac histone enrichment and areas with a high conservation score based on data from Multiz Alignments of 100 Vertebrates (Fig. 4).

Pyometra is likely caused by multifactorial genetic and environmental factors, hence, we investigated the potential of associated loci independent of the chromosome 22 locus and risk factors associated to age of onset. Though, none of the other loci reached Bonferroni corrected significance, we detected suggestive associations to SNPs located within introns of the several genes ( TESMIN , SORBS1, NAF1 and EMCN). Based on known function of the proteins encoded by these genes, potential implications for the development of pyometra could be considered.

In conclusion, this GWAS identified an association to a locus in the ABCC4 gene and subsequently identified a non-synonymous SNP in complete LD with the most highly associated GWAS SNP. The ABCC4 protein is a known prostaglandin transporter.

Materials and methods

Ethical approval

All samples were collected with the owners’ written consent and in agreement with Ethical guidelines (Ethical approvals Dnr C12/15, D318/9, C139/9).

Sample collection

Blood samples were collected from female golden retrievers affected by or with a history of pyometra before 8 years of age. The dogs were identified through the diagnostic code for pyometra in the Agria Animal Insurance Inc. database, which has been validated for research purposes [9]. A questionnaire was filled in by each dog-owner directly prior to the time of blood sample collection. Details of the dog’s Swedish Kennel Club’s (SKK) registry number, name, age, birth date, previous whelping, onset of signs of pyometra, whether surgical treatment (ovariohysterectomy) had been performed or not, and past or present other diseases or medications were noted in the questionnaire. In parallel, blood samples were similarly collected from control dogs (>7 years old intact female golden retrievers), identified via the SKK registry. Questionnaire and health information was also collected from the control dogs. Information regarding pedigree and number of siblings was extracted from the SKK database, based on the individual dog’s registration number. Siblings were excluded as to only include one individual in the case and control groups, respectively.

The health status of the control dogs was updated yearly by telephone contact, to assure that none of the controls had developed pyometra at the time of data analysis.

DNA extraction

Genomic DNA was extracted from ethylenediaminetetraacetic acid (EDTA) blood by a robotic method using the QIASymphony robot (Qiagen, Hilden, Germany) together with the QIAamp DNA Blood Midi Kit (Qiagen).

Genome-wide genotyping

DNA from each dog was genotyped using the lllumina 170K CanineHD BeadChip (lllumina, San Diego, CA, USA). A total of 98 cases and 96 control samples were genotyped on the NeoGen Genomics genotyping platform (NeoGen Genomics, Lincoln, NE, USA).

G l/l/AS analysis

Data quality control and filtering was performed using the software PLINK 1.07 [1] SNPs were removed if they had a minor allele frequency (MAF) less than 5% or if they had failed to be genotyped in more than 5% of samples (--maf 0.05, --geno 0.05). Individuals with a genotyping rate of less than 95% were removed (--mind 0.05). To evaluate and visualize the population structure a multidimensional scaling (MDS) plot was generated on the filtered dataset using PLINK 1.07 [1] The first four coordinates (C1-C4) were calculated and the first two were plotted against each other to illustrate population structure. An additional control for relatedness within the cases or controls was calculated using PLINK v 1.9 by calculating PIJHAT on the final dataset after LD pruning [1]

To account for cryptic relatedness within the population and population structure the GWAS analysis was performed using the efficient mixed model association expedited software (EMMAX) [3]. The software was used with an identity by state (IBS) matrix. QQ-plots and Manhattan plots were generated in R using the software package Lattice [15]. The significance threshold was determined using an LD corrected Bonferroni significance threshold was based on 11897 SNPs that were not in complete or near-complete LD as calculated by PLINK-indep 100 10 10 as previously described [10].

A conditional association analysis to look for risk factors independent of the top associated loci was performed using the genotype for the associated SNP (chr22: 45,875,420) as a covariate in the GWAS analysis using EMMAX.

To search for risk factors associated with age of onset, age was calculated in days for all the cases. A GWAS analysis was then performed including only the cases and using the age of onset in days as a continuous variable using the EMMAX software.

Whole genome sequencing

Whole genome sequencing data was generated from one pool of 10 golden retrievers, all heterozygous, for the risk haplotype and from one dog homozygous for the risk haplotype. In addition, individually barcoded sequencing data was generated from 10 individuals homozygous for the non-risk haplotype. Samples were sequenced as paired-end libraries with 10Obp read length on the lllumina HiSeq2000 system. Data was aligned to the CanFam3.1 reference genome sequence (https://www.ncbi.nlm.nih.goV/assembly/GCF_000002285.5) according to the Genome Analysis Tool Kit (GATK) best practices work flow [11] Variants in the regions of interest were annotated using variant effect predictor [12], including annotation with SIFT [13] and PolyPhen-2 [14]

TaqMan genotyping of SNPs in candidate region

TaqMan custom arrays (Table 4) were designed for genotyping of four non-synonymous coding SNPs in the ABCC4 gene and using the Custom TaqMan® Assay Design Tool (ThermoFisher Scientific, Waltham, MA). Assays were also designed for two of the top GWAS SNPs as controls and to evaluate genotypes for additional samples, which were not genotyped on the 170k CanineHD BeadChip.

Table 4: Primers and reporter used for the TaqMan genotyping

Reporter 1 dye was V C for forward primers and FAM for reverse primers; Reporter quencher was NFQ

Genotyping was performed in 292 golden retrievers including 134 cases and 158 controls. This included 97 cases and 96 controls genotyped for the GWAS analysis and additional novel cases and controls. Unfortunately, one of the cases from the GWAS study could not be genotyped for the additional candidate SNPs due to lack of DNA. Manual imputation of the ABCC4 coding SNPs with an r 2 above 0.99 to the original GWAS locus was performed for this individual. The basic association on the TaqMan genotyping data was carried out using PLINK 1.07 [1]

Allele frequencies in Panel of Normal

Genotypes for the four coding SNPs in the ABCC4 gene were extracted from a Panel of Normals (PON) database generated for a separate study which is displayed as a track on the UCSC genome browser [4]

The embodiments described above are to be understood as a few illustrative examples of the present invention. It will be understood by those skilled in the art that various modifications, combinations and changes may be made to the embodiments without departing from the scope of the present invention. In particular, different part solutions in the different embodiments can be combined in other configurations, where technically possible.

REFERENCES

[1] Purcell, S. et al. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am J Hum Genet 81, 559-575, doi:10.1086/519795 (2007).

[2] http://genome.ucsc.edu.

[3] Kang, H. M. et al. Variance component model to account for sample structure in genome-wide association studies. Nat Genet 42, 348-354, doi:10.1038/ng.548 (2010).

[4] Elvers, I. et al. Exome sequencing of lymphomas from three dog breeds reveals somatic mutation patterns reflecting genetic background. Genome Res 25, 1634-1645, doi:10.1101/gr.194449.115 (2015).

[5] Egenvall, A. et al. Breed risk of pyometra in insured dogs in Sweden. J Vet Intern Med 15, 530-538 (2001).

[6] Cheepala, S. B. et al. Crucial role for phylogenetically conserved cytoplasmic loop 3 in

ABCC4 protein expression. J Biol Chem 288, 22207-22218, doi:10.1074/jbc.M113.476218 (2013).

[7] Rosenbloom, K. R. et al. The UCSC Genome Browser database: 2015 update. Nucleic Acids Res 43, D670-681, doi: 10.1093/nar/gku1177 (2015).

[8] Reid, G. et al. The human multidrug resistance protein MRP4 functions as a prostaglandin efflux transporter and is inhibited by nonsteroidal antiinflammatory drugs. Proc Natl Acad Sci U S A 100, 9244-9249, doi: 10.1073/pnas.1033060100 (2003). [9] Egenvall, A., Bonnett, B. N., Olson, P. & Hedhammar, A. Validation of computerized Swedish dog and cat insurance data against veterinary practice records. Prev Vet Med 36, 51-65 (1998).

[10] Hayward, J. J. et al. Complex disease and phenotype mapping in the domestic dog. Nat Commun 7, 10460, doi: 10.1038/ncomms10460 (2016).

[11] McKenna, A. et al. The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res 20, 1297-1303, doi:10.1101/gr.107524.110 (2010).

[12] McLaren, W. et al. The Ensembl Variant Effect Predictor. Genome Biol 17, 122, doi: 10.1186/s13059-016-0974-4 (2016).

[13] Sim, N. L. et al. SIFT web server: predicting effects of amino acid substitutions on proteins. Nucleic Acids Res 40, W452-457, doi:10.1093/nar/gks539 (2012).

[14] Adzhubei, I. A. et al. A method and server for predicting damaging missense mutations. Nat Methods 7, 248-249, doi: 10.1038/nmeth0410-248 (2010). [15] Sarkar, D. in Use R! (eds R. Gentleman, K. Hornik, & G. Parmigiani) (Springer, 2008).

[16] Sambrook, J., Russell. D. W.. Molecular Cloning: A Laboratory Manual, the third edition, Cold Spring Harbor Laboratory Press, Cold Spring Harbor. New York, 1.31-1.38, 2001

[17] Sharma. R.C., et al. "A rapid procedure for isolation of RNA-free genomic DNA from mammalian cells", BioTechniques, 14, 176-178 (1993).