Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
MILK BIOACTIVE
Document Type and Number:
WIPO Patent Application WO/2019/075525
Kind Code:
A1
Abstract:
The present invention relates to methods of selecting, identifying or breeding mammals, particularly bovines, to improve milk composition or milk production traits. In particular, the methods involve determining the genotypic state of said mammals at SNPs associated with oligosaccharide biosynthesis. The present invention also relates to methods of identifying mammals which possess a genotype indicative of an altered milk production trait. The invention also encompasses mammals and milk products produced using these methods.

Inventors:
LIU ZHIQIAN (AU)
ROCHFORT SIMONE JANE (AU)
PRYCE JENNIE ELIZABETH (AU)
COCKS BENJAMIN (AU)
WANG TINGTING (AU)
ZAWADZKI JODY (AU)
SPANGENBERG GERMAN CARLOS (AU)
Application Number:
PCT/AU2018/051138
Publication Date:
April 25, 2019
Filing Date:
October 19, 2018
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
AGRICULTURE VICTORIA SERV PTY (AU)
International Classes:
C12Q1/6827; C12Q1/6888
Other References:
ZE, X. ET AL.: "Personalized nutrition kit development to determine secretotype of Chinese mother by detecting a specific single nucleotide polymorphism (rs 1047781", JOURNAL OF PEDIATRIC GASTROENTEROLOGY AND NUTRITION, vol. 64, April 2017 (2017-04-01)
BUITENHUIS, A.J. ET AL.: "Estimation of genetic parameters and detection of quantitative trait loci for metabolites in Danish Holstein milk", JOURNAL OF DAIRY SCIENCE, vol. 96, 2013, pages 3285 - 3295, XP055594984
SUNDEKILDE, U.K . ET AL.: "Natural Variability in Bovine Milk Oligosaccharides from Danish Jersey and Holstein-Friesian Breeds", JOURNAL OF AGRICULTURAL AND FOOD CHEMISTRY, vol. 60, 2012, pages 6188 - 6196, XP055243299, DOI: doi:10.1021/jf300015j
MEYRAND, M. ET AL.: "Comparison of milk oligosaccharides between goats with and without the genetic ability to synthesize alphas1-casein", SMALL RUMINANT RESEARCH, vol. 113, 2013, pages 411 - 420, XP028573167, DOI: doi:10.1016/j.smallrumres.2013.03.014
NEWBURG, D.S. ET AL.: "Innate protection conferred by fucosylated oligosaccharides of human milk against diarrhea in breastfed infants", GLYCOBIOLOGY, vol. 14, 2004, pages 253 - 263, XP002471649, DOI: doi:10.1093/glycob/cwh020
LARSEN, L.B. ET AL.: "Enestående oligosakkarider: Oligosakkaridprofiler af komælk for optimering af sundhedsgavnlige egenskaber", SLUTRAPPORT NR. 2017-148, November 2017 (2017-11-01), XP055594986, Retrieved from the Internet [retrieved on 20181123]
WICKRAMASINGHE, S. ET AL.: "Transcriptome Profiling of Bovine Milk Oligosaccharide Metabolism Genes Using RNA-Sequencing", PLOS ONE, vol. 6, no. 4, 2011, pages 1 - 10, XP055594989
LIU, Z ET AL.: "Bovine Milk Oligosaccharide Contents Show Remarkable Seasonal Variation and Intercow Variation", JOURNAL OF AGRICULTURAL AND FOOD CHEMISTRY, vol. 65, January 2017 (2017-01-01), pages 1307 - 1313, XP055594991
Attorney, Agent or Firm:
JONES TULLOCH (AU)
Download PDF:
Claims:
THE CLAIMS DEFINING THE INVENTION ARE AS FOLLOWS:

1. A method of selecting a mammal which is capable of producing milk that has an improved composition, said method including the step of determining the genotypic state of said mammal at one or more SNPs associated with oligosaccharide biosynthesis.

2. A method according to claim 1 , wherein said method includes the further step of subjecting said mammal to marker associated selection.

3. The method according to claim 1 or 2, wherein the genotypic state of said SNP or SNPs is determined directly. 4. The method according to claim 1 or 2, wherein the genotypic state of said SNP or SNPs is determined using a SNP array or whole genome sequence data.

5. A method according to claims 1 to 4, wherein said mammal is a bovine.

6. A method according to any one of claims 1 to 5, wherein the milk produced has an improved oligosaccharide composition. 7. A method according to any one of claims 1 to 6, wherein the oligosaccharide is selected from the group consisting of GNL, OS-C, 3'SL, OS-A and OS-B.

8. A method according to any one of claims 1 to 7, wherein the genotypic state is determined by genome-wide association analysis and/or gene expression level.

9. A method according to any one of claims 1 to 8, wherein the SNP is selected from the group consisting of SNPs listed in Tables 6-8.

10. A method of identifying a mammal which possesses a genotype indicative of an altered milk production trait, said method including:

providing a nucleic acid sample from said mammal;

determining the presence of one or more SNPs associated with oligosaccharide biosynthesis in said nucleic acid sample;

wherein the presence of said SNP is associated with said altered milk production trait

11. A method according to claim 10, wherein the method includes the further step of subjecting the mammal to marker associated selection.

12. A method according to claim 9 or 10, wherein said altered milk production trait is oligosaccharide composition. 13. A method according to any one of claims 10 to 12, wherein the mammal is a bovine.

14. A method according to any one of claims 10 to 13, wherein the oligosaccharide is selected from the group consisting of GNL, OS-C, 3'SL, OS-A and OS-B.

15. A method according to any one of claims 10 to 14, wherein the SNP is selected from the group consisting of SNPs listed in Tables 6-8. 16. A method of selectively breeding a mammal for altered milk composition, said method including using marker assisted selection to breed said mammal to carry one or more SNPs associated with oligosaccharide biosynthesis.

17. A method according to claim 16, wherein the mammal is a bovine.

18. A method according to claim 16 or 17, wherein the oligosaccharide is selected from the group consisting of GNL, OS-C, 3'SL, OS-A and OS-B.

19. A method according to any one of claims 16 to 18, wherein the SNP is selected from the group consisting of SNPs listed in Tables 6-8.

20. A method according to any one of claims 16 to 19, wherein said mammal carries two copies of one or more of the SNPs. 21. A mammal selected, identified or bred by a method according to any one of claims 1 to 20.

22. A mammal according to claim 21 , wherein the mammal is a bovine.

23. A milk product from a mammal selected, identified or bred by a method according to any one of claims 1 to 20.

Description:
MILK BIOACTIVE

Field of the Invention The present invention provides methods of selecting, identifying or breeding mammals, particularly bovines, to improve milk composition or milk production traits. In particular, the methods involve determining or breeding for SNPs associated with oligosaccharide biosynthesis. The invention also encompasses mammals and milk products produced using these methods.

Background of the Invention

The genetic basis of bovine milk production is of immense significance to the dairy industry. An ability to modulate milk volumes and content has the potential to alter farming practices and to produce products which are tailored to meet a range of requirements. In particular, a method of genetically evaluating bovine to select those which express desirable traits, such as increased milk production and improved milk composition, would be desirable.

Milk is a complex fluid comprising proteins, lipids, lactose and oligosaccharides, and vitamins and minerals. The biochemical components in milk determine its nutritional value, physical characteristics, e.g. milk fat globule size, and product characteristics, e.g. curd stability, renneting time. Such functional properties can be predicted from the biochemical profile of the milk. Oligosaccharides (OS) are a class of carbohydrates containing 3-15 monomer units. The most frequent monomers are glucose, fructose, galactose, and sialic acid. They can be concentrated from whey.

The role of OS in promoting human health is widely known. Acting as prebiotics, OS stimulate the growth of beneficial bifidobacteria in the colon. OS can also prevent infection by inhibiting the adhesion of pathogens to the intestinal mucosal surface. Furthermore, sialic acid, a component of milk OS, is essential for brain development and cognitive function of neonates. Indeed, OS are one of the major components of human milk. Bovine milk is a staple drink and also the most common ingredient in infant formulas. Bovine milk OS composition and content has been the subject of numerous studies in the past decade. Over 40 OS have been identified thus far in bovine milk, but their overall concentrations are much lower compared to human milk OS (see Table 1). As a result, fructooligosaccharides (FOS) extracted from plants and galactooligosaccharides (GOS) enzymatically synthesised are frequently used in infant formulas to mimic the functions of human milk OS.

Table 1 : Milk oligosaccharides

Bovine milk OS are structurally much closer to human milk OS, so bovine milk OS would be a better replacement than FOS and GOS to human milk OS if their concentrations could be substantially increased. In addition, as bovine milk is consumed by the majority of the population in many countries, increasing its OS content could also increase the uptake of such dietary fibres by a large number of people worldwide. Many of the health benefits that milk OS provide for infants are expected to be equally applicable to humans of all ages.

Bovine milk OS concentration has been investigated in relation to cow breed, animal diets, stage of lactation and seasons. Jersey milk contains higher levels of sialylated and also complex neutral fucosylated OS, while Holstein milk contains higher levels of the less complex neutral OS. However, the overall inter-breed difference in OS content is rather modest. Information on OS content in relation to animal diets is scarce and very limited data found in the literature appear to suggest that in contrast with milk fat and protein content, milk OS level is not influenced by animal diets. So increasing milk OS through diet manipulation is unlikely to be a feasible option. As for seasonal variation of milk OS, a recent systematic survey conducted in Victoria, Australia revealed a remarkable increase of most OS over the milking season, with the highest concentrations obtained in autumn. The most extensively investigated factor in relation to milk OS content is probably the stage of lactation. A large number of studies conducted in different countries showed that OS content is much higher in colostrum and declines gradually with the progression of lactation. Significant cow to cow variation in colostrum and milk OS content has been observed in a number of studies. As external factors such as diets and stage of lactation were the same for all cows in these experiments, genetic variation was proposed as a possible cause for the difference in OS production across individual cows, but information on the inheritance of this trait is still lacking.

It is an object of the present invention to overcome, or at least alleviate, one or more of the difficulties or deficiencies associated with the prior art. Summary of the Invention

In one aspect, the present invention provides a method of selecting a mammal which is capable of producing milk that has an improved composition, said method including the step of determining the genotypic state of said mammal at one or more SNPs associated with oligosaccharide biosynthesis.

The mammal may be of any suitable type, preferably a mammal that is used for commercial milk production, such as sheep, goats or cattle. In a preferred embodiment, the mammal is bovine.

By 'improved milk composition' as used herein is meant that the milk composition is altered to change the amount of one or more components, preferably to increase the amount of one or more components. However, in certain circumstances it may also be desirable to decrease the amount of one or components in the milk. In a preferred embodiment, the components are oligosaccharides, in which case the milk has an 'improved oligosaccharide composition'. In this way the milk may be 'humanised' to make it closer in composition to human milk. This may be desirable for producing infant formula. Alternatively, the composition of the milk may be tailored for producing food supplements, for example supplement powders, for example for the unwell or elderly.

In a particularly preferred embodiment, the oligosaccharide is selected from the group consisting of triose, 3'-sialyllactose (3'-SL), 6'-sialyllactose (6'-SL), 6'-sialyl-N- acetyllactosamine (6 -SLN), Disialyllactose (DSL), N-acetylgalactosaminyllactose (GNL), 3'- sialylgalactosyllactose (OS-A), lacto-N-pentaose (OS-B), lacto-N-tetraose (OS-C), Di-N- acetylhexosaminyltriose (OS-D), 3'-glycolylneuraminyllactose (OS-E), and 3'-sialyl-N- acetylglucosaminyllactose (OS-F). In a particularly preferred embodiment, the oligosaccharide is selected from the group consisting of GNL, OS-C, 3'SL, OS-A and OS-B.

By One or more SNPs associated with oligosaccharide biosynthesis' is meant variations in single nucleotides in the genome of the mammal that are associated with oligosaccharide biosynthetic genes. Preferably, the SNPs are located in the coding, non-coding or regulatory regions of such oligosaccharide biosynthetic genes. Preferably, the genes encode enzymes that are required for biosynthesis of an oligosaccharide selected from the group consisting of triose, 3'-SL, 6'-SL, 6 -SLN, DSL, GNL, OS-A, OS-B, OS-C, OS-D, OS- E, and OS-F. In a particularly preferred embodiment, the genes encode enzymes that are required for biosynthesis of an oligosaccharide selected from the group consisting of GNL, OS-C, 3'SL, OS-A and OS-B.

In a particularly preferred embodiment, the one or more SNPs are selected from the group consisting of SNPs listed in Tables 6-8.

The genotypic state of the mammal may be determined by any suitable method. Such methods may involve sequence analysis of the genome, or a section of the genome of the mammal, for example a chromosome showing strong association with desired OS traits. In a preferred embodiment, when the mammal is a bovine, the chromosome may be chromosome 1 1 . In this way it is possible to determine the milk phenotypic state of individuals based on their genotypes.

In a particularly preferred embodiment, the method may include the step of subjecting the sample to a genome-wide association analysis (GWAS). This may involve genotyping the mammal against a panel of SNPs to generate a low density genotype. Selected mammals may then be genotyped, or their genotypes imputed to a larger SNP array or to whole genome sequence, preferably using a reference population genotyped at higher density or with whole genome sequence data.

In a preferred embodiment, this step may be incorporated into the routine genetic evaluation of such mammals.

Determining the genotypic state of the mammal may alternatively or in addition involve an assessment of gene expression level. In a preferred embodiment, RNA transcript levels of candidate genes may be determined, for example by RNAseq analysis. While applicant does not wish to be restricted by theory, SNPs of interest may be highly associated with transcript levels. RNA transcript levels may be determined in any suitable sample from the mammal which contains nucleic acids, for example a blood sample.

In a further aspect, the present invention provides a method of identifying a mammal which possesses a genotype indicative of an altered milk production trait, said method including: providing a nucleic acid sample from said mammal; and

determining the presence of one or more SNPs associated with oligosaccharide biosynthesis in said nucleic acid sample;

wherein the presence of said SNP is associated with said altered milk production trait.

The mammal may be of any suitable type, preferably a mammal that is used for commercial milk production such as sheep, goat or cattle. In a more preferred embodiment, the mammal is bovine. In a preferred embodiment, the altered milk production trait may be oligosaccharide composition of the milk. Thus, the milk composition may be altered to change the amount of one or more components, preferably to increase the amount of one or more components. However, in certain circumstances it may also be desirable to decrease the amount of one or components in the milk.

In a preferred embodiment, the components are oligosaccharides. In this way, bovines may be identified that produce milk that is closer in composition to human milk. This may be desirable for producing infant formula. Alternatively, the composition of the milk may be tailored for producing food supplements, for example supplement powders, for example for the unwell or elderly.

The nucleic acid sample may be of any suitable type. In a preferred embodiment, the nucleic acid sample may be a blood sample. By One or more SNPs associated with oligosaccharide biosynthesis' is meant variations in single nucleotides in the genome of the mammal that are associated with oligosaccharide biosynthetic genes. Preferably, the SNPs are located in the coding, non-coding or regulatory regions of such oligosaccharide biosynthetic genes. Preferably, the genes encode enzymes that are required for biosynthesis of an oligosaccharide selected from the group consisting of triose, 3'-SL, 6'-SL, 6 -SLN, DSL, GNL, OS-A, OS-B, OS-C, OS-D, OS- E, and OS-F. In a particularly preferred embodiment, the genes encode enzymes that are required for biosynthesis of an oligosaccharide selected from the group consisting of GNL, OS-C, 3'SL, OS-A and OS-B.

In a particularly preferred embodiment, the one or more SNPs are selected from the group consisting of SNPs listed in Tables 6-8.

The presence of the SNPs in the nucleic acid sample may be determined by any suitable method. Such methods may involve sequence analysis of the genome, or a section of the genome of the mammal, for example a chromosome showing strong association with desired OS traits. In a preferred embodiment, when the mammal is a bovine, the chromosome may be chromosome 1 1.

In a particularly preferred embodiment, the method may include the step of subjecting the sample to a GWAS. This may involve genotyping the mammal against a panel of SNPs to generate a low density genotype. Selected mammals may then be genotyped against a larger reference population and/or a SNP array. In a preferred embodiment, this step may be incorporated into the routine genetic evaluation of such mammals.

Determining the presence of the SNPs may alternatively or in addition involve an assessment of gene expression level.

In a preferred embodiment, RNA transcript levels of candidate genes may be determined, for example by RNAseq analysis. While applicant does not wish to be restricted by theory, SNPs of interest may be highly associated with transcript levels.

RNA transcript levels may be determined in any suitable nucleic acid sample from the mammal, for example a blood sample.

In a further aspect of the present invention, there is provided a method of selectively breeding a mammal for altered milk composition, said method including using marker assisted selection to breed said mammal to carry one or more SNPs associated with oligosaccharide biosynthesis.

The mammal may be of any suitable type, preferably a mammal that is used for commercial milk production such as sheep, goat or cattle. In a more preferred embodiment, the mammal is bovine.

By One or more SNPs associated with oligosaccharide biosynthesis' is meant variations in single nucleotides in the genome of the mammal that are associated with oligosaccharide biosynthetic genes. Preferably, the SNPs are located in the coding, non-coding or regulatory regions of such oligosaccharide biosynthetic genes. Preferably, the genes encode enzymes that are required for biosynthesis of an oligosaccharide selected from the group consisting of triose, 3'-SL, 6'-SL, 6 -SLN, DSL, GNL, OS-A, OS-B, OS-C, OS-D, OS- E, and OS-F. In a particularly preferred embodiment, the genes encode enzymes that are required for biosynthesis of an oligosaccharide selected from the group consisting of GNL, OS-C, 3'SL, OS-A and OS-B.

In a particularly preferred embodiment, the one or more SNPs are selected from the group consisting of SNPs listed in Tables 6-8. The presence of the SNPs may be determined by analysing a nucleic acid sample from the mammal. The presence of the SNPs may be determined by any suitable method. Such methods may involve sequence analysis of the genome, or a section of the genome of the mammal, for example a chromosome showing strong association with desired OS traits. In a preferred embodiment, when the mammal is a bovine, the chromosome may be chromosome 1 1 .

In a preferred embodiment, the mammal may be bred to carry two copies of one or more of the SNPs.

In a further aspect of the present invention, there is provided a mammal selected, identified or bred by a method as hereinbefore described.

The mammal may be of any suitable type, preferably a mammal that is used for commercial milk production such as sheep, goat or cattle. In a more preferred embodiment, the mammal is bovine. In a further aspect of the present invention, there is provided a milk product from a mammal selected, identified or bred by a method as hereinbefore described. In a preferred embodiment the milk product may be a bovine milk product that is closer in composition to human milk. In another preferred embodiment, the milk product may be an infant formula. In another preferred embodiment, the milk product may be a food supplement, for example a supplement powder, for example for the unwell or elderly.

The present invention will now be more fully described with reference to the accompanying Examples and drawings. It should be understood, however, that the description following is illustrative only and should not be taken in any way as a restriction on the generality of the invention described above.

Brief Description of the Drawings/Figures Figure 1. PCA analysis on 360 animals across three years (Y1 , Y2, and Y3) using 632,003 SNP panel.

Figure 2. Quantile-quantile (QQ) plots of GWAS results for the traits 3'-SL, GNL, OS-A, OS-

B, and OS-C, calculated by plotting the observed -log[p-values] (curve P) and the expected -log[p-values] (line D; null hypothesis)

Figure 3. Manhattan plots of -log10 p-values of the traits 3'-SL, GNL, OS-A, OS-B, and OS-

C. The stars (indicated by arrows) indicate strong QTL signals (p-value < 0.00001). Figure 4. GWAS results with sequence variants showing candidate gene regions for GNL (a), OS-C (b), 3'-SL (c), OS-A (d) and OS-B (e, f and g). GNL and OS-C share a major QTL effect around the light grey highlighted region which also overlaps with the most significant eQTL variants affecting ABO gene expression. In each plot the variant with the top -log 10 p- value is shown by a diamond (with bp position label). The strength of LD (r 2 ) between this top variant and all others is colour coded.

Figure 5. LC-MS analysis of milk oligosaccharides.

Figure 6. Inter-cow variation in accumulation pattern of milk oligosaccharides.

Figure 7. Seasonal variation in accumulation pattern of milk oligosaccharides. Figure 8. Structure of milk oligosaccharides.

Figure 9. Relative abundance of milk oligosaccharides.

Figure 10. Evidence of cow-specific accumulation of oligosaccharides. Detailed Description of the Embodiments

Example 1 : Genomic regions associated with bovine milk oligosaccharide content Materials and methods Cows, herd management and milk sample collection

All experimental cows were maintained in the research Department of Economic Development, Jobs, Transport and Resources' Ellinbank herd at the Centre in Victoria, Australia and the experimentation was conducted in accordance with the Australian Code of Practice for the Care and Use of Animals for Scientific Purposes (Anonymous 2013). Cow diet varied through the milking season but the majority of the cows' nutrient intake was derived from grazed pasture, supplemented with bought in feedstuff as required.

A total of 360 multiparous Holstein cows that calved in late winter/early spring were used in this study. The experiment was conducted over three years (2013, 2014 and 2015), with 120 cows participating each year. Milk samples were collected each year in three batches (40 animals per batch) over the period of mid-October to late-November. So a total of 9 batches of samples (B1-B3 for year 2013, B4-B6 for year 2014 and B7-B9 for year 2015) were collected in this study. On each sampling occasion, the total milk from the afternoon and morning milking was collected into test buckets, pooled for each cow and a subsample taken for analysis. Milk samples were transported to the laboratory on ice and kept at -80 °C before analysis. Phenotyping

OS fraction was isolated from diluted raw milk using an ultra-filtration method and the filtrate used directly for LC-MS analysis. The detailed sample preparation procedure was as previously described (Liu et al. 2014). A total of 12 major OS present in mature milk was surveyed in this study; their structure and accurate mass are summarized in Table 2.

Table 2: Major OS species investigated in this study

Name (code) Structures 3 m/z (calculated)

Triose 503.1612

3'-sialyllactose (3'-SL) 632.2038

6'-sialyllactose (6'-SL) 632.2038

6'-sialyl-N-acetyllactosamine (6 -SLN) 673.2304

Disialyllactose (DSL) 923.2992

N-acetylgalactosaminyllactose (GNL) 544.1878

3'-sialylgalactosyllactose (OS-A) 794.2566

lacto-A/-pentaose (OS-B) 868.2934

Lacto-/V-tetraose (OS-C) L 706.2406

OS-D 909.3200

3'-glycolylneuraminyllactose (OS-E) 648.1987

OS-F 835.2832

Based on Lee et al. (2016). Hex: Glc/Gal; HexNAc: N-acetylglucosamine (GlcNAc)/N-acetylgalactosamine (GalNAc). NeuAc: /V-ac neuraminic acid (sialic acid); NeuGc: /V-glycolylneuraminic acid. Symbols: Glc (black circle), Gal (white circle), GlcNAc (black squ GalNAc (hatched square), HexNAc (white square), Neu5Ac (black diamond), and Neu5Gc (white diamond). Calculated m/z value deprotonated molecular ions (detected in negative mode).

An Agilent 1290 UPLC system coupled to an LTQ-Orbitrap MS (Thermo Scientific) was used for OS quantification. Chromatographic separation of OS was achieved using a HILIC Kinetex column (150x4.6 mm, 2.6 μηι, Phenomenex) maintained at 30 °C. The mobile phase was composed of 5 mM aqueous ammonium formate (A) and acetonitrile containing 0.1 % formic acid (B). The flow rate was 0.8 mL/min and the elution started with 5% A for the first 3 min and then increased to 30% A from 3 to 17 min. The total run time was 26 min for each analysis. MS instrumental settings for OS analysis were as previously described (Liu et al. 2014). All OS were detected in negative ion mode as their deprotonated ions. Due to the lack of standards for most OS, relative quantification was carried out for all the major OS. Peak area (after normalisation by the internal standard) was used as a measure for the relative abundance of each OS across all samples.

Genotyping

The 360 cows were originally genotyped using a low density custom in-house panel of 8309 genome-wide SNPs, of which approximately 5300 were common to SNPs on the lllumina BovineSNP50 BeadChip (-50,000 SNP array: http://www.illumina.com/products/by- type/microarray-kits.html). The low density genotypes for these cows were then imputed by DataGene Ltd (Melbourne, Australia) as part of their routine genetic evaluations in Australia to the standard lllumina BovineSNP50 BeadChip using a reference population of more than 50,000 animals. The imputed Bovine SNP50 genotypes comprised 39,756 SNP that passed quality control. These BovineSNP50 BeadChip genotypes were then imputed to the high density BovineHD BeadChip (800,000 SNP array). The reference population used for this imputation totalled 2155 animals with real genotypes for 632,003 SNP on the Bovine HD BeadChip that passed a range of quality control filters following (Erbe et al. 2012). After the initial genome-wide association analysis, the same Holstein cows were imputed to whole genome sequence variants on chromosomes showing strong associations with some OS traits. For imputation to sequence, we used a reference set of 645 sequenced dairy bulls from Run 5 of the 1000 Bull Genomes Project (Daetwyler et al. 2014). These included mainly Holstein breed (450) and several minor breeds including Jersey, Scandinavian Red and Guernsey. Sequence variants were only imputed if there were 4 or more copies of the minor allele among the reference bulls.

The software used for each imputation step was Fimpute using default parameters (Sargolzaei et al. 2014). To check the genomic similarities between the animals across three years, a Principal Component Analysis (PCA) was conducted on 360 animals with the HD 632,003 SNP as shown in Figure 1. To improve the quality of further association studies, the outliers in the Figure 1 were removed with 332 animals left. For the association analysis, HD SNP and sequence variants were only included if their minor allele frequency in the study cows was above 0.05.

Genome-wide association analysis (GWAS) The model of GWAS assumes that y, the phenotypic records of n individuals, is a linear model of fixed effects (β), each SNP effect (#;), and environmental errors (e): y = Xp + u + e, (1)

Where, β is the fixed effects including different batch number ranging from 1 to 9; X is a design matrix relating phenotypes to their fixed effects; u~JV(0, Va¾, V = Gog + Iaj, the zz'

animal relationship matrix G =— , m is the number of SNPs. And Z; is the genotype data for each SNP i coded with 0, 1 , and 2 (presenting the genotypes aa, Aa, and AA). In our analysis, the GWAS was conducted by the software GCTA (Yang et al. 201 1) for the association studies.

In addition, to assess the precision of GWAS, two statistical criteria, False Discovery Rate (FDR) and quantile-quantile (Q-Q) plot, were calculated based on the GWAS p-values. The Q-Q plot is a graphical representation that determines to what the degree the observed GWAS p-values for each SNP deviate from the expected value (null hypothesis) with a theoretical χ 2 distribution.

The FDR evaluates the rate of type I errors in null hypothesis testing. For the GWAS analysis, the estimated p-values to discover the proportion of significant SNPs (rejected null hypotheses) that are likely false (incorrect rejections) under given thresholds (e.g. in this research, four thresholds are defined as p < 0.000001, p < 0.00001, p < 0.0001, and p < 0.001) were used to calculate the FDR by applying the following equation (Bolormaa et al. 201 1): where T is the threshold p-value from GWAS, s is the number of significant SNPs with the p-values smaller than T, and N is the total number of SNPs in the data.

Gene expression QTL (eQTL) study

As part of a larger study, we undertook an RNAseq analysis to quantify RNA transcript levels of candidate genes to determine if specific variants were highly associated with transcript levels. Blood was sampled from a subset of 110 animals from those that were measured for OS with ethics approval from the Department of Economic Development, Jobs, Transport and Resources Animal Ethics Committee (2013-14). Blood was collected by venipuncture of the coccygeal vein after routine morning milking and was processed according to the blood fractionation and white blood cell stabilisation procedure in the RiboPure™ blood kit (Ambion by Life Technologies) protocol. Samples were then stored at -20 ° C. RNA was extracted from white blood cells using RiboPure Blood Kit (Ambion) according to manufactures instructions. RNAseq libraries were prepared using the SureSelect Strand Specific RNA Library Prep Kit (Agilent) according to manufacturer's instructions. Each library was uniquely barcoded randomly assigned to one of four pools and sequenced on a HiSeq™ 3000 (lllumina) in a 150 cycle paired end run. One hundred fifty base paired end reads were called with bcltofastq and output in fastq format. Sequence quality was assessed using FastQC. QualityTrim (https://bitbucket.org/arobinson/qualitytrim) was used to trim and filter poor quality bases and sequence reads. Adaptor sequences and bases with quality score less than 20 were trimmed from the end of reads. Reads with mean quality score less than 20, or greater than 3 N, or greater than three consecutive bases had quality score less than 15, or final length less than 50 bases were discarded. Only paired reads were retained for alignment.

Paired RNA reads for each sample were aligned to the Ensembl UMD3.1 bovine genome assembly using TopHat2 (Kim et al. 2013) allowing for two mismatches. Custom computer scripts were used to assess sequencing performance, library quality and alignment quality. Alignment files (.bam) for blood libraries >12.5 million read pairs (after quality control filtering) also having >80% mapping rate were retained for gene count matrix generation. Gene counts were evaluated using the python package HTSeq (Anders et al. 2014) and were combined to form a gene by sample count matrix. This count matrix was then normalised to take into account library size using the R software package, DESeq (Anders & Huber 2010). On average 60 million reads were generated per white blood cell library and an average of 89% reads passed quality control, of which an average of 92% mapped to the reference genome. After quality control, RNAseq data for 107 cows was included in the eQTL study. Only genes that were expressed in more than 25 cows were included for further analysis to avoid spurious associations due to very low read counts. A GWAS was then undertaken where the normalised counts of RNA transcripts for candidate genes were the 'phenotypes' (y in equation 1). As in equation 1 , each sequence variant was tested for association with the gene expression level ("eQTL"), testing only variants on the same chromosome as the gene under test.

Results

Phenotypic correlation between different OS species

A pairwise correlation analysis was conducted using the raw dataset that contains the relative abundance of the 12 major OS for 360 cows. A number of strong correlations in relative abundance (r > 0.6) were found across these OS (Table 3, bold).

Table 3: Pairwise correlation in abundance across the major OS species (n = 360)

Trios 6'- OS- e 3'-SL 6'-SL GNL SLN DSL OS-A OS-B C OS-D OS-E

Triose

3'-SL 0.15

6'-SL 0.22 0.42

GNL 0.06 -0.26 0.00

6'- SLN 0.34 0.19 0.63 0.16

DSL 0.1 1 0.18 0.31 0.22 0.69

OS-A 0.03 0.15 0.20 0.15 0.38 0.57

OS-B 0.32 0.13 0.15 0.03 0.19 0.26 0.65

OS-C 0.12 -0.10 0.05 0.84 0.31 0.34 0.15 0.02

OS-D 0.46 0.01 0.05 -0.15 0.06 0.32 0.44 0.95 -0.09

OS-E 0.05 0.14 0.15 0.13 0.52 0.57 0.33 0.15 0.20 0.04

OS-F 0.33 0.06 0.20 0.26 0.63 0.78 0.44 0.03 0.39 0.02 0.53

Genetic basis of OS traits

We first investigated the heritability of the OS traits to determine the proportion of the observed trait variation that is due to genetic factors rather than environmental variation or other biological factors. Heritability is the ratio of genetic variance to the total phenotypic variation (Table 4). Most of the OS traits show heritability between 50% to 84%, demonstrating that they are highly heritable, that is, differences between cows are in large part due to genetic factors.

Table 4. Hehtability, genetic and phenotypic variance for 12 bovine milk OS

Trait Genetic variance Phenotypic variance Heritability (s.e.)

Triose 1.82E+1 1 2.97E+11 0.61 (0.14)

3'-SL 2.4E+13 4.67E+13 0.51 (0.15)

6'-SL 4.05E+12 6.07E+12 0.67 (0.14)

GNL 6.63E+1 1 9.82E+11 0.67 (0.14)

6'-SLN 5.74E+10 8.36E+10 0.69 (0.15)

DSL 4.79E+1 1 6.93E+11 0.69 (0.14)

OS-A 1.15E+12 1.7E+12 0.68 (0.14)

OS-B 1.94E+10 2.31 E+10 0.84 (0.14)

OS-C 3.34E+10 6.06E+10 0.55 (0.15)

OS-D 9.69E+08 1.26E+09 0.77 (0.32)

OS-E 5.53E+10 1.41 E+11 0.39 (0.15)

OS-F 2.97E+09 3.58E+09 0.83 (0.14)

Power of association studies to detect QTLs

False Discovery Rates. Two outcomes produced by GWAS include p-values and estimated SNP effects. To assess the performance of GWAS, we first calculated the FDR rates for all the traits based on the estimated p-values (Table 5). Four p-value thresholds (p < 0.000001, p < 0.00001, p < 0.0001, and p < 0.001) were tested. For traits Triose, 6'- SL, 6'-SLN, DSL, OS-D, OS-E, and OS-F, the error rates are relatively high. For example, under the threshold p < 0.00001, the false discovery rate reached almost 100%, which means the GWAS results for these traits lack power. On contrary, the FDR rates for five traits 3'-SL, GNL, OS-A, OS-B and OS-C are relatively low. In particular, under the threshold p < 0.000001, the FDR of GNL and OS-C are only 0.2-0.3%, which provides strong evidence that the GWAS for these traits has ample power to detect real QTL. We therefore report details of QTL discovery only for these five traits with low FDR.

Table 5. Number of significant SNPs and False discovery rate (FDR) for 13 traits under four GWAS thresholds p < 0.000001, p < 0.00001, p < 0.0001, and p < 0.001.

Number of significant SNPs FDR (%)

Trait O.0000 <0.0000 O.000 O.00 <0.00000 O.0000 O.000 O.00

01 1 1 1 1 1 1 1

T ose 0 21 75 505 - 27 75 >100

3'-SL 6 12 127 591 9 47 45 96

6'-SL 0 0 46 583 - - >100 97

GNL 235 325 436 859 0.2 2 13 66

6'-SLN 0 0 36 357 - - >100 >100

DSL 0 0 58 493 - - 97 >100

OS-A 4 13 56 410 14 43 >100 >100

OS-B 12 30 78 585 5 19 72 97

OS-C 171 274 435 880 0.3 2 13 64

OS-D 0 4 22 334 - >100 >100 >100

OS-E 1 5 50 477 57 >100 >100 >100

OS-F 0 11 54 556 - 51 >100 >100

QQ plot. Quantile-quantile (QQ) plots were used to further investigate the quality of the above GWAS results for the five top traits (3'-SL, GNL, OS-A, OS-B and OS-C). Figure 2 illustrates that the highest observed -log p-values for each of the five traits are higher than expected under the null hypothesis of no true association. In particular, for traits GNL and OS-C, the observed -log p-values deviate considerably from the diagonal line, implying many moderately to highly significant p values are clearly more significant than expected under the null hypothesis.

QTL discovery from HD SNP GWAS

The GWAS results using HD SNP genotypes suggest the presence of several major QTL regions for the five OS traits with the lowest FDR (Figure 3). There are some obvious QTL peaks, which are likely to be near the region of causal mutations. Notably, a region on Chromosome 1 1 has a strong QTL signal affecting both GNL and OS-C (Figure 3).

Candidate gene and causal mutation discovery with sequence variants.

For QTL regions in Figure 3 with the lowest FDR, we undertook a further GWAS using imputed sequence variants on the relevant chromosomes. In theory, the causal mutations should be present in this sequence data, but it is difficult to pinpoint causal mutations in a GWAS because there are often strong associations between neighbouring alleles (linkage disequilibrium - LD). We therefore investigated LD between the most significant SNP and remaining SNP in the region to help identify likely candidate genes and putative causal mutations. The LD statistic (r 2 ) provides a basis for more precisely defining the most likely region for the causal mutation. The LD r 2 was estimated by the squared correlation between pairwise genotype allele counts using PLINK software (Purcell et al. 2007).

The results in Figure 4a and b demonstrate that GNL and OS-C share the same major QTL effect on Chromosome 11. The most significant SNP (104,229,609 bp) for both traits is just 1908 bp downstream of the ABO gene which codes for an enzyme involved in the oligosaccharide biosynthesis. The -log p-value was 44 and 38 for this top sequence variant for GNL and OS-C respectively, while in GWAS using HD SNP genotypes the most significant SNP had a lower -log p-value of 37 (GNL) and 31 (OS-C). The top sequence variant accounted for 78% and 84% of the genetic variance in GNL and OS-C respectively (Table 6) indicating that this, or another variant in strong LD, is responsible for most of the genetic variation in both traits. The "eQTL" analysis of sequence variants associated with ABO gene RNA transcript expression, revealed a tight cluster of 14 sequence variants (between 104,227, 1 11 and 104,229,385 bp) with the highest -log p-value (7.45) for this gene (Figure 4 and Table 5). This is very close to the top SNP in Figure 4, which strengthens the evidence that the causal variant may be a regulatory variant in this intergenic region controlling ABO gene expression.

Table 6. RNAseq eQTL analysis of ABO gene expression, testing all sequence variants on chromosome 11. The position of the most significant SNP associated with the ABO RNA transcript abundance is listed with the -log p-value.

Chromosome and position (bp) -log p-Value

Chr11:104225654 6.42

Chr11:104226169 6.42

Chr11:104226184 6.42

Chr11:104226396 6.42

Chr11:104227111 7.06

Chr11:104228091 7.06

Chr11:104228120 7.06

Chr11:104228217 7.06

Chr11:104228291 7.06

Chr11:104228607 6.71

Chr11:104228721 7.06

Chr11:104228726 7.06

Chr11:104228735 7.06

Chr11:104228842 7.06

Chr11:104228949 7.06

Chr11:104228983 7.06

Chr11:104229223 7.06

Chr11:104229261 7.06

Chr11:104229385 7.06 Chromosome and position (bp) -log p-Value

Chr11:104232298 6.42

Chr11:104232312 6.42

Chr11:104232319 6.42

Chr11:104232354 6.42

Chr11:104232725 6.19

Chr11:104232763 6.19

Chr11:104235463 6.42

Chr11:104235480 6.42

Chr11:104237486 6.42

Chr11:104237602 6.42

Chr11:104237645 6.42

Chr11:104237825 6.42

Chr11:104237889 6.42

Chr11:104238052 6.42

Chr11:104238257 6.42

Chr11:104238465 6.42

Chr11:104238576 6.42

Chr11:104251713 6.04

The most significant sequence variants for 3'-SL on Chromosome 1 (Figure 4c) were upstream of genes ST3GAL6 and CPOX and close to a small nucleolar RNA (SNORA68), suggesting that a causal variant in this intergenic region could be involved in regulating gene expression. Furthermore, the enzyme produced by the ST3GAL6 gene (β- galactoside a-2,3-sialyltransferase) is the key enzyme for the production of 3'-SL and the most significant SNP is in strong LD with other SNP around this genie region (Figure 4c). The most significantly associated variant explains 33% of the genetic variance, indicating a major effect on 3'-SL abundance. No strong eQTL effects were detected for either ST3GAL6 or CPOX genes.

The most significant sequence variants for OS-A and OS-B (Figure 4: d, e, f & g) each explained between 10 to 12% of the genetic variance (Table 7). For the OS-B trait there were three main QTL peaks on chromosome 10, 16 and 26, and together they explain 30% of the genetic variance. The most significant sequence variant on chromosome 10 is an intronic SNP in the ANKRD31 gene but is also very close to the GCNT4 gene (glucosaminyl (N-acetyl) transferase 4), which is also involved in milk oligosaccharide biosynthesis. For OS-A it is difficult to pinpoint a particular candidate gene: there are six genes in the chromosome region around the most significant variant and including those in very high LD (r 2 >0.8). Again, no eQTL effects were observed for these genes.

Table 7. Genomic information for the most significant GWAS sequence variants (multiple variants listed where they had equally significant p-values).

Most

Closest genes in significant Sequence Direction of Genetic

OS

QTL region 3 sequence variant minor allele variation

Trait(s)

(Chromosome) variant annotation effect explained (%) position (bp) b

38529260 Missense

RSPH10B (RSPH10B)

CCZ1 38541 187 Intronic

PMS2 (RSPH10B)

OS-A - 12%

AIMP2 38541547 Synonymous

ANKRD61 (RSPH10B)

EIF2AK1 (25) 3854451 1 Intronic

(RSPH10B)

53653341 Intronic

53653496 Intronic

53653712 Synonymous

KAZN (16) 53653953 Intronic OS-B + 11 %

53654074 Intronic

53654125 Intronic

53654187 Intronic Most

Closest genes in significant Sequence Direction of Genetic

OS QTL region 3 sequence variant minor allele variation

Trait(s)

(Chromosome) variant annotation effect explained (%) position (bp) b

53654618 Intronic

53654713 Intronic

53654869 Intronic

ANKRD31 6491671

Intronic OS-B + 10%

GCNT4 (10) (ANKRD31)

ATRNL1

36764962 Intergenic OS-B + 9%

GFRA1 (26)

a ^

a Genes in bold are known to be involved in OS metabolic pathway. D Multiple SNP listed for cases where more than one variant had equal p-values due to perfect LD between variants (i.e. 1^=1). ABO: transferase A, alpha 1 -3-N-acetylgalactosaminyltransferase; transferase B, alpha 1 -3- galactosyltransferase. ANKRD61 : ankyrin repeat domain 61 . ANKRD31 : ankyrin repeat domain 31 . AIMP2: aminoacyl tRNA synthetase complex-interacting multifunctional protein 2. ATRNL1 : attractin like 1 . CCZ1 : vacuolar protein trafficking and biogenesis associated homolog. CPOX: coproporphyrinogen oxidase. EIF2AK1 : eukaryotic translation initiation factor 2-alpha kinase 1 . GCNT4: glucosaminyl (N-acetyl) transferase 4. GFRA1 : GDNF family receptor alpha 1 . AZN: kazrin, periplakin interacting protein. PMS2: postmeiotic segregation increased 2. RSPH10B: radial spoke head 10 homolog B. ST3GAL6: ST3, beta-galactoside alpha-2,3-sialyltransferase 6.

Given the presence of very large QTL effects for several traits, this suggests that a simple strategy of marker assisted selection (MAS) could be implemented. The size of the major QTL effects were estimated and used to determine the potential for genetic improvement if animals were selectively bred to carry two copies of the favourable QTL alleles (Table 8). The QTL allele frequencies in the 332 experimental cows were found to be very similar to that of the Australian Holstein population generally (obtained from a large sample of industry animals). Naturally, the less common the favourable allele the higher the potential for genetic improvement in these traits and, with the exception of OS-A, the minor allele showed the favourable effect. Table 8. Predicted QTL effects and potential genetic improvement from marker assisted selection (MAS) for traits GNL, OS-C, 3'-SL, OS-B and OS-A.

MAF=Minor Allele Frequency

b Genetic average based on the marker effect and Hardy-Weinberg equilibrium genotype frequencies in the general Holstein population. c Potential average based on selection for the entire herd carrying only the favourable alleles.

d The difference between the current genetic average due to the favourable mutation, and the potential genetic average if all animals were selected to carry 2 copies of the favourable alleles.

Discussion

Although over 40 OS have been identified in bovine milk, the majority of them are present at trace level. Only the most abundant 12 species that can be reliably quantified without enrichment were surveyed in the first instance. These species are composed of 3-6 monomers with a molecular mass ranging from 500-1200 Da. In addition, half of them contains a sialic unit and thus are anionic. A strong correlation was observed between the abundance of some of the species, suggesting they are likely to share common steps in the biosynthesis pathway.

Nearly all OS are synthesised from lactose by successively adding various monomer units at different positions mediated by specific transferases. The large difference in abundance across OS species of the same monomer number suggests a remarkable difference in the activity of various transferases involved in OS synthesis. In addition, given the low level of OS in mature milk, the transformation from lactose to OS appears to be very inefficient in the mammary gland. Although some simple OS could be produced in vitro with the use of appropriate transferases, it is also possible to increase the level of intrinsic OS in milk through herd management and/or genetic selection of cows.

We have investigated the genetic architecture of OS accumulation in bovine milk. This study exploited sequence variants to fine-map six major candidate gene regions and putative causal mutations for five OS species. These OS included one high-abundance species (3'-SL), 2 intermediate-abundance species (GNL and OS-A) and 2 low-abundance species (OS-B and OS-C). It is worth noting that more QTL of minor influence would likely be detected by increasing the size of the mapping population and/or by refining the phenotype data. Therefore the list of QTLs found in this study is by no means exclusive, but highlights some major gene effects. The sequence GWAS fine-mapped a major QTL effect for GNL and OS-C which also overlapped a strong eQTL region that affected the expression of the ABO gene. The most significant SNP for OS-C and GNL was not among the top SNP in the eQTL region (Table 5) but was within 224 to 2498 bp of the top eQTL SNP. The RNAseq analysis was done on a subset of 105 cows so that the LD between SNP and a causal mutation could change compared to that in the 332 cows measured for OS. While applicant does not wish to be restricted by theory, our results suggest that the causal mutation may be a variant in a regulatory intergenic region that controls the expression of the ABO gene.

The ABO gene, codes for a 1-3-N-acetylgalactosaminyltransferase and a 1-3- galactosyltransferase, the former being the key enzyme for the synthesis of GNL from lactose. OS-C contains one extra Gal unit as compared to GNL and this structural similarity implies that GNL is likely to be the precursor of OS-C. This may explain the co-localisation of QTLs detected for these two species. The most significant sequence SNP at 104,229,609 bp was previously reported as being a putative causal mutation affecting overall milk protein yield in dairy cattle. The allele that increased the GNL and OS-C abundance also increased milk protein yield (a desirable quality). Additionally, the ABO gene was most highly expressed in lactating bovine mammary tissue and blood when compared to 17 other bovine tissues.

The most significant SNP for 3'-SL was fine-mapped close to a strong candidate gene (ST3GAL6) that codes for a 2-3-sialyltransferase: the key enzyme for the production of 3'- SL from lactose. It is interesting to note that no QTL was identified for 6'-SL, an isomer of 3'-SL, but this may be due to lack of power because 6'SL is at a lower abundance than 3'SL. In the case of OS-A and OS-B, the functions of the candidate genes that encompass the most significant SNPs are not known to be directly related to OS synthesis except for GCNT4. GCNT4 codes for glucosaminyl (N-acetyl) transferase 4 and is one of the key enzyme involved in biosynthesis of milk OS. For the remaining 7 major OS, no large QTLs were identified in this study. QTLs with moderate to large effects were detected for five of the OS species (GNL, OS-C, 3'-SL, OS-A and OS-B), accounting for 30 to 84% of genetic variance. This indicates that a simple marker assisted selection (MAS) strategy based on the described variants could more than double the OS abundance in milk (Table 7). We have also developed genomic predictions using all HD genome-wide markers in a single model. It is likely that the MAS approach will be equally accurate at this stage due to the presence of major QTL effects.

In conclusion, this is the first study on genetic architecture of bovine milk OS abundance using sequence variants. A total of six genomic regions were identified on five chromosomes, affecting five of the 12 major OS. Among the major OS species detected, the accumulation of GNL and OS-C was found to be largely controlled by a single QTL; a dramatic increase in the content of these OS by marker assisted selection can thus be expected. QTLs accounting for 33% and 21 % of variation were detected for 3'-SL and OS-B respectively, suggesting that genetic selection should also be effective in improving the concentration of these two species in bovine milk. Example 2: High value milk components and targets for genetic improvement

Oligosaccharide structures are shown in Table 8.

o Galactose

® Glucose

ϋ GicNAc ♦ NeuAc

Fucose

« < ? # Fructose

Table 8: Oligosaccharide structures A total of 13 major oligosaccharides were identified and quantified in milk (see Figure 5 and Table 9).

Hex: glucose or galactose; HexNAc: N-acetylglucosamine or N-acetylgalactosamine;

NeuAc: N-acetylneuraminic acid; NeuGc: N-glycolylneuraminic acid

N: neutral; A: acidic

Table 9: 13 milk oligosaccharides

The accumulation pattern of milk oligosaccharides was determined, showing inter-cow variation (Figure 6) and seasonal variation (Figure 7).

The structure of various milk oligosaccharides is shown in Figure 8 and their relative abundance is shown in Figure 9. Evidence of cow-specific accumulation of oligosaccharides is shown in Figure 10.

Finally, it is to be understood that various alterations, modifications and/or additions may be made without departing from the spirit of the present invention as outlined herein. References

Anders et al. (2014) HTSeq— a Python framework to work with high-throughput sequencing data. Bioinformatics 31 , 166-9.

Anders & Huber (2010) Differential expression analysis for sequence count data. Genome Biology 11 , R106.

Bolormaa et al. (2011) Genome-wide association studies for feedlot and growth traits in cattlel . Journal of Animal Science 89, 1684-97.

Daetwyler et al. (2014) Whole-genome sequencing of 234 bulls facilitates mapping of monogenic and complex traits in cattle. Nat Genet 46, 858-65. Erbe et al. (2012) Improving accuracy of genomic predictions within and between dairy cattle breeds with imputed high-density single nucleotide polymorphism panels. Journal of Dairy Science 95, 4114-29.

Lee et al. (2016) Rapid screening of bovine milk oligosaccharides in a whey permeate product and domestic animal milks by accurate mass database and tandem mass spectral library. J. Agric. Food Chem. 64, 6364-6374.

Liu et al. (2014) Simple Liquid Chromatography-Mass Spectrometry Method for

Quantification of Major Free Oligosaccharides in Bovine Milk. Journal of Agricultural and Food Chemistry 62, 11568-74.

Sargolzaei et al. (2014) A new approach for efficient genotype imputation using information from relatives. BMC Genomics 15, 1-12.