|
|
||||||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
ANIMAL GENETICS |




* University of California, Davis 95616;
and
University of Missouri, Columbia 65211;
and
Cornell University, Ithaca, NY 14850; and
Colorado State University, Fort Collins 80523
| Abstract |
|---|
|
|
|---|
Key Words: genetic evaluation genetic marker microsatellite on-farm expected progeny difference paternity single nucleotide polymorphism
| INTRODUCTION |
|---|
|
|
|---|
A set of marker loci can be characterized by its cumulative parentage exclusion probability (PE), i.e., the probability that a random individual other than a true parent from a population in Hardy-Weinberg equilibrium can be proven not to be the true parent of another randomly chosen individual. For unrelated sires, the probability of unambiguous parentage assignment is equal to PE raised to the power of the number of nonparent candidate bulls in the breeding group (Sherman et al., 2004
). Although the number of bulls in a breeding group does not directly affect PE, the likelihood of unambiguously identifying the true sire by excluding every nonparent candidate decreases when more candidate sires are present. In herds with many natural service sires in a breeding group, panels with low PE may result in multiple bulls qualifying as possible sires for a single calf [i.e., not being excluded as a sire (Sherman et al., 2004
)]. Rather than discarding information from such individuals from sire evaluations, a calfs performance can be fractionally assigned to all qualifying bulls using likelihood scores derived from their genotypes (Weaber, 2005
).
Here, we report on field data using a 28 SNP panel to assign paternity to calves in a commercial herd that employed a multiple-sire breeding pasture. A comparison was made between genetic evaluations obtained when using a powerful STR "gold standard" marker panel to assign paternity and those obtained when using the 28 SNP panel with a comparatively low PE in combination with software designed to fractionally assign the performance of calves to multiple qualifying bulls. Additionally, simulations were performed to model the effect of loci number, minor allele frequency (MAF), and number of offspring per bull on the accuracy of genetic evaluations based on parentage determinations derived from SNP data.
| MATERIALS AND METHODS |
|---|
|
|
|---|
Cattle Population and DNA Collection
Blood or semen samples were obtained from 8 Angus and Hereford AI or 27 natural-service herd sire candidates run as a single cohort in a multiple-sire breeding pasture, and their 625 yearling progeny. Artificial insemination was performed on a subset of cows before their exposure to the herd sires. The DNA was isolated from semen using a standard phenol-chloroform extraction method, and from FTA cards (Whatman Inc., Florham Park, NJ) according to manufacturers instructions. The herd sires included 4 paternal half-sib groups: 3 groups consisting of 2 sires, and 1 group consisting of 5 sires. Herd sires ranged from 1 to 8 yr of age at the time of breeding. All bulls had passed a breeding soundness examination by a licensed veterinarian before the breeding season.
Genetic Testing and Parentage Assignment
The STR genotyping based on 23 STR named AD-CYC, BM203, BM888, BM1818, BM1824, BM2113, BM4107, BM4208, BRN, CYP21, ETH10, ETH152, ETH225, INRA23, OarFCB5, OarFCB193, RM006, RM067, SPS115, TGLA94, TGLA122, TGLA126, and TGLA227 was performed at the Veterinary Genetics Laboratory (VGL), University of California, Davis. These markers are routinely used at VGL for parentage verification of cattle. Primer sequences for the STR are available in public databases and can be obtained through marker name queries using "Marker Search" at the US Meat Animal Research Centers cattle genome Web site (http://www.marc.usda.gov/genome/genome.html; last accessed August 2007) and the "Request on Loci" at BOVMAP database (http://locus.jouy.-inra.fr/cgi-bin/bovmap/intro2.pl; last accessed August 2007). The PCR reactions with fluorescence-labeled primers were carried out according to VGL standard protocols, and amplicons were resolved by capillary electrophoresis on ABI 3730 sequencers (Applied Bio-systems, Foster City, CA). The fragment size analysis software STRand (http://www.vgl.ucdavis.edu/informatics/STRand/; last accessed August 2007) was used for genotyping.
The SNP-based genotyping using 28 SNP (GenBank accessions AY761135, AY773474, AY776154, AY841151, AY842472, AY842473, AY842474, AY842475, AY849380, AY850194, AY851162, AY851163, AY853302, AY853303, AY856094, AY857620, AY858890, AY860426, AY863214, AY914316, AY916666, AY919868, AY929334, AY937242, AY939849, AY941204, AY942198, AY943841; http://www.ncbi.nlm.nih.gov/sites/entrez?sdb=Nucleotide; last accessed August 2007) derived from the Heaton et al. (2002)
paper was performed by a commercial genotyping company (Genaissance, Duluth, GA).
Genotyping results from these 2 sets of analyses were run through Cervus (http://www.fieldgenetics.com/pagesaboutCervus_Overview.jsp; last accessed August 2007; Marshall et al., 1998
), to determine the number of alleles per loci or MAF in the case of the biallelic SNP panel. Samples that contained DNA from more than 1 animal were removed from STR analysis before the data were run through the program. This program was also used to estimate number of alleles, observed and expected heterozygosity (assuming Hardy-Weinberg equilibrium), polymorphic information content (PIC; Botstein et al., 1980
), goodness-of-fit Hardy-Weinberg equilibrium test, and loci exclusion probabilities for the situation where genotypes were available for putative parents of one sex but the other parent was unknown [Excl(1)] or both parental genotypes were available and one of the parents is known with certainty [Excl(2)]. The cumulative parentage exclusion probability (PE) for the 2 marker sets was calculated according to Jamieson and Taylor (1997)
.
Paternity based on STR was determined by comparing the genotypes of all 35 potential sires against each calfs genotype. An exclusion was recorded when a bull and a calf had no allele(s) in common at a locus. Sire assignments based on STR were made in 2 rounds of analyses. First, sire(s) was assigned to a calf if there were no exclusions. Second, for remaining calves, a sire was assigned if he was the only bull with a 1-locus exclusion. Paternity was denoted unknown if no bull met either criterion.
Paternity based on SNP was assigned with the Sire-Match software (E. J. Pollak, Cornell University), which uses a likelihood-based method to compute a probability that a putative sire is the true sire given genotypes of the calf, the dam, and all putative sires. In the current study, the dams genotype was not collected and so population genotype frequencies computed from the genotypes of all bulls and calves were used. Genotype mismatches were not permitted; one excluding locus disqualified a bull. In cases where 2 or more bulls qualified with no exclusions as potential sires for a given calf (for both STR and SNP panels), probabilities from Sire-Match were used to either categorically assign each calf to the single most probable sire or fractionally assign calves to all compatible (no exclusions) bulls for the genetic evaluations.
Genetic Evaluation
A genetic evaluation of 583 weaning weight records from progeny of 27 herd bulls and 8 AI sires was carried out using a sire model equation, y = Xb + Zu + e, where y represented adjusted weaning weight observed from a single cohort, b was a vector of sex effects (bull, steer, heifer), u was a vector of direct sire progeny differences, X was an incidence matrix relating weaning weight observations to their sex, Z was an incidence matrix relating calves with weaning weights to their potential sires using parentage probabilities ascertained from either SNP or STR markers, and e was a vector of residuals. In contrast to the usual Z matrix that contains a single nonzero element of unity in each row, in the column corresponding to the sire of the calf represented by that row, this incidence matrix included as many nonzero elements in each row as there were potential sires for the calf. The sum of all the nonzero elements in any row was always 1. The number of nonzero elements in any column was the actual number of progeny that were assigned to that sire with any nonzero probability, whereas the total of any column was the equivalent number of summed progeny assigned to that sire (i.e., the sum of the fractional probabilities). The vector u included all known male ancestors of the calves (i.e., sires and paternal grandsires). This enabled straightforward computation of the inverse of the numerator relationship matrix and accounted for the half-sib relationships that existed between some sires. Weaning weights were adjusted for age at weaning and dam age according to BIF guidelines (BIF, 2002
) using the computer software CattlePro (Bowman Farm Systems Inc., Cynthiana, KY), and the heritability of direct weaning weight was assumed to be 0.25. Records from calves with uncertain paternity have reduced genetic variation, and so the residual variance was inflated for these animals in order that the assumed phenotypic variance was identical regardless of paternity probabilities. The resulting mixed model equations were solved directly, and BIF accuracies (BIF, 2002
) were computed from diagonal elements of the inverse coefficient matrix as if the assumed paternity was exact. Identical procedures were used in the analysis of simulated data.
Simulation Studies
Two studies were undertaken using simulated markers to investigate the influence of number of markers, MAF, and the number of offspring per bull. The first study quantified the probabilities a calf in this experiment would be assigned a single sire on the basis of its genotype. The second study determined the impact of uncertain paternity on the accuracy of EPD estimated from field data.
A more comprehensive SNP panel representative of the field information was simulated by creating 2 additional SNP markers with identical MAF to each of the 28 actual SNP. This generated a set of 84 realistic loci that was used to investigate PE for marker panels ranging in size from 4 to 84 loci. An ideal panel with maximal exclusion rate was also created with the same number of markers but with a MAF of 0.5 at each locus. The probability of a calf having a unique sire pedigree assignment [P(unique sire assignment)] was computed for various panel sizes in increments of 4 loci, for both the realistic and maximal hypothesized panels. This probability was computed as: P(unique sire assignment) = (PE)n–1, where n = the number of possible sires, set at 27 to correspond to the number of natural service sires in the field study.
The second study simulated individual phenotypes and genotypes for 4 to 40 markers corresponding to a known pedigree involving 20 unrelated sires, with 5 or 30 progeny for every sire. The markers had realistic MAF based on the field study or maximal exclusion probabilities by assuming a MAF of 0.5 at all loci. The 40 markers were assumed to assort independently. Exclusion probabilities for each marker set and the theoretical maximum power of exclusion, assuming equal MAF, were computed as suggested by Jamieson and Taylor (1997)
.
Simulation of Sire Breeding Values and Offspring Phenotype. The true breeding values were simulated for the 20 sires using a normal distribution and were then used for the simulation of progeny phenotypes. Progeny phenotypes were simulated by adding half the breeding value of the sire to a normal deviate chosen to reflect a trait with h2 = 0.25, the value assumed in analyzing the field data.
Simulation of Sire and Offspring Genotypes. Alleles at each locus were simulated for each sire by sampling a random number from the Uniform (0, 1) distribution. If the realization was less than or equal to the MAF then the first allele was assigned, otherwise the alternate allele was assigned. Genotypes for progeny were generated by sampling a single allele from the sire pair, and a second from an unknown dam population with equal allele frequencies.
Paternity Probabilities. Paternity probabilities were assigned using a likelihood based approach analogous to the algorithm used by Sire Match. Genetic evaluations were undertaken using the same procedures described for the field data, except that an additional evaluation could be undertaken using the actual relationships that were used in simulating the data.
Statistical Analysis and Data Sampling. Accuracy of evaluations, traditionally defined as the correlation between true and estimated breeding value, were computed as Pearson product moment correlations.
Replicates. Two hundred samplings of sire breeding values and progeny genotypes and phenotypes were created for each scenario (number of markers, MAF, progeny number). For each scenario, paternity probabilities were computed, followed by genetic evaluation and computation of the accuracy of evaluation.
| RESULTS AND DISCUSSION |
|---|
|
|
|---|
|
|
|
The 23-loci STR panel used in this study had higher exclusion power than is typically used in commercial genotyping laboratories. The median number of ISAG-recommended STR microsatellite used in cattle genotyping is 12 loci (Baumung et al., 2004
), and such panels would have lower PE. The 23-loci STR panel helped ensure sufficient power to determine correct paternity for each calf in the sample, enabling a valid evaluation of the SNP assignment. Figure 2
shows the results of the first simulation study that examined the number of SNP loci that would be required to achieve a given PE assuming 1 unknown parent, given equal or simulated (based on field study frequencies) MAF at each SNP locus, and independently assorting loci. It can be seen that the observed PE (0.956) in this study with 28 SNP loci was lower than the theoretical maximal exclusion rate with equal MAF at the 28 SNP loci (0.976). This was due to some SNP loci having low MAF in the population examined (Table 1
).
|
The calculation of PE based on allelic frequencies assumes all candidate bulls are unrelated. In the context of a commercial ranch setting where half-sib sires are common, it is important to recognize that as the number or relatedness of putative sires in a multiple-sire breeding group increases, additional numbers of marker loci would be required to maintain single sire assignments at a fixed rate (Pollak, 2005
).
A comparison of the sire evaluations for weaning weight derived from pedigree assignments for the 2 panels used in the field study is shown in Table 3
. In this analysis multiple bulls qualifying to a single calf were fractionally assigned to all qualifying males according to the likelihood score computed by Sire Match. The STR pedigree included 24 sires, 12 paternal grand sires, and 503 calf records, whereas the SNP pedigree included 35 herd sires, 14 paternal grand sires, and 558 calf records. The reason for the larger number of calf records in the genetic evaluation based on the SNP panel assignments was that the SNP panel analysis included records from some calves that were excluded from all known sires in the STR panel analysis. In the case of the SNP panel with a PE of 0.956, bulls that actually sired no progeny according to the STR panel were incorrectly assigned fractional probabilities of calves summing to as many as 9 equivalent progeny (the sum of the fractional probabilities for each sire), which resulted in them receiving an erroneous EPD. Although the correlation between the STR and SNP-based EPD was 0.94 for the 24 bulls that had progeny according to the STR panel, the generation of EPD for bulls that actually sired no progeny would create obvious problems from the perspective of genetic improvement. In some cases, erroneous assignments based on the SNP-panel results placed bulls among the top 10 bulls for weaning weight EPD, although in this particular example the BIF accuracy of such EPD was never higher than 0.14, and the number of equivalent progeny was less than 10. Ideally, the resolution of the marker panel would be sufficient to minimize the assignment of equivalent progeny to such sires.
|
|
|
There was a large variability in calf output reflecting variation in mating success to the extent that a large proportion of young bulls did not sire any offspring. Other studies have reported similar variability in calf output among herd sires, and further found success is moderately repeatable providing the composition of the bull mating groups remains similar (DeNise, 1999
; Holroyd et al., 2002
). The unexpectedly large number of young bulls that did not produce any progeny in this particular trial would present an obvious problem for sire evaluation programs based on progeny tests, especially when considering that the years of service subsequent to genetic evaluation have a significant effect on the return on investment for progeny testing (Weaber, 2005
). A separate multiple-sire breeding pasture for yearling bulls would be advantageous.
Using DNA testing to generate on-farm EPD for sires in multiple-sire breeding groups represents a promising application of biotechnology. The simulation and field data presented in this study suggest that SNP panels for some commercial applications may need to have a higher PE than can be achieved with the 32 - 37 SNP loci panels that have been proposed for use in bovine parentage analysis (Heaton et al., 2002
; Werner et al., 2004
). Variable calf output and MAF, large sire cohorts, relatedness among sires, and missing data can all negatively impact the accuracy of on-farm EPD. In field situations where several of these variables occur concurrently, the use of marker panels with high PE values will be required to obtain accurate EPD.
Single nucleotide polymorphism discovery is ongoing, and already bovine SNP panels for parentage determination that use over 100 loci are commercially available. The use of progeny testing to develop within-herd EPD for herd sires on economically-relevant traits has the potential to generate value by improving the response to selection for targeted traits. The return on investment that results from such progeny testing was found to be greatly influenced by the cost of parentage determination (Weaber, 2005
). New SNP genotyping platforms continue to drive down the cost to generate SNP genotypes, and the future will undoubtedly see the introduction of inexpensive genotyping assays using high resolution SNP parentage panels. This will improve the accuracy of sire assignments and on-farm genetic evaluations, and may result in progeny testing becoming an economically viable option for commercial ranchers.
This case study illustrated some problems that may be encountered in paternity testing in large commercial herds. Field data are likely to include both missing sires and sires that did not produce any progeny. Low resolution marker panels and large cohorts of potential herd sires are particularly problematic and may result in sire-assignment errors and imprecise genetic evaluations. The frequency of sire misassignment can be minimized by using a high resolution panel or by simple management practices that include dividing large herds into smaller multiple-sire breeding groups with fewer sires while maintaining the same bull:female ratio, genotyping all potential bulls before breeding, sorting bulls into sire groups with divergent genotypes, keeping young bulls in separate breeding groups, and minimizing relatedness among bulls.
| Footnotes |
|---|
2 Corresponding author: alvaneenennaam{at}ucdavis.edu
Received for publication May 18, 2007. Accepted for publication July 26, 2007.
| LITERATURE CITED |
|---|
|
|
|---|
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |