|
|
||||||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
ANIMAL GENETICS |


* Facultad de Agronomía, Universidad de la República, 12900 Montevideo, Uruguay; and
Department of Animal and Dairy Science, University of Georgia, Athens 30602-2771
| Abstract |
|---|
|
|
|---|
Key Words: beef cattle calving day calving success censored record model comparison threshold-linear model
| INTRODUCTION |
|---|
|
|
|---|
An appropriate model for days to calving or the equivalent calving day (CD) should consider the best approach for handling animals that fail to calve. Johnston and Bunter (1996)
generated censored records by adding a penalty to the largest record within a contemporary group. Donoghue et al. (2004a)
made random draws from truncated normal distributions to generate a specific censored record for each cow. Forni and Albuquerque (2005)
simply ignored the information from noncalving cows.
Urioste et al. (2007)
considered CD records as missing and used a threshold-linear (TL) approach (Foulley et al., 1983
; Janss and Foulley, 1993
), including the binary trait calving success (CS) as a correlated trait. In regular censored models, the censoring is by the same trait (e.g., Hughes, 1999
). In 2-trait models, it is assumed that one trait is censored by another correlated trait (Arnason, 1999
; Foulley, 2004
). The latter is often used in econometrics and is called a type II Tobit model (see Amemiya, 1984
, for a general reference). The possible superiority of TL models over multiple trait (MT) linear models with censoring for evaluating fertility traits in beef cattle has not been quantified. Following our previous study on fertility traits in beef cattle, the objective of this study was to compare genetic models for fertility traits, investigating the possible superiority of TL models over the MT censored models.
| MATERIALS AND METHODS |
|---|
|
|
|---|
Data
The data consisted of 6,763 records on CD and CS from 3,442 spring-calving cows, born between 1975 and 2000 in 19 herds, obtained from the Uruguayan Aberdeen Angus Recording System. The CD was defined as the number of days from the beginning of a herds calving season to a cows calving date. The CS was defined as a binary trait: females that calved were coded as 1 and cows without a recorded calving in a specific year, but appearing in subsequent year(s), were assigned a 0 (failure) in the corresponding year(s) between 2 identified calvings. The data included cows with a clearly identified first calving at the age of 2 or 3 yr, and 2 subsequent calvings. Cows with first calving record at 4 yr of age were assumed to have failed at age 3 and were added, provided they had a calving interval of at least 10 mo with respect to the last recorded date in the same herd in the previous year. The pedigree file had 7,748 animals, with 72% of them having both sire and dam information, and a further 10% having an identified sire.
Editing procedures are fully described in Urioste et al. (2007)
. Briefly, calving records from cows with a missing birth date or born in the fall, with an age less than 600 d at calving, used as embryo transfer donors or recipients, with missing sire, with calving interval less than 280 d, or from herds with too few records were removed. Data from year 2004 (last year of recording) were eliminated because no inferences could be made about the presence or absence of cows in the herd. Records from contemporary groups with less than 5 records, or that included less than 2 records from cows that actually calved, were also deleted.
Statistical Models
Three approaches were defined to handle CD observations on animals that failed to calve, and specific data sets were created: i) cows were assigned a penalized CD value corresponding to the last observed CD plus a fixed penalty of 21 d within contemporary group (PEN); ii) estimation of censored records was simultaneously obtained from a truncated normal distribution, using data for animals in the same contemporary group (CEN); and iii) records were regarded as missing, and parameters were estimated in a TL analysis including the observed binary traits CS in different parities (TLMISS).
Trivariate statistical models were fitted for the PEN and CEN data sets, treating calving day at first, second, and third calving opportunity (CD1, CD2, and CD3, respectively) as separate traits. The general model, in matrix notation, can be written as
![]() | [1] |
where y is a vector of observed and predicted CD; ß is a vector of systematic effects; a is a vector of animal additive genetic effects; e is the vector of residual effects; and X and Z are the corresponding incidence matrices. The ß vector included the contemporary group effect (herd x year x mating management group) in each of the first 3 calving opportunities (173, 135, and 100 levels, respectively) and the effects of age at calving and physiological status at mating (lactating or nonlactating cow). Age at calving had 3 levels within calving opportunity (animals within ± 1 SD of the mean, less than –1 or more than 1 SD from the mean). Residuals were assumed to be correlated and to follow a normal distribution. Animal additive genetic effects followed the multivariate normal distribution, a ~ N(0, G0
A), where G0 is the (co)variance matrix between animal effects and A is the matrix of known additive relationships between animals.
For the TLMISS data set, a 6-trait model was adopted, including the binary traits of calving success at the first 3 calving opportunities (CS1, CS2, and CS3, respectively) to the above mentioned linear CD traits (CD1, CD2, and CD3). The model is
![]() | [2] |
with definitions as in Eq. [1], but y now includes observed and missing CD records and liabilities of CS; ß is a vector with corresponding levels of age at calving and physiological status at mating; c is a vector with uncorrelated, contemporary group effects; and residuals are also uncorrelated.
Genetic parameters used in the respective models were those obtained by Urioste et al. (2007)
. Breeding values were drawn from the posterior distributions using Gibbs sampling, as implemented in the programs TM, kindly provided by A. Legarra, INRA-Castanet Tolosan, France, and Thrgibbs2f90 kindly provided by S. Tsuruta, University of Georgia, Athens. Based on the preGibbs diagnosis used (program Postgibbsf90 by S. Tsuruta) and on visual inspection of trace plots, a chain of 20,000 iterations was run for all models, with a burn-in of 4,000 rounds, keeping a sample every 50th iteration for inference of posterior features.
Comparison of Models
The alternative models were compared in 3 ways. First, the models were contrasted using a simple data splitting technique and Pearson product moment correlations between predicted breeding values (PBV) for a pair of subsamples. Second, rank correlations between PBV obtained with the complete record data set were calculated to compare differences in the ranking of animals by models. Third, the percentage of sires selected in common using the different approaches was inspected at 2 hypothetical percentages of animals selected.
Data splitting or cross-validation is a method of model selection according to the predictive ability of a class of models (Shao, 1993
). The application usually involves omitting a portion of the available data, fitting a prediction model to the remainder of the data (training set), and then testing the model fit on the omitted portion (prediction set). In that way, one obtains an estimate of the deterioration in quality (McCarthy, 1976
). If data are split randomly, the second set imitates a sample of future observations (Picard and Cook, 1984
).
The splitting technique used in this study involved a random partitioning of the complete data set into 2 subsets, with approximately one-half of the records in each subset. Solutions for each of the models were obtained for both subsets, and the correlation between predicted breeding values from the 2 subsets was calculated. For each model, the estimated correlation coefficients between the 2 samples provide an informative comparative assessment of model prediction performance, useful for ranking several candidate models. Greater correlation estimates between complementary subsets implied a greater stability of the model for predicting breeding value solutions in deleted records of animals.
| RESULTS AND DISCUSSION |
|---|
|
|
|---|
|
Correlations between subsamples for CS can only be reported for the TLMISS approach. They were somewhat lower than for the CD estimations presented in Table 1
, with values of 0.65, 0.62, and 0.52 for first, second, and third calving opportunities, respectively.
The results for CD traits are in accordance with our preliminary observations in a previous study (Urioste et al., 2007
), where TLMISS provided a clearer and more informative picture of relationships between CD and CS, with more consistent estimation of genetic (co)variation between the 2 traits and good agreement with results from other studies (Johnston and Bunter, 1996
; Donoghue et al., 2004b
). A continuous measure of fertility, such as CD, allows for the identification of animals that conceive early in the breeding season but has the disadvantage that it is not observed in noncalving cows. On the other hand, the binary trait CS is readily observable and presents high genetic correlations with CD.
Arnason (1999)
developed methodology for this type of data structure and applied it to genetic evaluation of Swedish standard-bred trotters for racing performance (a linear trait) and racing status (a binary trait, starting and nonstarting horses, where the latter do not produce a record). He also conducted a simulation showing that including an all or none variable as a correlated trait resulted in greater correlations between true and estimated breeding values, compared with a single trait AM-BLUP model based on observed performance records. The methodology was further used by Thuneberg-Selonen et al. (2001)
for racing performance and racing status in Finnish horses. For computational convenience they treated the categorical trait as linear.
Correlations of PBV for CD from the 3 models from the first sample with PBV for CS using TLMISS from the second sample, and vice versa, are presented in Table 2
(average value of 2 comparisons). The TLMISS approach is provided as a reference because breeding values derived using this model implicitly considered the genetic correlations estimated by Urioste et al. (2007)
and, therefore, should provide the greatest correlations between PBV for CD and CS. The performance of CEN was generally better that of PEN in providing PBV for CD that were more highly correlated with CS. The greatest differences between the censored models and TLMISS were found in the third calving opportunity, where number of observations was the lowest, suggesting that TLMISS may have reduced the error in estimation of CD through the use of the genetic relationship with CS.
|
|
Table 4
displays the number of bulls that were selected in common, for scenarios with 10 or 25% of the bulls selected for breeding, when the aim was to improve CD or CS. Although the TLMISS model does not provide true breeding values, it was adopted as a reference model, given the results presented in Tables 1
to 3![]()
. Consequently, the other models were compared with it to determine their ability to identify the same greatest ranked sires.
|
On the whole, TLMISS seemed to be more accurate than CEN or PEN as a model for genetic evaluation of fertility in beef cattle. This suggests that the assumption of CD being fully censored by the length of the breeding season is less accurate than the assumption of CD being censored by many factors, where the length of the current breeding season is just one of them. For example, in the study by Urioste et al. (2007)
, CD at the second opportunity was strongly influenced by CD in the first opportunity.
Improved information is especially important in countries where data are scarce and costly. For small national data sets, efficient use of information is vital to genetic evaluation. This perspective becomes even more important in countries without well-established fertility recording schemes. Models that take these circumstances into account are of great value. The evidence produced by this study suggests the need to include records under the TLMISS assumptions to estimate genetic differences in fertility for sires more accurately. Donoghue et al. (2004b)
used a TL approach to combine measures of the trait calving to first insemination obtained from both natural and artificial matings. Ponzoni (1992)
, in an article reviewing calving rate (equivalent to CS in the current study) and CD, stated that calving rate could provide greater genetic gains in reproduction, given the genetic and economic assumptions in his study. He concluded, however, that from a genetic point of view, the difference between using one or the other trait would be small compared with the effect of ignoring reproduction altogether.
The PEN and CEN models are useful alternatives for genetic evaluations; however, application of the TLMISS model highlights the benefits of incorporating all records collected, either on CS scores or on both traits simultaneously, to improve accuracy of evaluation and to adjust for potential bias that may occur as a result of missing CD records. Use of the TL animal model for genetic evaluation is attractive because it presented more consistent predictions of breeding values with different data structures, plus missing records do not need to be predicted. If the genetic relationships are sufficiently complete, as in our case, it can be useful for practical genetic evaluations of fertility. In practice, it may be possible to use a multitrait approach for CD and CS, but report only 1 genetic value that could be used for selection purposes. Other approaches could include use of the more simple and parsimonious repeatability models.
For a measure to be useful in a national genetic evaluation scheme, it must be heritable and cost effective to measure and record. The potential of using a TL model approach for fertility traits has been established. Calving success allows the identification of animals that have missing CD records and genetic ties are used to adjust for potential bias. The genetic correlations reported in an earlier study indicate a strong, favorable relationship between the 2 traits. Selecting for reduced CD will lead to correlated increases in calving success.
| Footnotes |
|---|
2 Corresponding author: jurioste{at}fagro.edu.uy
Received for publication August 10, 2007. Accepted for publication July 17, 2007.
| LITERATURE CITED |
|---|
|
|
|---|
This article has been cited by other articles:
![]() |
K. M. Cammack, M. G. Thomas, and R. M. Enns Review: Reproductive Traits and Their Heritabilities in Beef Cattle Professional Animal Scientist, October 1, 2009; 25(5): 517 - 528. [Abstract] [PDF] |
||||
![]() |
Y. Hou, P. Madsen, R. Labouriau, Y. Zhang, M. S. Lund, and G. Su Genetic analysis of days from calving to first insemination and days open in Danish Holsteins using different models and censoring scenarios J Dairy Sci, March 1, 2009; 92(3): 1229 - 1239. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |