J. Anim. Sci. 2005. 83:62-67
© 2005 American Society of Animal Science
Studies on multiple trait and random regression models for genetic evaluation of beef cattle for growth
J. Bohmanova1,
I. Misztal and
J. K. Bertrand
Department of Animal and Dairy Science, University of Georgia, Athens 30602-2771
 |
Abstract
|
|---|
A simulation study examined issues important for genetic evaluation of growth in beef cattle by random regression models with cubic Legendre polynomials (RRML) and linear splines with three knots (RRMS) compared with multiple-trait models (MTM). Parameters for RRML were obtained by conversion from covariance functions. Parameters for MTM and RRMS were extracted from RRML at 1, 205, and 365 d; parameters for RRMS were the same as MTM for all effects except the permanent environment and the residual. Four data sets were generated assuming RRML included records at 1, 205, and 365 d; at 1, 160 to 250, and 320 to 410 d; at 1, 100, 205, 300, and 365 d; and at 1, 55 to 145, 160 to 250, 275 to 325, and 320 to 410 d. Accuracies were computed as correlations between the true (simulated) and predicted breeding values. With the first data set, excellent agreement in accuracy was obtained for all models. With the second data set, the accuracy of MTM dropped by up to 1.5% compared with the first data set, but accuracy was unchanged for both RRML and RRMS. With the third (fourth) data set, accuracies of RRML were up to 2.4% (2.5%) higher than with the first (second) data set. Small differences in accuracy between RRML and RRMS were found with the third and fourth data sets, which were traced to inflated correlations especially between 1 and 205 d in RRMS; inflation could be decreased by adding one extra knot at 100 d to RRMS. Diagonalization of random coefficients was crucial for RRML but not for RRMS, resulting in approximately six (two) times faster convergence with RRML (RRMS). Reduction of dimensionality in RRML associated with small eigenvalues caused a less accurate evaluation for birth weight. Genetic evaluation of growth by RRM requires careful implementation. The RRMS is simpler to implement than the RRML.
Key Words: Growth Random Regression Model Splines
 |
Introduction
|
|---|
One alternative to the genetic evaluation for growth by multiple-trait model (MTM) is a random regression model (RRM). In a study by Meyer (2004)
, RRM was up to 5.9% more accurate than MTM. The gain was a result of more appropriate modeling of genetic parameters, avoidance of age preadjustment, and utilization of a larger amount of data.
In practice, the superiority of RRM over MTM depends on the quality of implementation. Nobre et al. (2003a)
compared MTM and RRM using Legendre polynomials (RRML) for genetic evaluations of Nelore cattle. The RRML had poor numerical properties until reparameterization by diagonalization. After diagonalization, evaluations with RRML were still less accurate than MTM because parameters previously estimated with RRML contained many artifacts and were inaccurate.
Parameters estimated with RRML are prone to contain artifacts due to data distribution (Misztal et al., 2000
; Nobre et al., 2003b
), especially when direct-maternal covariance is estimated. To avoid such problems, Legarra et al. (2004)
constructed "artifact free" parameters for RRML using literature estimates for MTM, RRM, and heuristics.
One alternative to RRML is RRM with splines (RRMS). White et al. (1999)
used natural cubic splines to model lactation curves. The RRMS have good flexibility, are smooth, and have limited sensitivity to the data (Druet et al., 2003
). The RRMS using linear splines that have knots at standard ages of MTM can use parameters of MTM directly for almost all effects, greatly simplifying the model; however, it is not clear whether such RRMS match the accuracy of RRML.
The main objective of this study was to examine issues important for an implementation of RRM in a genetic evaluation of beef cattle for growth by simulation. The secondary objective was to compare RRM based on Legendre polynomials and linear splines.
 |
Materials and Methods
|
|---|
Simulation
Three nonoverlapping generations of animals were simulated. The base population consisted of 8,400 dams and 600 sires. The total number of animals in the pedigree file was 38,400, of which 29,400 had records. In each generation, males and females were mated at random. The simulated pedigree and data did not aim to reflect the beef population accurately, but were designed to facilitate comparisons and testing.
Parameters used in simulation were those constructed by Legarra et al. (2004)
. In this study, existing estimates from MTM and RRM were combined, corrected for smoothness, converted to multiple-trait parameters with traits defined every 30 d, and again converted to parameters of RRM with cubic Legendre polynomials.
Data were simulated assuming cubic regression on Legendre polynomials of age for direct genetic, direct permanent environmental, maternal genetic, and maternal permanent environmental effect. Average growth curve was modeled by a fixed linear regression on days of age nested within contemporary group. Residual variance was fitted with a linear spline on days of age. Use of RRML rather than MTM enabled data simulation for records at any age.
Four data sets were simulated. The first data set (3EXACT) consisted of three records per animal at exactly 1, 205, and 365 d of age. With this data set, properly designed RRM should be in perfect agreement with MTM. The second data set (3SPREAD) contained three records per animal. These records were located at 1 d and in 45-d intervals around 205 and 365 d of age. The distribution of the spread in this and later cases was uniform. With the second data set, RRM would be expected to maintain accuracy because it accounts for changes in variances, whereas accuracy of MTM would be expected to be lower. The third (5EXACT) data set was formed by adding records at 100 and 300 d of age to the first data set. The fourth data set (5SPREAD) was created by including two extra records in a 45-d interval around 100 d, and a 25-d interval around 300 d of age to the second data set. With extra records, RRM was expected to be more accurate than MTM and that accuracy should be similar for both data sets.
Models
Multiple-Trait Model.
The MTM with traits defined at 1, 205, and 365 d of age was as follows:
where CGi is a fixed effect of contemporary group i for trait t; bt is a covariable for a preadjustment to constant age; dft is a deviation from constant age; djt is a direct genetic effect of animal j for trait t; and mkt and mpkt are maternal genetic effect and maternal permanent environmental effects for dam k, respectively. Random measurement error is denoted by eijkt. The variances and covariances were:
where Gd and Gm are covariance matrices of direct and maternal genetic effects; Gdm is a matrix of genetic co-variances between direct and maternal effects; A is an additive genetic relationship matrix; Gmp is a matrix of maternal permanent environmental effects; R is a covariance matrix of residual effects; Il and In are identity matrices of size l and n; and l denotes number of dams and n number of animals with records. The (co)variance parameters were transformed from the RRM model to be equivalent at standard weights.
Random Regression Model.
Two RRM were fitted: RRML and RRMS. In the RRMS, the spline function used knots located at 1, 205, and 365 d. The location of knots was chosen to correspond to traits defined in the MTM.
The RRM were defined as follows:
where yijkt is the tth observation of animal j of dam k, CGmi the mth fixed regression coefficient of contemporary group i; djl and pjl are the lth random regression coefficients of direct genetic and permanent environmental effects of animal j; mkl and mpkl are the lth random regression coefficients of maternal genetic and permanent environmental effects of dam k; and eijkt is the random measurement error;
l(at) represents either the lth value of the Legendre polynomial at age at (RRML) or spline coefficient of the lth + 1 knot at age at (RRMS). The variances and covariances were as follows:
 |
where Kd and Km are covariance matrices of random regression coefficients for direct and maternal genetic effects, respectively; Kdm is a matrix of genetic covariances between direct and maternal regression coefficients; A is an additive genetic relationship matrix; Kp is a covariance matrix of random regression coefficients for permanent environmental effect; Kmp is a covariance matrix of random regression coefficients for maternal permanent environmental effect; In and Il are identity matrices of size n and l; and n denotes the number of animals with records and l denotes the number of dams. A diagonal matrix of residual variances (Kr) was modeled by linear splines. Although the structure of the variances is the same for RRML and RRMS, the values are different.
Linear Splines
A vector of spline coefficients (
) at age t(at) for knots q1, q2, and q3 can be defined as:
The choice of linear splines was due to two factors. First, each spline coefficient has localized effects (Wold, 1974
; Green and Silverman, 1994
) and thus would result in fewer artifacts than Legendre polynomials. Second, parameters for models with linear splines are very easy to derive from parameters of MTM. This can be illustrated by listing the RRMS for the direct effect only:
The spline coefficients and the model for specific weights corresponding to standard weights in MTM are:
Birth weight:
Weaning weight:
Yearling weight:
Thus, the direct effects in RRMS for standard weights are the same as in MTM, and subsequently, the variances are identical. Generalizing, when the knots in RRMS correspond to traits in MTM, the variances in the corresponding effects except the residual are the same. However, the residual effect in MTM is split into the permanent environment plus the residual effect. In this paper, these variances in RRMS were converted from RRML parameters for agreement at standard points.
Diagonalization of RRM
Reparameterization of regression coefficients that results in diagonal (co)variances for random effects has been proposed by van der Werf et al. (1998)
. It was found essential for decreasing computational costs of RRM when Legendre polynomials were used (Lidauer and Mantysaari, 1999
; Nobre et al., 2003a
). However, in the case of the maternal model, the simultaneous diagonalization of both direct (Gd) and maternal variances (Gm) is impossible. Let the model for animal I with dam j be: y..ij..= ..+ di
+ mj
+ ..where
is the vector of Legendre or spline coefficients, d and m are vectors of the regression coefficients for direct and maternal effect, respectively.
The variances and covariances are as follows:
Decompose Gd, Gm, and Gmd into matrices with eigen-values (D) and eigenvectors (V):
The model can be transformed to:
where d*i = diVd
*d = V'd
and m*i = miVd
*m = V'm
The (co)variance matrix of direct and maternal effects for an animal on the transformed scale is:
 |
Results and Discussion
|
|---|
Table 1
presents accuracies of breeding values computed as a correlation between the true (simulated) and predicted breeding values. The accuracies for all three methods using 3EXACT were the same. The accuracies with 3SPREAD were exactly the same (at least to one decimal place) for RRML and RRMS but lower for MTM, as expected. With 5EXACT and 5SPREAD, the accuracies of RRML and RRMS increased compared with 3EX-ACT and 3SPREAD, also as expected. However, the increase in RRMS was slightly smaller than in RRML, indicating differences between these methods.
View this table:
[in this window]
[in a new window]
|
Table 1. Accuracies (%) of breeding values in a multiple-trait model (MTM), a random regression model with Legendre polynomials (RRML), and a random regression model with splines (RRMS)
|
|
(Co)variances of RRML and RRMS are equivalent at standard ages but are different at other ages. Figure 1
shows the direct variance as a function of age for MTM, RRML, and RRMS. The variance for RRMS is concave between the knots; the concavity increases with a decrease of genetic correlation between the adjacent knots. Figure 2
shows genetic correlations for the direct effect between birth weight and weight at other ages. The correlations with RRMS are inflated, especially around 100 d. The inflated correlation resulted in too large of a contribution of records especially around 100 d to prediction of birth weight. Figures 1
and 2
also contain graphs of RRMS obtained when an extra knot was added at 100 d. In this case, the variances of RRML and RRMS are very similar. It is worth noting that despite the differences in variances, the accuracies of RRML and RRMS were very similar and higher than those with MTM. This agrees with the opinion of Schaeffer and Wilton (1981)
that mixed-model equations are robust with respect to slightly inaccurate parameters. Selection of number and location of knots will be a topic of a separate study.

View larger version (18K):
[in this window]
[in a new window]
|
Figure 1. Direct genetic variance of random regression model with Legendre polynomials (RRML), random regression model with splines and knots located at 1, 205, and 365 d (RRMS 3 knots), random regression model with splines and knots located at 1, 100, 205, and 365 d (RRMS 4 knots), and multiple-trait model (MTM).
|
|

View larger version (16K):
[in this window]
[in a new window]
|
Figure 2. Direct genetic correlation between weight at birth and other ages in random regression model with Legendre polynomials (RRML), random regression model with splines and knots located at 1, 205, and 365 d (RRMS 3 knots); random regression model with splines and knots located at 1, 100, 205, and 365 d (RRMS 4 knots), and multiple-trait model (MTM).
|
|
The rank of RRM was decreased by dropping random regression coefficients with eigenvalues that explained less than 1% of variance. Although the computation costs were decreased, this affected accuracy. Correlations between the rank-reduced and nonreduced RRML predictions were 0.89, 0.91, and 0.95 for the direct genetic effect and 0.80, 0.98, and 0.97 for the maternal genetic effect at 1, 205, and 365 d of age, respectively. This was because even though the eigenvalue corresponding to the eliminated direct genetic variance component accounted for 0.102% of the total variance, it explained a large portion of the variance at birth (Figure 3
). The elimination of this eigenvalue decreased direct genetic variance at birth by 5.13 kg2 (65.2%). The change in variance due to the rank reduction was close to zero after 120 d of age. Similarly, the reduction of the maternal effect decreased variance at early ages and caused almost no change at later ages. Foulley and Robert-Granié (2002)
stated that the rank of RRM should be reduced with caution.

View larger version (12K):
[in this window]
[in a new window]
|
Figure 3. Differences in direct and maternal genetic variance between original and rank-reduced random regression model with Legendre polynomials (RRML).
|
|
Table 2
shows the number of rounds and computing time for the original, diagonalized, and reduced models. The solution method was a preconditioned conjugate gradient with a diagonal preconditioner. After diagonalization, the number of rounds required to converge decreased from 571 to 101 in RRML, and from 184 to 81 in RRMS. The RRM with splines were faster than RRML due not only to better convergence rate but also to having only three covariables per effect rather than four in RRML.
 |
Implications
|
|---|
The implementation of random regression models for growth requires testing to ensure that numerical problems or inaccurate parameters do not decrease the accuracy of breeding values. Solutions by multiple-trait models and random regression models for growth are equal when records occur at standard points. Multiple-trait models lose, but random regression models retain, accuracy when records occur at nonstandard points. The accuracy of random regression models increases when additional records are incorporated, although the increase may be small. In random regression models with Legendre, diagonalization is crucial for adequate performance; however, the rank reduction should be done with caution. The random regression models with linear splines are a simple alternative to random regression models with Legendre polynomials because the parameters are almost the same as in multiple-trait models, and the convergence rate is satisfactory without diagonalization.
1 Correspondencee-mail: jarmila{at}uga.edu.
Received for publication August 3, 2004.
Accepted for publication October 1, 2004.
 |
Literature Cited
|
|---|
Druet, T., F. Jaffrézic, D. Boichard, and V. Ducrocq. 2003. Modeling lactation curves and estimation of genetic parameters for first lactation test-day records of French Holstein cows. J. Dairy Sci. 86:24802490.[Abstract/Free Full Text]
Foulley, J. L., and C. Robert-Granié. 2002. Basic statistical methods for longitudinal data. Course Notes. Montpellier, France.
Green, P. J., and B. W. Silverman. 1994. Nonparametric Regression and Generalized Linear Models. Chapman & Hall, London, U.K.
Legarra, A., I. Misztal, and J. K. Bertrand. 2004. Constructing covariance functions for random regression models for growth in Gelbvieh beef cattle. J. Anim. Sci. 82:15641571.[Abstract/Free Full Text]
Lidauer, M., and E. A. Mantysaari. 1999. Multiple trait reduced rank random regression test-day model for production traits. Proc. Annu. Interbull Mtg., Zurich, Switzerland. Interbull Bull. 22:7480.
Meyer, K. 2004. Scope for a random regression model in genetic evaluation of beef cattle for growth. Livest. Prod. Sci. 86:6983.
Misztal, I., T. Strabel, J. Jamrozik, and E. A. Mantysaari. 2000. Strategies for estimating the parameters needed for different test-day models. J. Dairy Sci. 83:11251134.[Abstract]
Nobre, P. R. C., I. Misztal, S. Tsuruta, and J. K. Bertrand. 2003a. Genetic evaluation of growth in Nellore cattle by multiple-trait and random regression models. J. Anim. Sci. 81:927932.[Abstract/Free Full Text]
Nobre, P. R. C., I. Misztal, S. Tsuruta, J. K. Bertrand, L. O. C. Silva, and P. S. Lopes. 2003b. Analyses of growth curves of Nellore cattle by multiple-trait and random regression models. J. Anim. Sci. 81:918926.[Abstract/Free Full Text]
Schaeffer, L. R., and J. W. Wilton. 1981. Comparison of single and multiple trait beef sire evaluation. Can. J. Anim. Sci. 61:565573.
van der Werf, J., M. E. Goddard, and K. Meyer. 1998. The use of covariance functions and random regressions for genetic evaluation of milk production based on test day records. J. Dairy Sci. 81:33003308.[Abstract]
White, I. M., R. Thompson, and S. Brotherstone. 1999. Genetic and environmental smoothing of lactation curves with cubic splines. J. Dairy Sci. 82:632638.[Abstract]
Wold, S. 1974. Spline functions in data analysis. Technometrics 16:111.
This article has been cited by other articles:

|
 |

|
 |
 
J. P. Sanchez, I. Misztal, and J. K. Bertrand
Evaluation of methods for computing approximate accuracies of predicted breeding values in maternal random regression models for growth traits in beef cattle
J Anim Sci,
May 1, 2008;
86(5):
1057 - 1066.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
J. P. Sanchez, I. Misztal, I. Aguilar, and J. K. Bertrand
Genetic evaluation of growth in a multibreed beef cattle population using random regression-linear spline models
J Anim Sci,
February 1, 2008;
86(2):
267 - 277.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
F. Kohn, A. R. Sharifi, S. Malovrh, and H. Simianer
Estimation of genetic parameters for body weight of the Goettingen minipig with random regression models
J Anim Sci,
October 1, 2007;
85(10):
2423 - 2428.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
K. R. Robbins, I. Misztal, and J. K. Bertrand
Joint longitudinal modeling of age of dam and age of animal for growth traits in beef cattle
J Anim Sci,
December 1, 2005;
83(12):
2736 - 2742.
[Abstract]
[Full Text]
[PDF]
|
 |
|