J. Anim Sci.
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
 QUICK SEARCH:   [advanced]


     


This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Robbins, K. R.
Right arrow Articles by Bertrand, J. K.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Robbins, K. R.
Right arrow Articles by Bertrand, J. K.
J. Anim. Sci. 2005. 83:2736-2742
© 2005 American Society of Animal Science


ANIMAL GENETICS

Joint longitudinal modeling of age of dam and age of animal for growth traits in beef cattle

K. R. Robbins1, I. Misztal and J. K. Bertrand

Animal & Dairy Science Department, The University of Georgia, Athens 30602-2771


    Abstract
 Top
 Abstract
 Introduction
 Materials and Methods
 Results and Discussion
 Implications
 Appendix 1
 Literature Cited
 
Two methods to jointly model age of dam (AOD) and age of animal in random regression analyses of growth in Gelbvieh cattle were examined. The first method (M1) was analogous to the multiple-trait analysis and consisted of AOD as a nested class variable and a cubic polynomial regression on age nested within birth, weaning, and yearly weights. The second method (M2) used two-dimensional splines, with age knots at 150, 205, 270, 340, and 390 d. The AOD knots were placed at 725, 1,464, and 2,189 d. These selected knots were used to form a two-dimensional grid containing 15 knots, each representing a specific age and AOD combination. A data set containing Gelbvieh growth records was split along contemporary groups into two data sets. Data set 1 contained 316,078 records and was used for prediction by mixed-model equations. Data set 2 contained 164,167 records and was used for cross validation. In the complete data set, only 90 and 30% of animals with birth weight had records on weaning and yearling weights, respectively. Models were evaluated based on R2, average squared error (ASE), percent bias, and plots of solutions. The ASE for weights associated with birth weight, weaning weight, and yearling weight for M1 were 15, 505, and 703 kg2. With M2, large jumps in fixed-effect estimates were observed outside the two-dimensional grid. To eliminate this problem, weighted one-dimensional splines were used for extrapolation beyond the two-dimensional grid. For M2 with weighted spline extrapolation, the ASE were 15, 542, and 777 kg2 for birth weight, weaning weight, and yearling weight, respectively. Creation of optimal two-dimensional splines is difficult when data are clustered. Despite such difficulties, the two-dimensional spline was capable of jointly and continuously modeling AOD and age of animal.

Key Words: Cross Validation • Polynomial Regression • Two-Dimensional Spline


    Introduction
 Top
 Abstract
 Introduction
 Materials and Methods
 Results and Discussion
 Implications
 Appendix 1
 Literature Cited
 
Although much work has been done on the modeling of random effects in random regression models for growth (Bohmanova et al., 2005Go; Robbins et al., 2005Go), relatively little attention has been paid to the modeling of fixed effects in this context. When evaluating growth, two continuous covariates are commonly modeled: age of dam (AOD) effect and the animal’s age. Producers currently record three growth traits for beef cattle that are analyzed using multiple-trait models, and as a result, records tend to be clustered around birth, weaning, and yearling ages. With such age distributions, polynomial functions nested within each trait seem to be an obvious choice for the modeling of these effects. If random regression models become more widely accepted, however, larger and more continuous age ranges could be recorded for evaluation. This could make nesting more difficult, and outlying age records could make polynomials vulnerable to artifacts.

One alternative to polynomial regressions is splines. Splines are a series of polynomial functions fit through control points, referred to as knots. It has been shown that spline functions are resistant to artifacts (Druet et al., 2003Go; Aarons et al., 2004Go). Unlike polynomial regressions, in which a small subset of data can affect the entire function, splines are defined by a series of polynomials that are affected only by their bounding knots (Molinari et al., 2002Go). Although one-dimensional splines provide a more robust model, they still require nesting when modeling AOD and age of animal. Two-dimensional splines can provide a generalized and robust model for fixed effects. The selected two-dimensional knots provide automatic nesting and implicit modeling of interactions and decrease the effects of outlying records.

The purpose of this cross-validation study was to compare the fit of two methods for evaluating growth in beef cattle and to provide a basic methodology for fitting two-dimensional splines to biological data.


    Materials and Methods
 Top
 Abstract
 Introduction
 Materials and Methods
 Results and Discussion
 Implications
 Appendix 1
 Literature Cited
 
Data Sets

All evaluations were performed on the data set used by Robbins et al. (2005)Go. The data set contained records on three weights collected at birth, weaning (approximately 205 d), and yearling (approximately 365 d). There was an average of 2.07 records per animal in the complete data set; 90 and 30% of animals with birth records (Bwt) also had weaning (Wwt) and yearling (Ywt) records, respectively. All contemporary groups (CG) with <15 animals were eliminated to ensure accurate estimation of CG by the prediction data set. Weight records were then split within CG; two-thirds went to a prediction data set, and one-third went to a validation data set. Data set 1 contained 316,078 growth records on 216,603 Gelbvieh cattle and was used for mixed-model evaluation. Data set 2 contained 164,167 records on 135,356 animals and was used for cross validation. A summary of the distribution of records across AOD and traits can be found in Table 1Go.


View this table:
[in this window]
[in a new window]
 
Table 1. Distribution of records across traits for each data seta
 
Methods

Two methods were employed to model AOD and age of animal. Method 1 (M1) modeled AOD as a within-trait nested class variable and used a within-trait nested cubic polynomial regression to model age of animal. Method 2 (M2) jointly modeled AOD and age of animal with a two-dimensional spline. All random effects were modeled using linear splines.

The equation for random effects in scalar notation was


where randomhijkm = sum of random effects for trait t (growth records at birth, weaning, or yearling ages) and AOD for group j, dirdk and pedk = spline coefficients d for additive direct (dir) and permanent environmental (pe) effects for animal k, matdm and mpedm = spline coefficients d for maternal (mat) and maternal permanent (mpe) environmental effects for dam m, ehijkm = weighted heterogeneous random residual modeled by linear splines and implemented by weighting each observation, and sdh = coefficient d of the linear spline function for an observation taken at age h. This is the same equation used by Robbins et al. (2005)Go.

The fixed-effect model using within-trait nested cubic polynomial regressions on age and within-trait nested AOD classes was


where cgi = contemporary group i, consisting of animals of the same sex, percent Gelbvieh, and from the same breeder-defined management groups; {alpha}ht = linear, quadratic, and cubic regression coefficients at age h and nested in trait t; ageh = age h of animal; aget = reference age of trait t; and AODj = AOD class j nested within trait t. Age of dam classes were renumbered for Wwt and Ywt traits for nesting purposes.

A second model that contained the same within-trait fixed effects plus an additional AOD by age of animal interaction was fit and is described here:


where AODj*ageh = the interaction of AOD class j by age of animal h.

The two-dimensional spline model can be written as


where cf = the coefficient of an animal with age and AOD such that


{alpha}hj = the estimated knot value for age h and AODj.

The coefficients for the two-dimensional splines were determined as


where x is 1 minus the distance of the age of animal from the knot for age of animal, and y is determined by 1 minus the distance of the AOD from the knot for AOD when 0 ≤ distance ≤ 1.

Because the two-dimensional spline was poor at extrapolation beyond the two-dimensional grid, a model was run that used the weighted sum of one-dimensional splines for extrapolation beyond the grid knots. The equation was


where wi = weighting factor for one-dimensional splinei; lcf = linear spline coefficient for one-dimensional spline extrapolation; knotad = two-dimensional spline knot for age of animal h and age of dam d; and nk = number of one-dimensional spline functions.

Evaluation Methods

Solutions obtained by program BLUP90IOD (Tsuruta et al., 2001Go) from the analysis of data set 1 were used to predict the records of animals in data set 2. Using actual and predicted records from data set 2, the R2, average squared errors (ASE), and percent bias were computed for each model at each trait (Bwt, Wwt, and Ywt). The ASE, percent bias, R2, and plots of fixed-effect solutions were used to evaluate each model.

As a result of the overparameterization of the evaluation models, mean squared error could not be used because there were no df. Therefore, the ASE was used to evaluate the fixed-effect models. The ASE was computed as


where yi = the weight of animal i, yi = the predicted record of animal i, and n = the number of records contained in the test data set. In addition to ASE, percent bias was calculated as



    Results and Discussion
 Top
 Abstract
 Introduction
 Materials and Methods
 Results and Discussion
 Implications
 Appendix 1
 Literature Cited
 
The steps involved in model selection differ greatly between M1 and M2. These differences represent an important distinction between model types and should be taken into consideration when formulating a fixed-effect model. To construct a polynomial model, one needs to consider the order of the polynomial as well as the need and location for nesting. Because AOD was modeled as categorical, class boundaries had to be defined. Such considerations make model selection a relatively simple process for M1.

Creating an optimal two-dimensional spline model can be much more time-consuming than polynomial regressions. Splines are approximations that depend heavily on the location of the knots. There must be enough knots to adequately model the shape of the function, and there must be enough records in each interval between knots to accurately estimate knot values. Unfortunately, there is no automatic procedure for the selection of knots, which can result in much trial and error; however, there are some general rules that can aid in this process. Wold (1974)Go suggested the use of as few knots as possible (no more than one extremum and one inflection point per interval) and the location of knots close to inflection points. In the case of the two-dimensional spline, the application of these rules to each variable separately can provide a good starting point. In addition, the inability of the two-dimensional spline to model data outside the two-dimensional grid necessitates the placement of knots at extreme values. However, if data are sparse around the extrema, the use of weighted spline extrapolation may give the best results.

Once a base model has been established, there are some generalized procedures for the addition of knots to the model. One procedure is to place an additional knot at the median of the existing interval (Rosenberg et al., 2003Go). This process could be useful when the data are continuously distributed. In the case of growth data in beef cattle, both age and AOD are clustered, thereby limiting the areas in which knots can be placed. In such a case, the median may not be the best place to add additional knots; however, the general principles of this procedure can be useful in expanding the base model. When dealing with disjointed data, placing knots at the end points of each cluster may be a good idea; however, if data are sparse at the endpoints, placing the knots closer to the center can provide better results.

When using two-dimensional splines, variables may behave differently depending on the value of another variable. In such a case, as with the nesting of AOD within age, placing knots based on each variable’s curve alone may not be optimal. To account for possible interactions or nesting effects, conditional plots can be of value. Plotting a variable by each interval of the other variable can help in determining how the two variables interact. In such a case, it is best to place as few knots as possible that allow enough flexibility to model possible interactions and nesting effects. It is important to remember that although an optimized spline is robust against artifacts, the flexibility of the spline model makes it highly susceptible to artifacts when knots are poorly placed (Wold, 1974Go).

Cross-validation results in Table 2Go show that M1 performed well. The model containing AOD by age of animal interactions had lower R2 values than models without the interaction effect, suggesting no interaction is present in the data. The interaction model showed increases in ASE and negative biases for Wwt and Ywt. The relatively large and negative percent bias values show that the interaction model is overpredicting records; this is likely due to overfitting of the model to data set 1. It was expected that the nested polynomials would perform well given the disjointed nature of the age distributions. The distinct Bwt, Wwt, and Ywt groups, coupled with the high density of records within each group, makes nested polynomials an appealing model choice for this particular data set.


View this table:
[in this window]
[in a new window]
 
Table 2. Cross-validation resultsa
 
The fixed effect plot for M1 can be seen in Figure 1Go. The function on age shows a period of steep linear growth between 75 and 275 d, followed by a period of decreasing slope between 275 and 325 d of age. The decreasing slope is the result of the discontinuity of the nested regressions. The point at 275 d is calculated with the regression nested in Wwt, whereas the point at 325 d is calculated by the regression nested in Ywt. Clearly, the two nested functions are disjointed. Beyond 325 d, another period of steep linear growth can be seen. When looking at the AOD functions at birth and yearling in Figure 2Go, there is a period of linear increase until 1,500 d followed by a shallow incline to a relatively flat plateau at 2,200 d. At yearling age, the AOD function shows a linear incline to at plateau at 1,500 d.



View larger version (40K):
[in this window]
[in a new window]
 
Figure 1. Three-dimensional plot of weight x age x age of dam (AOD) for the cubic polynomial regression model nested within birth, weaning, and yearling growth traits for beef cattle. There were no observations plotted between 275 and 325 d of age.

 


View larger version (11K):
[in this window]
[in a new window]
 
Figure 2. Plot of age of dam (AOD) x age of animal estimated from beef cattle data. The nested polynomial model (M1) is represented by the dashed line. The two-dimensional spline with weighted spline extrapolation model (M2) is represented by the solid line.

 
For this application, the best fitting two-dimensional spline models contained five age knots at 150, 205, 270, 340, and 390 d of age; birth was analyzed separately. The function below 150 d was modeled as a decreasing function from 150 d toward zero. The function above 390 d was modeled as an increasing function. Models that contained more age knots were erratic and seemed to be influenced by artifacts, whereas models with fewer knots performed poorly in cross validation. For AOD, three knots were placed at 725, 1,464, and 2,190 d. The AOD function was much more sensitive to artifacts than age of animal; thus, only a few knots could be used. As AOD increased, data became considerably sparser. It was found that forcing a flat function through later dam ages at birth, weaning, and yearling yielded the best results. It should be noted that there is some flexibility in choosing knots, as several combinations of knots yielded comparable results to the previously noted model. The exclusion of Bwt in the two-dimensional spline models had only a small effect on ASE and percent bias but provided the best model.

As seen in Table 2Go, M2 performed well with the extended grid and weighted spline extrapolation methods. The parity of M1 and M2 at birth would be expected, as there is no age variation; the modeling of AOD effects was the only difference between models. Although there are some differences in R2, ASE, and percent bias at Wwt and Ywt, M1 and M2 had similar fits for these traits. These results suggest that M2 is capable of automatically nesting AOD within age of animal but does not provide a superior fit to the data. Whereas M1 has seven additional fixed-effect parameters compared with M2, their effect on model complexity is negligible when weighted against more than 16,000 CG.

When looking at the graph of M2 in Figure 3aGo, it seems that using two-dimensional functions to extrapolate beyond the grid can result in large jumps in the estimated effects. This results from the fact that, unlike one-dimensional splines, two-dimensional coefficients must be forced to sum to a constant. Once outside the grid, this restriction is removed, and knot coefficients suddenly jump to values that no longer sum to this constant. As seen in Figure 3bGo, the use of weighted spline extrapolation greatly alleviates this problem. The weighted spline function allows the sum of knot values to increase or decrease gradually from one. As well as giving smoother graphs, the use of the weighted interpolation gives lower ASE and percent bias as shown in Table 2Go. Another solution to this problem is the extension of the two-dimensional grid to encompass all data. This method performs well in terms of ASE and percent bias, but graphs of solutions in Figure 3cGo show that it can be subject to artifacts. Some alternatives to the previously described M2 methods could involve the creation of a function with an asymptote such that knotad = one-dimensional spline knot at age "a," where age "a" is 1 d beyond the bound of the two-dimensional grid, and age of dam "d," where age of dam "d" is 1 d beyond the bound of the two-dimensional grid. This results in a discontinuous model with an increased number of knots. Additionally, the elimination of the bounding knot farthest from any given data point would leave only three knots for each observation, allowing for a linear formulation of knot coefficients. In addition to this simplified triangular methodology, the inclusion of both fixed and random interaction effects could be effective for modeling of data sets containing multiple growth curves.



View larger version (26K):
[in this window]
[in a new window]
 
Figure 3. Three-dimensional plots of weight x age x age of dam (AOD) for the two-dimensional spline (M2) modeling of beef cattle data. A = M2, B = M2 with weighted spline extrapolation, and C = M2 with extended grid.

 
The graph of M2, found in Figure 3bGo, shows an almost linear growth throughout the function with some curvature present. This linear growth is also observed in each of the two polynomial regressions found in Figure 1Go; however, with M2, the function is continuous. This property of M2 could make it a better choice for modeling data distributed continuously across all ages. The graph of the AOD effect at birth (Figure 2Go) shows a linear incline to a flat plateau. When looking at AOD curves for weaning and yearling, slowly increasing functions can be seen before 1,300 d followed by steeper linear inclines that reached a plateau. For birth and weaning weights, these plateaus are reached at 2,200 d as with the M1 AOD functions. For yearling weight, M2 curves reach a plateau at approximately 1,500 d, much like M1. This result is not surprising, as AOD has little effect after weaning, and AOD curves tend to be flatter. The AOD curves seemed particularly sensitive to artifacts when more than three or four knots were used. Such curves were very erratic and yielded poor results in cross validation. As can be seen in Figure 2Go, estimates of AOD effects obtained from M1 are of a larger magnitude than estimates obtained from M2. This is due to the location of the overall mean in each model. In M2, the overall mean was contained in the contemporary group; as a result, it is not present in the AOD and age graphs. With the polynomial regression, some of the overall mean is present in the cross-classified AOD effect and, therefore, is present in the AOD and age graphs. The presence and absence of this mean has an effect on the magnitude of age and AOD estimates, but it has no effect on breeding value prediction. Putting scale issues aside, however, the graphs of M1 and M2 are similar for age of animal and AOD.

The clustering of data around birth, 205 d, and 365 d makes the use of nested polynomials a relatively simple and effective way to model fixed effects in this application. However, if the use of RRM models becomes a more standard practice, the nesting of polynomials will become increasingly difficult if collection of data across ages becomes more continuous. Because of the disjoint nature of the nested polynomials, evaluation of animals with records located between Wwt and Ywt age ranges could be problematic.

Given the current state of the industry in which records are clustered within predefined age ranges, M2 does not have an advantage over traditional polynomial regressions. Previous applications of two-dimensional splines have been in the form of thin-plate splines used in the context of engineering and graphical applications (Meinguet, 1979Go). In such instances, data are collected in a grid-like manner or such that observations are located at key points given a known three-dimensional shape (Bookstein, 1989Go). Under such conditions, thin-plate splines are very effective; however, in the present application, neither of these conditions is met. This does not mean that two-dimensional splines cannot be effective, but they might not provide optimal performance. Despite this, the potential susceptibility of polynomial regression to artifacts could make two-dimensional splines a more attractive choice.

In this study, only one set of curves was fit for all animals. In fact, curves for different management systems and different regions may vary as a function of year of birth. Modeling of these different curves could be done using combinations of fixed and random effects, as done in the random regression modeling of herd by year of calving (Druet et al., 2003Go). If these differences are ignored, the curves are averages over environments. If recording for Ywt is selective, as in this study where only 30% of animals with Bwt had records for Ywt, the curves for Ywt may be averages over mostly selected environments and may cause imperfect curves for Wwt when the age effect is not nested. Thus, it is possible that M2 could be closer or superior to M1 if recording for Ywt were more complete.


    Implications
 Top
 Abstract
 Introduction
 Materials and Methods
 Results and Discussion
 Implications
 Appendix 1
 Literature Cited
 
Although nested polynomials perform well, their sensitivity to artifacts and need for nesting could create problems if records are measured for increasingly wide age ranges. Two-dimensional splines did not have superior performance with clustered data, but their automatic nesting and robustness could make them an appealing choice for data sets in which nesting is difficult and in areas where sparse outlying data are present. When modeling data are collected for longitudinal analysis, the continuous nature of the two-dimensional spline may yield superior performance relative to the disjointed, nested polynomial model.


    Appendix 1
 Top
 Abstract
 Introduction
 Materials and Methods
 Results and Discussion
 Implications
 Appendix 1
 Literature Cited
 
An example of the calculations for the weights (wi) and spline coefficients (cf) used for extrapolation with weighted splines. The wi and cf for an animal’s record taken at 216 d of age with a 3,000-d-old dam is computed. Assuming the model used in this study, the bounding age knots for this record are at 205 and 270 d; the last age of dam knot is placed at 2,190 d. If the extrapolation function is modeled as decreasing beyond age of dam at 2,190 d, the wi and cf would be calculated as follows:




Weighted spline extrapolation = 0.83 x 0.73 x knot(205, 2,190) + 0.17 x 0.73 x knot(270, 2,190), where knot(205, 2190) is the two-dimensional spline knot, as estimated by the mixed-model equations, at 205 d of age and a dam age of 2,190 d. Knot(270,2190) is the two-dimensional spline knot at 270 d of age and dam age of 2,190 d.

1 Correspondence: Rhodes Center for Animal and Dairy Science (phone: 706-542-0965; fax: 706-583-0274; e-mail: krobbin1{at}uga.edu).

Received for publication May 12, 2005. Accepted for publication August 1, 2005.


    Literature Cited
 Top
 Abstract
 Introduction
 Materials and Methods
 Results and Discussion
 Implications
 Appendix 1
 Literature Cited
 


Aarons, L., C. Baxter, and S. Gupta. 2004. Pharmacodynamics of controlled release Verapamil in patients with hypertension: An analysis using spline functions. Biopharm. Drug Dispos. 25:219–225.[Medline]

Bohmanova, J., I. Misztal, and J. K. Bertrand. 2005. Studies on multiple trait and random regression models for genetic evaluation of beef cattle for growth. J. Anim. Sci. 83:62–67.[Abstract/Free Full Text]

Bookstein, F. L. 1989. Principle warps: Thin-plate splines and the decomposition of deformations. IEEE Trans. Pattern Anal. Mach. Intell. 11:567–585.

Druet, T., F. Jaffrezic, D. Boichard, and V. Ducrocq. 2003. Modeling lactation curves and estimation of genetic parameters for first lactation test-day records of French Holstein cows. J. Dairy Sci. 86:2480–2490.[Abstract/Free Full Text]

Meinguet, J. 1979. Multivariate interpolation at arbitrary points made simple. J. Appl. Math. Phys. 30:292–304.

Molinari, N., M. Morena, J. P. Cristol, and J. P. Daures. 2002. Free knot splines for biochemical data. Comp. Meth. Prog. Biomed. 67:163–167.[Medline]

Robbins, K. R., I. Misztal, and J. K. Bertrand. 2005. A practical longitudinal model for evaluating growth in Gelbvieh cattle. J. Anim. Sci. 83:29–33.[Abstract/Free Full Text]

Rosenberg, P. S., H. Katki, C. A. Swanson, L. M. Brown, S. Wacholder, and R. N. Hoover. 2003. Quantifying epidemiologic risk factors using non-parametric regression: Model selection remains the greatest challenge. Stat. Med. 22:3369–3381.[Medline]

Tsuruta, S., I. Misztal, and I. Stranden. 2001. Use of the preconditioned conjugate gradient algorithm as a generic solver for mixed model equations in animal breeding applications. J. Anim. Sci. 79:1166–1172.[Abstract/Free Full Text]

Wold, S. 1974. Spline functions in data analysis. Technometrics 16:1–11.



This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Robbins, K. R.
Right arrow Articles by Bertrand, J. K.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Robbins, K. R.
Right arrow Articles by Bertrand, J. K.


HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS