J. Anim Sci.
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
 QUICK SEARCH:   [advanced]


     


This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Damgaard, L. H.
Right arrow Articles by Andersen, A. H.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Damgaard, L. H.
Right arrow Articles by Andersen, A. H.
J. Anim. Sci. 2006. 84:1338-1350
© 2006 American Society of Animal Science


ANIMAL GENETICS

The effect of ignoring individual heterogeneity in Weibull log-normal sire frailty models1

L. H. Damgaard*,{dagger},2, I. R. Korsgaard{dagger}, J. Simonsen{ddagger}, O. Dalsgaard{ddagger} and A. H. Andersen{ddagger}

* Department of Animal Science and Animal Health, Royal Veterinary and Agricultural University, Grønnegårdsvej 2, 1870 Frederiksberg C, Denmark; and {dagger} Department of Genetics and Biotechnology, Danish Institute of Agricultural Sciences, P.O. Box 50, 8830 Tjele, Denmark; and {ddagger} Department of Theoretical Statistics, University of Aarhus, 8000 Aarhus C, Denmark


    Abstract
 Top
 Abstract
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 IMPLICATIONS
 APPENDIX A
 APPENDIX B
 LITERATURE CITED
 
The objective of this study was, by means of simulation, to quantify the effect of ignoring individual heterogeneity in Weibull sire frailty models on parameter estimates and to address the consequences for genetic inferences. Three simulation studies were evaluated, which included 3 levels of individual heterogeneity combined with 4 levels of censoring (0, 25, 50, or 75%). Data were simulated according to balanced half-sib designs using Weibull log-normal animal frailty models with a normally distributed residual effect on the log-frailty scale. The 12 data sets were analyzed with 2 models: the sire model, equivalent to the animal model used to generate the data (complete sire model), and a corresponding model in which individual heterogeneity in log-frailty was neglected (incomplete sire model). Parameter estimates were obtained from a Bayesian analysis using Gibbs sampling, and also from the software Survival Kit for the incomplete sire model. For the incomplete sire model, the Monte Carlo and Survival Kit parameter estimates were similar. This study established that when unobserved individual heterogeneity was ignored, the parameter estimates that included sire effects were biased toward zero by an amount that depended in magnitude on the level of censoring and the size of the ignored individual heterogeneity. Despite the biased parameter estimates, the ranking of sires, measured by the rank correlations between true and estimated sire effects, was unaffected. In comparison, parameter estimates obtained using complete sire models were consistent with the true values used to simulate the data. Thus, in this study, several issues of concern were demonstrated for the incomplete sire model.

Key Words: genetics • individual heterogeneity • survival frailty model


    INTRODUCTION
 Top
 Abstract
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 IMPLICATIONS
 APPENDIX A
 APPENDIX B
 LITERATURE CITED
 
The Weibull log-normal sire frailty model used to perform genetic evaluation of sires for longevity of dairy cows implicitly neglects unobserved individual heterogeneity not accounted for by the sire effect and the covariates included in the model (Ducrocq and Casella, 1996Go). In effect, the model does not have a genetic interpretation in the sense of the additive genetic infinitesimal model (Fisher, 1918Go; Bulmer, 1980Go; Andersen et al., 2000Go). Korsgaard et al. (1998)Go proposed a Cox frailty model in which unobserved individual heterogeneity was modeled by a normal distributed residual effect on the log-frailty scale. In a subsequent study, Andersen et al. (2000)Go proved that the residual effect on the log-frailty scale is a sufficient condition in order for the Weibull sire frailty models to satisfy the assumptions of the additive genetic infinitesimal model.

For univariate and independent survival times, Vaupel et al. (1979)Go and Manton et al. (1981)Go were the first to recognize that unobserved individual heterogeneity, if ignored, could lead to bias in parameter estimates and to incorrect inferences about the lifetime distribution. For count data, Tempelman and Gianola (1996)Go addressed the consequence of ignoring unobserved individual heterogeneity in the mixed Poisson sire model suggested by Foulley et al. (1987)Go. As in the Weibull sire model without an error term, the model of Foulley et al. (1987)Go also ignores three-quarters of the genetic variance and the environmental variance not otherwise accounted for by Poisson sampling. By means of simulation, Tempelman and Gianola (1996)Go showed that it is necessary to account for unobserved individual heterogeneity to make correct inferences about model parameters.

The objective of this study was by means of simulation to quantify the effect on parameter estimates and address consequences for genetic inferences and the distribution of lifetimes, when individual heterogeneity is ignored in Weibull sire frailty models.


    MATERIALS AND METHODS
 Top
 Abstract
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 IMPLICATIONS
 APPENDIX A
 APPENDIX B
 LITERATURE CITED
 
In this study, we consider the Weibull log-normal sire frailty model (Andersen et al., 2000Go), in which unobserved individual heterogeneity is accounted for by a normally distributed residual effect on the log-frailty scale. The effect of ignoring individual heterogeneity was studied by simulating data from this complete model (hereafter denoted the complete sire model), and thereafter analyzing the data generated with the complete model and also with a corresponding model that did not account for individual heterogeneity in the log-frailty. The latter model was denoted the incomplete sire model.

A Bayesian analysis using Gibbs sampling was evaluated for both models and is described later. For comparison, we also estimated the parameters of the incomplete sire model by using the Survival Kit software (Ducrocq and Sölkner, 1998Go). Note that the 2 models specify different distributions, and that we used the notation T, Formula, and Formula to uniquely define parameters and random variables associated with the models. Three simulation studies were conducted, including 3 levels of heritability on the log-frailty scale (0.15, 0.5, or 1) combined with 4 levels of censoring (0, 25, 50, and 75%).

Complete Sire Model
Let Formula denote a vector with elements (sg(i) + i)i = 1, ...n, and let the random variable Formulai for i = 1,...,n be a lifetime of animal i. In the complete sire model, the hazard function conditional on model parameters Formula = (Formula, Formula, Formula, Formula, {sigma}Formula2 is given by


Formula 1[1]

in which Formula 1 and Formula 1 are parameters characterizing the baseline hazard function. The element sg(i) is a sire transmitting ability, with g(i) = j being an index function relating animal i to its sire j, in which j = 1,...,q, and q is the number of sires. The distribution of sire transmitting abilities is assumed to be s | {sigma}s2 ~ Nq(0,I{sigma}s2). Here, Formula 1i is a residual effect, with Formula 1|{sigma}Formula 12 ~ Nn(0,I{sigma}Formula 12). The 2 vectors s and Formula 1 are assumed to be independent and conditional on Formula 1; the lifetimes are assumed to be independent. With this specification of the model, we assume a balanced half-sib design with unrelated dams and sires. The model is consistent with the additive genetic infinitesimal model if {sigma}Formula 12 > 3{sigma}s2

The complete sire model is a log-linear model for Formula 1i


Formula 2[2]

given by in which {varepsilon}i follows an extreme value distribution, with E({varepsilon}i) = –{gamma}E ({gamma}E is Euler’s constant), and Var({varepsilon}i) = {pi}2/6. All of the {varepsilon}i’s are independent, and also independent of s and Formula 2. The parameters specifying the complete sire model in Equation [1] are identifiable if, and only if, at least 1 sire has more than 1 offspring (Andersen et al., 2000Go). The expected value, variance, third central moment of Formula 2i, covariance (Cov), and correlation (corr) between 2 half sibs, say Formula 2i and Formula 2j, (i j) are given by:


Formula 3[3]


Formula 4[4]


Formula 5[5]


Formula 6[6]


Formula 7[7]

in which {Psi}(2) is the second derivative of the digamma function {Psi}(·). From these equations, it follows that model parameters are identifiable. First, Formula 7 can be identified from Equation [5], Formula 7 can be identified from Equation [3], {sigma}s2 can be identified from Equation [6], and, finally, {sigma}Formula 72 can be identified from Equation [4]. The sire model in Equation [1] can also be parameterized by its equivalent animal model with hazard function given by


Formula 8[8]

in which model parameters {theta} = ({rho}, ß, a, e, {sigma}a2 {sigma}e2) are given in terms of the sire model by {rho} = Formula 8, ß = Formula 8, ({sigma}a2 + {sigma}e2) = ({sigma}s2 + {sigma}Formula 82), and {sigma}a2 = 4{sigma}s2. The vector with elements (ai + ei)i = 1,... n and the vector Formula 8 are identically multivariate normally distributed, and (Ti)i = 1,... n and (Formula 8i)i = 1,... n are identically distributed (Andersen et al., 2000Go).

Incomplete Sire Model
The incomplete sire model is defined exactly as the complete sire model, except that the residual effect on the normally distributed scale is omitted. That is, conditional on model parameters Formula 8 = (Formula 8, Formula 8, Formula 8, {sigma}Formula 82), the hazard function is given by


Formula 9[9]

in which Formula 9 is a vector with elements (Formula 9g(i))j=1,...,q. Here, Formula 9g(i) is a random effect, specific for each sire, with Formula 9|Formula 9 ~ Nq(0,I{sigma}Formula 92). Like the complete sire model, the incomplete sire model is a log-linear model for Formula 9i. However, because this model ignores three-quarters of the additive genetic variance in log-frailty, it is impossible to define an equivalent animal model (Andersen et al., 2000Go; Korsgaard et al., 2000Go). In effect, the model cannot be interpreted in terms of the additive genetic infinitesimal model (Fisher, 1918Go; Bulmer, 1980Go). The model can at best be used as an approximation to an additive genetic survival model. From a genetic point of view, the problems with the incomplete sire model are related to the fact that the covariance structure on the log-frailty scale is different from what is expected from additive genetic theory. That is, the sire effect measures similarities between half-sibs that do not necessarily have an additive genetic background.

Simulation of Data
Three data sets were generated from the Weibull log-normal animal frailty model (Equation [8]). Each data set embedded lifetimes of 10,000 animals from 100 unrelated sires each having 100 offspring, corresponding to a balanced half-sib design. The models used to simulate and analyze the data were parameterized with their equivalent sire models (Equation [1]), for which parameter values used in the simulation of data are specified as follows. The Weibull baseline parameters were the same for the 3 data sets and equal to Formula 9 = 2.4 and Formula 9 = –15.5, whereas 3 sets of variance components ({sigma}s2 = 0.15, and {sigma}Formula 92 = 2.85), ({sigma}s2 = 0.375, and {sigma} Formula 92 = 2.625), and ({sigma}s2 = 0.75, and {sigma}Formula 92 = 2.25) were used to simulate the data. The corresponding heritabilities on the log-frailty scale (hnor2 = 4{sigma}s2/[{sigma}s2 + {sigma}Formula 92]) were equal to 0.2, 0.5, and 1, respectively. With a heritability equal to 1, all of the residual variance is additive genetic. Following Hougaard (2000)Go, the mean and variance of a log-lifetime (lifetime) for all data sets were equal to E(log[Formula 9]) = 6.22, (E[Formula 9] = 734), Var(log[Formula 9]) = 0.81, and (Var[Formula 9] = [739]2). For each simulated data set, 4 levels of censoring (0, 25, 50 and 75%) were introduced by right censoring (Type II censoring). Type II censoring arises in experiments in which n items are placed on test and terminates the experiment upon observing a prespecified number, r ≤ n, of events; hence, nr individuals have censored observations. All together, 12 data sets were available for analysis, 4 data sets for each set of the parameter values. Descriptive statistics of the data sets are given in Table 1Go.


View this table:
[in this window]
[in a new window]
 
Table 1. Empirical statistics of simulated lifetimes, sire effects, residual effects, and censoring times
 
Data Analyses
All 12 data sets were analyzed with the complete sire model (Equation [1]) and the incomplete sire model (Equation [9]). By analyzing the data with the models used for generating the data, we demonstrate that parameters can be estimated satisfactorily. This has not previously been shown for the Weibull log-normal frailty model including a residual effect on the log-frailty scale.

In this study, marginal posterior distributions of the parameters were obtained from a Bayesian analysis using Gibbs sampling (Gelfand and Smith, 1990Go) combined with adaptive rejection sampling (Gilks and Wild, 1992Go; see Appendix A). The Gibbs sampler is only outlined for the complete sire model because the Gibbs sampler for the incomplete sire model was found using the same principles. The burn-in period and the chain length of the Gibbs sampler were the same across models and were established by visual inspection of trace plots of all parameters. The first 20,000 rounds of the Gibbs sampler were considered as burn-in; thereafter 49,000 samples of models parameters were saved with a sampling interval of 20 (i.e., a total of 1,000,000 rounds were run). Samples of derived quantities were obtained by calculating these for each round of sampled values of the parameters saved.

In addition to the Bayesian analysis using Gibbs sampling, parameters of the incomplete sire model were also estimated using the Survival Kit software (Ducrocq and Sölkner, 1998Go). In Survival Kit, the sire variance is estimated as the mode of its marginal posterior distribution approximated by Laplacian integration, whereas the other model parameters are estimated as the mode of the joint posterior distribution conditional on the estimated sire variance and data (Ducrocq and Casella, 1996Go).

Prior Specification.
A priori, Formula 9, Formula 9, and (s, Formula 9, {sigma}s2, {sigma}Formula 92 were assumed to be mutually independent, and with prior distribution given by: p(Formula 9) = p(Formula 9)p(Formula 9)p(s| {sigma}s2, {sigma}Formula 92)p(Formula 9 | {sigma}s2, {sigma}Formula 92)p({sigma}s2, {sigma}Formula 92), in which s |{sigma}s2, {sigma}Formula 92 ~ Nq(0,IFormula 9) and Formula 9| {sigma}Formula 92 Nn(0,I{sigma}Formula 92). Improper uniform priors were assigned to Formula 9, Formula 9, and ({sigma}s2, {sigma}Formula 92) over their range of positive support; i.e., p({rho}) {infty} I({rho} > 0), p(ß) {infty} 1 and p({sigma}s2, {sigma}Formula 92) {infty} I({sigma}Formula 92> 3{sigma}s2). Note that this prior specification guaranties that the model is consistent with the assumptions of the additive genetic infinitesimal model.

Joint Posterior Distribution.
Suppose data of animal i are represented by (yi, {delta}i), with yi being an observed lifetime ({delta}i = 1) or a censored lifetime ({delta}i = 0). Conditional on Formula 9, the type II censoring mechanism is noninformative (Kalbfleisch and Prentice, 1980Go), and by combining the conditional likelihood with the prior specification we obtain the joint posterior distribution up to proportionality given by


Formula 10[10]

The fully conditional posterior distributions of single parameters were derived up to the proportionality by retaining the terms depending on the parameter of interest from the posterior distribution (see Appendix A). The derived expressions for the fully conditional distributions of Formula 10, Formula 10, (sj)j = 1,...q, and (Formula 10i)i = 1,...,n could not be recognized as kernels of known distributions, but the kernels were all shown to be log-concave, and adaptive rejection sampling (see Appendix B) was used to draw samples from these distributions.

Criteria for Evaluation of the Models
The 2 models and the associated methods for estimation of the parameters were evaluated by comparing true and estimated parameters, and by calculating Spearman rank correlations between the true and estimated sire effects. However, in this evaluation of the models, we must keep in mind that the parameters of the complete sire model have different interpretations than the parameters of the incomplete sire model.

To focus on the model misspecification more directly, we compared the lifetime distribution specified directly by the incomplete sire model with the corresponding true one obtained after averaging over the residual effect (marginalized distribution). That is, for a sire, say sj, the true marginalized survival function for an animal i with g(i) = j is given by


Formula 11[11]

which is to be compared with the one specified directly by the incomplete sire model


Formula 12[12]

The expectation in Equation [11] does not have a closed form and was solved numerically by methods of Monte Carlo by sampling Formula 12i from N(0, {sigma}Formula 122) 5,000 times. It is important to note that Equation [12] does not have the form of a survival function of a proportional hazards model, which implies that the conditional and marginal interpretation of the model parameters is different. Generally, in the marginal model the ratio between hazard functions of 2 individuals with different covariates is no longer constant. For t isin (1,2,...,2,000), we performed the following steps:


Formula 13[13]

Formula 13i(t | Formula 13, Formula 13, Formula 13, {sigma}s2) for the incomplete sire model was for t isin = (1,2,...,2,000) estimated by its posterior mean. For comparison, we also compared the estimated marginalized survival function with the true marginalized survival function.


    RESULTS
 Top
 Abstract
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 IMPLICATIONS
 APPENDIX A
 APPENDIX B
 LITERATURE CITED
 
Parameter Estimates Obtained Using a Complete Sire Model
Mode and mean values of the marginal posterior distributions of model parameters obtained using a complete sire model were in close agreement with the true values used to simulate the data. All the true values used to simulate the data were within the 95% central posterior density (CPD) regions (Gelman et al., 1995Go) defined by the 2.5 and 97.5% percentiles (Tables 2Go to Table 4GoGo). With increasing level of censoring, the 95% CPD regions were increased. Similar results were observed for the marginal posterior distributions of the mean and variance of a log-lifetime.


View this table:
[in this window]
[in a new window]
 
Table 2. True values, and marginal posterior mode, mean, and 2.5 and 97.5% percentiles of the model parameters, and of E(Formula 19) and Var(Formula 19) with Formula 19 = log(Formula 19), obtained using the complete sire model to analyze data simulated with hnor2 = 0.2
 

View this table:
[in this window]
[in a new window]
 
Table 3. True values, and marginal posterior mode, mean, and 2.5 and 97.5% percentiles of the model parameters, and of E(Formula 19) and Var(Formula 19) with Formula 19 = log(Formula 19), obtained using the complete sire model to analyze data simulated with hnor2 = 0.5
 

View this table:
[in this window]
[in a new window]
 
Table 4. True values, and marginal posterior mode, mean, and 2.5 and 97.5% percentiles of the model parameters, and of E(Formula 19) and Var(Formula 19) with Formula 19 = log(Formula 19), obtained using the complete sire model to analyze data simulated with hnor2 = 1
 
Parameter Estimates Obtained Using an Incomplete Sire Model
Parameter estimates obtained from a Bayesian analysis using Gibbs sampling were very similar to estimates obtained using the Survival Kit (Table 5Go to Table 7GoGo). Mode and mean values of marginal posterior distributions of model parameters estimated using a incomplete sire model were numerically lower than the corresponding true values, and the true values were far outside the corresponding 95% CPD regions (Table 5Go to Table 7GoGo). In the incomplete sire model, censoring not only increased the 95% CPD regions, but also mode and mean values of the marginal posterior distributions of Formula 13 and {sigma}Formula 132 were increased and of Formula 13 decreased with increasing level of censoring. The only exception was for hnor2 = 0.2 with censoring increasing from 0 to 25%, where the sire variance was slightly decreased. For the Weibull baseline parameters, the 95% CPD regions did not overlap for any of the 4 different levels of censoring. The true expected log-lifetime was within the 95% CPD regions at 0 and 25% censoring, but outside for the 2 remaining levels of censoring. The estimated variance of log-lifetime decreased with increasing level of censoring, and the 95% CPD regions did not overlap for any levels of censoring. The true variance of log-lifetime was only within the 95% CPD regions at 25% censoring.


View this table:
[in this window]
[in a new window]
 
Table 5. True values, parameter estimates with associated SD obtained from Survival Kit (S.Kit), and marginal posterior mode, mean, and 2.5 and 97.5% percentiles of model parameters, and of E(Formula 19) and Var(Formula 19) with Formula 19= log(Formula 19), obtained using the incomplete sire model to analyze data simulated with hnor2 = 0.2
 

View this table:
[in this window]
[in a new window]
 
Table 6. True values, parameter estimates with associated SD obtained from Survival Kit (S.Kit), and marginal posterior mode, mean, and 2.5 and 97.5% percentiles of model parameters, and of E(Formula 19) and Var(Formula 19) with Formula 19 = log(Formula 19), obtained using the incomplete sire model to analyze data simulated with hnor2 = 0.5
 

View this table:
[in this window]
[in a new window]
 
Table 7. True values, parameter estimates with associated SD obtained from Survival Kit (S.Kit), and marginal posterior mode, mean, and 2.5 and 97.5% percentiles of model parameters, and of E(Formula 19) and Var(Formula 19) with Formula 19 = log(Formula 19), obtained using the incomplete sire model to analyze data simulated with hnor2 = 1
 
Survival Functions
The estimated marginalized survival functions obtained from the complete sire model after having integrated out the residual effect were in close agreement with the corresponding true ones. For each of the 3 simulated data sets these results are illustrated for 0 and 75% censoring and for a sire with negative transmitting ability; sj = –0.73 for data simulated with hnor2 = 0.2, sj = –0.98 for data simulated with hnor2 = 0.5, and sj = –1.55 for data simulated with hnor2 = 1 (Figures 1Go to 3GoGo). Up to about 500 time units, the survival curves estimated from the incomplete sire models were close to the true marginalized survival curves. After 500 time units, this approach resulted in a slight overestimation of the survival curve at 0% censoring and a substantially underestimation at 75% censoring. It is clear that for the level of neglected individual heterogeneity addressed in this study, the incomplete sire model provides a poor fit of the true survival curve that furthermore depends heavily on the level of censoring.


Figure 1
View larger version (11K):
[in this window]
[in a new window]
 
Figure 1. Marginalized survival curves for hnor2 = 0.2 and a true sire effect, sj = –0.73; true model (-); estimated from complete sire model, 0% censoring (o) and 75% censoring (•); estimated from incomplete sire model, 0% censoring ({circ}) and 75% censoring ({blacksquare}).

 

Figure 2
View larger version (13K):
[in this window]
[in a new window]
 
Figure 2. Marginalized survival curves for hnor2 = 0.5 and a true sire effect, sj = –0.98; true model (-); estimated from complete sire model, 0% censoring (o) and 75% censoring (•); estimated from incomplete sire model, 0% censoring ({square}) and 75% censoring ({blacksquare}).

 

Figure 3
View larger version (11K):
[in this window]
[in a new window]
 
Figure 3. Marginalized survival curves for hnor2 = 1 and a true sire effect, sj = –1.55; true model (-); estimated from complete sire model, 0% censoring (o) and 75% censoring (•); estimated from incomplete sire model, 0% censoring ({square}) and 75% censoring ({blacksquare}).

 
Ranking of Sires
The Spearman rank correlations between estimated and true sire effects were of the same order of magnitude for the complete sire model and the incomplete sire model (Table 8Go). The only exception was for data not censored, where Spearman rank correlations were slightly greater for the complete sire model. With increasing level of censoring, the correlations between true and estimated sire effects decreased, and with increasing heritability on the normally distributed scale the correlations increased. The complete sire model and the incomplete sire model did not rank sires exactly the same way, which is illustrated for hnor2 = 0.2 and 0% censoring in Figure 4Go. From this figure it is also clear that for both models the ranking of estimated sire effects varies randomly around the true ranking of sires independent of the magnitude of true sire effect.


View this table:
[in this window]
[in a new window]
 
Table 8. Spearman rank correlations between the true sire effects and marginal posterior means of the sire effects
 

Figure 4
View larger version (14K):
[in this window]
[in a new window]
 
Figure 4. Ranking of estimated sire effects plotted against ranking of true sire effects for data with hnor2 = 0.2 and 0% censoring; complete sire model (*) and incomplete sire model ({square}).

 

    DISCUSSION
 Top
 Abstract
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 IMPLICATIONS
 APPENDIX A
 APPENDIX B
 LITERATURE CITED
 
This study established that if individual heterogeneity is present, the Weibull log-normal sire frailty necessarily needs to take account of this variation in order to draw correct conclusions about the population investigated. If not accounted for, the estimated model parameters will be scaled toward zero by an amount, which depends in magnitude on the level of censoring. As a result, the distribution of lifetimes will not be correctly estimated, and parameter estimates obtained using an incomplete sire model may provide limited information about the quantitative genetic basis of the survival trait investigated. On the other hand, for the purpose of ranking sires for selection, the incomplete sire model performed as good as the complete sire model when measured as the correlation between true and estimated sire effects.

Various issues related to the effect of ignoring unobserved heterogeneity in proportional hazards models for individual and independent survival times were already addressed in the late 1970s (Vaupel et al., 1979Go; Manton et al., 1981Go). These and several later studies have established that if individual heterogeneity is ignored, regression parameters will be biased toward zero by an amount that depends in magnitude on the ignored individual heterogeneity (Struthers and Kalbfleisch, 1986Go; Keiding and Andersen, 1997Go) and on the level of censoring (Solomon, 1984Go; Henderson and Oman, 1999Go). These results agree well with the ones found in this study for the Weibull sire frailty model in which lifetimes are correlated.

Ducrocq and Casella (1996)Go and Meuwissen et al. (2002)Go also studied the validity of using incomplete sire models for drawing genetic inferences. In contrast to this study, they found that the level of bias in estimated sire variance was negligible. No results were reported for the other model parameters, and these studies only considered very high heritabilities on the log-frailty scale; hnor2 = 1 combined with 0% censoring (Ducrocq and Casella, 1996Go) and hnor2 = 0.8 combined with about 50% censoring (Meuwissen et al., 2002Go). The reason for the different results between this and their studies is possibly because they used smaller sire and residual variances in their simulation of data compared with the ones used in this study. In combination with our results, this suggests that the magnitude of bias depends on the level of ignored unobserved individual heterogeneity. Similarly, Tempelman and Gianola (1996)Go found that the size of bias increased with increasing amount of neglected individual heterogeneity in the mixed Poisson sire model of Foulley et al. (1987)Go for count data.

An obvious question is whether the amount of unobserved individual heterogeneity, which might be present in applications, is sufficiently large to cause concern. Korsgaard et al. (1998)Go estimated the residual variance to 0.49 and the heritability on the linear log-frailty scale to 0.14 in a semiparametric sire model for time until occurrence of respiratory disease in bulls. Studies not related to animal breeding have reported residual variances in the range 0.8 to 1.2 (Hougaard, 1986Go; Hougaard et al., 1992Go; Pickles and Crouchley, 1994Go; Aalen et al., 1995Go). Henderson and Oman (1999)Go concluded that if individual heterogeneity of this size is ignored in a proportional hazards model for individual and independent survival times, the estimated fixed effects will be dependent on the level of censoring and take values that are about 70% of the true values. In this study, the estimates of the Weibull parameter Formula 13 took values that were between 50 and 80% of the true value depending on the level of censoring. In applications it therefore seems important always as a first step to use survival models, which accounts for unobserved individual heterogeneity in log-frailty to avoid misleading conclusions.

In accordance with this study, Vukasinovic et al. (1999)Go found that the correlation between true and estimated sire effects decreases with increasing level of censoring. In contrast to this study, they did not report any results indicating a dependence between parameter estimates and level of censoring probably because they fixed the 2 Weibull parameters, a log-gamma parameter associated with a random herd-year-season effect and the sire variance, when they estimated the sire effects and the remaining model parameters using an incomplete sire model.

In this study we found that the correlations between estimated and true sire effects were of the same order of magnitude for the incomplete sire model and the complete sire model. The ability of the incomplete sire model to provide nearly optimal ranking of sires has also been reported in previous studies (Ducrocq and Casella, 1996Go; Vukasinovic et al., 1999Go; Ducrocq, 2001Go). This result is not unexpected. In studies of univariate and independent survival time, it has been shown that relative importance of explanatory variables remains unchanged in the absence of censoring, when proportional hazards models ignoring individual heterogeneity are used (Solomon, 1984Go; Struthers and Kalbfleisch, 1986Go). Empirical evidence suggests that this result is also valid in the presence of censoring (Hougaard et al., 1994Go; Keiding and Andersen, 1997Go). Results from this study provide evidence that a similar consistency also is valid for random effects.

Therefore, if the major purpose of a genetic analysis is to rank sires for selection, the incomplete sire model seems to be a satisfactory and computational efficient approximation. Although this approximation seems valid under the assumption of balanced half-sib designs, Damgaard et al. (2003)Go found by means of simulation that estimated sire effects varied with simultaneous changes in daughter group size and level of censoring. This result agrees with expectation from the dependence between parameter estimates and level of censoring found in this study. Therefore, in situations in which daughter group characteristics vary between sires and within sires in course of time, this result advocates that cautions should be exercised when sire estimates are obtained from an incomplete sire model. An interesting observation from practice shows that problems with unstable sire effects across time exist in genetic evaluation for longevity of dairy cows based on an incomplete sire model (VanRaden and Powell, 2002Go). Clearly these properties of the incomplete sire model would decrease the efficiency by which sires with different daughter group characteristics can be ranked for selection.

The Bayesian analysis using Gibbs sampling derived and implemented for the complete sire models in this study showed satisfactory results and can easily be extended to more general Weibull log-normal frailty models including time-dependent covariates, random environmental effects, etc. The method successfully separated variation of additive genetic and environmental origin and resulted in parameter estimates that were consistent with the true values used to simulate data. Except for genetic analysis of dairy cows, it is generally necessary to apply an animal model to account satisfactorily for the relationships between animals. However, from a practical point of view, the Gibbs sampling approach is computational-time-demanding. Therefore, despite the undesirable properties demonstrated for the incomplete sire model in this study, the complete sire or equivalent animal model is unlikely to generate much enthusiasm in genetic applications unless an estimation procedure can be established, which is applicable also in large applications.

For estimation of parameters in the overdispersed Poisson animal model for count data, Tempelman and Gianola (1996)Go suggested a 2-step procedure, which is conceptually similar to the estimation procedure implemented in the Survival-Kit for mixed survival models (Ducrocq and Sölkner, 1998Go). That is, first variance components are estimated as the joint mode of their posterior distribution approximated using Laplacian integration. In the second step, regression parameters and random effects are estimated as the joint mode of the posterior distribution conditional on the estimated variance components. Although this method works satisfactorily for Poisson animal models (Tempelman and Gianola, 1996Go), it does not seem to work even for Weibull log-normal animal models without a residual effect in log-frailty (Ducrocq and Casella, 1996Go). The development of efficient estimation routines applicable also in large applications for the complete sire model and the equivalent animal model is an important subject for further research.


    IMPLICATIONS
 Top
 Abstract
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 IMPLICATIONS
 APPENDIX A
 APPENDIX B
 LITERATURE CITED
 
Assessing additive genetic knowledge of survival traits based on Weibull log-normal frailty models that ignore individual heterogeneity is not an approach to be recommended for developing and running breeding programs. This is because if individual heterogeneity is present the model will generally not lead to correct genetic inferences. We therefore suggest that future research aimed at attaining genetic knowledge of survival traits should rest on survival models, which model unobserved individual heterogeneity.


    APPENDIX A
 Top
 Abstract
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 IMPLICATIONS
 APPENDIX A
 APPENDIX B
 LITERATURE CITED
 
Fully Conditional Distributions
To derive the fully conditional distribution of model parameters from the joint posterior distribution in Equation [10], we used the notation in which Formula 13{varphi} denotes the parameter vector except the parameter {varphi}. The fully conditional distribution of Formula 13 and Formula 13 are up to proportionality given by:


Formula 14[14]


Formula 15[15]

The fully conditional posterior distribution of the jth sire transmitting ability, and the ith residual effect are given by:


Formula 16[16]


Formula 17[17]

The fully conditional posterior distribution of variance components can be recognized as truncated inverse Gamma distributions, according to:


Formula 18[18]


Formula 19[19]


    APPENDIX B
 Top
 Abstract
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 IMPLICATIONS
 APPENDIX A
 APPENDIX B
 LITERATURE CITED
 
Adaptive Rejection Sampling
Adaptive rejection sampling (Gilks and Wild, 1992Go) is a method for rejection sampling from a univariate log-concave probability density function f(x), in which the density only need be specified up to proportionality, say g(x) = cf(x), and D denotes the domain of f(x). To sample a point from f(x) using adaptive rejection sampling, first an initialization step is performed, in which the initial rejection envelope and squeezing function are defined. Let h(x) = log(g[x]). In this study, the same initialization step was used for all parameters Formula 19, Formula 19, (sj)j=1,...q, and (Formula 19i)i=1,...n, with an extra step for Formula 19, because its fully conditional distribution is bounded on the left at 0. In the initialization step, 3 abscissas, say x1, x2, and x3, were specified by first locating the mode, xm, using Newton-Raphson and thereafter setting x1 = xm – (h'' (x))–1, x2 = xm + 0.1(h'' [x])–1, and x3 = xm + (h''[x])–1. If x1 ≤ 0 for Formula 19, x1 was set to xm/2. Next, sampling and updating steps were performed alternately until a sample was accepted.

Implementation of the Gibbs Sampler
The following beginning values were assigned to the parameters: Formula 19 = 2.4, Formula 19 = –15, (sj)j=1,...q = 0, (Formula 19i)i=1,...n = 0, {sigma}s2 = 0.4, and {sigma}Formula 192 = 2.6. Sampling was carried out from the respective fully conditional distributions in the following order, describing 1 round of the Gibbs sampler of the complete sire model: step 1, sample Formula 19 from Equation [14]; step 2, sample (sj)j=1,...q from Equation [16]; step 3, sample {sigma}s2 from Equation [18]; step 4, sample (Formula 19i)i=1,...n from Equation [17]; step 5, sample {sigma}Formula 192 from Equation [19]; and step 6, sample Formula 19 from Equation [15].


    Footnotes
 
1 Acknowledgments: The first author would like to thank Ph.D. students A. C. Sørensen and M. Hansen for introducing me to programming in Fortran. Back

2 Corresponding author: lars.damgaard{at}agrsci.dk

Received for publication April 18, 2005. Accepted for publication December 13, 2005.


    LITERATURE CITED
 Top
 Abstract
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 IMPLICATIONS
 APPENDIX A
 APPENDIX B
 LITERATURE CITED
 


Aalen, O., E. Bjertness, and T. Sonju. 1995. Analysis of dependent survival data applied to lifetimes of amalgam fillings. Stat. Med. 14:1819–1829.[Medline]

Andersen, A. H., I. R. Korsgaard, and J. Jensen. 2000. Identifiability of parameters in and equivalence of animal and sire models for Gaussian and threshold characters, traits following a Poisson mixed model and survival traits. Research reports 417:1–36. Dep. Theor. Stat., Univ., Aahus, Denmark.

Bulmer, M. G. 1980. The mathematical theory of quantitative genetics. Clarendon Press, Oxford, England.

Damgaard, L. H., I. R. Korsgaard, J. Simonsen, O. Dalsgaard, and A. H. Andersen. 2003. Genetic versus non-genetic Weibull log-normal sire frailty models: A simulation study. Proc. 30th Interbull Tech. Workshop, Beltsville, MD 30:35–43.

Ducrocq, V., and G. Casella. 1996. A Bayesian analysis of mixed survival models. Genet. Sel. Evol. 28:505–529.

Ducrocq, V., and J. Sölkner. 1998. "The Survival Kit V3.0", a package for large analyses of survival data. Proc. 6th World Congr. Genet. Appl. Livest. Prod., Armidale, Australia 27:447–450.

Ducrocq, V. 2001. A two-step procedure to get animal model solutions in Weibull survival models used for genetic evaluations on length of productive life. Proc. 27th Interbull Meet. Budapest, Hungary 27:147–151.

Fisher, B. A. 1918. The correlation between relatives on the supposition of Mendelian I inheritance. Trans. Royal Soc. Edinb. 52:399–433.

Foulley, D. S., D. Gianola, and S. Im. 1987. Genetic evaluation of traits distributed as Poisson-binomial with reference to reproductive characters. Theor. Appl. Genet. 73:870–877.

Gelfand, A. E., and A. F. M. Smith. 1990. Sampling based approaches to calculating marginal densities. J. Am. Stat. Assoc. 85:398–409.

Gelman, A., J. B. Carlin, H. S. Stern, and D. B. Rubin. 1995. Bayesian Data Analysis. Chapman & Hall, Boca Raton, FL.

Gilks, W. R., and P. Wild. 1992. Adaptive rejection sampling for Gibbs sampling. Appl. Statist. 41:337–348.

Henderson, R., and P. Oman. 1999. Effect of frailty on marginal regression estimates in survival analysis. J. R. Stat. Soc. B 61:367–379.

Hougaard, P. 1986. Survival models for heterogeneous populations derived from stable distributions. Biometrika 73:387–396.[Abstract/Free Full Text]

Hougaard, P. 2000. Analysis of multivariate survival data. Springer, New York, NY.

Hougaard, P., B. Harvald, and N. V. Holm. 1992. Measuring the similarities between the lifetimes of adult Danish twins born between 1881–1931. J. Am. Stat. Assoc. 87:17–25.

Hougaard, P., P. Myglegaard, and K. Borch-Johnsen. 1994. Heterogeneity models of disease susceptibility, with application to diabetic nephropathy. Biometrics 50:1178–1188.[Medline]

Kalbfleisch, J. D., and R. L. Prentice. 1980. The statistical analysis of failure time data. John Wiley and Sons, New York, NY.

Keiding, N., and P. K. Andersen. 1997. The role of frailty models and accelerated failure time models in describing heterogeneity due to omitted covariates. Stat. Med. 16:215–224.[Medline]

Korsgaard, I. R., A. H. Andersen, and J. Jensen. 2000. On different models, on heritability, reliability and related quantities of survival traits. Page 80 in Animal Genetics. EAAP Publ. No. 51. Hague, The Netherlands.

Korsgaard, I. R., P. Madsen, and J. Jensen. 1998. Bayesian inference in the semi-parametric log normal frailty model using Gibbs sampling. Genet. Sel. Evol. 30:241–256.

Manton, K. G., E. Stallard, and J. W. Vaupel. 1981. Methods for comparing the mortality experience of heterogeneous populations. Demography 18:389–410.[Medline]

Meuwissen, T. H. E., R. F. Veerkamp, B. Engel, and S. Brotherstone. 2002. Single and multitrait estimates of breeding values for survival using sire and animal models. Anim. Sci. 75:15–24.

Pickles, A., and R. Crouchley. 1994. Generalizations and applications of frailty models for survival and event data. Stat. Meth. Med. Res. 3:263–278.[Medline]

Solomon, P. J. 1984. Effect of misspecification of regression models in the analysis of survival data. Biometrika 71:291–298.[Abstract/Free Full Text]

Struthers, C. A., and J. D. Kalbfleisch. 1986. Misspecified proportional hazard models. Biometrika 73:363–369.[Abstract/Free Full Text]

Tempelman, R. J., and D. Gianola. 1996. A mixed effects model for overdispersed count data in animal breeding. Biometrics 52:265–279.

VanRaden, P. M., and R. L. Powell. 2002. Properties of international longevity evaluation and correlations with other traits. Proc. 29th Interbull Meet., Interlaken, Switzerland 29:61–65.

Vukasinovic, N., J. Moll, and N. Künze. 1999. Genetic evaluation for length of productive life with censored records. J. Dairy Sci. 82:2178–2185.[Abstract]

Vaupel, J. W., K. G. Manton, and E. Stallard. 1979. The impact of heterogeneity in individual frailty on the dynamics of mortality. Demography 16:439–454.[Medline]


This article has been cited by other articles:


Home page
J DAIRY SCIHome page
M. Rodrigues-Motta, D. Gianola, B. Heringstad, G. J. M. Rosa, and Y. M. Chang
A Zero-Inflated Poisson Model for Genetic Analysis of the Number of Mastitis Cases in Norwegian Red Cows
J Dairy Sci, November 1, 2007; 90(11): 5306 - 5315.
[Abstract] [Full Text] [PDF]


This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Damgaard, L. H.
Right arrow Articles by Andersen, A. H.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Damgaard, L. H.
Right arrow Articles by Andersen, A. H.


HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS