J. Anim Sci.
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
 QUICK SEARCH:   [advanced]


     


This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Da, Y.
Right arrow Articles by Xu, J.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Da, Y.
Right arrow Articles by Xu, J.
J. Anim. Sci. 2002. 80:2528-2539
© 2002 American Society of Animal Science

Linkage analysis using direct and indirect counting and relative efficiencies for codominant and dominant loci1

Y. Da2, J. Garbe, N. London and J. Xu

Department of Animal Science, University of Minnesota, Saint Paul 55108

2 Correspondence:
265D Haecker Hall (phone: (612) 625-7780; fax: (612) 625-1283; E-mail:
yda{at}tc.umn.edu).


    Abstract
 Top
 Abstract
 Introduction
 Material and Methods
 Results and Discussion
 Implications
 Appendix 1. LOD Scores
 Appendix 2: Likelihood Functions...
 Literature Cited
 
A method based on direct and indirect counting is developed for rapid and accurate linkage analysis for codominant and dominant loci. Methods for estimating gender-specific recombination frequencies are available for cases where at least one of the two loci is multiallelic and for biallelic loci with mixed parental linkage phases where at least one locus is codominant. Most of the estimates of gender-average and gender-specific recombination frequencies required iterative solutions. The new method makes use of the full data set, yields exact estimates of the recombination frequencies when the observed and expected genotypic frequencies are equal, and are computationally efficient. Relative efficiency of various data types is affected by the inheritance mode and by parental linkage phases of biallelic loci, but unaffected by the locus polymorphism when using the full data set for linkage analysis. The ability to determine parental linkage phases is affected by the locus polymorphism as well as inheritance mode. Intercross (or F-2 design) is more efficient for mapping codominant loci, whereas backcross is more efficient if dominance is involved. Mixed parental linkage phases of biallelic loci are less efficient than coupling or repulsion linkage phases. Ignoring noninformative offspring results in biased estimates of recombination frequency for biallelic loci only and reduced LOD scores for all cases.

Key Words: Codominance • Dominance • Linkage Analysis • Loci


    Introduction
 Top
 Abstract
 Introduction
 Material and Methods
 Results and Discussion
 Implications
 Appendix 1. LOD Scores
 Appendix 2: Likelihood Functions...
 Literature Cited
 
While computationally efficient methods are available for large-scale linkage analysis for codominant loci (Green et al., 1990), rapid methods are unavailable for mapping dominant loci and for the map integration of dominant and codominant loci. Most computer programs that provide linkage analysis for dominant loci, such as LINKAGE (Lathrop et al. 1984), implement computationally intensive likelihood analysis and generally have a limitation on the number of loci that can be analyzed jointly. A computationally efficient method for linkage analysis with codominant and dominant inheritance would be a valuable tool for mapping dominant genes and for the map integration of codominant and dominant loci, because the dominant inheritance mode is typical of many disease genes and many dominant markers exist (Ajmone-Marsan et al., 1997; Cushwa and Medrano, 1996; Knorr et al., 1999). Knapp et al. (1995) derived an analytical formula for maximum likelihood estimation of recombination frequency between two dominant loci in repulsion linkage phase. The mathematical simplicity of such an analytical formula is computationally efficient for large-scale linkage analysis. However, many other cases of linkage analysis do not have a simple analytical formula for estimating recombination frequencies based on likelihood functions. The understanding of relative efficiencies of various types of genotypic data is useful for planning mapping experiments. Most results on relative efficiencies of genotypic data (Allard, 1956; Green, 1981) were based on the approximate variances and covariances of estimated recombination frequencies, but the accuracy of such an approximation is unclear. The purpose of this article is to develop simple solutions for linkage analysis to facilitate large-scale joint linkage analysis with codominant and dominant loci, and to evaluate the relative efficiencies of various types of genotypic data to provide insights for designing mapping experiments.


    Material and Methods
 Top
 Abstract
 Introduction
 Material and Methods
 Results and Discussion
 Implications
 Appendix 1. LOD Scores
 Appendix 2: Likelihood Functions...
 Literature Cited
 
General Strategy.
Families with linkage information will be divided into two categories: families that can be analyzed using the direct counting method (Ott, 1999) for all offspring (Category I), and families that cannot be analyzed using the direct counting method for all offspring (Category II). Recombination frequencies will be estimated using the direct counting method (Ott, 1999) for Category I, and using "direct and indirect counting" for Category II. Then, the two estimates will be combined to obtain the overall estimate of recombination frequency. The focus of this article is on the new method of direct and indirect counting for Category II.

The Method of Indirect Counting.
The purpose of using indirect counting is to develop a method for linkage analysis that uses the full data set including noninformative offspring with minimal mathematical complexity and computational difficulty to facilitate large-scale applications. Noninformative offspring do not have information to determine parental allele transmission unequivocally (Da and Lewin, 1995) and cannot be used for linkage analysis using the direct counting method. However, noninformative offspring are expected to contain a percentage of unobservable recombinants and such unobservable recombinants could be estimated using the method of indirect counting to be described below. Therefore, "noninformative" offspring for direct counting in fact are at least "partially" informative for indirect counting. "Underlying genotype" is used to refer to a phase-known genotype, whereas "genotype" or "observed genotype" is used to refer to a genotype with known allele contents only. For example, AaBb is an observed genotype with two possible underlying genotypes: AB/ab and Ab/aB. "Phenotype" of a locus refers to the fact that the allele content of the locus is unknown and is used to describe observations of dominant loci. Based on the gene counting method of Smith (1957), the indirect counting method calculates the expected number of recombinants contained in the noninformative offspring using the following formula


[1]

where kei = expected number of recombinants contained in genotype (or phenotype) i, mij = number of recombinants in the underlying genotype j of genotype (or phenotype) i, vij = conditional probability of recombinants in noninformative offspring for a given two-locus genotype (for codominant loci) or phenotype (for dominant loci), and ki = the total number of noninformative offspring with the given genotype (or phenotype). The general formula for calculating vij is


[2]

where pj = probability of underlying genotype j with recombinant(s), and qi = probability of the observed genotype or phenotype. Note that pj and qi can be equal in some cases. The observed number of recombinants in the same category of families is obtained by direct counting from informative offspring for which parental allele transmission can be determined unequivocally. Adding the numbers of expected and observed recombinants yields the estimated total number of recombinants. Dividing this estimated number of recombinants by the total number of meioses yields the estimate of recombination frequency from families where noninformative offspring exist. If gender-average (sex-average) recombination frequency is assumed,


[3]

where {theta} = gender-average recombination frequency, nr = total number of expected and observed recombinants, T = total number of meioses. Since nr is a function of {theta}, Eq. [3]Go generally is a polynomial function of {theta}. In this article, an analytical solution for {theta} is provided if Eq. [3]Go is a polynomial function of degree 3 or less, and an iterative solution is used if Eq. [3]Go is a higher order polynomial function. As shown in Da and Lewin (1995), a cross between heterozygous genotypes, referred to as an "intercross," is the only situation where noninformative offspring may exist if the genotypes of both parents are known. Therefore, the method of indirect counting will consider various situations of intercross, including multiallelic, biallelic, codominant, and dominant loci.

Gender-Average and Gender-Specific Recombination Frequencies.
Gender-average (sex-average) recombination frequency refers to the recombination frequency estimated from meioses of both genders, and gender-specific (sex-specific) recombination frequencies refer to two recombination frequencies estimated from male and female meioses separately. Gender-average recombination frequency is always estimable as long as linkage information exists. However, gender-specific recombination frequencies are not always estimable. When the two loci are biallelic and the heterozygous parents have the same linkage phase (coupling or repulsion), gender-specific recombination frequencies are nonestimable regardless of whether the loci are codominant or dominant, because two independent equations cannot be established to estimate two separate recombination frequencies. For two dominant loci, the case with mixed parental linkage phases (one parent is in coupling phase and the other in repulsion phase) is the only situation where two equations could be established to estimate gender-specific recombination frequencies. However, neither our method nor the maximum likelihood method would yield reliable estimates. Therefore, estimating gender-specific recombination frequencies using dominant loci is deemed impractical and will not be considered in this article. Methods to estimate gender-specific recombination frequencies will be developed for cases where at least one locus has multiple alleles or the parents have mixed linkage phases with at least one codominant locus. In analogy to Eq. [3]Go, gender-specific (sex-specific) recombination frequencies can be estimated using the following equations simultaneously:


[4]


[5]

where x = female recombination frequency, nx = total number of expected and observed female recombinants, Tx = total number of female meioses, y = male recombination frequency, ny = total number of expected and observed male recombinants, and Ty = total number of male meioses. In all cases covered by this article, Eqs. [4]Go and [5]Go will be solved by iterative methods.

Pooling of Estimates.
For families using direct and indirect counting, estimates of a recombination frequency from all s families can be pooled to obtain the overall estimate from all families using the following formula:


[6]

where {theta}1 = the overall estimate of the recombination frequency from families where noninformative offspring exist, nri = expected number of recombinants in family i, and Ti = number of gametes in family i. Equation [6]Go can be used to obtain the pooled estimates of x and y except that {theta} is replaced with x or y, and nri and Ti are replaced with the corresponding gender-specific numbers defined in Eqs. [4]Go and [5]Go. When gender-specific recombination frequencies are available, the gender-average recombination frequency will be obtained as:


[7]

where a1 = Tx/(Tx + Ty) and a2 = Ty/(Tx + Ty). As usual, the LOD score for a gender-average recombination frequency is defined as


[8]

where L({theta}) = likelihood function under the hypothesis of linkage, and L({theta} = 1/2) = likelihood function under the hypothesis of no linkage. The LOD scores for testing the significance of gender-specific recombination frequencies in the literature (e.g., Ott, 1999) is:


[9]

The LOD score given by Eq. [9]Go is an indication how much the gender-specific model is favored over the gender-average model, but is not a test for the significance of each gender-specific recombination frequency. The following LOD scores could be defined to test the significance for gender-specific recombination frequencies:


[10]


[11]

As to be shown in this article, a family with noninformative offspring is not as informative as a family without noninformative offspring, even when noninformative offspring are used in the linkage analysis. To account for this type of unequal information, estimates of a recombination frequency from the two categories of families should be weighted differently. Since the LOD score is a summary statistic of the number of observations and informativeness of a family type, the LOD of each family type should be a logical choice as the weight. Let {theta}2 = estimate of recombination frequency from families without noninformative offspring, and Z1 and Z2 be the LOD scores for {theta}1 and {theta}2, respectively. Then, the overall estimate from all families can be obtained as {theta} = c1{theta}1 + c2{theta}2, where c1 = Z1/(Z1 + Z2), and c2 = Z2/(Z1 + Z2), with c1 + c2 = 1. A gender-specific recombination frequency (x or y) over families can be obtained similarly using the LOD scores defined by Eqs. [10]Go and [11]Go. However, this article does not include families that do not have separate estimates of x and y in the calculation of x or y across families. We have observed that "forced" estimates of x and y from those families tend to yield the same x and y values, so that including such families without separate x and y estimates would tend to diminish the difference between x and y.

Relative Efficiency of Different Genotypic Data.
Relative efficiency of different genotypic data, including multiallelic, biallelic, codominant and dominant genotypes, will be compared using the unit LOD score and the likelihood ratio for testing parental linkage phases. The unit LOD score (u) will be defined as the expected LOD score per offspring assuming gender-average recombination frequency, i.e.,


[12]

where N is the number of offspring, and Z{theta} is defined by Eq. [8]Go. The definition of the unit LOD score is the same as the ELOD in Lander and Botstein (1989). Here, "unit LOD" rather than "ELOD" is used to avoid potential confusion with the ELOD defined differently in Ott (1999). The type of genotypic data with higher unit LOD score is considered more efficient for linkage analysis. An advantage of the unit LOD score over the overall LOD score (Z{theta}) is that the unit LOD score can be expressed in terms of the recombination frequencies so that the numbers of observations are no longer involved. This is convenient for studying the relative efficiencies without having to assume a specific set of numbers of observations. It can be shown that the unit LOD score can be obtained by replacing the numbers of genotypes in Z{theta} by the corresponding genotypic probabilities. Unit LOD scores for specific cases are defined in Appendix 1. The backcross design (backcross to the recessive line is assumed if dominance is involved) is included for comparing relative efficiencies with the intercross or F-2 design. The information available for testing parental linkage phases is a measure of data efficiency. When parental linkage phases are unknown, such as when the grandparents have missing genotypes and the allele transmission from the grandparents to the parent cannot be determined, the likelihood ratio test based on the offspring genotypic distribution can be used to determine the parental linkage phases. The type of genotypic data that yields more statistical confidence for determining parental linkage phases is more efficient for linkage analysis. Given two loci, four combinations of parental linkage phases are possible. The likelihood ratio for the two highest likelihood functions will be used for comparing efficiency for inferring parental linkage phases. Likelihood functions for testing parental linkage phases are given in Appendix 2.

Bias and Reduction in LOD Score Due to Ignoring Noninformative Offspring.
To quantify the benefit of including noninformative offspring in linkage analysis, bias in estimates of recombination frequency due to ignoring noninformative offspring and the reduction in unit LOD score are evaluated under the assumption that the observed offspring distribution equals the expected. Bias is defined as the difference between the estimate using informative offspring only and the estimate using full data. The reduction in LOD score is defined as the difference in the unit LOD scores between using the full data and using informative offspring only. As to be shown in this article, the methods of using full data developed in this article yield estimates that are exactly the same as the true parameters under the assumption that the observed offspring distribution equals the expected. Therefore, the bias in recombination frequency due to ignoring noninformative offspring can be expressed as {theta}d - {theta}, where {theta}d = estimate of recombination frequency using direct counting, and {theta} = the true recombination frequency.


    Results and Discussion
 Top
 Abstract
 Introduction
 Material and Methods
 Results and Discussion
 Implications
 Appendix 1. LOD Scores
 Appendix 2: Likelihood Functions...
 Literature Cited
 
Results of the new method for estimating recombination frequencies will be presented for eight cases in order of the most informative loci (both loci have multiple codominant alleles) to the least informative loci (both loci are dominant with mixed parental linkage phases).

Both Loci Are Multiallelic and Codominant (MM Data Type).
"Multiallelic" in this article refers to three or more alleles per locus for the two heterozygous parents. Such a definition is used because three alleles per locus for the parents result in 100% informative offspring for the locus. The direct counting method is used for this type of data. The purpose of describing this type of data is not to develop a new method, but to use as a comparison to less informative types of data where noninformative offspring exist. By the direct counting method, the gender-specific recombination frequencies are estimated by Eqs. [4]Go and [5]Go. When x and y are available, gender-average recombination frequency can be obtained by Eq. [7]Go. Likelihood functions and LOD scores are given in Appendices 1 and 2.

Multiallelic and Biallelic Codominant Loci (MB Data Type).
From Table 1Go, gender-specific recombination frequencies can be obtained by the following iterative solutions:


View this table:
[in this window]
[in a new window]
 
Table 1. Genotypic frequency, number of observations, and the number of recombinants in the offspring from the intercross of A1B/A2b (male) x A3B/A4b (female)
 

[13]


[14]

where x = female recombination frequency, y = male recombination frequency, superscript i = iteration number, a = (k2+ k3+ k6+ k7)/n, b = (k9+ k12)/n, c = (k10+ k11)/n, and d = (k2+ k4+ k5+ k7)/n, and where k1through k12are defined in Table 1Go. Then the gender-average recombination frequency can be estimated as {theta} = (x + y)/2, noting that the male and female parents have the same number of meioses. This method of estimating gender-average recombination frequency will also be used for other cases where gender-specific recombination frequencies are available.

Biallelic Codominant Loci with Coupling or Repulsion Parental Linkage Phases (BB Data Type).
For this case, gender-specific recombination frequencies are unavailable and gender-average recombination frequency can be estimated based on Table 2Go. Substituting the nrin Table 2Go into Eq. [2]Go, then Eq. [2]Go can be written as a third degree polynomial function of {theta}, and the solution for {theta} is


View this table:
[in this window]
[in a new window]
 
Table 2. Genotypic frequency, number of observations, and the number of recombinants in the offspring from the intercross of AB/ab x AB/ab
 

[15]

where s = 1/2[a1a2/3 - (2/27)a13 - c], t = 1/3(a2 - a12/3), a1 = (T + c1 + n4)/T, a2 = 0.5 + c1/T, and where T = 2n, c1 = 2n3 + n2, c = c1/(2T), n1 = k1 + k9, n2 = k2 + k4 + k6 + k8, n3 = k3 + k7, and n4 = k5. Note that Eq. [15]Go is derived under the assumption of coupling parental linkage phases but is applicable to the repulsion linkage phases by reversing the allele definitions for one of the two loci.

Biallelic Codominant Loci with Mixed Parental Linkage Phases (BB-CR).
From Table 3Go, gender-specific recombination frequencies can be obtained by the following iterative solutions:


View this table:
[in this window]
[in a new window]
 
Table 3. Offspring phenotypes and recombinants from the mating of AB/ab (male) x Ab/aB (female)
 

[16]


[17]

where x = female recombination frequency, y = male recombination frequency, a = (k1+ k9)/n, b = k5/n, c = (k2+ k4+ k6+ k8)/n, and d = (k3+ k7)/n.

Multiallelic Codominant Locus and Dominant Locus (MD Data Type).
From Table 4Go, gender-specific recombination frequencies can be obtained by the following iterative solutions:


View this table:
[in this window]
[in a new window]
 
Table 4. Genotypic frequency, number of observations, and the number of recombinants in the offspring from the intercross of A1B/A2b (male) x A3B/A4b (female) with B being dominant over b
 

[18]


[19]

where a = k5/n, b = k6/n, c = k7/n, d = k8/n, e = (k1+ k2)/n, and f = (k1+ k3)/n.

Biallelic Codominant Locus and Dominant Locus with Coupling or Repulsion Parental Linkage Phases (BD Data Type).
Gender-specific recombination frequencies are nonestimable for this case. From Table 5Go, the gender-average recombination frequency can be obtained using the following iterative solution:


View this table:
[in this window]
[in a new window]
 
Table 5. Genotypic frequency, number of observations, and the number of recombinants in the offspring from the intercross of AB/ab x AB/ab with B being dominant over b
 

[20]

where a = (2k2+ k4)/(2n), b = k1/n, c = k3/(2n), and d = k5/n.

Biallelic Codominant Locus and Dominant Locus with Mixed Parental Linkage Phases (BD-CR Data Type).
From Table 6Go, gender-specific recombination frequencies can be obtained by the following iterative solutions:


View this table:
[in this window]
[in a new window]
 
Table 6. Offspring phenotypes and recombinants from the mating of AB/ab (male) x Ab/aB (female)
 

[21]


[22]

where a = k1/n, b = k2/n, c = k3/n, d = k4/n, e = k5/n, f = k6/n, v1= [x(1 - y) + xy]/(1 - y + xy), v2= xy/(1 - y + xy), v3= 2[x(1 - y) + xy]/(1 + x + y - 2xy), v4= 2[(1 - x)y + xy]/(1 + x + y - 2xy), v5= xy/[(1 - x)(1 - y) + xy], v6= [x + (1 - x)y]/[(1 - x)(1 - y) + xy], v7= xy/(1 - x + xy), v8= [(1 - x)y + xy]/(1 - x + xy).

Two Dominant Loci with Coupling Linkage Phases (DD-CC Data Type).
In this case, both parents are assumed to have coupling linkage phase (Table 7Go). The gender-average recombination frequency can be obtained from the following iterative solution:


View this table:
[in this window]
[in a new window]
 
Table 7. Genotypic frequency, number of observations, and the number of recombinants in the offspring from the intercross of AB/ab x AB/ab with allele A being dominant over a and B being dominant over b
 

[23]

where a = k1/(2n), and b = (k2+ k3)/(2n).

Two Dominant Loci with Mixed Linkage Phases (DD-CR Data Type).
In this case, one parent is assumed to have coupling phase and the other repulsion phase (Table 8Go). The gender-average recombination frequency can be obtained from the following iterative solution:


View this table:
[in this window]
[in a new window]
 
Table 8. Genotypic frequency, number of observations, and the number of recombinants in the offspring from the intercross of AB/ab x Ab/aB with A being dominant over a and B being dominant over b
 

[24]

where a = k1/(2n), b = (k2+ k3)/(2n), and c = k4/(2n). For the case when the two loci are dominant and both parents have repulsion linkage phase (DD-RR data type), the analytical formula for maximum likelihood estimation of recombination frequency is available from Knapp et al. (1995).

Numerical Results for Estimating Recombination Frequencies.
Equations [13]Go through [24]Go were validated and tested using 200 offspring genotypes generated with the requirement that the observed genotypic frequencies equal the expected. The true parameters used to generate the offspring genotypes were {theta} = 0.20, x = 0.10, and y = 0.20. Two sets of extreme starting values, {theta}0= 0.01, x0= 0.01, and y0= 0.01, and {theta}0= 0.45, x0= 0.45, and y0= 0.45, were used to test the robustness of the iterative solutions to starting values. Equations [13]Go through [24]Go all yielded estimates of recombination frequencies that are exactly the same as the assumed true parameters. The iterative solutions required less than 55 iterations to converge with a tolerance level of 10-9 except for the case of dominance with mixed linkage phases (DD-CR), which required 235 to 284 iterations to converge. In terms of CPU time, all the iterative solutions required less than 1 s to converge on an 800-MHz laptop computer. The two different sets of extreme starting values did not have a significant effect on the number of iterations or computing time. The case of dominance with mixed parental linkage phases not only required more iterations, but also was the least efficient data type, as discussed below. For all the cases, direct and indirect counting yielded exactly the same results as maximum likelihood analysis. The method of direct and indirect counting should be a useful addition or alternative to current methods available for linkage analysis including complex maximum likelihood analysis due to its mathematical simplicity and computational efficiency. When combined with the strategy of two-point analysis for linkage detection, the method of direct and indirect counting should allow rapid large-scale joint linkage analysis of codominant and dominant loci, which is useful to facilitate mapping dominant loci using codominant markers and the map integration of codominant and dominant loci. The estimates of recombination frequencies from direct and indirect counting are the expected fraction of recombinants whether the estimates are within or outside the parameter space. This is helpful in interpreting the estimates in situations where the meanings of the estimates are not easily interpretable. For example, if a maximum likelihood using numerical maximization yielded an estimate outside the parameter space, the estimate itself could not tell whether the problem was due to the algorithm of numerical maximization or due to a wrong model or sampling. As shown in London et al. (2002) and Xu et al. (2002), a wrong inheritance model could result in a serious bias in estimating recombination frequencies (including estimates out of the parameter space) and such a bias could be evaluated conveniently using the method of direct and indirect counting.

Relative Efficiencies.
Figure 1Go shows that the unit LOD scores are affected by the inheritance mode of each locus and the parental linkage phases but unaffected by the polymorphism of the locus for all cases where noninformative offspring exist. Genotypic data with 100% informative offspring (both loci are multiallelic and codominant; MM in Figure 1Go) is the most efficient data type for linkage analysis even though offspring noninformative for direct counting in other types of data are used by indirect counting. This implies that an offspring noninformative for direct counting is only partially informative for indirect counting and is never as good as an informative offspring. Mixed linkage phases (BB-CR, BD-CR, DD-CR in Figure 1Go) are less efficient than coupling and repulsion phases. For dominant loci, coupling linkage phases (DD-CC in Figure 1Go) are strikingly more efficient than the mixed and repulsion phases (DD-CR and DD-RR in Figure 1Go). For example, assuming {theta} = 0.05, the unit LOD for the repulsion phases is only 22% of that for the coupling phases, whereas the unit LOD for the mixed phases is a mere 12% of that for the coupling phases. The backcross design is better than the intercross or F-2 design for mapping dominant loci but is worse for mapping codominant loci. Compared to results of relative efficiencies in the literature, the results in this article have new information regarding the effect of marker polymorphism on the unit LOD scores and the ability to determine parental linkage phases, and have essentially the same conclusion regarding the effect of inheritance mode; that is, dominance has less linkage information than codominance and backcross is more efficient for dominant loci (Allard, 1956; Green, 1981; Knapp et al., 1995). However, the result regarding dominant loci with mixed linkage phases (DD-CR) is somewhat different from that in Green (1981), where DD-CR is found to be more efficient than repulsion linkage phases for small recombination frequencies. This difference could be attributable to different methods for evaluating relative efficiencies; that is, this article used the unit LOD whereas Green (1981) used the information matrix, which is an approximation of the second moments of parameter estimates based on the second derivatives of the log-likelihood function. It is worth noting that the same author recommended avoiding the experimental design using mixed linkage phases (DD-CR in Figure 1Go) for linkage analysis (p. 85, Green, 1981). For inference about parental linkage phases, both the locus polymorphism and inheritance mode affect the statistical power; that is, the power for detecting parental linkage phases decreases as locus polymorphism decreases and as the number of dominant loci increases (Figure 2Go). Note that the BB-CR data type (Table 3Go) does not have the ability to distinguish between the two possible cases of mixed linkage phases, Phase II and Phase III in Table 9Go, using the likelihood ratio test in Appendix 2. Therefore, knowing parental linkage phases is a necessary condition to estimate gender-specific recombination frequencies using Eqs. [16]Go and [17]Go. If mixed linkage parental linkage phases are identified as the most likely parental phases but cannot be identified as Phase II or III, Eqs. [16]Go and [17]Go still can be used but cannot identify which estimate is for the male and which is for the female recombination frequency.



View larger version (25K):
[in this window]
[in a new window]
 
Figure 1. Unit LOD scores for various types of data for linkage analysis. MM: two multiallelic codominant loci; MB: one multiallelic codominant locus and one biallelic codominant locus; BB: both loci are biallelic and codominant with coupling or repulsion linkage phases; BB-CR: both loci are biallelic and codominant with mixed linkage phases; MD: one multiallelic codominant locus and one dominant locus; BD: one biallelic codominant locus and one dominant locus with coupling or repulsion linkage phases; BD-CR: one bi-allelic codominant locus and one dominant locus with mixed parental linkage phases; DD-CC: two dominant loci in coupling linkage phase; DD-CR: two dominant loci with mixed parental linkage phase; DD-RR: two dominant loci in repulsion linkage phase.

 


View larger version (19K):
[in this window]
[in a new window]
 
Figure 2. Likelihood ratios for identifying parental linkage phase. The likelihood ratio is based on the largest two likelihood functions for each case and is in log10 scale calculated from 200 F-2 offspring. MM: two multiallelic codominant loci; MB: one multiallelic codominant locus and one biallelic codominant locus; BB: both loci are biallelic and codominant; MD: one multiallelic codominant locus and one dominant locus; BD: one biallelic codominant locus and one dominant locus; DD: two dominant loci.

 

View this table:
[in this window]
[in a new window]
 
Table 9. Association between genotypic probabilities and numbers of observations for testing parental linkage phases for an intercross of A1A2B1B2 x A3A4B3B4
 
Bias and Reduction in LOD Score Due to Ignoring Noninformative Offspring.
The direct counting method does not apply to the case when both loci are dominant, because such a method cannot estimate recombination frequency due to the fact that only one genotype is informative, that is, the aabb genotype in Tables 7Go and 8Go. Therefore, this section applies only to the cases where at least one locus is codominant. For biallelic loci with coupling or repulsion linkage phases where at least one locus is codominant (BB, BD), gender-specific recombination frequencies are unavailable and the bias in the gender-average recombination frequency is d1 = {theta}d - {theta} = -{theta}(1 - {theta})(1 - 2{theta})/[(1 - {theta})2 + {theta}2]. For biallelic loci with mixed parental linkage phases (BB-CR, BD-CR), {theta}d = 0.5 irrespective of the true parameter value, and the bias is d2 = {theta}d - {theta} = 0.5 - {theta}. For biallelic loci with mixed parental linkage phases (BB-CR, BD-CR), bias in the female recombination frequency is d3 = xd - x = x(1 - x)(1 - 2y)/(x + y - xy), where xd = the female recombination frequency estimated by the direct counting method. Bias in the male recombination frequency can be obtained using the same formula except that x and y are switched in the formula. The formulas of d1, d2, and d3 show that ignoring noninformative offspring yields underestimates of recombination frequencies for coupling or repulsion parental linkage phases and overestimates for mixed parental linkage phases. As shown in Figure 3Go, the absolute bias reaches the maximum at {theta} = 0.257 for the coupling or repulsion phases, where the bias is 0.151, whereas the bias is an increasing function of {theta} and decreasing function of x or y for the mixed phases. These results indicate that ignoring noninformative offspring could result in a serious bias for biallelic loci. Bias due to ignoring noninformative offspring was also reported for half-sib designs with biallelic codominant loci (Gomez-Raya, 2001). Our analytical and numerical results show that ignoring noninformative offspring does not result in a bias in the estimate of the recombination frequency when at least one locus is multiallelic. As expected, ignoring noninformative offspring results in a reduction in the LOD score for all cases (Figure 4Go).



View larger version (19K):
[in this window]
[in a new window]
 
Figure 3. Bias in estimates of recombination frequency due to ignoring noninformative offspring. BB: both loci are biallelic and codominant; BB-CR: two codominant loci with mixed parental linkage phases; BD: one biallelic codominant locus and one dominant locus with coupling or repulsion linkage phases; BD-CR: one biallelic codominant locus and one dominant locus with mixed parental linkage phases.

 


View larger version (16K):
[in this window]
[in a new window]
 
Figure 4. Reduction in LOD scores due to ignoring noninformative offspring. The reduction in LOD score for a data type is defined as the difference between the unit LOD in Figure 1Go and the unit LOD score when noninformative offspring for direct counting are ignored.

 

    Implications
 Top
 Abstract
 Introduction
 Material and Methods
 Results and Discussion
 Implications
 Appendix 1. LOD Scores
 Appendix 2: Likelihood Functions...
 Literature Cited
 
Results from this study indicate that the method of direct and indirect counting can be an effective method for large-scale joint linkage analysis of codominant and dominant loci that are useful for mapping dominant genes using codominant markers or to integrate linkage maps of codominant and dominant loci.


    Appendix 1. LOD Scores
 Top
 Abstract
 Introduction
 Material and Methods
 Results and Discussion
 Implications
 Appendix 1. LOD Scores
 Appendix 2: Likelihood Functions...
 Literature Cited
 
MM data type: both loci are codominant and multiallelic (Table 9Go).


MB data type: both loci are codominant, but one locus is multi-allelic and one locus is bi-allelic (Table 1Go).


where N = 1k1 + k4 + k5 + k8, N2 = k2 + k3 + k6 + k7, N3 = k9 + k12, N4 = k10 + k11, N5 = k1 + k3 + k6 + k8, N6 = k2 + k4 + k5 + k7, N7 = k1 + k8, N8 = k2 + k7, N9 = k3 + k4 + k5 + k6 + k9 + k12, and N10 = k10 + k11.

BB data type: both loci are codominant and bi-allelic with coupling or repulsion parental linkage phases (Table 2Go).


where N1 = k1 + k9, N2 = k2 + k4 + k6 + k8, N3 = k3 + k7, N4 = k5. The unit LOD score is the same as that for the MB data type.

BB data type: both loci are codominant and biallelic with mixed parental linkage phases (Table 3Go).


MD data type: one locus is codominant and multi-allelic and one locus is dominant (Table 4Go).


BD data type: one locus is codominant and biallelic and one locus is dominant with coupling or repulsion parental linkage phases (Table 5Go).


The unit LOD score is the same as for the MD data type.

BD-CR data type: one locus is codominant and biallelic and one locus is dominant with mixed parental linkage phases (Table 6Go).


DD-CC data type: both loci are dominant with coupling linkage phase (Table 7Go).


DD-CR data type: both loci are dominant with mixed parental linkage phases (Table 8Go).


DD-RR data type: both loci are dominant with coupling and repulsion linkage phases (Knapp et al., 1995).


where k1, k2, k3, and k4 are numbers of observations for A-B-, A-bb, aaB-, and aabb genotypes, respectively.


    Appendix 2: Likelihood Functions for Testing Parental Linkage Phases
 Top
 Abstract
 Introduction
 Material and Methods
 Results and Discussion
 Implications
 Appendix 1. LOD Scores
 Appendix 2: Likelihood Functions...
 Literature Cited
 
When the linkage phase of each parent is unknown, the most likely linkage phase of each parent can be identified using likelihood ratios. Table 9Go shows the association between the genotypic probabilities and the numbers of genotypic observations. As the assumption about the parental linkage phases changes, the association between the underlying probabilities and the observations changes. For an intercross with two loci, four combinations of parental linkage phases are possible. For example, for the mating of A1A2B1B2 x A3A4B3B4 in Table 9Go, the following four combinations of parental linkage phases are possible: 1) A1B1/A2B2 x A3B3/A4B4, 2) A1B2/A2B1 x A3B3/A4B4, 3) A1B1/A2B2 x A3B4/A4B3, and 4) A1B2/A2B1 x A3B4/A4B3. Then, the corresponding log-likelihood functions (except for a common constant) are:


The log-likelihood functions for the other cases can be derived in a similar manner. The resulting formulae are similar, except that the definitions for ki and qi are different and the numbers of terms in the summation of the right-hand side of the equation are generally different as well. The most likely linkage phase is then identified by the largest likelihood.


    Footnotes
 
1 This research is supported in part by the Agricultural Experiment Station (project MN-16-043) and grant-in-aid of the University of Minnesota, and by funding from Cargill and NRICGP/USDA (grant #03275). The authors wish to thank two anonymous reviewers for helpful comments. Back

Received for publication November 11, 2001. Accepted for publication May 28, 2002.


    Literature Cited
 Top
 Abstract
 Introduction
 Material and Methods
 Results and Discussion
 Implications
 Appendix 1. LOD Scores
 Appendix 2: Likelihood Functions...
 Literature Cited
 


Allard, R. W. 1956. Formulas and tables to facilitate the calculation of recombination values in heredity. Hilgardia 24:235–278.

Ajmone-Marsan, P., A. Valentini, M. Cassandro, G. Vecchiotti-Antaldi, G. Bertoni, and M. Kuiper. 1997. AFLPTM markers for DNA fingerprinting in cattle. Anim. Genet. 28:418–426.[Medline]

Cushwa, W. T., and J. F. Medrano. 1996. Applications of the random amplified polymorphic DNA (RAPD) assay for genetic analysis of livestock species. Anim. Biotechnol. 7:11–31.

Da, Y., and H. A. Lewin. 1995. Linkage information content and efficiency of full-sib and half-sib designs for gene mapping. Theor. Appl. Genet. 90:699–706.

Gomez-Raya, L. 2001. Biased estimation of the recombination fraction using half-sib families and informative offspring. Genetics 157:1357–1367.[Abstract/Free Full Text]

Green, E. L. 1981. Genetics and Probability in Animal Breeding Experiments. Oxford University Press, New York.

Green, P., K. Falls, and S. Crooks. 1990. Documentation for CRI-MAP. version 2.4. Washington University School of Medicine, St. Louis.

Knapp, S. J., J. L. Holloway, W. C. Bridges, and B-H. Liu. 1995. Mapping dominant markers using F2 matings. Theor. Appl. Genet. 91:74–81.

Knorr, C., H. H. Cheng, and J. B. Dodgson. 1999. Application of AFLP markers to genome mapping in poultry. Anim. Genet. 30:28–36.[Medline]

Lander, E. S., and D. Botstein. 1989. Mapping Mendelian factors underlying quantitative traits using RFLP linkage maps. Genetics 121:185–199.[Abstract/Free Full Text]

Lathrop, G. M., J. M. Lalouel, C. Julier, and J. Ott. 1984. Strategies for multilocus linkage analysis in humans. Proc. Natl. Acad. Sci. USA 81:3443–3446.[Abstract/Free Full Text]

London, N., J. Xu, J. Garbe, and Y. Da. 2002. Linkage analysis for the hypothesized interaction between the polled and scurred traits in cattle. In: Proc. 7th World Cong. Genet. Appl. Livest. Prod., Montpellier, France 29:485–488.

Ott, J. 1999. Analysis of Human Genetic Linkage. 3rd ed. The Johns Hopkins University Press, Baltimore and London.

Smith, C.A.B. 1957. Counting methods in genetical statistics. Ann. Hum. Genet. 21:254–276.[Medline]

Xu, J., N. London, J. Garbe, and Y. Da. 2002. Bias in linkage analy-sis due to ignoring epistasis effects. In: Proc. 7th WorldCong. Genet. Appl. Livest. Prod., Montpellier, France32:633–636.



This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Da, Y.
Right arrow Articles by Xu, J.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Da, Y.
Right arrow Articles by Xu, J.


HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS