J. Anim Sci. 2008. 86:500-507. doi:10.2527/jas.2007-0536
© 2008 American Society of Animal Science
A comparison of methods of fitting several models to nutritional response data
D. Vedenov*,1 and
G. M. Pesti
,2
* Department of Agricultural and Applied Economics, and and
Department of Poultry Science, The University of Georgia, Athens 30602-2772
 |
Abstract
|
|---|
A variety of models have been proposed to fit nutritional input-output response data. The models are typically nonlinear; therefore, fitting the models usually requires sophisticated statistical software and training to use it. An alternative tool for fitting nutritional response models was developed by using widely available and easier-to-use Microsoft Excel software. The tool, implemented as an Excel workbook (NRM.xls), allows simultaneous fitting and side-by-side comparisons of several popular models. This study compared the results produced by the tool we developed and PROC NLIN of SAS. The models compared were the broken line (ascending linear and quadratic segments), saturation kinetics, 4-parameter logistics, sigmoidal, and exponential models. The NRM.xls workbook provided results nearly identical to those of PROC NLIN. Furthermore, the workbook successfully fit several models that failed to converge in PROC NLIN. Two data sets were used as examples to compare fits by the different models. The results suggest that no particular nonlinear model is necessarily best for all nutritional response data.
Key Words: nutritional requirement mathematical modeling
 |
INTRODUCTION
|
|---|
Nutritional response experiments result in a series of ordered pairs of data relating inputs to outputs. The inputs are various levels of a nutrient, from suboptimal to superoptimal levels. The outputs are typical response parameters such as growth, feed utilization efficiency, body composition, including various blood and other tissue parameters, and parameters such as bone strength and immune competency.
Nutritionists have fitted such data to a plethora of statistical interpretations and mathematical models (Almquist, 1953
, 1954
; Duncan, 1955
; Fisher et al., 1973
; Curnow, 1973
; Morgan et al., 1975
; Mercer et al., 1978
; Robbins et al., 1979
; Ware et al., 1980
; Mercer, 1982
, 1992
; Fisher, 1985
; Robbins, 1986
; Gahl et al., 1991
; Shearer, 2000
). Some models are used simply because they fit the data well (Almquist, 1953
; Fisher et al., 1973
; Robbins et al., 1979
), whereas others are used because there are logical interpretations of the various parameters (Curnow, 1973
; Morgan et al., 1975
; Ware et al., 1980
).
Because the majority of models are nonlinear, fitting any of them usually requires the availability of sophisticated statistical software and training in its use. Nonlinear regression analysis ultimately reduces to minimization of a sum of squared errors (SSE), although no general closed-form solution to the optimization problem can be obtained. Numerical optimization methods are built into nonlinear regression procedures available in statistical software packages such as SAS (SAS Institute, 2003
). Microsoft Excel software (Microsoft Corporation, Redmond, WA) is more commonly available and is easier to use, but has limited statistical capabilities. However, Excel has a built-in module called Solver, which also implements numerical optimization algorithms. Using this optimization module, we developed a Microsoft Excel workbook to fit nutritional response data to several models that have been described and used in the scientific literature (Table 1
). The advantages of the developed workbook are the wider availability of and familiarity with the underlying software, convenience of data manipulation, and better visualization capabilities provided by Microsoft Excel. The workbook is accompanied by a tutorial in the form of a Microsoft PowerPoint presentation.
The objective of the study reported here was to compare parameter estimate results from the Nutritional Response Model Excel workbook (NRM.xls) with results from the NLIN procedure (SAS Institute, 2003
). The workbook that was developed was also used to compare the performance of several nutrition models in fitting results of 2 nutrition response experiments.
 |
MATERIALS AND METHODS
|
|---|
The estimation of a nonlinear regression model,
consists of finding a vector of parameters β that minimizes the SSE,
for a given set of data; that is,
 | [1] |
The estimated values of the parameter vector have statistical properties similar to those in the conventional linear models, including consistency and asymptotic normality. In addition, the usual statistical parameters can be calculated for nonlinear regression models, including SE, confidence intervals, and goodness-of-fit measures, for example, R2 (Green, 2003
).
Because the function, f(.), is nonlinear, the minimization problem in Eq. [1] generally does not have a closed-form solution. Estimation procedures implemented in statistical packages use various iterative methods to calculate the (approximate) solutions. The procedures typically require the user to specify the functional form to be estimated and also to provide initial (or beginning) values for the iteration process (SAS Manual, The NLIN Procedure, SAS Inst. Inc., Cary, NC). Although statistical software packages provide the user with the flexibility of estimating virtually any nonlinear regression model, this flexibility comes at the price of a steep learning curve and limitations of software availability.
Microsoft Excel does not allow for explicit estimation of nonlinear models. However, it includes a numerical optimization module, Solver, which can minimize or maximize the value in a given cell by changing the values in other user-identified cells. The numerical optimization methods used by Solver (generalized reduced gradient method) are similar to the ones used by nonlinear regression procedures such as PROC NLIN of SAS. Therefore, Microsoft Excel with the Solver module provides a convenient platform for development of a nonlinear regression estimation tool for the case in which the same set of known models is typically estimated.
The NRM.xls was produced with Microsoft Office Excel 2003 with the Solver module (Solver.xla) installed. The NRM.xls and a tutorial on its use are available free of charge (http://pubs.caes.uga.edu/caespubs/ES-pubs/RB440/RB-440.htm; last accessed Nov. 5, 2007).
A separate spreadsheet was created for each of the 6 models implemented in the workbook (Table 1
). Each spreadsheet allows for up to 100 pairs of observations to be entered manually or copied from another program or data file. (One hundred data points is certainly not a limitation of the program and can be easily expanded to any desired number of observations.) Additional functionality provided in the workbook also allows users to copy the same data set simultaneously to all 6 model spreadsheets. Each spreadsheet includes cells with the parameters of the model being estimated. The current values of the parameters are used to calculate the error terms in Eq. [1] for all observations by using built-in Excel functions. The error terms are then combined in the SSE. The cell with SSE serves as a target cell for the Solver module, which attempts to minimize its value by changing the values of the parameter cells. The numerical optimization algorithms tend to be sensitive to the beginning values (i.e., the initial guesses). Therefore, each spreadsheet also contains a graph showing a scatter plot of the observations as well as the fitted model corresponding to the current values of the parameters. The user can modify the initial values of the parameters manually before running the optimization module to achieve a "visual fit" of the regression model to the data. Although subjective, such a visual fit tends to result in good beginning values for the numerical optimization algorithm and speed up its convergence.
In addition to the parameter estimates, each spreadsheet also calculates the SE and 95% confidence intervals for the estimated parameters, as well as the R2 of the fitted model. Because of the nonlinearity of the models, the interpretation of the R2 is somewhat complicated, because it is not necessarily contained between 0 and 1 (Green, 2003
). However, it still provides a useful measure of the relative performance of the various models.
Data from 2 experiments were fitted to the response models by using NRM.xls and Excel, and PROC NLIN of SAS (version 9.1, TS Level 1M3 XP-PRO platform, SAS Inst. Inc.; Figure 1
). One data set consisted of 24 points from a chick feeding trial in which 8 levels of Lys were fed to triplicate pens of 8 chicks each from 9 to 18 d of age (Figure 2
); the other consisted of 100 points randomly generated around the saturation kinetics model with parameters from a classic feeding trial (Morgan et al., 1975
) with graded levels of nitrogen fed to mice (Bosshardt et al., 1946
; Figure 3
). More specifically, one hundred random "inputs" were drawn from a uniform distribution between 0 and 4%. The exact value of the saturation kinetics model was calculated at each input point and a random shock (noise) was then added to generate an "observation." The random shocks were also drawn from a uniform distribution between 0.05 x maximum and 0.05 x maximum data included with NRM.xls. Both the modified Gauss-Newton method and the Marquardt method were used with SAS 9.1 (SAS Institute, 2003
). Results from the 2 procedures were nearly identical, so only results from the Marquardt method are reported here.

View larger version (10K):
[in this window]
[in a new window]
|
Figure 2. A comparison of spline models and the saturation kinetics model fitted to data from an experiment with broiler chickens fed graded concentrations of Lys.
|
|

View larger version (18K):
[in this window]
[in a new window]
|
Figure 3. A comparison of nonlinear models fitted to 100 data points based on an experiment with mice fed graded concentrations of protein.
|
|
Initial parameter estimates were guessed by using NRM.xls until the predicted line grossly approximated the observed points. The Solver add-in of Excel was then invoked to find the final parameter estimates and descriptive statistics. The final parameter estimates of NRM.xls were then used as initial parameter estimates with PROC NLIN of SAS. When PROC NLIN failed to converge on a solution for several models, a grid search was conducted to find appropriate initial parameter estimates. In this manner, convergence was achieved for one additional model by using SAS.
 |
RESULTS
|
|---|
The NRM.xls found solutions for all models for both data sets (Tables 2
and 3
). The SAS program failed to converge for several models with the smaller chick data set (points in Figure 2
), and with the broken line with ascending quadratic model for the larger mouse data set (points in Figure 3
). The NRM.xls workbook provided results nearly identical to those in PROC NLIN when SAS converged on a solution.
View this table:
[in this window]
[in a new window]
|
Table 2. A comparison of parameter estimates using the Nutrient Response Model Workbook (NRM.xls) and SAS version 9.1 using the data set in Figure 2
|
|
View this table:
[in this window]
[in a new window]
|
Table 3. A comparison of parameter estimates using the Nutrient Response Model Workbook (NRM.xls) and SAS version 9.1 using the data set in Figure 3
|
|
 |
DISCUSSION
|
|---|
Both methods of fitting the nonlinear models described here require good initial parameter estimates to find the global minimum of SSE and thus the best model fit and final parameter estimates. The process of finding good initial parameter estimates is much easier with NRM.xls because it contains graphs that immediately show how each estimated model fits the data. Despite having initial parameter estimates from NRM.xls that fitted the data well, SAS failed to converge for several models. The sums of the squared residuals were very similar for the 2 fitting methods for the smaller chick data set, suggesting no practical differences. However, for the larger mouse data set, the sums of the residuals squared were consistently smaller for the NRM.xls Excel workbook, indicating that Excels Solver found better parameter estimates. The overall conclusion was that the developed Excel workbook provided estimates of nonlinear regression models comparable in accuracy with those of PROC NLIN of SAS.
The NRM.xls program calculates goodness of fit statistics (R2) for each of the models. These 2 very different data sets illustrate how difficult it is to declare that one model is advantageous over any other. Because of the nonlinearity of the models, the R2 values for the various models did not differ very much. For the larger mouse data, the R2 values for the saturation kinetics (Morgan et al., 1975
; Mercer, 1982
, 1992
), 4-parameter logistics (Gahl et al., 1991
), and sigmoidal-exponential (Robbins et al., 1979
) models were 99.95, 99.94, and 99.94%, respectively. For the smaller chick data, the R2 values for the saturation kinetics (Morgan et al., 1975
; Mercer, 1982
, 1992
), 4-parameter logistics (Gahl et al., 1991
), and sigmoidal-exponential (Green, 2003
) models were 99.58%. Even the poorest fit broken line with linear ascending segment model (Robbins, 1986
) had an R2 value of 99.49%. Although some authors have found better fits for one model or the other (Ware et al., 1980
; Gahl et al., 1991
), the 2 data sets used here as examples suggest that no particular nonlinear model is necessarily best for all nutritional response data. The choice of a model to evaluate data should depend on the objectives of the experiment (Baker, 1986
). Clearly, some form of nonlinear regression model should be used in preference to multiple range tests (Lowry, 1992
).
Microsoft Excel, with its Solver add-in is practically ubiquitous. It is available on the vast majority of microcomputers worldwide. Although lacking the flexibility of nonlinear regression procedures of the statistical packages, the NRM.xls workbook is a convenient tool for estimating a standard set of nutrition response models available to practically anyone with Internet access. The Microsoft Office PowerPoint tutorial makes the NRM.xls easy to use for anyone with even a basic knowledge of Microsoft Excel.
 |
Footnotes
|
|---|
1 Present address: Department of Agricultural Economics, Texas A&M University, 2124 TAMU, College Station 77843–2124. 
2 Corresponding author: gpesti{at}uga.edu
Received for publication August 24, 2007.
Accepted for publication October 12, 2007.
 |
LITERATURE CITED
|
|---|
Almquist, H. J. 1953. Interpretation of amino acid requirement data according to the law of diminishing returns. Arch. Biochem. Biophys. 44:245–247.[CrossRef][Medline]
Almquist, H. J. 1954. A fallacy in the analysis of nutritional requirement data. Arch. Biochem. Biophys. 50:503.[CrossRef][Medline]
Baker, D. H. 1986. Problems and pitfalls in animal experiments designed to establish dietary requirements for essential nutrients. J. Nutr. 116:2339–2349.[Abstract/Free Full Text]
Bosshardt, D. K., L. C. Ydse, M. M. Ayres, and R. H. Barnes. 1946. The use of mice for the measurement of the growth promoting quality of proteins. J. Nutr. 31:23–33.[Abstract/Free Full Text]
Curnow, R. N. 1973. A smooth population response curve based on abrupt threshold and plateau model for individuals. Biometrics 29:1–10.[CrossRef]
Duncan, D. B. 1955. Multiple range and multiple F tests. Biometrics 11:1–42.[Medline]
Fisher, C. 1985. The prediction of responses to nutrients. Pages 51–73 in Proc. Thirteenth Int. Congr. Nutr. T. J. Taylor and N. K. Jenkins, ed. Libbey, London, UK.
Fisher, C., T. R. Morris, and R. C. Jennings. 1973. A model for the description and prediction of the response of laying hens to amino acid intake. Br. Poult. Sci. 14:469–484.[Medline]
Gahl, M. J., M. D. Finke, T. D. Crenshaw, and N. J. Benevenga. 1991. Use of a four parameter logistic equation to evaluate the response of growing rats to ten levels of each indispensable amino acid. J. Nutr. 121:1720–1729.[Abstract/Free Full Text]
Green, W. H. 2003. Econometric Analysis. Pearson Education, Upper Saddle River, NJ.
Lowry, S. R. 1992. Use and misuse of multiple comparisons in animal experiments. J. Anim. Sci. 70:1971–1977.[Abstract]
Morgan, P. H., L. P. Mercer, and N. W. Flodin. 1975. General model for nutritional responses in higher animals. Proc. Natl. Acad. Sci. USA 72:4327–4331.[Abstract/Free Full Text]
Mercer, L. P. 1982. The quantitative nutrient-response relationship. J. Nutr. 112:560–566.[Abstract/Free Full Text]
Mercer, L. P. 1992. The determination of nutritional requirements: Mathematical modeling of nutrient-response curves. J. Nutr. 122:706–708.[Abstract/Free Full Text]
Mercer, L. P., N. W. Flodin, and P. H. Morgan. 1978. New methods for comparing the biological efficiency of alternative nutrient sources. J. Nutr. 108:1244–1249.[Abstract/Free Full Text]
Robbins, K. R. 1986. A Method, SAS Program, and Examples for Fitting the Broken Line to Growth Data. Univ. Tennessee Res. Report 86–09. Univ. Tennessee Agric. Exp. Stn., Knoxville.
Robbins, K. R., H. W. Norton, and D. H. Baker. 1979. Estimation of nutrient requirements from growth data. J. Nutr. 109:1710–1714.[Abstract/Free Full Text]
SAS Institute. 2003. SAS Users Guide: Statistics, Release 9.1. SAS Inst. Inc., Cary, NC.
Shearer, K. D. 2000. Experimental design, statistical analysis and modeling of dietary nutrient requirement studies for fish: A critical review. Aquacult. Nutr. 6:91–102.[CrossRef]
Ware, G. O., R. D. Phillips, R. S. Parrish, and L. C. Moon. 1980. A comparison of two nonlinear models for describing intake-response relationships in higher organisms. J. Nutr. 110:765–770.[Abstract/Free Full Text]