|
|
||||||||


* Institute of Agricultural and Environmental Engineering (IMAG), Wageningen University and Research Centre, 6700 AA Wageningen, The Netherlands;
and
Animal Welfare CentreFaculty of Veterinary Medicine, University of Utrecht, Yalelaan 17, 3584 CL, Utrecht, The Netherlands; and
and
Department of Animal Sciences, Ethology Group, Wageningen University and Research Centre, 6700 AH, Wageningen, The Netherlands
2 Correspondence:
Inst. of Animal Science and Health (ID-Lelystad), Wageningen Univ. and Res. Ctr., P.O. Box 65, 8200 AB Lelystad, The Netherlands (phone: +31-(0)320-238205; fax: +31-(0)320-238050; E-mail:
m.b.m.bracke{at}id.wag-ur.nl).
| Abstract |
|---|
|
|
|---|
Key Words: Animal Welfare Housing Indexes Management Pigs
| Introduction |
|---|
|
|
|---|
Although several causal models with relevance for welfare have been constructed (e.g., Wiepkema, 1987; Hughes and Duncan, 1988; Moberg, 2000), relatively few models have been developed specifically for overall welfare assessment (reviewed in Bracke et al., 1999a). These models either provide a rough theoretical outline (e.g., Mellor and Reid, 1994) or they directly assign points to easily identifiable attributes of the housing system such as the so-called Tiergerechtheitsindex (cf. van den Weghe, 1998; Bartussek, 1999a). These models, however, are in need of a transparent scientific basis (e.g., Sundrum, 1997).
A step forward would be to formalize the procedure of reasoning involved in welfare assessment showing how the available knowledge can be used to select attributes and to weigh them into an overall judgment (Sandøe and Simonsen, 1992; Bracke et al., 1999c). To develop a model for welfare assessment at the housing-system level we chose the pregnant sow as a case, because sows are farmed under a wide range of housing conditions, including intensive ones that have raised public concern and that resulted in many detailed studies examining their welfare (cf. Scientific Veterinary Committee, 1997).
| Methods |
|---|
|
|
|---|
In order to facilitate the handling of large amounts of rather complex information, the model is embedded in a decision support system, that is, a computer-based information system (Turban, 1995). We developed this system mainly to identify the underlying procedure (i.e., the reasoning steps involved in overall welfare assessment). For the development of the decision support system we used the so-called Evolutionary Prototyping Method (Turban, 1995; Bracke et al., 2001a,b): an initial prototype was constructed and improved in repeated updating versions based in part on interviews with experts to fill in gaps in our knowledge (Bracke et al., 1999b, d). This led to the present "final" version. The term evolutionary here means that the decision support system and the welfare model are designed to be flexible, in that they allow the incorporation of new information and insights when these become available.
The objective of this paper is to describe the decision support system, the welfare model for pregnant sows, and the procedure to select and weight attributes of housing and management systems on the basis of available scientific knowledge. Validation using expert opinion will be the topic of a subsequent paper (Bracke et al., 2002).
| Components in the Decision Support System |
|---|
|
|
|---|
The decision support system is implemented in Microsoft Access 97, which is a relational database that stores information in tables that are linked to each other (Date, 1995; Bracke et al., 1999a). Calculations are performed on the information in these tables using operators and so-called queries that allow combining the information from the tables by selecting specified data sets from them. The five primary tables in the decision support system contain scientific statements (1), a list of needs (2), attributes (3), weighting categories (4), and housing systems described by their attributes (5), respectively. Two secondary tables contain, respectively, links between attributes and needs and links between attribute levels, weighting-category levels, types, and scientific statements (Figure 1
). The welfare model is constructed from the information contained in the first four primary tables. It consists of attributes, their levels, attribute scores, and weighting factors. When the attributes of a housing system are described in the fifth primary table, the model assigns attribute scores and weighting factors and calculates a welfare score as a weighted average of the attribute scores.
|
Modeling of Attributes
Attributes are descriptors of housing and management systems. Attributes have two or more levels that specify the properties of the housing system. The model contains 37 attributes that together determine the overall degree of need satisfaction and frustration of the animals (Table 1
). Some model attributes are environment-based (e.g., space per pen) and others are animal-based (e.g., health and hygiene status) or management-related (e.g., mixing management). The model attributes have between two and eight levels (3.9 on average), which are mutually exclusive, discrete classes that describe the (welfare-relevant) properties of housing systems. For instance, the attribute "space per pen" has eight levels that range from 1 to 1.5 m2 to > 6,250 m2 per enclosure.
|
The first subprocedure involved designing the levels of each attribute to be mutually exclusive and together exhaustive to cover the models domain, which includes the wide range of farm types suited for agricultural production. As a result, all housing systems in the domain can be described with exactly one, and never more, or less, than one, level of each attribute. This ensures also that a generic calculation rule (i.e., calculating welfare as a weighted average score) can be used such that any welfare advantage ascribed to a housing system accrues to all the systems with the same descriptive property, and only to them.
Not only are the levels of each attribute mutually exclusive, but the model attributes are also as much as possible defined to be mutually exclusive in order to avoid double-counting. For example, although the attributes "space per pen" and "movement comfort" may appear to overlap, they have been defined in the model with specific reference to space for locomotion and grip provided by the floor, respectively. The attributes and levels necessarily derive from the current state of housing for pregnant sows and the current state of science. Because the models domain also includes novel housing systems with attributes that cannot be described at present, the levels of each attribute can be interpreted as equivalence (as-if) classes. For example, space per pen has > 6,250 m2 as its best level. Equivalent conditions may be achieved with a novel housing concept, such as regular exercise on a treadmill. Although research will first have to confirm this before it can be incorporated formally into the science-based model, the model can presently be used to evaluate the impact of such research on the overall welfare status by allowing attribute levels to be interpreted as equivalence classes (i.e., when the treadmill is regarded as equivalent to > 6,250 m2 of pen space.
The second subprocedure involved the linking of each attribute to at least one of the 11 needs, which we had previously formulated as the main aspects of welfare as perceived from the animals point of view (Bracke et al., 1999d; Figure 2
). For example, the attribute "feeding level" is linked to the "ingestion" need, because it covers the energy level of the feed and the body condition of the sows to indicate how hungry the sows are. The linking of attributes to needs ensures that each attribute in the model is relevant for welfare from the animals point of view. It also ensures that welfare is covered overall, because all needs have at least one attribute, and it ensures a proportional distribution of attributes across the needs, because there is no major over- or under-representation of any single need as judged by the number and weights of the attributes assigned to it (cf. Figure 2
; method derived from Streiner and Norman, 1995, p 21). Furthermore, the attribute-need links help to define the attributes and to minimize the overlap between them to avoid double-counting. Note that the model calculates welfare directly from the attributes. Therefore, overlap of needs, as shown in Figure 2
, is not a problem, but overlap of attributes would result in double-counting. For example, the attributes "space per pen" and "space per sow" appear to overlap. However, "space per pen" is defined as space per pen for the needs "movement" and "exploration," whereas "space per sow" is defined as space for the "social contact" need (cf. Figure 2
). In this way, linking attributes to different needs helps to reduce the overlap between them. This example illustrates, also, that an attribute such as "space per pen" may be multifunctional, that is, linked to more than one need, but only when a conflict between the different functions (needs) cannot arise. The two requirements, no overlap and no conflict of functions, resulted in the disqualification of certain physical attributes such as "straw" and "soil" as attributes in our model.
|
In the third subprocedure the model attributes (through their levels) were linked to the scientific statements in the database. These statements had previously been selected from the literature as being truly scientific (i.e., referring directly to empirical observations and being relevant to distinguish between housing systems on welfare grounds). The procedure requires that all statements be linked to at least one attribute level. This provides a scientific basis to the model, because it specifies the meaning of the attributes (1), weights them (2), and identifies the attribute levels (3).
To specify the meaning of the attributes on average 4.9 scientific statements are used per attribute. These statements specify the influencing factors that affect the model attribute and its levels. For example, the level "ad lib water availability" (attribute 15) is specified by scientific statements about the amount drunk and the drinking speed of sows under ad libitum conditions (between 17 and 21.5 L/d; 3 L/min; Hill and Sainsbury, 1995, p 235). In Table 1
short descriptions are given of the attributes and their best and worst level. The full description of all 37 attributes in the model adds up to 7,500 words, on average over 200 words per attribute. Such descriptions also include a specification of other attributes in the model from which the described attribute must be distinguished (to avoid overlap).
For attribute weighting on average 25 scientific statements were used per attribute. Scientific statements can be used to determine the weight of an attribute, because they provide evidence of relationships between the attribute and welfare performance criteria, which we have classified into weighting categories. For example, "pigs will work for access to earth (Hutson and Haskell, 1990)" (Scientific Veterinary Committee, 1997, p 69) is a scientific statement that says that (but not how much) pigs will work for rooting substrate. This gives some weight to the attribute. Further weight is "added" by the statement that "Matthews and Ladewig (1994) produced demand curves for access to bedding material and found that they [the demand curves for bedding, MB] were only second to food in extent of demand" (Scientific Veterinary Committee, 1997, p 69; cf. bottom of Table 3
; the weighting procedure will be described in more detail in the next sections).
|
Because each scientific statement that has been collected is linked to at least one attribute or attribute level, this third subprocedure ensures that all concepts encountered in the scientific statements are incorporated into the set of 37 model attributes. Both the linking of the attributes to needs and to scientific statements, therefore, contribute to ensuring that the model assesses welfare overall, rather than only partially.
Weighting Categories and Types
The procedure described above shows how the attributes have been modeled for overall welfare assessment based on the domain of housing systems, the list of needs, and the scientific statements. We will now further explain the scoring and weighting procedures, in particular how the responses of the animals, as measured by science and described in scientific statements, were used for weighting the attributes using so-called weighting categories and types.
If we succeeded in attribute modeling, then we can regard each attribute as an additive component of welfare with its own scale. The levels of an attribute identify the points on this attribute scale, where each level receives an attribute score (AS). In our model the worst level of each attribute received a score of 0, the best level received a score of 1, and any intermediate levels received intermediate attribute scores in direct proportion to their ranks. As a result, an attribute with three levels receives attribute scores of 0, 0.5, and 1. An attribute with four levels has 0, 0.33, 0.67, and 1 as its attribute scores.
For weighting of the attribute scores across attributes we used a list of 12 weighting categories that were linked to the attribute levels and their scientific statements. The weighting categories classify welfare performance criteria, which have been measured in the various welfare disciplines, namely veterinary science (with the weighting categories "pain" and "illness"), evolutionary biology ("survival" and "fitness"), stress physiology ("HPA," i.e., hypothalamic-pituitary-adrenocortical axis, and "SAM," i.e., sympathetic-adrenal-medullary activation), and ethology ("aggression," "abnormal behavior," "frustration and avoidance," "natural behavior," "preferences," and "demand"; see Table 2
, also, for a brief explanation of each weighting category). Weighting categories may be regarded as the dependent variables of empirical research, and the attributes in the model are the independent variables. Scientific statements describe a relationship between the two kinds of variables. In virtue of these relationships weighting categories can be used to weight the attributes: the more scientific evidence is available showing the positive and negative welfare consequences of an attribute, the higher its weighting factor (which will be defined more formally below). Most weighting categories define a negative contribution to welfare. Positive contributions are made by the weighting categories of "natural behavior," "preferences," and "demand."
|
One of the four weighting-category levels conveys a very high weighting score (set at 10,000 in the decision support system). When this level is assigned to an attribute (based on a scientific statement), this attribute turns into a minimum requirement for welfare (i.e., the welfare status is low, no matter what else is true about the housing system). The weighting score of 10,000 results in an overall welfare score for the housing system as a whole of less than 0.5 on a scale from 0 to 10.
The three remaining levels per weighting category assign weighting scores of either 1, 2, and 3 points, or of 1, 3, and 5 points, which have been assigned based on the dimensions of intensity, duration and incidence (Willeberg, 1991). With these dimensions we judged "pain," "illness," and "HPA" to be the most negative weighting categories. The most positive one was "demand" (see Table 2
). It follows that when we compare the two positive weighting categories "demand" (with weighting scores 1, 3, and 5) and "preferences" (with weighting scores 1, 2, and 3), we find two pairs of levels with equal weights, namely those with a weighting score of 1, and those with a weighting score of 3. The basis for this normative judgment lies in the above-mentioned dimensions of intensity, duration, and incidence (cf. also Anonymous, 2001).
For weighting we did not only determine which weighting-category levels, and associated weighting scores (WS), were predicated by the scientific statements about each attribute level. We also registered the specific type (T) of the weighting category. Types identify the quality or nature of the scientific measurement. Types provide a further differentiation of the (type of) welfare performance criteria that have been classified in each weighting category. Examples of types are "duration" of the pain weighting-category, "a measure of cortisol" of the HPA category, and "stereotypic behavior" (vs, for example, "abnormal sexual behavior") of the weighting category "abnormal behavior" (see also Table 3
). As will be explained in the next section, the number of unique types per weighting category is functional for weighting, because it represents the different types of scientific argument for weighting within each weighting category.
| Calculation Rules |
|---|
|
|
|---|
For the best level and for the worst level of each attribute (ALbest and ALworst, respectively) the maximum weighting score and the total number of distinct types are determined per weighting category.
Table 3
illustrates the procedure for the best level, "> 5 cm," of attribute 10, "rooting substrate." This attribute has four levels. The best level specifies more than 5 cm substrate depth that is available for at least 4 h per day. Two intermediate levels specify "less than 5 cm at least 2 h per day," and "no rooting substrate," respectively. Its worst level is "nose rings" in the presence of rooting substrate.
To determine the "weight" of ALbest, first the maximum weighting scores per weighting category ("natural behavior," "preferences," and "demand") are determined (3, 3, and 3, respectively, marked with superscript a in Table 3
). Second, the number of unique types per weighting category is counted (1, 5, and 1, respectively, marked with superscript b in Table 3
). Third, per weighting category the maximum weighting score is increased with 0.2 times its number of unique types (resulting in 3.2, 4, and 3.2, respectively). The factor of 0.2 is, somewhat arbitrarily, chosen to reduce the contribution of the number of distinct types. Five distinct types (as for the weighting category "preferences" in Table 3
) with a factor of 0.2 add 1 point to the weighting score (WS) of the maximum weighting-category level (3 + 1 = 4). Fourth, the weighting points are summated over the weighting categories (3.2 + 4 + 3.4). This results in a "weight" of 10.4 for the level "> 5 cm" (ALbest) of attribute "rooting substrate."
The rationale for the procedure is that the impact of any novel scientific statement mainly derives from its ability to identify a new weighting category (establishing a new welfare performance criterion) or from its ability to increase the weighting score of a weighting category (e.g., the work by Matthews and Ladewig [1994] described in S13 increased the WS of "demand" from WS = 1 to WS = 3). Statements that fail to do either of these two things do not contribute to the "weight" of the attribute level, unless they identify a novel type. For example, statement S6 in Table 3
"adds" the type "anatomy and cognition" (pigs have a sensitive snout). By itself this statement conveys a "preference" weighting score of 1, which does not contribute to this attribute levels "weight" because a weighting score of 3 is already established by S12 (pigs have a strong preference to root). However, it does "add" a novel type ("anatomy and cognition") to the "weight" of the attribute level "> 5 cm" substrate, although it is only counted in a moderated way, namely by a factor 0.2.
The "weight" of the best level (ALbest) of an attribute expresses the degree to which it is (relatively) positive for welfare, based on available scientific knowledge. The "weight" of the worst level (ALworst) expresses the degree to which that level is (relatively) negative for welfare. Therefore, the same procedure that was used for ALbest is used for ALworst, except that the minima of the (largely negative) weighting scores are used, and that the number of unique types are subtracted, rather than summated. In the decision support system we obtained a "weight" of -5.0 for the worst level "nose rings" of the attribute "rooting substrate."
The weighting factor (WF) for an attribute as a whole is the difference between the "weights" of its best and worst level. The weighting factor for the attribute "rooting substrate" is 15.4 (= 10.4 - (-5.0)). In our model weighting factors for the 37 attributes range between 2.4 and 25.8 (Table 1
, column WF).
More formally, the weighting factor (WFi) of the i-th attribute in the model is calculated as
![]() | [1] |
where ALi,worst is the worst level and ALi,best is the best level of the i-th attribute; WSwcl is the weighting score assigned to the attribute level based on a scientific statement; wc identifies the weighting categories linked to the attribute level; wcl identifies the weighting-category levels (to which WS have been assigned) within one weighting category; and NTwc is the number of unique types per weighting category assigned to the attribute level.
Attribute scores are calculated for each level of an attribute on a scale from 0 to 1 proportional to their rank as
![]() | [2] |
where ASi,j is the attribute score of the j-th level of the i-th attribute in the model; NLi is the total number of levels of attribute i; RLi,j is the rank number of the j-th level of the i-th attribute, where levels are ranked for welfare from 1 for ALi,best to NLi for ALi,worst; RLi,j
[1, NLi]. For example, an attribute with five levels has AS of 1, 0.75, 0.5, 0.25, and 0 for its levels ranked 1 to 5.
The "absolute" welfare score for a housing system (AWh, on scale 0 to 1) is calculated in our model as the weighted average of the attribute scores, that is, as the sum of the attribute scores (AS) multiplied by the weighting factor (WF) of each attribute in the model, divided by the total sum of WF:
![]() | [3] |
where ASi,h is the attribute score (between 0 and 1) of the level of attribute i that represents the (welfare-relevant) property of the housing system (h); WFi is the weighting factor of the i-th attribute; m is the total number of attributes in the model, i.e., 37; i
[1,m].
The "absolute" scores (AWh) represent the ratio of all positive and negative aspects of a housing system. AWh covers the domain of logically possible housing systems, which is much wider than that encountered in reality. Therefore, we transform the scores (linearly) to a scale that covers the actual systems. In this paper we use the seven main housing systems for pregnant sows as a benchmark to define the (relative) welfare scores (Wh) on a scale from 0 to 10. These systems include tethered housing, individual housing in stalls, electronic sow feeding (ESF), group housing with free-access stalls, Biofix (i.e., trickle feeding), outdoor housing with huts on pasture, and the Family Pen system. A brief description and a reference for each system are given in Table 4
. Together these systems cover most of the variation between attributes in the domain of current housing systems (Bracke et al., 1999b). Score transformation results in the worst of these seven systems (tethered system with an AWh = 0.39 = AWmin) receiving a score of 0 and the best system (the Family Pen with AWh = 0.74 = AWmax) receiving a score of 10. For score transformation the following formula is used:
|
![]() | [4] |
where Wh is the relative welfare score for a housing system (h, on a scale from 0 to 10); AWh is its "absolute" welfare score (scale 0 to 1); and AWmin and AWmax are the "absolute" scores for the worst and best reference systems, respectively.
Although most housing and management systems are expected to fall within the range of 0 to 10, in theory systems can get a much higher and a much lower score. Housing systems with AWh = 0 and AWh = 1 would receive a Wh of -11.1 and +17.4, respectively. These are logically possible systems with only positive or only negative aspects. In reality, however, housing systems tend to have both positive and negative attributes. This is also true for the best (Family Pen) and worst (tethered) housing system in our set of seven reference system, but the proportion of positive attributes is larger for the Family Pen system.
Besides providing a benchmark to define the range of the relative scale (Wh), the seven main systems for pregnant sows also provide reference points to interpret welfare scores calculated for other housing systems.
| Some Model Calculations |
|---|
|
|
|---|
We calculated the welfare impacts of several compound attributes, such as stockmanship, behavioral restrictions, and outdoor access, to serve as examples.
Stockmanship was the most important compound attribute, with a welfare impact of 9.4 points. To calculate this impact stockmanship was defined to include specific management-related attributes (e.g., food rations, social stability, and mixing management) as well as regimens for providing basic needs such as food, water, thermal comfort, and an adequate health status. Also important compound attributes were behavioral restrictions (7.7 welfare points) and space quality (6.3). The separation of functional areas and outdoor access per se were relatively unimportant, with 1.6 and 0.8 welfare points, respectively. However, it should be noted that we used a narrow definition of "outdoor access per se", that is, we used only its impact on the model attributes "air quality," "visually isolated areas," and "light," and we disregarded the extra space that is usually provided with it. This shows that the welfare impact of an attribute may be greatly affected by the way it is defined in terms of the attributes in the model.
The model also allows the calculation of dose-response relationships between the separate levels of a compound attribute and welfare, expressed in welfare points. These relationships need not be linear, due to the differential effects of scientific evidence in relation to each level. The welfare points are calculated from the welfare "weights" of its levels. Because different "amounts" of scientific evidence are available for the "weights" of the different levels of a compound attribute, its dose-response curve will often not be linear. Figure 3
shows the dose-response relationships for the compound attributes "group size," "space per pen," and "substrate quantity." These attributes have welfare impacts of 2.3, 4.5, and 3.3, respectively. "Substrate quantity" shows a substantial increase when little substrate is provided. With larger amounts of substrate the contribution to welfare is tailing off. "Space per pen" shows a sharp increase at low space allowances and the curve tails off at larger space allowances. This compound attribute has an impact of 4.5 points. It is defined in relation to the model attribute with the same short descriptor "space per pen" (which has an effect of only 1.7 welfare points, Table 1
), but it also relates to other model attributes such as "space per sow" and separate functional areas (involving the attributes 6, 16, 27, 36, and 37). For "group size" we presumed a constant space allowance of 2 m2 per sow and feeding stalls. Its curve shows an optimum value for small groups. Very large groups have a suboptimal value for welfare, mainly because pigs have a natural tendency to live in small groups (weighting category "natural behavior"). The optimum value for small groups remains below the 0 line in Figure 3
, because we presumed 2 m2 per sow and normal husbandry conditions as regards mixing frequency in relation to group size. As a result, small groups received positive points for social contact (attribute 8) but negative points for relatively small pens (at a space allowance of 2 m2 per sow) and for the mixing frequency of once per pregnancy (attribute 7). The model contains more negative weightings than positive ones, because more knowledge is available about negative welfare performance criteria than about positive ones (cf. column WS in Table 2
).
|
Comparison with Other Models
We compared the results of our model with results from three models for welfare assessment in pregnant sows derived from the literature. These are Frasers Behavioural Deprivation Index (BDI, Fraser, 1983) and two versions of the Tiergerechtheitsindex (TGI), one version by Walter and Postler (published in Sundrum et al., 1994) and one version by Bartussek (1999b). The models were interpreted to apply at the housing-system level in order to allow comparison with our model.
The output of the different models is shown in Figure 4
. The Spearman rank correlation coefficients (Rho) between each pair of models were all larger than 0.85 (P < 0.05). Kendalls coefficient of concordance for the four models is 0.94 (P = 0.001).
|
Flexibility and Some "Sensitivity Analysis"
The model is embedded in a computer-based decision support system, which is designed to be flexible, that is, to be adaptable when new knowledge about welfare becomes available. The flexibility of the decision support system covers a range of aspects. An experienced user may want to modify (add, delete, or change) scientific statements, the calculation procedure, the attributes in the model, descriptions of housing systems, weighting categories, or the list of needs. He or she may also want to produce a number of different versions of the model and compare the output. The user may also construct his or her own model, or insert a model derived from the literature, as we have done for the three models described above. Such flexibility allows a quantitative assessment of criticism. We will illustrate this point with reference to Table 3
, which illustrates the weighting procedure.
An objection may be that it is arbitrary to weight the number of types by a factor of 0.2. With the decision support system we can calculate the effect of changing this factor. We compared the present results with model variants in which this factor was set at 0 and 1, respectively. The largest deviation in overall welfare scores for any of the seven housing systems was only 0.23 points on the 10-point scale.
Would it, then, be legitimate to exclude this factor from the calculation procedure altogether, for instance to reduce its complexity? We may even ask whether the whole weighting procedure may be redundant, because the correlation between the weighted and unweighted versions of the model is relatively high (Spearmans Rho is 0.89, P < 0.01). Theoretically, the correlation may be as high as 0.99, using an equation from Gulliksen (1950). However, even if weighting was found to be empirically redundant (Dalkey, 1975; Wainer, 1976), it certainly is not redundant from a conceptual point of view (Streiner and Norman, 1995, p 86). Because our goal was to assess welfare as much as possible in accordance with knowledge of the biology of the animals, we cannot assess welfare without a weighting procedure, because we must obviously distinguish between the more and the less important attributes for welfare (cf. luxuries and necessities, e.g. Dawkins, 1990). Similarly, we cannot discard the weighting of the number of types in the calculation procedure, because new knowledge that "adds" a new type to an attribute level conceptually gives it a somewhat higher weight. In other words, setting the 0.2 factor at 0 is illegitimate.
Another objection to Table 3
is that the weighting categories ("natural behavior," "preferences," and "demand") largely overlap. Maybe they should be regarded as different levels of one weighting category. We identified these categories as three different weighting categories, because they involve three different paradigms of welfare research. With little effort (less than 15 min) we produced a version of the model that summarized the three weighting categories into one. The maximum effect was a (downward) deviation of only 0.41 points for the system "outdoor huts." Larger effects were found when these specifically behavioral weighting categories were deleted completely from the weighting procedure. This resulted in a maximum deviation of 1.8 points (for "outdoor huts").
These calculations illustrate the flexibility of the decision support system and its use in evaluating points of criticism in a quantitative way. Especially when large effects are found, closer evaluation and possible upgrading of the model are warranted.
| Discussion |
|---|
|
|
|---|
This paper shows that welfare can be assessed without involving any ethical questions, although it does involve normative issues (cf. Fraser, 1995; Bracke et al., 1999c), especially in dealing with uncertainties in knowledge. For integrated welfare assessment many decisions were needed to identify the 352 scientific statements, 145 attribute levels, 11 needs, 12 weighting categories, and the many links between them. In doing so, many hidden assumptions had to be made explicit; for example, the assignment of weighting scores made explicit how we weighted the weighting-category levels. Most of these decisions were somewhat arbitrary (cf. Hurnik, 1988), but we showed how our model can be used to determine the impact of such arbitrary choices by doing a sensitivity analysis. Our technical solution is not the most elegant, but it works: we can calculate welfare scores that correspond for the most part with other models as well as with previously obtained expert opinion (Bracke et al., 1999d). This means the model is ready for a more explicit validation test, even though the model can always be improved further. The decision support system, in which the model is embedded, is developed to accommodate this course of evolution. The value of such a system is that it may contribute to increase the objectivity and intersubjectivity in welfare assessment. Welfare assessment on the basis of the biological needs of animals and the findings from empirical research guides the way to an assessment of animal welfare as perceived from the animals point of view, even though such assessment must necessarily remain an assessment performed by humans (Bekoff et al., 1992).
Model Construction
For model construction it is important to demarcate the scope of the model. Our model concerns assessment of the welfare status of pregnant sows in relation to their housing and management system based on available scientific knowledge. The models domain includes the wide range of present and future housing and management systems for pregnant sows. The model is designed primarily to distinguish between different types of housing systems, rather than between individual farms within systems.
A list of hierarchically organized needs was used to break up the concept of overall welfare into manageable chunks. Further functional decomposition resulted in the formulation of a list of weighted attributes that identify the welfare-relevant properties of housing systems through the relationship between the attributes and welfare performance criteria (weighting categories) as specified in the scientific statements. These attributes are the proverbial apples and oranges that are "added" into the mixed fruit basket of overall welfare.
We carefully selected attributes covering all aspects of welfare while minimizing the (conceptual) overlap between them. We did not minimize (empirical) correlations between attributes. The average interattribute correlation was considerable (Crohnbachs alpha was 0.79 for the 37 attributes and the seven reference housing systems; see, e.g., Nunnally, 1970). As a result, when we change only one attribute of a housing system there will often be multiple effects in the model. This was illustrated above for influencing factors and compound attributes. Empirical correlations between the model attributes may contribute to the rather stable model performance when weighting factors are changed. For example, when all weighting factors are set at 1 the Spearman rank correlation coefficient with the original model is still 0.89 (P < 0.01).
Attribute scores for the levels of an attribute were determined in proportion to their welfare rank. An objection may be that the model transforms ordinal scales to interval scales. Although for setting the relative distance between the attribute levels we could have used a procedure similar to the one we used for the weighting of attributes as a whole, we used the simplifying assumption of proportionality, because it is expected to have only minor effects on the overall scores because it concerns only a fraction of the attribute effect shown in Table 1
.
Minimum-requirement levels identify those properties of a housing system below which welfare is poor no matter what. Because we used discrete levels, these minimum requirements have relatively sharp cut-off points. In reality, the cut-off points are rather fuzzy. Techniques such as fuzzy logic (e.g., Bardossy and Duckstein, 1995) with membership curves that "fuzzify" the cut-off points may be used to improve this aspect of the model.
For weighting beyond minimum requirements we identified a number of weighting categories. The correlation between the weighting factor of an attribute and this attribute having a minimum-requirement level attached to it is low but significant (Spearman rank correlation is 0.38, P < 0.05; data from Table 1
). The effect is mitigated by the fact that some attributes with a high weighting factor have their minimum-requirement level covered by another attribute, for instance "space per sow" in the case of "space per pen." Another reason why an attribute may have a high weighting factor without a minimum-requirement level is that it may affect welfare positively rather than negatively, as is the case for "social contact," which is defined to exclude the aspect of agonism.
The weighting categories identify the various welfare performance criteria that have been measured in the various welfare-science disciplines. Some of these disciplines have been disputed, such as preference testing (Dawkins, 1983) and stress physiology (Rushen, 1991). Our aim was not to settle these disputes, but rather to find a way that allowed using the relevant findings from each of these disciplines. The attributes in the model often specify environment-based design criteria. Our model is the first to spell out how environment-based attributes can be used for overall welfare assessment, namely in virtue of their effects on welfare performance criteria as described in the scientific statements. Previous approaches have either focused predominantly on identifying and measuring performance criteria (e.g., Broom and Johnson, 1993, and many others), or they have focused on modeling environment-based attributes without explicitly showing how already available knowledge is taken into account (e.g., TGI indexes mentioned above). Our approach may provide a bridge between the comprehensiveness of the latter approach with the scientific credibility of the former.
To calculate welfare scores we used an additive calculation rule. As a result, the contribution of an attribute to welfare is constant and independent of the state of the other attributes. However, interactions such as between space quantity and quality and between feeding level and thermal comfort would seem to be relevant. For the most part we reduced the impact of interactions by carefully describing the attributes. For example, the attribute "exposure to cold" has included in its description an influencing factor "feeding level." We presumed that the effects of interactions, which remained despite careful description, are negligible. This assumption may be false, but incorporating interactions in the model would have further increased its complexity. The basic assumption of additive calculation seems warranted, as it is commonly used (cf. Keeny and Raiffa, 1976; Huirne and Hardaker, 1998). Models of stress often use multiple stressors that are supposed to act in an additive way. Furthermore, empirical evidence is available that stressors may act additively (Webster, 1995, p 120) and that compensation is possible. For example, studies with chicks (McFarlane et al., 1989; McKee and Harrison, 1995) and with growing pigs (Hyun et al., 1998a,b) have demonstrated additive effects with multiple stressors. It has also been shown that stress can be compensated by reward (e.g. van den Berg et al., 1999, 2000; Spruijt et al., 2001). Pedersen et al. (1998) showed that positive handling of pregnant gilts could reduce the negative stress-physiological consequences of tethered housing. A biological basis for summation and compensation may lie in the capacity of animals to adapt. This capacity is substantial but has its limits (e.g. Broom and Johnson, 1993; Barnard and Hurst, 1996). Our welfare model incorporates a two-tiered approach, using minimum requirements to set limits, but otherwise allowing compensation between welfare-relevant attributes.
Model Application and Utility
When the properties of a housing system from within the models domain are known, the user may be able to determine attribute levels and calculate a welfare score in as little as 5 min. However, at present, our decision support system requires an experienced user. The user needs to understand the scientific statements and the models weighting procedure. Because the attributes have been designed to avoid overlap, this resulted in somewhat technical definitions of attributes, weighting categories, and needs. The decision support system also requires expertise in using the software (MS Access). However, the decision support system includes assessments of the seven main housing systems for pregnant sows. This facilitates the application of the model, because it provides points of reference for model application. Further research will be needed to make the model suitable for lay use.
At this point we cannot determine what difference in welfare scores would represent a definite difference in welfare. We estimate that this may be a difference of as much as 2 points on the scale from 0 to 10; a retest showed an average absolute difference of 0.71 points with a range of 0.1 to 1.7. Comparison with other models (Figure 4
) showed even larger differences, especially for the mid-welfare systems, but the present data do not support an evaluation of which is the better model.
Validation can take essentially three forms: (sensitivity) analysis, expert opinion, and empirical research. In this paper we have shown some results of analytical sensitivity analysis, comparison with other models, and previous interviews with experts. An associated paper will discuss an extended validation using expert opinion. Empirical validation will be difficult because it requires extensive data collection on farms to allow comparison at the housing-system level. Further validation of the model will result from using it in practice and from upgrading it periodically.
The main strength of our work is that it shows how pregnant-sow welfare can be quantified using a systematic and transparent procedure covering all reasoning steps from basic assumptions, specifications of the models domain, and available welfare knowledge, all the way to the interpretation of the welfare scores.
According to our study the main steps of a procedure to assess animal welfare are:
With this procedure an operational decision support system to assess the welfare status of pregnant sows in relation to their housing and management system based on available scientific knowledge has been developed. The overall welfare status is expressed as a score between 0 and 10, based on what is known about the biological needs of the animals. The decision support system shows how farm animal welfare can be assessed in an explicit and transparent way, with the flexibility to accommodate new insights about welfare assessment when these become available.
| Implications |
|---|
|
|
|---|
| Footnotes |
|---|
Received for publication February 1, 2001. Accepted for publication January 29, 2002.
| Literature Cited |
|---|
|
|
|---|