Interbull CoP - APPENDIX IV - Description of weighting factors and examples
Weighting factors for the international genetic evaluation
The following is a procedure to compute new weighting factors for the international genetic evaluation of Interbull, and has to be implemented separately by each individual organisation participating in Interbull evaluations. The procedure is based on information used in the national genetic evaluation in each country, and how a country considers such information depends on the genetic evaluation model.
The procedure consists of two steps. In the first step, for each animal with own performance records used in the national genetic evaluation, the reliability due to its own performance, R(o), is estimated using selection index methodology. R(o) is computed separately per trait (milk, fat or protein yield) and information from the other traits in a multiple trait national genetic evaluation is ignored. The second step combines the R(o) from the daughter and her dam, expressed in effective daughter contributions (EDC), which are subsequently accumulated over all daughters of a sire.
Step 1: Estimation of reliability based on own performance (R(o)):
For each animal with own performance records in the national genetic evaluation, Ri(o) is estimated. Estimation of Ri(o) depends on the genetic evaluation model.
a) Single trait (repeatability) model for the national genetic evaluation (where individual lactations are considered as the same trait):
b) Multiple trait model for the genetic evaluation (where each lactation, part of lactation or test day observation are treated as different traits):
Let k’EBV be the estimated breeding value or transmitting ability of the bull for the trait of interest (milk, fat or protein yield), i.e. the one submitted in the 010 file to Interbull, where EBV is a vector with multiple trait (lactation, part-lactation, test day) estimates of breeding value or transmitting ability, and k a vector with weights given to each estimate.
Note:
- heritability is the heritability of a single observation, e.g. test day yield - non-genetic parameters correspond to all non-genetic effects, which may include permanent environment effects ( i.e. E = PE + e )
- the non-genetic correlation depends on assumptions of the evaluation model, i.e. whether PE and/or e are considered to be correlated between lactations
- in case genetic and environmental correlations and heritability are not constant over lactation (e.g. in random regression models), an average value over lactation should be used
for missing traits, computation of the nominator in (2) can be done in two ways:1) set the rows and columns corresponding to the missing trait to zero in the P matrix, or 2) remove the corresponding rows and columns in the P matrix and the corresponding rows in the G matrix. Either way should give the same results. However, the first method is recommended since it is less ambiguous (i.e. the G matrix is not affected and the P matrix is of same dimension under all circumstances). The denominator in (2) is the same for all animals, whether observations are missing or not!
Step 2: Combine sources of information
a) Once Ri(o) is computed for all animals with own records in the genetic evaluation, information from the cow and her dam is combined as follows:
This new weight, resulting from Equation (4), will be added to the trait-group files according to the revised format and sent to Interbull.
References
Fikse, W.F. and Banos, G., 1999. Weighting factors of daughter information in international genetic evaluation for milk production traits: effect on (co)variance components. J. Dairy Sci., 82 (suppl. 1): 72 (Abstr.).
Fikse, W.F. and Banos, G., 1999. Weighting factors in international genetic evaluations: effects on international breeding value and reliability estimates. Interbull Bulletin, 22: 38 - 43.
Fikse, W.F. and Banos, G., 2001. Weighting factors of sire daughter information in international genetic evaluations. J. Dairy Sci. 84:1759-1767.
VanRaden, P.M. and Wiggans, G.R., 1991. Derivation, calculation, and use of national animal model information. J. Dairy Sci., 74: 2737 - 2746.
VanVleck, L.D., 1993. Selection index and introduction to mixed model methods. pp 481. CRC Press Inc., Boca Raton, FL.
Weighting factors for the international genetic evaluation of longevity
The following describes the procedure to compute weighting factors for the international genetic evaluations for longevity traits, and has to be implemented by each country participating in Interbull evaluations for longevity traits. Procedures currently in use for genetic evaluation of longevity traits can be divided into two main categories: survival analysis and analysis of culling rate, stayability or length of life with mixed linear models. Due to the different nature of both types of analysis, procedures to compute weighting factors are described separately.
Survival analysis²
The number of culled daughters is to be submitted as weighting factor. Heritability should be computed as:
Mixed linear model analysis
- If longevity data are analysed with mixed linear models, assuming normality of breeding values and residuals, then the procedures above underneath “New weighting factors for the international genetic evaluation" applies.
References Ducrocq, V., Delaunay, I., Boichard, D. and S. Mattalia. 2003. A general approach for international genetic evaluations robust to inconsistencies of genetic trends in national evaluations. Interbull Bulletin 30, 101-111.
²A more elaborate definition of a weighting factor for survival analysis can be found in Ducrocq et al. (2003). The weighting factor proposed there was observed to have a perfect correlation with number of culled daughters (Ducrocq, 2003, pers. comm.), why the latter definition of weighting factors will be used for international evaluations
Weighting factors, example
Weighting factors for the international genetic evaluation: Example
This document is a supplement to Appendix IV (which will be referred to as the “parent” document) and contains a set of examples to illustrate the calculation of the new weighting factors. For a hypothetical data set, Step 1 of the procedure will be illustrated for two different types of national genetic evaluation models. Computations for Step 2 are then outlined for a given set of results from Step 1.
Step 1: Computation of reliability based on own performance (R(o)): Consider the following hypothetical data set:
Animal |
Sire |
Lactation 1 |
Lactation 2 |
||
|
|
CG |
Weight1 |
CG |
Weight1 |
1 |
S1 |
A1 |
1 |
B1 |
1 |
2 |
S1 |
A1 |
1 |
B1 |
1 |
3 |
S1 |
A1 |
1 |
B1 |
1 |
4 |
S2 |
A1 |
1 |
B1 |
1 |
5 |
S2 |
A1 |
1 |
B1 |
0.9 |
6 |
S2 |
A1 |
1 |
B2 |
1 |
7 |
S1 |
A1 |
0.85 |
- |
- |
8 |
S1 |
A1 |
0.7 |
- |
- |
9 |
S1 |
A2 |
1 |
B2 |
1 |
10 |
S2 |
A2 |
1 |
B2 |
1 |
11 |
S2 |
A2 |
1 |
B2 |
1 |
12 |
S2 |
A2 |
1 |
B2 |
1 |
13 |
S1 |
A2 |
1 |
B2 |
1 |
14 |
S1 |
A2 |
1 |
B2 |
0.75 |
15 |
S1 |
A2 |
0.8 |
- |
- |
1 Weight in the national genetic evaluation. Computation of Ri(o) depends on the genetic evaluation model, and will for this example data set be illustrated for two classes of models.
a) Single trait (repeatability) model for the national genetic evaluation (where individual lactations are considered as the same trait):
Assume the following parameters from the national genetic evaluation: h2 = 0.30 r = 0.50
For animals 1-8 in lactation 1:
Likewise, for animals 9-15 in lactation 1, animals 1-5 in lactation 2, and animals 6 and 9-14 in lactation 2 the summation evaluates to 6.8, 4.9, and 6.75, respectively.
For animals 1-3 and 7-8 in lactation 1:
This summation evaluates to 3.0, 3.8, 3.0, 3.0, 1.9, 4.0, and 2.75 for animals 4-6 in lactation 1, animals 9 and 13-15 in lactation 1, animals 10-12 in lactation 1, animals 1-3 in lactation 2, animals 4-5 in lactation 2, animals 6 and 10-12 in lactation 2, animals 9 and 13-14, respectively. Then for animal 1:
For animals 1 - 15 the results are:
Animal |
wi1 |
wi2 |
m |
Ri(o) |
1 |
0.397 |
0.388 |
0.785 |
0.264 |
2 |
0.397 |
0.388 |
0.785 |
0.264 |
3 |
0.397 |
0.388 |
0.785 |
0.264 |
4 |
0.603 |
0.612 |
1.215 |
0.329 |
5 |
0.603 |
0.551 |
1.154 |
0.321 |
6 |
0.603 |
0.407 |
1.010 |
0.302 |
7 |
0.338 |
- |
0.338 |
0.152 |
8 |
0.278 |
- |
0.278 |
0.131 |
9 |
0.441 |
0.593 |
1.034 |
0.305 |
10 |
0.559 |
0.407 |
0.966 |
0.295 |
11 |
0.559 |
0.407 |
0.966 |
0.295 |
12 |
0.559 |
0.407 |
0.966 |
0.295 |
13 |
0.441 |
0.593 |
1.034 |
0.305 |
14 |
0.441 |
0.444 |
0.886 |
0.282 |
15 |
0.353 |
- |
0.353 |
0.157 |
b) Multiple trait model for the national genetic evaluation (where each lactation, part lactation or test day observation is treated as a different trait):
The same data structure as before is used in this case too, however, the two lactations are considered as genetically distinct traits. The example calculations below assume only one observation per trait. mj for animal i is therefore equal to wij as computed above. The complete matrix of phenotypic (co)variances that is used in the national genetic evaluation will be denoted as P* for the remaining of this document. The matrix specific to each animal (with appropriate elements set to zero) will be denoted as P. The illustration of missing observations (animal 7) is shown for the first implementation option, i.e. zeroing rows and columns, since this method is less ambiguous (i.e., the G matrix is not affected and the P matrix is of same dimension under all circumstances).
Assume the following parameters from the national genetic evaluation:
For animals 1 - 15 the results are:
Animal |
m1 |
m2 |
nominator |
Ri(o) |
1 |
0.397 |
0.388 |
21.673 |
0.170 |
2 |
0.397 |
0.388 |
21.673 |
0.170 |
3 |
0.397 |
0.388 |
21.673 |
0.170 |
4 |
0.603 |
0.612 |
30.954 |
0.242 |
5 |
0.603 |
0.551 |
29.423 |
0.230 |
6 |
0.603 |
0.407 |
25.952 |
0.203 |
7 |
0.338 |
- |
9.487 |
0.074 |
8 |
0.278 |
- |
7.813 |
0.061 |
9 |
0.441 |
0.593 |
28.176 |
0.221 |
10 |
0.559 |
0.407 |
25.143 |
0.197 |
11 |
0.559 |
0.407 |
25.143 |
0.197 |
12 |
0.559 |
0.407 |
25.143 |
0.197 |
13 |
0.441 |
0.593 |
28.176 |
0.221 |
14 |
0.441 |
0.444 |
24.025 |
0.188 |
15 |
0.353 |
- |
9.914 |
0.078 |
In case multiple observations are recorded on genetically distinct traits, the P11 element in the P matrix is obtained as follows:Let two observations be recorded on the first genetically distinct trait, the repeatability of observations on the same trait r1 = 0.40. Assume that wi1, computed the same way as illustrated above, for observation 1 and 2 are 0.7 and 0.8. Then m1 = 0.7 + 0.8 = 1.5, and
Step 2: Combine sources of information
First, the reliability contributed by the dam of the animal is added to the animals reliability based on own performance, and expressed in effective daughter contributions (EDC) (formula (3) in the “parent” document). Once that is done for all animals, information from all daughters of a sire is accumulated into one single weighting factor for each sire. Ri(o) values from case b) from Step 1 will be taken for these example computations, and hypothetical values for Rdam(o) are used. Parameters (G, P*, k) will also be those used in case b) of Step 1.
Note that Rdam(o) is always zero in case the national genetic evaluation applies a sire model!
The heritability of the trait submitted in the 010 file is the heritability needed in Step 2, and is for this example computed as:
For animals 1 - 15 the EDCi(o+d), given the assumed values for Rdam(o) for dams, are:
Animal |
Sire |
Ri(o) |
Rdam(o) |
EDCi(o+d) |
1 |
S1 |
0.170 |
0.29 |
0.486 |
2 |
S1 |
0.170 |
0.30 |
0.486 |
3 |
S1 |
0.170 |
- |
0.479 |
4 |
S2 |
0.242 |
0.31 |
0.712 |
5 |
S2 |
0.230 |
0.31 |
0.674 |
6 |
S2 |
0.203 |
0.29 |
0.588 |
7 |
S1 |
0.074 |
0.28 |
0.206 |
8 |
S1 |
0.061 |
0.26 |
0.169 |
9 |
S1 |
0.221 |
0.24 |
0.640 |
10 |
S2 |
0.197 |
0.23 |
0.567 |
11 |
S2 |
0.197 |
0.23 |
0.567 |
12 |
S2 |
0.197 |
0.23 |
0.567 |
13 |
S1 |
0.221 |
0.24 |
0.640 |
14 |
S1 |
0.188 |
0.25 |
0.541 |
15 |
S1 |
0.078 |
0.27 |
0.215 |
Finally, the weighting factors ws for both sires in this example data set are computed as follows:
Weighting factors for the international genetic evaluation of longevity:
Examples for mixed linear models
This document contains a set of examples to illustrate the calculation of the weighting factors for genetic evaluations for longevity using mixed linear models. The same procedure will apply as the one for production traits (outlined in Appendix IV). It consists of two steps:
1) computation of reliability based on own performance, and
2) combining information from daughters and their dams, to be expressed in effective daughter contributions (EDC). For a hypothetical data set, computations for Step 1 will be illustrated for two different national genetic evaluation models based on mixed linear models. Computations for Step 2 are then outlined for a given set of results from Step 1.
Step 1: Computation of reliability based on own performance (R(o)):
Consider the following hypothetical data set:
Animal |
Sire |
CG |
Length of life analysis |
|
Binomial trait analysis |
|
|
|
|
Weight1 |
Productive life (months) |
|
Survived 2nd lactation (0=no, 1=yes) |
1 |
S1 |
A1 |
1 |
23 |
|
0 |
2 |
S1 |
A1 |
1 |
29 |
|
1 |
3 |
S1 |
A1 |
1 |
24 |
|
1 |
4 |
S2 |
A1 |
1 |
23 |
|
0 |
5 |
S2 |
A1 |
1 |
28 |
|
1 |
6 |
S2 |
A1 |
1 |
24 |
|
1 |
7 |
S1 |
A1 |
0.67 |
20 |
|
0 |
8 |
S1 |
A1 |
0.80 |
26 |
|
1 |
9 |
S1 |
A2 |
1 |
28 |
|
1 |
10 |
S2 |
A2 |
1 |
18 |
|
0 |
11 |
S2 |
A2 |
1 |
26 |
|
1 |
12 |
S2 |
A2 |
1 |
23 |
|
0 |
13 |
S1 |
A2 |
1 |
28 |
|
1 |
14 |
S1 |
A2 |
1 |
24 |
|
1 |
15 |
S1 |
A2 |
0.81 |
27 |
|
1 |
1 Weight in the national genetic evaluation. Computation of Ri(o) depends on the genetic evaluation model, and will for this example data set be illustrated for two types of linear model analysis. For the first case, longevity is regarded as a binomial trait indicating whether or not an animal survived the first and second lactation. For the second case, length of productive life is recorded. In the example a situation is imitated where some records receive a lower weight (e.g., due to those being predicted rather than observed records).
a) Binomial trait analysisAssume the following parameters from the national genetic evaluation: h2 = 0.02
For animals 1-8:
Likewise, for animals 9-15 the summation evaluates to 7.
Then for animal 1:
For animals 1 - 15 the results are:
Animal |
m = wi1 |
Ri(o) |
1 |
0.375 |
0.0075 |
2 |
0.375 |
0.0075 |
3 |
0.375 |
0.0075 |
4 |
0.625 |
0.0125 |
5 |
0.625 |
0.0125 |
6 |
0.625 |
0.0125 |
7 |
0.375 |
0.0075 |
8 |
0.375 |
0.0075 |
9 |
0.429 |
0.0086 |
10 |
0.571 |
0.0114 |
11 |
0.571 |
0.0114 |
12 |
0.571 |
0.0114 |
13 |
0.429 |
0.0086 |
14 |
0.429 |
0.0086 |
15 |
0.429 |
0.0086 |
b) Length of life analysis
The same data structure as before is used here as well, however, the trait being analysed is length of productive life. In addition, a situation with differential weights for observations is illustrated. Assume the following parameters from the national genetic evaluation: h2 = 0.10
For animals 1-8:
Likewise, for animals 9-15 the summation evaluates to 6.81.
Then for animal 1:
For animal 7:
- For animals 1 - 15 the results are:
Animal |
m1 |
Ri(o) |
1 |
0.402 |
0.040 |
2 |
0.402 |
0.040 |
3 |
0.402 |
0.040 |
4 |
0.598 |
0.060 |
5 |
0.598 |
0.060 |
6 |
0.598 |
0.060 |
7 |
0.269 |
0.027 |
8 |
0.321 |
0.032 |
9 |
0.441 |
0.044 |
10 |
0.560 |
0.056 |
11 |
0.560 |
0.056 |
12 |
0.560 |
0.056 |
13 |
0.441 |
0.044 |
14 |
0.441 |
0.044 |
15 |
0.357 |
0.036 |
Step 2: Combine sources of information
First, the reliability contributed by the dam of the animal is added to the animals reliability based on own performance, and expressed in effective daughter contributions (EDC). Once that is done for all animals, information from all daughters of a sire is accumulated into one single weighting factor for each sire. Ri(o) values from case b) from Step 1 will be taken for these example computations, and hypothetical values for Rdam(o) are used.
Note that Rdam(o) is always zero in case the national genetic evaluation applies a sire model!
Assume the following parameters from the national genetic evaluation: h2 = 0.10 and k follows as:
For animal 1:
And for animal 3, which is assumed to have an unknown dam
R3(o) = 0.040
Rdam(o) = 0
For animals 1 - 15 the EDCi(o+d), given the assumed values for Rdam(o) for dams, are:
Animal |
Sire |
Ri(o) |
Rdam(o) |
EDCi(o+d) |
1 |
S1 |
0.040 |
0.059 |
0.394 |
2 |
S1 |
0.040 |
0.060 |
0.394 |
3 |
S1 |
0.040 |
- |
0.394 |
4 |
S2 |
0.060 |
0.062 |
0.594 |
5 |
S2 |
0.060 |
0.062 |
0.594 |
6 |
S2 |
0.060 |
0.058 |
0.594 |
7 |
S1 |
0.027 |
0.056 |
0.265 |
8 |
S1 |
0.032 |
0.052 |
0.315 |
9 |
S1 |
0.044 |
0.048 |
0.434 |
10 |
S2 |
0.056 |
0.046 |
0.554 |
11 |
S2 |
0.056 |
0.046 |
0.554 |
12 |
S2 |
0.056 |
0.046 |
0.554 |
13 |
S1 |
0.044 |
0.048 |
0.434 |
14 |
S1 |
0.044 |
0.050 |
0.434 |
15 |
S1 |
0.036 |
0.054 |
0.354 |
Finally, the weighting factors ws for both sires in this example data set are computed as follows: