Interbull CoP - APPENDIX IV - Description of weighting factors and examples

Weighting factors for the international genetic evaluation

The following is a procedure to compute new weighting factors for the international genetic evaluation of Interbull, and has to be implemented separately by each individual organisation participating in Interbull evaluations. The procedure is based on information used in the national genetic evaluation in each country, and how a country considers such information depends on the genetic evaluation model.

The procedure consists of two steps. In the first step, for each animal with own performance records used in the national genetic evaluation, the reliability due to its own performance, R(o), is estimated using selection index methodology. R(o) is computed separately per trait (milk, fat or protein yield) and information from the other traits in a multiple trait national genetic evaluation is ignored. The second step combines the R(o) from the daughter and her dam, expressed in effective daughter contributions (EDC), which are subsequently accumulated over all daughters of a sire.

Step 1: Estimation of reliability based on own performance (R(o)):

For each animal with own performance records in the national genetic evaluation, R_i(o) is estimated. Estimation of R_i(o) depends on the genetic evaluation model.

a) Single trait (repeatability) model for the national genetic evaluation (where individual lactations are considered as the same trait):

b) Multiple trait model for the genetic evaluation (where each lactation, part of lactation or test day observation are treated as different traits):

Let k’EBV be the estimated breeding value or transmitting ability of the bull for the trait of interest (milk, fat or protein yield), i.e. the one submitted in the 010 file to Interbull, where EBV is a vector with multiple trait (lactation, part-lactation, test day) estimates of breeding value or transmitting ability, and k a vector with weights given to each estimate.

Note:

heritability is the heritability of a single observation, e.g. test day yield - non-genetic parameters correspond to all non-genetic effects, which may include permanent environment effects ( i.e. E = PE + e )
the non-genetic correlation depends on assumptions of the evaluation model, i.e. whether PE and/or e are considered to be correlated between lactations
in case genetic and environmental correlations and heritability are not constant over lactation (e.g. in random regression models), an average value over lactation should be used
for missing traits, computation of the nominator in (2) can be done in two ways:1) set the rows and columns corresponding to the missing trait to zero in the P matrix, or 2) remove the corresponding rows and columns in the P matrix and the corresponding rows in the G matrix. Either way should give the same results. However, the first method is recommended since it is less ambiguous (i.e. the G matrix is not affected and the P matrix is of same dimension under all circumstances). The denominator in (2) is the same for all animals, whether observations are missing or not!

Step 2: Combine sources of information

a) Once R_i(o) is computed for all animals with own records in the genetic evaluation, information from the cow and her dam is combined as follows:

This new weight, resulting from Equation (4), will be added to the trait-group files according to the revised format and sent to Interbull.

References

Fikse, W.F. and Banos, G., 1999. Weighting factors of daughter information in international genetic evaluation for milk production traits: effect on (co)variance components. J. Dairy Sci., 82 (suppl. 1): 72 (Abstr.).

Fikse, W.F. and Banos, G., 1999. Weighting factors in international genetic evaluations: effects on international breeding value and reliability estimates. Interbull Bulletin, 22: 38 - 43.

Fikse, W.F. and Banos, G., 2001. Weighting factors of sire daughter information in international genetic evaluations. J. Dairy Sci. 84:1759-1767.

VanRaden, P.M. and Wiggans, G.R., 1991. Derivation, calculation, and use of national animal model information. J. Dairy Sci., 74: 2737 - 2746.

VanVleck, L.D., 1993. Selection index and introduction to mixed model methods. pp 481. CRC Press Inc., Boca Raton, FL.

Weighting factors for the international genetic evaluation of longevity

The following describes the procedure to compute weighting factors for the international genetic evaluations for longevity traits, and has to be implemented by each country participating in Interbull evaluations for longevity traits. Procedures currently in use for genetic evaluation of longevity traits can be divided into two main categories: survival analysis and analysis of culling rate, stayability or length of life with mixed linear models. Due to the different nature of both types of analysis, procedures to compute weighting factors are described separately.

Survival analysis²

The number of culled daughters is to be submitted as weighting factor. Heritability should be computed as:

Mixed linear model analysis

If longevity data are analysed with mixed linear models, assuming normality of breeding values and residuals, then the procedures above underneath “New weighting factors for the international genetic evaluation" applies.

References Ducrocq, V., Delaunay, I., Boichard, D. and S. Mattalia. 2003. A general approach for international genetic evaluations robust to inconsistencies of genetic trends in national evaluations. Interbull Bulletin 30, 101-111.

²A more elaborate definition of a weighting factor for survival analysis can be found in Ducrocq et al. (2003). The weighting factor proposed there was observed to have a perfect correlation with number of culled daughters (Ducrocq, 2003, pers. comm.), why the latter definition of weighting factors will be used for international evaluations

Weighting factors, example

Weighting factors for the international genetic evaluation: Example

This document is a supplement to Appendix IV (which will be referred to as the “parent” document) and contains a set of examples to illustrate the calculation of the new weighting factors. For a hypothetical data set, Step 1 of the procedure will be illustrated for two different types of national genetic evaluation models. Computations for Step 2 are then outlined for a given set of results from Step 1.

Step 1: Computation of reliability based on own performance (R(o)): Consider the following hypothetical data set:

Animal	Sire	Lactation 1		Lactation 2
		CG	Weight¹	CG	Weight¹
1	S1	A1	1	B1	1
2	S1	A1	1	B1	1
3	S1	A1	1	B1	1
4	S2	A1	1	B1	1
5	S2	A1	1	B1	0.9
6	S2	A1	1	B2	1
7	S1	A1	0.85	-	-
8	S1	A1	0.7	-	-
9	S1	A2	1	B2	1
10	S2	A2	1	B2	1
11	S2	A2	1	B2	1
12	S2	A2	1	B2	1
13	S1	A2	1	B2	1
14	S1	A2	1	B2	0.75
15	S1	A2	0.8	-	-

¹ Weight in the national genetic evaluation. Computation of R_i(o) depends on the genetic evaluation model, and will for this example data set be illustrated for two classes of models.

a) Single trait (repeatability) model for the national genetic evaluation (where individual lactations are considered as the same trait):

Assume the following parameters from the national genetic evaluation: h² = 0.30 r = 0.50

For animals 1-8 in lactation 1:

Likewise, for animals 9-15 in lactation 1, animals 1-5 in lactation 2, and animals 6 and 9-14 in lactation 2 the summation evaluates to 6.8, 4.9, and 6.75, respectively.

For animals 1-3 and 7-8 in lactation 1:

This summation evaluates to 3.0, 3.8, 3.0, 3.0, 1.9, 4.0, and 2.75 for animals 4-6 in lactation 1, animals 9 and 13-15 in lactation 1, animals 10-12 in lactation 1, animals 1-3 in lactation 2, animals 4-5 in lactation 2, animals 6 and 10-12 in lactation 2, animals 9 and 13-14, respectively. Then for animal 1:

For animals 1 - 15 the results are:

Animal	w_i1	w_i2	m	R_i(o)
1	0.397	0.388	0.785	0.264
2	0.397	0.388	0.785	0.264
3	0.397	0.388	0.785	0.264
4	0.603	0.612	1.215	0.329
5	0.603	0.551	1.154	0.321
6	0.603	0.407	1.010	0.302
7	0.338	-	0.338	0.152
8	0.278	-	0.278	0.131
9	0.441	0.593	1.034	0.305
10	0.559	0.407	0.966	0.295
11	0.559	0.407	0.966	0.295
12	0.559	0.407	0.966	0.295
13	0.441	0.593	1.034	0.305
14	0.441	0.444	0.886	0.282
15	0.353	-	0.353	0.157

b) Multiple trait model for the national genetic evaluation (where each lactation, part lactation or test day observation is treated as a different trait):

The same data structure as before is used in this case too, however, the two lactations are considered as genetically distinct traits. The example calculations below assume only one observation per trait. m_j for animal i is therefore equal to w_ij as computed above. The complete matrix of phenotypic (co)variances that is used in the national genetic evaluation will be denoted as P^* for the remaining of this document. The matrix specific to each animal (with appropriate elements set to zero) will be denoted as P. The illustration of missing observations (animal 7) is shown for the first implementation option, i.e. zeroing rows and columns, since this method is less ambiguous (i.e., the G matrix is not affected and the P matrix is of same dimension under all circumstances).

Assume the following parameters from the national genetic evaluation:

For animals 1 - 15 the results are:

Animal	m₁	m₂	nominator	R_i(o)
1	0.397	0.388	21.673	0.170
2	0.397	0.388	21.673	0.170
3	0.397	0.388	21.673	0.170
4	0.603	0.612	30.954	0.242
5	0.603	0.551	29.423	0.230
6	0.603	0.407	25.952	0.203
7	0.338	-	9.487	0.074
8	0.278	-	7.813	0.061
9	0.441	0.593	28.176	0.221
10	0.559	0.407	25.143	0.197
11	0.559	0.407	25.143	0.197
12	0.559	0.407	25.143	0.197
13	0.441	0.593	28.176	0.221
14	0.441	0.444	24.025	0.188
15	0.353	-	9.914	0.078

In case multiple observations are recorded on genetically distinct traits, the P₁₁ element in the P matrix is obtained as follows:Let two observations be recorded on the first genetically distinct trait, the repeatability of observations on the same trait r₁ = 0.40. Assume that w_i1, computed the same way as illustrated above, for observation 1 and 2 are 0.7 and 0.8. Then m₁ = 0.7 + 0.8 = 1.5, and

Step 2: Combine sources of information

First, the reliability contributed by the dam of the animal is added to the animals reliability based on own performance, and expressed in effective daughter contributions (EDC) (formula (3) in the “parent” document). Once that is done for all animals, information from all daughters of a sire is accumulated into one single weighting factor for each sire. R_i(o) values from case b) from Step 1 will be taken for these example computations, and hypothetical values for R_dam(o) are used. Parameters (G, P^*, k) will also be those used in case b) of Step 1.

Note that R_dam(o) is always zero in case the national genetic evaluation applies a sire model!

The heritability of the trait submitted in the 010 file is the heritability needed in Step 2, and is for this example computed as:

For animals 1 - 15 the EDC_i(o+d), given the assumed values for R_dam(o) for dams, are:

Animal	Sire	R_i(o)	R_dam(o)	EDC_i(o+d)
1	S1	0.170	0.29	0.486
2	S1	0.170	0.30	0.486
3	S1	0.170	-	0.479
4	S2	0.242	0.31	0.712
5	S2	0.230	0.31	0.674
6	S2	0.203	0.29	0.588
7	S1	0.074	0.28	0.206
8	S1	0.061	0.26	0.169
9	S1	0.221	0.24	0.640
10	S2	0.197	0.23	0.567
11	S2	0.197	0.23	0.567
12	S2	0.197	0.23	0.567
13	S1	0.221	0.24	0.640
14	S1	0.188	0.25	0.541
15	S1	0.078	0.27	0.215

Finally, the weighting factors w_s for both sires in this example data set are computed as follows:

Weighting factors for the international genetic evaluation of longevity:

Examples for mixed linear models

This document contains a set of examples to illustrate the calculation of the weighting factors for genetic evaluations for longevity using mixed linear models. The same procedure will apply as the one for production traits (outlined in Appendix IV). It consists of two steps:

1) computation of reliability based on own performance, and

2) combining information from daughters and their dams, to be expressed in effective daughter contributions (EDC). For a hypothetical data set, computations for Step 1 will be illustrated for two different national genetic evaluation models based on mixed linear models. Computations for Step 2 are then outlined for a given set of results from Step 1.

Step 1: Computation of reliability based on own performance (R(o)):

Consider the following hypothetical data set:

Animal	Sire	CG	Length of life analysis		Binomial trait analysis
			Weight¹	Productive life (months)	Survived 2^nd lactation (0=no, 1=yes)
1	S1	A1	1	23	0
2	S1	A1	1	29	1
3	S1	A1	1	24	1
4	S2	A1	1	23	0
5	S2	A1	1	28	1
6	S2	A1	1	24	1
7	S1	A1	0.67	20	0
8	S1	A1	0.80	26	1
9	S1	A2	1	28	1
10	S2	A2	1	18	0
11	S2	A2	1	26	1
12	S2	A2	1	23	0
13	S1	A2	1	28	1
14	S1	A2	1	24	1
15	S1	A2	0.81	27	1

¹ Weight in the national genetic evaluation. Computation of R_i(o) depends on the genetic evaluation model, and will for this example data set be illustrated for two types of linear model analysis. For the first case, longevity is regarded as a binomial trait indicating whether or not an animal survived the first and second lactation. For the second case, length of productive life is recorded. In the example a situation is imitated where some records receive a lower weight (e.g., due to those being predicted rather than observed records).

a) Binomial trait analysisAssume the following parameters from the national genetic evaluation: h² = 0.02

For animals 1-8:

Likewise, for animals 9-15 the summation evaluates to 7.

Then for animal 1:

For animals 1 - 15 the results are:

Animal	m = w_i1	R_i(o)
1	0.375	0.0075
2	0.375	0.0075
3	0.375	0.0075
4	0.625	0.0125
5	0.625	0.0125
6	0.625	0.0125
7	0.375	0.0075
8	0.375	0.0075
9	0.429	0.0086
10	0.571	0.0114
11	0.571	0.0114
12	0.571	0.0114
13	0.429	0.0086
14	0.429	0.0086
15	0.429	0.0086

b) Length of life analysis

The same data structure as before is used here as well, however, the trait being analysed is length of productive life. In addition, a situation with differential weights for observations is illustrated. Assume the following parameters from the national genetic evaluation: h² = 0.10

For animals 1-8:

Likewise, for animals 9-15 the summation evaluates to 6.81.

Then for animal 1:

For animal 7:

For animals 1 - 15 the results are:

Animal	m₁	R_i(o)
1	0.402	0.040
2	0.402	0.040
3	0.402	0.040
4	0.598	0.060
5	0.598	0.060
6	0.598	0.060
7	0.269	0.027
8	0.321	0.032
9	0.441	0.044
10	0.560	0.056
11	0.560	0.056
12	0.560	0.056
13	0.441	0.044
14	0.441	0.044
15	0.357	0.036

Step 2: Combine sources of information

First, the reliability contributed by the dam of the animal is added to the animals reliability based on own performance, and expressed in effective daughter contributions (EDC). Once that is done for all animals, information from all daughters of a sire is accumulated into one single weighting factor for each sire. R_i(o) values from case b) from Step 1 will be taken for these example computations, and hypothetical values for R_dam(o) are used.

Note that R_dam(o) is always zero in case the national genetic evaluation applies a sire model!

Assume the following parameters from the national genetic evaluation: h² = 0.10 and k follows as:

For animal 1:

And for animal 3, which is assumed to have an unknown dam

R₃(o) = 0.040

R_dam(o) = 0

For animals 1 - 15 the EDC_i(o+d), given the assumed values for R_dam(o) for dams, are:

Animal	Sire	R_i(o)	R_dam(o)	EDC_i(o+d)
1	S1	0.040	0.059	0.394
2	S1	0.040	0.060	0.394
3	S1	0.040	-	0.394
4	S2	0.060	0.062	0.594
5	S2	0.060	0.062	0.594
6	S2	0.060	0.058	0.594
7	S1	0.027	0.056	0.265
8	S1	0.032	0.052	0.315
9	S1	0.044	0.048	0.434
10	S2	0.056	0.046	0.554
11	S2	0.056	0.046	0.554
12	S2	0.056	0.046	0.554
13	S1	0.044	0.048	0.434
14	S1	0.044	0.050	0.434
15	S1	0.036	0.054	0.354

Finally, the weighting factors w_s for both sires in this example data set are computed as follows: