| Size: 90870 Comment:  | Size: 76130 Comment:  | 
| Deletions are marked like this. | Additions are marked like this. | 
| Line 19: | Line 19: | 
| For each trait group, five data sets will be necessary for the '''''GEBV test''''': two '''''full data sets''''', two '''''reduced data sets'''''' '''''and one additional file with '''''genotyping information on test bulls''. '''--(In order to facilitate data preparation and reading, Interbull file formats 01X''' '''will be used. )-- | Data formats are described at [[https://wiki.interbull.org/public/GEBVtest_software?action=print|GEBVtest Software]]. | 
| Line 22: | Line 22: | 
| The full data sets include all animals present in the latest international evaluation before validation data is prepared (CURRENT), without editions. They are of two types, one containing national official genetic values (EBVs) and another containing either de-regressed predicted genetic merits (D_PGMs) or daughter deviations (DDs). | The full data sets include all animals present in the most recent Interbull MACE evaluation. They are of two types, one containing national official genetic merit values (EBVs) and another containing either de-regressed predicted genetic merits (D_PGMs) or daughter deviations (DDs). | 
| Line 25: | Line 25: | 
| The files sent by the NGEC as input for the latest Interbull routine run (formats 010, 115, 015, 016, 017, 018, 019, 020) will be used by the Interbull Centre and the NGEC do not have to provide these files again. | The files sent by the NGEC as input for the most recent Interbull MACE evaluation and will be used to identify the candidate bulls, estimate selection intensity and check bulls birth year and type of proof. | 
| Line 28: | Line 28: | 
| The NGEC needs to prepare either DD or D_PGM for the same animals included in 3.1.1. These values represent the currently estimated performance of the animals and will be used as the dependent variable in the validation procedure. EDC and reliability estimates should be exactly the same as the 01X file in 3.1.1. | The NGEC needs to prepare either DD or D_PGM for the same animals included in fileCxxxf. These values represent the currently estimated performance of the animals and will be used as the dependent variable in the validation procedure. EDC and reliability estimates should be exactly the same as in fileCxxxf. | 
| Line 34: | Line 34: | 
| The NGEC should carry out a conventional genetic evaluation using truncated data (only phenotypes up to 4 years prior to the date of analysis) but including in the analysis all animals present in the current official evaluations (fileCxxxf). All animals in the C01Xf must be included in the C01Xr file sent to Interbull. | The NGEC should carry out a conventional genetic evaluation using truncated data (only phenotypes up to 4 years prior to the date of analysis) but including in the analysis all animals present in the current official evaluations (fileCxxxf).--( )-- | 
| Line 37: | Line 37: | 
| Similarly, new genomic evaluations should be carried out using exactly the same model being validated (current) but excluding phenotypic information up to four years ago (truncated data, fileCxxxr). All bulls that did not have a progeny test 4 years ago and that currently have at least 20 daughter-equivalents in the national genetic evaluation (test bulls) need to have a genomically enhanced EBV (GEBVr) estimated and included in the output. All animals included in the fileCxxxf must be included in the fileGxxxr file sent to Interbull. | Similarly, new genomic evaluations should be carried out using exactly the same model being validated (current) but excluding phenotypic information up to four years ago (truncated data, fileCxxxr). All bulls that did not have a progeny test 4 years ago and that currently have at least 20 daughter-equivalents in the national genetic evaluation (test bulls) need to have a genomically enhanced EBV (GEBVr) estimated and included in the output. | 
| Line 39: | Line 39: | 
| If a significant number of foreign animals are included in the reference population and estimation of genomic prediction equations uses de-regressed MACE values for these animals as input, the reduced genomic evaluation can be achieved in two ways: | If a significant number of foreign animals are included in the reference population and estimation of genomic prediction equations uses de-regressed MACE values for these animals as input, the reduced genomic evaluation can be achieved in two ways: | 
| Line 41: | Line 41: | 
| a. the Interbull Centre can make historical files available upon request (e.g. information used four years ago) containing past MACE results and the correspondent national EDCs, as well as heritability and genetic correlations used in the respective evaluations – these data can then be used to estimate 4-year old de-regressed values; OR b. the genomic prediction equations for the truncated data (only bulls with EDCr > 0) are obtained using current de-regressed MACE values. This constitutes an exception and should only be used when other option is not practical. | a. the Interbull Centre can make historical files available upon request (e.g. information used four years ago) containing past MACE results and the correspondent national EDCs, as well as heritability and genetic correlations used in the respective evaluations – these data can then be used to estimate 4-year old de-regressed values; OR a. the genomic prediction equations for the truncated data (only bulls with EDCr > 0) are obtained using current de-regressed MACE values. This constitutes an exception and should only be used when the standard procedure is not practical. | 
| Line 46: | Line 47: | 
| ||<tablestyle="margin-left:-5.3pt;border-collapse:collapse;border:none;mso-border-alt:solid windowtext .5pt;mso-yfti-tbllook:160;mso-padding-alt:0cm 5.4pt 0cm 5.4pt;mso-border-insideh:.5pt solid windowtext;mso-border-insidev:.5pt solid windowtext" tableclass="MsoNormalTable"rowstyle="mso-yfti-irow:0;mso-yfti-firstrow:yes"#D9D9D9 height="6.75pt" style="border:solid windowtext 1.0pt;mso-border-alt:solid windowtext .5pt;padding:0cm 5.4pt 0cm 5.4pt;text-align:center" |2>Test Data ''' ''' ||<#D9D9D9 height="6.75pt" style="border:solid windowtext 1.0pt;border-left:none;mso-border-left-alt:solid windowtext .5pt;mso-border-alt:solid windowtext .5pt;padding:0cm 5.4pt 0cm 5.4pt;text-align:center" |2>Type of information ''' ''' ||<#D9D9D9 height="6.75pt" style="border:solid windowtext 1.0pt;border-left:none;mso-border-left-alt:solid windowtext .5pt;mso-border-alt:solid windowtext .5pt;padding:0cm 5.4pt 0cm 5.4pt;text-align:center" |2>File types and formats ''' ''' ||||||<#D9D9D9 height="6.75pt" style="border:solid windowtext 1.0pt;border-left:none;mso-border-left-alt:solid windowtext .5pt;mso-border-alt:solid windowtext .5pt;padding:0cm 5.4pt 0cm 5.4pt;text-align:center;vertical-align:top">Specific variables^a^ '''(equivalent field in the 01X file)''' ''' ''' || | ||<tablestyle="margin-left:-5.3pt;border-collapse:collapse;border:none;mso-border-alt:solid windowtext .5pt;mso-yfti-tbllook:160;mso-padding-alt:0cm 5.4pt 0cm 5.4pt;mso-border-insideh:.5pt solid windowtext;mso-border-insidev:.5pt solid windowtext" tableclass="MsoNormalTable"rowstyle="mso-yfti-irow:0;mso-yfti-firstrow:yes"#D9D9D9 height="6.75pt" style="border:solid windowtext 1.0pt;mso-border-alt:solid windowtext .5pt;padding:0cm 5.4pt 0cm 5.4pt;text-align:center" |2>Test Data ''' ''' ||<#D9D9D9 height="6.75pt" style="border:solid windowtext 1.0pt;border-left:none;mso-border-left-alt:solid windowtext .5pt;mso-border-alt:solid windowtext .5pt;padding:0cm 5.4pt 0cm 5.4pt;text-align:center" |2>Type of information ''' ''' ||<#D9D9D9 height="6.75pt" style="border:solid windowtext 1.0pt;border-left:none;mso-border-left-alt:solid windowtext .5pt;mso-border-alt:solid windowtext .5pt;padding:0cm 5.4pt 0cm 5.4pt;text-align:center" |2>File types and format^a^ ''' ''' ||||||<#D9D9D9 height="6.75pt" style="border:solid windowtext 1.0pt;border-left:none;mso-border-left-alt:solid windowtext .5pt;mso-border-alt:solid windowtext .5pt;padding:0cm 5.4pt 0cm 5.4pt;text-align:center;vertical-align:top">Specific variables^b^ '''(equivalent field in the fileCxxxf)''' ''' ''' || | 
| Line 55: | Line 55: | 
| . ^a^All other variables should be the same as in the 01X files. ''' '''__3.3 Genotyping information on test bulls (file fomat 732)__ In order to estimate the expectation of the regression coefficient of GEBVr on D_PGM (E[b1]), it is necessary to know which test bulls (EDC≥20 and EDCr = 0) have been genotyped. It consists of a simple list of test bulls with a flag indicating if the bull has been genotyped or not. The file format is given in Appendix II. ''' '''__3.4 Specific instructions for data preparation:__ ''' ''' | |
| Line 57: | Line 56: | 
| * The domestic bulls (type of proof ≠ 21 or 22) that have EDC≥20 and EDCr = 0 are called '''test bulls.'''  Interbull recommends that number of test bulls would be about 0.25 *(number of  bulls used as reference population). ''' ''' i. If the number of bulls the country includes in the genomic evaluation is too small, then the accuracy of the GEBVs calculated using the truncated data becomes significantly smaller than with the full data. In that case, the country can use''' n''' < 4 years as the time difference between''''' full''''' and''''' reduced data''''' sets. ''' ''' i. ii. If the number of test bulls is too small (ntb < 50), the country may chose to consider foreign bulls (type of proof = 21 or 22) that have EDC≥20 and EDCr = 0 also as test bulls. ''' ''' i. iii. In both exceptions above, the Interbull Centre must be communicated in detail about the criteria adopted to define the test bulls. ''' ''' b. GEBV is the genomically enhanced breeding value. Correspondingly, the GEDC is a genomically enhanced EDC that combines the EDC from national non-genomic evaluation with the gain from genomic evaluation. This means that GEDC should be larger than EDC and GEDCr should be larger than EDCr. c. Include all the bulls having GEBVr in the data without data edits based on EDC, EDCr, GEDC or GEDCr. d. If GEDCr is not available, then GEDCr = λ * r^2^,,GEBVr,, / (1- r^2^,,GEBVr,,). ''' ''' | |
| Line 63: | Line 57: | 
| e. The method of estimation of GEDCr (and/or r^2^,,GEBVr,,) has to be reported in the Interbull GENO form. f. The GEBVr prediction equations also have to be based on the truncated data. If the GEBVr combines information of DGV and EBV (i.e. PA), the EBV (PA) information has to be also from the truncated data. g. Bulls with EBV in the full data sets that have no progeny information four years ago (EDCr=0), should be included in the reduced data set. ''' ''' | ^a^The GEBVtest software (gebvtest.py) uses a trait-independent format (File300). Users can either prepare data in the new format or use the program gtconvert.py to convert the current format into the File300 format.<<BR>> ^b^All other variables should be the same as in the Cxxxf files. | 
| Line 65: | Line 59: | 
| . h. If the EBVs from evaluations published four years ago are available, the country can use these values for the''''' reduced data sets'''''. However, if the evaluation model, trait definitions, etc. have changed from the estimation of EBVs in the''''' reduced data sets''''' and the estimation of EBVs in the''''' full data sets, '''''the GEBVr can be expected to have lower accuracy than GEBV. In this case, the country should report the expected correlation between the old ''(reduced'') and the new ''(full'') data EBVs (see Interbull Testing Method 3). ''' ''' | === Specific instructions for data preparation: === a. The domestic bulls (type of proof ≠ 21 or 22) that have EDC≥20 and EDCr = 0 are called '''test bulls.''' Interbull recommends that number of test bulls would be about 0.25 *(number of bulls used as reference population). i. If the number of bulls the country includes in the genomic evaluation is too small, then the accuracy of the GEBVs calculated using the truncated data becomes significantly smaller than with the full data. In that case, the country can use n < 4 years as the time difference between full and reduced data sets. i. If the number of test bulls is too small (ntb < 50), the country may chose to consider foreign bulls (type of proof = 21 or 22) that have EDC≥20 and EDCr = 0 also as test bulls. i. In both exceptions above, the Interbull Centre must be communicated in detail about the criteria adopted to define the test bulls. a. Appropriate time windows (birth years of bulls) may vary depending of the trait to be validated, the speed of their progeny test program and other factors. A shift of the time window with one year will give a different set of bulls that qualify for the test. It is desirable to choose a time window so as to have the largest number of qualifying bulls as possible. It is required that bulls do not have any progeny for the evaluation in (YYYY - 4), and EDC≥20 in YYYY. The standard adopted for the GEBV test is to include four years of candidate/test bulls, which corresponds to an age cutoff of (YYYY-8). a. GEBV is the genomically enhanced breeding value. Correspondingly, the GEDC is a genomically enhanced EDC that combines the EDC from national non-genomic evaluation with the gain from genomic evaluation. This means that GEDC should be larger than EDC and GEDCr should be larger than EDCr. a. Include all the bulls having GEBVr in the data without data edits based on EDC, EDCr, GEDC or GEDCr. a. If GEDCr is not available, then GEDCr = λ * r^2^,,GEBVr,, / (1- r^2^,,GEBVr,,) a. The method of estimation of GEDCr (and/or r^2^,,GEBVr,,) has to be reported in the Interbull GENO form. a. The GEBVr prediction equations also have to be based on the truncated data. If the GEBVr combines information of DGV and EBV (i.e. PA), the EBV (PA) information has to be also from the truncated data. a. Bulls with EBV in the full data sets that have no progeny information four years ago (EDCr=0), should be included in the reduced data set. a. If the EBVs from evaluations published four years ago are available, the country can use these values for the reduced data sets. However, if the evaluation model, trait definitions, etc. have changed from the estimation of EBVs in the reduced data sets and the estimation of EBVs in the full data sets, the GEBVr can be expected to have lower accuracy than GEBV. In this case, the country should report the expected correlation between the old (reduced) and the new (full) data EBVs (see Interbull Testing Method 3). a. In order to remove any change in scale of proof expression, EBVr and GEBVr should be rescaled to the same scale as EBVs, using bulls already proven in the reduced data sets. | 
| Line 67: | Line 74: | 
| * In order to remove any change in scale of proof expression, EBVr and GEBVr should be rescaled to the same scale as EBVs, using bulls already proven in the '''''reduced data sets'''''. ''' '''__3.4. File names__ Submitted files should have the following name: ''' ''' ''' file{FILETYPE}{FILEFORMAT}{DATASET}{BRD}.{COU}.yyyymmdd''' where file = prefix FILETYPE = C (conventional), G (genomic) or D (De-regressed/DD) (upper-case) FILEFORMAT = 010, 015, 115, 016, 017, 018, 019, 020 DATASET = f (full) or r(reduced) (lower-case) BRD = all, bsw, gue, hol, jer, rdc, sim (lower-case) COU = ARG, AUS, … (upper-case) yyyymmdd = year|month|day of the file creation ''' '''Example: ‘''fileG010rhol.AUS.20110509''’ would be the genomic reduced data set for production traits from the Australian Hosltein population, prepared in May 09, 2011. ''' ''' | === File names === Submitted files should have the following name: '''''file{FILETYPE}{FILEFORMAT}{DATASET}{BRD}.{COU}.yyyymmdd''''' , | 
| Line 69: | Line 77: | 
| * Test description __4.1. Testing for bias__ ''' ''' The bias in the national genomic evaluations will be tested using a regression model: ''' '''φ,,i,,= b,,0,, + b,,1,,*GEBVr,,i,, + e,,i,,'' '' [1] ''' '''where φ,,i,, is the D_PGM (or DD, if available) from the bulls that have EDC≥20 and EDCr =0. The EDCs from the '''''full data set''''' can be used as weights in the model if DDs are supplied, otherwise the accuracy of the D_PGMs (w,,i,,=EDC/(EDC+λ)) will be used as weights. ''' ''' | where | 
| Line 71: | Line 79: | 
| * This model is used to estimate b,,1,, to compare with the expectation of b,,1,, (H,,0,,: b,,1,, = E(b,,1,,)) and therefore test the bias on GEBVr. Item 4.3 describes how the expectation of b,,1,, can be derived considering the impact of selective genotyping among test bulls. ''' ''' i. The statistical significance will be tested using a t-test against H,,0,, (C.I. = 0.95). ''' ''' i. ii. For larger populations the estimated standard error might become very small and then the t-test may become too restrictive. In those cases, a “biological significance” will be adopted to test H,,0,, [P((E(b,,1,,)-0.1) ≤ b,,1,, ≤ (P(E(b,,1,,)+0.1))]. ''' ''' i. iii. Followed by ITC discussions in Cork 2012, the country-trait-breeds will pass the test, if b,,1,, value is greater than the lower endpoint of the 95% confidence interval or its biological equivalent. ''' ''' b. The accuracy of GEBVr will be estimated from the R^2^ of the model (accuracy of the model after selection for genotyping). This validation accuracy R^2^,,validation,, = R^2^/ {{file:///C:\Users\joao.AD\AppData\Local\Temp\msohtmlclip1\01\clip_image002.gif||height="30",width="24"}} , where {{file:///C:\Users\joao.AD\AppData\Local\Temp\msohtmlclip1\01\clip_image002.gif||height="30",width="24"}} is the average weight of all the test bulls. It will be expected that the mean of published bull r^2^,,GEBVr,, is in agreement with R^2^,,validation,,. ''' ''' ''' '''__4.2. Testing the improvement from conventional evaluation__ ''' '''The improvement of the added daughter information to the parental information will be estimated by comparing the R^2^ from model [1] with the R^2^ from model [2]: ''' '''φ,,i,, = b,,0,, + b,,1,,*EBVr,,i,,+ e,,i,, [2] ''' '''where φ,,i,, and the corresponding weight w,,i,, are the same as in model [1]. The R^2^ from model [1] must be higher than the R^2^ from model [2]. ''' '''__4.3. Estimating the effect of selective genotyping on E(b,,1,,)__ ''' '''The expected value of b,,1 ,,is 1.0 only if the genotyped test bulls are a representative sample of the bulls in the corresponding age classes. The selection based on EBVs before genotyping will reduce the value of b,,1,, and also the value of R^2^ for model [1]. ''' '''The level of selective genotyping can be approximated from the difference between the mean EBV of the genotyped test bulls, µ,,EBVg,,, and the mean EBV of all potential test bulls (i.e. bulls with EDC≥20 and EDC,,r,,=0, genotyped or not), µ,,EBVall,,, and the standard deviation of EBV of all potential test bulls (σ,,EBVall,,). ''' ''' . {{file:///C:\Users\joao.AD\AppData\Local\Temp\msohtmlclip1\01\clip_image005.gif||height="50",width="179"}} [3] ''' ''' Using tables from quantitative genetics books, (e.g. page 379 from Falconer, D. S. & Mackay, T. F. C. ''Introduction to Quantitative Genetics'', Longman, 4^th^ ed. 1996) the proportion of selected (genotyped) individuals (p) can be obtained for the selection differential (i) and the corresponding truncation point x that divides the standard normal density into selected proportion p and non-selected (1-p). ''' '''Having the proportion of the selected individuals, the expected value of the b,,1,, (E(b,,1,,)) and the effect of the selection on R^2^ of the test model can be estimated by approximation of the effect of selection on the variance of the selected trait and on the covariance between the independent (GEBVr) and the dependent (φ) variables. Having (i) as the mean deviation of the selected individuals from the total population in terms of standard deviation from the total population, and (x) as a selection truncation point from the overall mean: k = i(i - x) [4] ''' '''v,,1,, = 1 – k [5] ''' '''Calculating R^2^ before selection (R,,b,,^2^), which is the R^2^ for model [1], from R^2^ after selection (R,,a,,^2^): ''' '''R,,b,,^2^ = R,,a,,^2^ / (v,,1,, + kR,,a,,^2^) [6] ''' '''v,,2,, = 1 – kR,,b,,^2^ [7] ''' '''E(b,,1,,) = v,,1,, / v,,2,, [8] ''' Example:''' ''' '''Assuming that: µ,,EBVg,, = 16.00; µ,,EBVall,, = 11.76; σ,,EBVall,, = 10.00; R,,a,,^2^ = 0.555. The selection differential (i) for the genotyped bulls equal to 0.424 standard deviations of EBVs (equation [3]), the proportion of genotyped bulls (p) would be 75 percent and the mean deviation of the truncation point from the overall mean (x) would be equals to -0.674 (from reference table). Applying equations [4], [5] and [6] it is possible to calculate R,,b,,^2^ = 0.70. Using equations [7] and [8] E(b,,1,,) = 0.793. ''' Table 2 –''' Examples of expected regression coefficients (E(b,,1,,)) as functions of the selection intensity (i) and the coefficient of determination before selection (R,,b,,^2^). ''' ''' | * file = prefix FILETYPE = C (conventional), G (genomic) or D (De-regressed/DD) (upper-case) * FILEFORMAT = 010, 015, 115, 016, 017, 018, 019, 020 * DATASET = f (full) or r(reduced) (lower-case) * BRD = all, bsw, gue, hol, jer, rdc, sim (lower-case) * COU = ARG, AUS, … (upper-case) * yyyymmdd = year|month|day of the file creation | 
| Line 79: | Line 86: | 
| Example: ‘''fileG010rhol.AUS.20110509''’ would be the genomic reduced data set for production traits from the Australian Hosltein population, prepared in May 09, 2011. == Test description == === Testing for bias === The bias in the national genomic evaluations will be tested using a regression model: φ,,i,,= b,,0,, + b,,1,,*GEBVr,,i,, + e,,i,, ''' [1]''' , where φ,,i,, is the D_PGM (or DD, if available) from the bulls that have EDC≥20 and EDCr =0. The EDCs from the '''''full data set''''' can be used as weights in the model if DDs are supplied, otherwise the accuracy of the D_PGMs (u,,i,,=EDC/(EDC+λ)) will be used as weights. a. This model is used to estimate b,,1,, to compare with the expectation of b,,1,, (H,,0,,: b,,1,, = E(b,,1,,)) and therefore test the bias on GEBVr. Item 4.3 describes how the expectation of b,,1,, can be derived considering the impact of selective genotyping among test bulls. i. The statistical significance will be tested using a t-test against H,,0,, (C.I. = 0.95). i. For larger populations the estimated standard error might become very small and then the t-test may become too restrictive. In those cases, a “biological significance” will be adopted to test H,,0,, [P((E(b,,1,,)-0.1) ≤ b,,1,, ≤ (P(E(b,,1,,)+0.1))]. i. Followed by ITC discussions in Cork 2012, the country-trait-breeds will pass the test, if b,,1,, value is greater than the lower endpoint of the 95% confidence interval or its biological equivalent. a. The accuracy of GEBVr will be estimated from the R^2^ of the model (accuracy of the model after selection for genotyping). This validation accuracy R^2^,,validation,, = R^2^/ ū, where ū is the average weight of all the test bulls. It will be expected that the mean of published bull r^2^,,GEBVr,, is in agreement with R^2^,,validation,,. ''' ''' ''' ''' === Testing the improvement from conventional evaluation === The improvement of the added daughter information to the parental information will be estimated by comparing the R^2^ from model [1] with the R^2^ from model [2]: ''' '''φ,,i,, = b,,0,, + b,,1,,*EBVr,,i,,+ e,,i,, '''[2]''' , ''' '''where φ,,i,, and the corresponding weight u,,i,, are the same as in model [1]. The R^2^ from model [1] must be higher than the R^2^ from model [2]. === Estimating the effect of selective genotyping on E(b1) === ''' '''The expected value of b,,1 ,,is 1.0 only if the genotyped test bulls are a representative sample of the bulls in the corresponding age classes. The selection based on EBVs before genotyping will reduce the value of b,,1,, and also the value of R^2^ for model [1]. ''' '''The level of selective genotyping can be approximated from the difference between the mean EBV of the genotyped test bulls, µ,,EBVg,,, and the mean EBV of all potential test bulls (i.e. bulls with EDC≥20 and EDC,,r,,=0, genotyped or not), µ,,EBVall,,, and the standard deviation of EBV of all potential test bulls (σ,,EBVall,,). ''' ''' . i = (µ,,EBVg,, - µ,,EBVall,,)/ σ,,EBVall ,,'''[3]'''. ''' ''' Using tables from quantitative genetics books, (e.g. page 379 from Falconer, D. S. & Mackay, T. F. C. ''Introduction to Quantitative Genetics'', Longman, 4^th^ ed. 1996) the proportion of selected (genotyped) individuals (p) can be obtained for the selection differential (i) and the corresponding truncation point x that divides the standard normal density into selected proportion p and non-selected (1-p). ''' '''<<BR>> Having the proportion of the selected individuals, the expected value of the b,,1,, (E(b,,1,,)) and the effect of the selection on R^2^ of the test model can be estimated by approximation of the effect of selection on the variance of the selected trait and on the covariance between the independent (GEBVr) and the dependent (φ) variables. Having (i) as the mean deviation of the selected individuals from the total population in terms of standard deviation from the total population, and (x) as a selection truncation point from the overall mean: k = i(i - x) '''[4]''' ''' '''v,,1,, = 1 – k '''[5]''' ''' '''Calculating R^2^ before selection (R,,b,,^2^), which is the R^2^ for model [1], from R^2^ after selection (R,,a,,^2^): ''' '''R,,b,,^2^ = R,,a,,^2^ / (v,,1,, + kR,,a,,^2^) '''[6]''' ''' '''v,,2,, = 1 – kR,,b,,^2^ '''[7]''' ''' '''E(b,,1,,) = v,,1,, / v,,2,, '''[8]''' ''' Example:''' ''' '''Assuming that: µ,,EBVg,, = 16.00; µ,,EBVall,, = 11.76; σ,,EBVall,, = 10.00; R,,a,,^2^ = 0.555. The selection differential (i) for the genotyped bulls equal to 0.424 standard deviations of EBVs (equation [3]), the proportion of genotyped bulls (p) would be 75 percent and the mean deviation of the truncation point from the overall mean (x) would be equals to -0.674 (from reference table). Applying equations [4], [5] and [6] it is possible to calculate R,,b,,^2^ = 0.70. Using equations [7] and [8] E(b,,1,,) = 0.793. ''' Table 2 –''' Examples of expected regression coefficients (E(b,,1,,)) as functions of the selection intensity (i) and the coefficient of determination before selection (R,,b,,^2^). ''' '''''' ''' | |
| Line 92: | Line 146: | 
| 5. Documents to be submitted by participating NGEC  __5.1. Interbull GENO form__ ''' ''' 5. The methodology for estimation of GEBV and its’ accuracy (r^2^,,GEBV,,) have to be reported by the NGEC in Interbull GENO form (Appendix IV). ''' ''' ''' '''__5.2.'' GEBV test'' estimates (File format 731)__ The NGEC submitting genomic data for validation is required to provide also Form 731 (Appendix III), which contains the results from the validation test obtained by the applicant, as well as descriptive statistics needed for the correct treatment of the submitted data. Eventual discrepancies between values in file 731 and estimates obtained by ITBC will be discussed with the NGEC submitting data prior to publication of results. ''' ''' APPENDIX I – Definitions: · EBV – Estimated Breeding Value (conventional national evaluations of the trait, free of genomic information, which are submitted to Interbull to be used in MACE evaluations) · DGV - Direct Estimated Genomic Value (genomic evaluations based on SNP prediction equations) · GEBV – Genomically Enhanced Estimated Breeding Value (evaluations that combine EBV and DGV) · EDC –Effective Daughter Contribution · GEDC – Genomically Enhanced Effective Daughter Contribution (EDC plus the genomic contribution) · GMACE - Multiple Trait Across Country Genomic Evaluation · PA – Parent Average · D_PGM – De-regressed Predicted Genetic Merit · DD – Daughter Deviation · NGEC - National Genetic Evaluation Centre ''' '''· λ = (4-h^2^)/h^2^ ''' '''· r^2^ – Reliability of the bull’s evaluation ''' '''· R^2^ – Accuracy of the test model ''' ''' . ''' ''' APPENDIX II - FILE FORMAT 732 ''' ''' 5. File format for genotyping information on test bulls. ''' ''' ||<tablewidth="756px" tablestyle="margin-left:.1pt;border-collapse:collapse;border:none;mso-border-alt:solid windowtext .5pt;mso-yfti-tbllook:160;mso-padding-alt:0cm 5.4pt 0cm 5.4pt;mso-border-insideh:.5pt solid windowtext;mso-border-insidev:.5pt solid windowtext" tableclass="MsoNormalTable"rowstyle="mso-yfti-irow:0;mso-yfti-firstrow:yes;mso-height-rule:exactly"#D9D9D9 width="93px" height="34.0pt" style="border:solid windowtext 1.0pt;mso-border-alt:solid windowtext .5pt;padding:0cm 5.4pt 0cm 5.4pt;mso-height-rule:exactly">Starting position ||<#D9D9D9 width="307px" height="34.0pt" style="border:solid windowtext 1.0pt;border-left:none;mso-border-left-alt:solid windowtext .5pt;mso-border-alt:solid windowtext .5pt;padding:0cm 5.4pt 0cm 5.4pt;mso-height-rule:exactly">Field description ||<#D9D9D9 width="178px" height="34.0pt" style="border:solid windowtext 1.0pt;border-left:none;mso-border-left-alt:solid windowtext .5pt;mso-border-alt:solid windowtext .5pt;padding:0cm 5.4pt 0cm 5.4pt;mso-height-rule:exactly">Format ||<#D9D9D9 width="179px" height="34.0pt" style="border:solid windowtext 1.0pt;border-left:none;mso-border-left-alt:solid windowtext .5pt;mso-border-alt:solid windowtext .5pt;padding:0cm 5.4pt 0cm 5.4pt;mso-height-rule:exactly">Example || ||<rowstyle="mso-yfti-irow:1;mso-height-rule:exactly"width="93px" height="14.2pt" style="border:solid windowtext 1.0pt;border-top:none;mso-border-top-alt:solid windowtext .5pt;mso-border-alt:solid windowtext .5pt;padding:0cm 5.4pt 0cm 5.4pt;mso-height-rule:exactly;vertical-align:top">1 ||<width="307px" height="14.2pt" style="border-top:none;border-left:none;border-bottom:solid windowtext 1.0pt;border-right:solid windowtext 1.0pt;mso-border-top-alt:solid windowtext .5pt;mso-border-left-alt:solid windowtext .5pt;mso-border-alt:solid windowtext .5pt;padding:0cm 5.4pt 0cm 5.4pt;mso-height-rule:exactly;vertical-align:top">Record type ||<width="178px" height="14.2pt" style="border-top:none;border-left:none;border-bottom:solid windowtext 1.0pt;border-right:solid windowtext 1.0pt;mso-border-top-alt:solid windowtext .5pt;mso-border-left-alt:solid windowtext .5pt;mso-border-alt:solid windowtext .5pt;padding:0cm 5.4pt 0cm 5.4pt;mso-height-rule:exactly;vertical-align:top">Character 3 ||<width="179px" height="14.2pt" style="border-top:none;border-left:none;border-bottom:solid windowtext 1.0pt;border-right:solid windowtext 1.0pt;mso-border-top-alt:solid windowtext .5pt;mso-border-left-alt:solid windowtext .5pt;mso-border-alt:solid windowtext .5pt;padding:0cm 5.4pt 0cm 5.4pt;mso-height-rule:exactly;vertical-align:top">732 || ||<rowstyle="mso-yfti-irow:2;mso-height-rule:exactly"width="93px" height="14.2pt" style="border:solid windowtext 1.0pt;border-top:none;mso-border-top-alt:solid windowtext .5pt;mso-border-alt:solid windowtext .5pt;padding:0cm 5.4pt 0cm 5.4pt;mso-height-rule:exactly;vertical-align:top">4 ||<width="307px" height="14.2pt" style="border-top:none;border-left:none;border-bottom:solid windowtext 1.0pt;border-right:solid windowtext 1.0pt;mso-border-top-alt:solid windowtext .5pt;mso-border-left-alt:solid windowtext .5pt;mso-border-alt:solid windowtext .5pt;padding:0cm 5.4pt 0cm 5.4pt;mso-height-rule:exactly;vertical-align:top">Breed of evaluation ||<width="178px" height="14.2pt" style="border-top:none;border-left:none;border-bottom:solid windowtext 1.0pt;border-right:solid windowtext 1.0pt;mso-border-top-alt:solid windowtext .5pt;mso-border-left-alt:solid windowtext .5pt;mso-border-alt:solid windowtext .5pt;padding:0cm 5.4pt 0cm 5.4pt;mso-height-rule:exactly;vertical-align:top">Character 3 ||<width="179px" height="14.2pt" style="border-top:none;border-left:none;border-bottom:solid windowtext 1.0pt;border-right:solid windowtext 1.0pt;mso-border-top-alt:solid windowtext .5pt;mso-border-left-alt:solid windowtext .5pt;mso-border-alt:solid windowtext .5pt;padding:0cm 5.4pt 0cm 5.4pt;mso-height-rule:exactly;vertical-align:top">HOL || ||<rowstyle="mso-yfti-irow:3;mso-height-rule:exactly"width="93px" height="14.2pt" style="border:solid windowtext 1.0pt;border-top:none;mso-border-top-alt:solid windowtext .5pt;mso-border-alt:solid windowtext .5pt;padding:0cm 5.4pt 0cm 5.4pt;mso-height-rule:exactly;vertical-align:top">7 ||<width="307px" height="14.2pt" style="border-top:none;border-left:none;border-bottom:solid windowtext 1.0pt;border-right:solid windowtext 1.0pt;mso-border-top-alt:solid windowtext .5pt;mso-border-left-alt:solid windowtext .5pt;mso-border-alt:solid windowtext .5pt;padding:0cm 5.4pt 0cm 5.4pt;mso-height-rule:exactly;vertical-align:top">Breed of bull ||<width="178px" height="14.2pt" style="border-top:none;border-left:none;border-bottom:solid windowtext 1.0pt;border-right:solid windowtext 1.0pt;mso-border-top-alt:solid windowtext .5pt;mso-border-left-alt:solid windowtext .5pt;mso-border-alt:solid windowtext .5pt;padding:0cm 5.4pt 0cm 5.4pt;mso-height-rule:exactly;vertical-align:top">Character 3 ||<width="179px" height="14.2pt" style="border-top:none;border-left:none;border-bottom:solid windowtext 1.0pt;border-right:solid windowtext 1.0pt;mso-border-top-alt:solid windowtext .5pt;mso-border-left-alt:solid windowtext .5pt;mso-border-alt:solid windowtext .5pt;padding:0cm 5.4pt 0cm 5.4pt;mso-height-rule:exactly;vertical-align:top">HOL || ||<rowstyle="mso-yfti-irow:4;mso-height-rule:exactly"width="93px" height="14.2pt" style="border:solid windowtext 1.0pt;border-top:none;mso-border-top-alt:solid windowtext .5pt;mso-border-alt:solid windowtext .5pt;padding:0cm 5.4pt 0cm 5.4pt;mso-height-rule:exactly;vertical-align:top">10 ||<width="307px" height="14.2pt" style="border-top:none;border-left:none;border-bottom:solid windowtext 1.0pt;border-right:solid windowtext 1.0pt;mso-border-top-alt:solid windowtext .5pt;mso-border-left-alt:solid windowtext .5pt;mso-border-alt:solid windowtext .5pt;padding:0cm 5.4pt 0cm 5.4pt;mso-height-rule:exactly;vertical-align:top">Country of first registration of full ||<width="178px" height="14.2pt" style="border-top:none;border-left:none;border-bottom:solid windowtext 1.0pt;border-right:solid windowtext 1.0pt;mso-border-top-alt:solid windowtext .5pt;mso-border-left-alt:solid windowtext .5pt;mso-border-alt:solid windowtext .5pt;padding:0cm 5.4pt 0cm 5.4pt;mso-height-rule:exactly;vertical-align:top">Character 3 ||<width="179px" height="14.2pt" style="border-top:none;border-left:none;border-bottom:solid windowtext 1.0pt;border-right:solid windowtext 1.0pt;mso-border-top-alt:solid windowtext .5pt;mso-border-left-alt:solid windowtext .5pt;mso-border-alt:solid windowtext .5pt;padding:0cm 5.4pt 0cm 5.4pt;mso-height-rule:exactly;vertical-align:top"> || ||<rowstyle="mso-yfti-irow:5;mso-height-rule:exactly"width="93px" height="14.2pt" style="border:solid windowtext 1.0pt;border-top:none;mso-border-top-alt:solid windowtext .5pt;mso-border-alt:solid windowtext .5pt;padding:0cm 5.4pt 0cm 5.4pt;mso-height-rule:exactly;vertical-align:top">13 ||<width="307px" height="14.2pt" style="border-top:none;border-left:none;border-bottom:solid windowtext 1.0pt;border-right:solid windowtext 1.0pt;mso-border-top-alt:solid windowtext .5pt;mso-border-left-alt:solid windowtext .5pt;mso-border-alt:solid windowtext .5pt;padding:0cm 5.4pt 0cm 5.4pt;mso-height-rule:exactly;vertical-align:top">Sex ||<width="178px" height="14.2pt" style="border-top:none;border-left:none;border-bottom:solid windowtext 1.0pt;border-right:solid windowtext 1.0pt;mso-border-top-alt:solid windowtext .5pt;mso-border-left-alt:solid windowtext .5pt;mso-border-alt:solid windowtext .5pt;padding:0cm 5.4pt 0cm 5.4pt;mso-height-rule:exactly;vertical-align:top">Character 1 ||<width="179px" height="14.2pt" style="border-top:none;border-left:none;border-bottom:solid windowtext 1.0pt;border-right:solid windowtext 1.0pt;mso-border-top-alt:solid windowtext .5pt;mso-border-left-alt:solid windowtext .5pt;mso-border-alt:solid windowtext .5pt;padding:0cm 5.4pt 0cm 5.4pt;mso-height-rule:exactly;vertical-align:top">M || ||<rowstyle="mso-yfti-irow:6;mso-height-rule:exactly"width="93px" height="14.2pt" style="border:solid windowtext 1.0pt;border-top:none;mso-border-top-alt:solid windowtext .5pt;mso-border-alt:solid windowtext .5pt;padding:0cm 5.4pt 0cm 5.4pt;mso-height-rule:exactly;vertical-align:top">14 ||<width="307px" height="14.2pt" style="border-top:none;border-left:none;border-bottom:solid windowtext 1.0pt;border-right:solid windowtext 1.0pt;mso-border-top-alt:solid windowtext .5pt;mso-border-left-alt:solid windowtext .5pt;mso-border-alt:solid windowtext .5pt;padding:0cm 5.4pt 0cm 5.4pt;mso-height-rule:exactly;vertical-align:top">ID number of bull ||<width="178px" height="14.2pt" style="border-top:none;border-left:none;border-bottom:solid windowtext 1.0pt;border-right:solid windowtext 1.0pt;mso-border-top-alt:solid windowtext .5pt;mso-border-left-alt:solid windowtext .5pt;mso-border-alt:solid windowtext .5pt;padding:0cm 5.4pt 0cm 5.4pt;mso-height-rule:exactly;vertical-align:top">Character 12 ||<width="179px" height="14.2pt" style="border-top:none;border-left:none;border-bottom:solid windowtext 1.0pt;border-right:solid windowtext 1.0pt;mso-border-top-alt:solid windowtext .5pt;mso-border-left-alt:solid windowtext .5pt;mso-border-alt:solid windowtext .5pt;padding:0cm 5.4pt 0cm 5.4pt;mso-height-rule:exactly;vertical-align:top">000000A12345 || ||<rowstyle="mso-yfti-irow:7;mso-height-rule:exactly"width="93px" height="14.2pt" style="border:solid windowtext 1.0pt;border-top:none;mso-border-top-alt:solid windowtext .5pt;mso-border-alt:solid windowtext .5pt;padding:0cm 5.4pt 0cm 5.4pt;mso-height-rule:exactly;vertical-align:top">26 ||<width="307px" height="14.2pt" style="border-top:none;border-left:none;border-bottom:solid windowtext 1.0pt;border-right:solid windowtext 1.0pt;mso-border-top-alt:solid windowtext .5pt;mso-border-left-alt:solid windowtext .5pt;mso-border-alt:solid windowtext .5pt;padding:0cm 5.4pt 0cm 5.4pt;mso-height-rule:exactly;vertical-align:top">Name of bull ||<width="178px" height="14.2pt" style="border-top:none;border-left:none;border-bottom:solid windowtext 1.0pt;border-right:solid windowtext 1.0pt;mso-border-top-alt:solid windowtext .5pt;mso-border-left-alt:solid windowtext .5pt;mso-border-alt:solid windowtext .5pt;padding:0cm 5.4pt 0cm 5.4pt;mso-height-rule:exactly;vertical-align:top">Character 30 ||<width="179px" height="14.2pt" style="border-top:none;border-left:none;border-bottom:solid windowtext 1.0pt;border-right:solid windowtext 1.0pt;mso-border-top-alt:solid windowtext .5pt;mso-border-left-alt:solid windowtext .5pt;mso-border-alt:solid windowtext .5pt;padding:0cm 5.4pt 0cm 5.4pt;mso-height-rule:exactly;vertical-align:top"> || ||<rowstyle="mso-yfti-irow:8;mso-height-rule:exactly"width="93px" height="14.2pt" style="border:solid windowtext 1.0pt;border-top:none;mso-border-top-alt:solid windowtext .5pt;mso-border-alt:solid windowtext .5pt;padding:0cm 5.4pt 0cm 5.4pt;mso-height-rule:exactly;vertical-align:top">56 ||<width="307px" height="14.2pt" style="border-top:none;border-left:none;border-bottom:solid windowtext 1.0pt;border-right:solid windowtext 1.0pt;mso-border-top-alt:solid windowtext .5pt;mso-border-left-alt:solid windowtext .5pt;mso-border-alt:solid windowtext .5pt;padding:0cm 5.4pt 0cm 5.4pt;mso-height-rule:exactly;vertical-align:top">Country sending this information ||<width="178px" height="14.2pt" style="border-top:none;border-left:none;border-bottom:solid windowtext 1.0pt;border-right:solid windowtext 1.0pt;mso-border-top-alt:solid windowtext .5pt;mso-border-left-alt:solid windowtext .5pt;mso-border-alt:solid windowtext .5pt;padding:0cm 5.4pt 0cm 5.4pt;mso-height-rule:exactly;vertical-align:top">Character 3 ||<width="179px" height="14.2pt" style="border-top:none;border-left:none;border-bottom:solid windowtext 1.0pt;border-right:solid windowtext 1.0pt;mso-border-top-alt:solid windowtext .5pt;mso-border-left-alt:solid windowtext .5pt;mso-border-alt:solid windowtext .5pt;padding:0cm 5.4pt 0cm 5.4pt;mso-height-rule:exactly;vertical-align:top"> || ||<rowstyle="mso-yfti-irow:9;mso-yfti-lastrow:yes;mso-height-rule:exactly"width="93px" height="14.2pt" style="border:solid windowtext 1.0pt;border-top:none;mso-border-top-alt:solid windowtext .5pt;mso-border-alt:solid windowtext .5pt;padding:0cm 5.4pt 0cm 5.4pt;mso-height-rule:exactly;vertical-align:top">59 ||<width="307px" height="14.2pt" style="border-top:none;border-left:none;border-bottom:solid windowtext 1.0pt;border-right:solid windowtext 1.0pt;mso-border-top-alt:solid windowtext .5pt;mso-border-left-alt:solid windowtext .5pt;mso-border-alt:solid windowtext .5pt;padding:0cm 5.4pt 0cm 5.4pt;mso-height-rule:exactly;vertical-align:top">Flag if bull has been genotyped ||<width="178px" height="14.2pt" style="border-top:none;border-left:none;border-bottom:solid windowtext 1.0pt;border-right:solid windowtext 1.0pt;mso-border-top-alt:solid windowtext .5pt;mso-border-left-alt:solid windowtext .5pt;mso-border-alt:solid windowtext .5pt;padding:0cm 5.4pt 0cm 5.4pt;mso-height-rule:exactly;vertical-align:top">Character 1 ||<width="179px" height="14.2pt" style="border-top:none;border-left:none;border-bottom:solid windowtext 1.0pt;border-right:solid windowtext 1.0pt;mso-border-top-alt:solid windowtext .5pt;mso-border-left-alt:solid windowtext .5pt;mso-border-alt:solid windowtext .5pt;padding:0cm 5.4pt 0cm 5.4pt;mso-height-rule:exactly;vertical-align:top">Y=yes; N=no || | |
| Line 112: | Line 148: | 
| . | == Documents to be submitted by participating NGEC == === Interbull GENO form === The methodology for estimation of GEBV and its’ accuracy (r^2^,,GEBV,,) have to be reported by the NGEC in Interbull GENO form (Appendix IV). ''' ''' ''' ''' | 
| Line 114: | Line 152: | 
| . APPENDIX III - FILE FORMAT 731 File format for description of the ''GEBV test'' results. ''' ''' | === GEBV test estimates (File format 731) === The NGEC submitting genomic data for validation is required to provide also Form 731 (Appendix III), which contains the results from the validation test obtained by the applicant, as well as descriptive statistics needed for the correct treatment of the submitted data. Eventual discrepancies between values in file 731 and estimates obtained by ITBC will be discussed with the NGEC submitting data prior to publication of results. | 
| Line 116: | Line 155: | 
| == APPENDIX I – Definitions: == * EBV – Estimated Breeding Value (conventional national evaluations of the trait, free of genomic information, which are submitted to Interbull to be used in MACE evaluations) * DGV - Direct Estimated Genomic Value (genomic evaluations based on SNP prediction equations) * GEBV – Genomically Enhanced Estimated Breeding Value (evaluations that combine EBV and DGV) * EDC –Effective Daughter Contribution * GEDC – Genomically Enhanced Effective Daughter Contribution (EDC plus the genomic contribution) * GMACE - Multiple Trait Across Country Genomic Evaluation * PA – Parent Average * D_PGM – De-regressed Predicted Genetic Merit * DD – Daughter Deviation * NGEC - National Genetic Evaluation Centre * λ = (4-h^2^)/h^2^ * r^2^ – Reliability of the bull’s evaluation * R^2^ – Accuracy of the test model ''' ''' APPENDIX III - FILE FORMAT 731 File format for description of the ''GEBV test'' results. ''' ''' | |
| Line 138: | Line 193: | 
| * Models [1] or [2], as proposed on the validation test description. APPENDIX IV – FORM GENO ''' ''' | |
| Line 140: | Line 194: | 
| . Standard Interbull form for description of national genomic evaluation systems.   . Avaiable at: ''' ''' . http://www.interbull.org/index.php?option=com_content&view=article&id=55&Itemid=161 ''' ''' | |
| Line 143: | Line 195: | 
| ---- ''' ''' ''' ''' | * Models [1] or [2], as proposed on the validation test description. | 
| Line 146: | Line 197: | 
| . [[#A_ftnref2|[ii]]] http://www.interbull.org/index.php?option=com_content&view=article&id=55&Itemid=76 ''' '''[[#A_ftnref3|[iii]]] Current: latest official routine run before validation data is prepared. ''' ''' | == APPENDIX IV – FORM GENO == . Standard Interbull form for description of national genomic evaluation systems. . Avaiable at ''' ''' | 
Appendix X - Interbull validation test for genomic evaluations – GEBV test
Document based on Mäntysaari, E., Liu, Z and VanRaden P. 2011. Interbull Validation Test for Genomic Evaluations. Interbull Bulletin 41, p. 17-21.
Motivation
The inclusion of genomic information in international comparisons for dairy breeds requires that the national genomic breeding values (GEBVs) get validated by Interbull in a similar fashion that conventional EBVs are validated as a pre-condition to participate in the MACE evaluations.
The GEBV test will be applied to validate national models used to compute GEBVs that the national genetic evaluation centers (NGEC) publish and will eventually submit to Interbull for international genetic evaluations including genomic information. The GEBV test can be considered also a quality assurance assessment for national genomic evaluations. GEBVs from models that have been tested can be referred to as breeding value estimates with appropriate reliability, and be converted to other country scale breeding values using conversion equations derived by Interbull.
Rationale
The GEBV test evaluates:
- the unbiasedness of the genomic evaluations through the evaluation of - the consistency of the genetic trend captured by GEBV, and
- the consistency of the variation of GEBVs and EBVs;
 
- the improvement in accuracy from the use of GEBV instead of EBV.
The test for bias is done by verifying the ability of a model only including data from 4 years ago to predict current performances. NGEC have to exclude the last 4 years of data and re-run the analyses with the reduced data, with the same model that are being tested.  However, in some cases the bull generation available for validation has not been genotyped in everything and all. Thus, bulls exist that will get more than 20 daughters in the full data, but that have no GEBVs.   This is called selective genotyping, and it leads into systematic bias in the validation bull group. In the test, this bias needs to be corrected by accounting for the selection between the mean national EBV (current, conventional) of the bulls genotyped and the overall mean national EBV including all potential candidates. This selection differential can be used to derive the expected regression coefficient, which would be equal to unity as if no selective genotyping took place. 
Testing the improvement in accuracy is done by comparing the coefficient of determination (R2) of the reduced genomic model and the equivalent reduced conventional model (from 4 years ago) regressed to current performances. The R2 from the model including genomic information must be higher than the model including only parent average information. 
Test data sets
Data formats are described at GEBVtest Software.
Full data sets
The full data sets include all animals present in the most recent Interbull MACE evaluation. They are of two types, one containing national official genetic merit values (EBVs) and another containing either de-regressed predicted genetic merits (D_PGMs) or daughter deviations (DDs).
National official genetic evaluation file (fileCxxxf)
The files sent by the NGEC as input for the most recent Interbull MACE evaluation and will be used to identify the candidate bulls, estimate selection intensity and check bulls birth year and type of proof.
Daughter deviation file (fileDxxxf)
The NGEC needs to prepare either DD or D_PGM for the same animals included in fileCxxxf. These values represent the currently estimated performance of the animals and will be used as the dependent variable in the validation procedure. EDC and reliability estimates should be exactly the same as in fileCxxxf.
Reduced data sets
The reduced data sets should be prepared by truncating the phenotypes used as input for both the conventional and the genomic evaluations. The NGEC must exclude phenotypic information from the past 4 years and re-run the current models of genetic/genomic evaluation for the traits of interest, keeping the animals without progeny information after truncation (test bulls) in the data in order to obtain genetic merit estimates based solely on parent averages (EBVr) or on parent averages plus genomic prediction equations (GEBVr).
Reduced conventional genetic evaluation file (fileCxxxr)
The NGEC should carry out a conventional genetic evaluation using truncated data (only phenotypes up to 4 years prior to the date of analysis) but including in the analysis all animals present in the current official evaluations (fileCxxxf).
Reduced genomic evaluation file (fileGxxxr)
Similarly, new genomic evaluations should be carried out using exactly the same model being validated (current) but excluding phenotypic information up to four years ago (truncated data, fileCxxxr). All bulls that did not have a progeny test 4 years ago and that currently have at least 20 daughter-equivalents in the national genetic evaluation (test bulls) need to have a genomically enhanced EBV (GEBVr) estimated and included in the output.
If a significant number of foreign animals are included in the reference population and estimation of genomic prediction equations uses de-regressed MACE values for these animals as input, the reduced genomic evaluation can be achieved in two ways:
- the Interbull Centre can make historical files available upon request (e.g. information used four years ago) containing past MACE results and the correspondent national EDCs, as well as heritability and genetic correlations used in the respective evaluations – these data can then be used to estimate 4-year old de-regressed values; OR
- the genomic prediction equations for the truncated data (only bulls with EDCr > 0) are obtained using current de-regressed MACE values. This constitutes an exception and should only be used when the standard procedure is not practical. 
Table 1 presents a comparison between the several types of data and the notation used to identify variables from different files.
Table 1 – Comparative specification of the data files needed for the GEBV test.
| Test Data | Type of information | File types and formata | Specific variablesb (equivalent field in the fileCxxxf) | ||
| EDC | Reliability | EBV | |||
| Full data sets | Conventional Genetic data | C010f, C115f, C015f, C016f, C017f, C018f, C019f, C020f | EDC | r2EBV | EBV | 
| Daughter deviation data | D010f, D115f, D015f, D016f, D017f, D018f, D019f, D020f | EDC | r2EBV | D_PGM (or DD, if available) | |
| Reduced data sets | Conventional Genetic data | C010r, C115r, C015r, C016r, C017r, C018r, C019r, C020r | EDCr | r2EBVr | EBVr | 
| Genomic data | G010r, G115r, G015r, G016r, G017r, G018r, G019r, G020r | GEDCr | r2GEBVr | GEBVr | |
aThe GEBVtest software (gebvtest.py) uses a trait-independent  format (File300). Users can either prepare data in the new format or  use the program gtconvert.py to convert the current format into the  File300 format.
 bAll other variables should be the same as in the Cxxxf files. 
Specific instructions for data preparation:
- The domestic bulls (type of proof ≠ 21 or 22) that have EDC≥20 and EDCr = 0 are called test bulls. Interbull recommends that number of test bulls would be about 0.25 *(number of bulls used as reference population). - If the number of bulls the country includes in the genomic evaluation is too small, then the accuracy of the GEBVs calculated using the truncated data becomes significantly smaller than with the full data. In that case, the country can use n < 4 years as the time difference between full and reduced data sets. 
- If the number of test bulls is too small (ntb < 50), the country may chose to consider foreign bulls (type of proof = 21 or 22) that have EDC≥20 and EDCr = 0 also as test bulls. 
- In both exceptions above, the Interbull Centre must be communicated in detail about the criteria adopted to define the test bulls.
 
- Appropriate time windows (birth years of bulls) may vary depending of the trait to be validated, the speed of their progeny test program and other factors. A shift of the time window with one year will give a different set of bulls that qualify for the test. It is desirable to choose a time window so as to have the largest number of qualifying bulls as possible. It is required that bulls do not have any progeny for the evaluation in (YYYY - 4), and EDC≥20 in YYYY. The standard adopted for the GEBV test is to include four years of candidate/test bulls, which corresponds to an age cutoff of (YYYY-8).
- GEBV is the genomically enhanced breeding value. Correspondingly, the GEDC is a genomically enhanced EDC that combines the EDC from national non-genomic evaluation with the gain from genomic evaluation. This means that GEDC should be larger than EDC and GEDCr should be larger than EDCr.
- Include all the bulls having GEBVr in the data without data edits based on EDC, EDCr, GEDC or GEDCr.
- If GEDCr is not available, then GEDCr = λ * r2GEBVr / (1- r2GEBVr) 
- The method of estimation of GEDCr (and/or r2GEBVr) has to be reported in the Interbull GENO form. 
- The GEBVr prediction equations also have to be based on the truncated data. If the GEBVr combines information of DGV and EBV (i.e. PA), the EBV (PA) information has to be also from the truncated data.
- Bulls with EBV in the full data sets that have no progeny information four years ago (EDCr=0), should be included in the reduced data set.
- If the EBVs from evaluations published four years ago are available, the country can use these values for the reduced data sets. However, if the evaluation model, trait definitions, etc. have changed from the estimation of EBVs in the reduced data sets and the estimation of EBVs in the full data sets, the GEBVr can be expected to have lower accuracy than GEBV. In this case, the country should report the expected correlation between the old (reduced) and the new (full) data EBVs (see Interbull Testing Method 3).
- In order to remove any change in scale of proof expression, EBVr and GEBVr should be rescaled to the same scale as EBVs, using bulls already proven in the reduced data sets.
File names
Submitted files should have the following name: file{FILETYPE}{FILEFORMAT}{DATASET}{BRD}.{COU}.yyyymmdd ,
where
- file = prefix FILETYPE = C (conventional), G (genomic) or D (De-regressed/DD) (upper-case)
- FILEFORMAT = 010, 015, 115, 016, 017, 018, 019, 020
- DATASET = f (full) or r(reduced) (lower-case)
- BRD = all, bsw, gue, hol, jer, rdc, sim (lower-case)
- COU = ARG, AUS, … (upper-case)
- yyyymmdd = year|month|day of the file creation
Example: ‘fileG010rhol.AUS.20110509’ would be the genomic reduced data set for production traits from the Australian Hosltein population, prepared in May 09, 2011.
Test description
Testing for bias
The bias in the national genomic evaluations will be tested using a regression model:
φi= b0 + b1*GEBVri + ei [1] ,
where φi is the D_PGM (or DD, if available) from the bulls that have EDC≥20 and EDCr =0. The EDCs from the full data set can be used as weights in the model if DDs are supplied, otherwise the accuracy of the D_PGMs (ui=EDC/(EDC+λ)) will be used as weights.
- This model is used to estimate b1 to compare with the expectation of b1 (H0: b1 = E(b1)) and therefore test the bias on GEBVr. Item 4.3 describes how the expectation of b1 can be derived considering the impact of selective genotyping among test bulls. - The statistical significance will be tested using a t-test against H0 (C.I. = 0.95). 
- For larger populations the estimated standard error might become very small and then the t-test may become too restrictive. In those cases, a “biological significance” will be adopted to test H0 [P((E(b1)-0.1) ≤ b1 ≤ (P(E(b1)+0.1))]. 
- Followed by ITC discussions in Cork 2012, the country-trait-breeds will pass the test, if b1 value is greater than the lower endpoint of the 95% confidence interval or its biological equivalent. 
 
- The accuracy of GEBVr will be estimated from the R2 of the model (accuracy of the model after selection for genotyping). This validation accuracy R2validation = R2/ ū, where ū is the average weight of all the test bulls. It will be expected that the mean of published bull r2GEBVr is in agreement with R2validation. 
Testing the improvement from conventional evaluation
The improvement of the added daughter information to the parental information will be estimated by comparing the R2 from model [1] with the R2 from model [2]:
φi = b0 + b1*EBVri+ ei [2] ,
where φi and the corresponding weight ui are the same as in model [1]. The R2 from model [1] must be higher than the R2 from model [2].
Estimating the effect of selective genotyping on E(b1)
The expected value of b1 is 1.0 only if the genotyped test bulls are a representative sample of the bulls in the corresponding age classes. The selection based on EBVs before genotyping will reduce the value of b1 and also the value of R2 for model [1]. The level of selective genotyping can be approximated from the difference between the mean EBV of the genotyped test bulls, µEBVg, and the mean EBV of all potential test bulls (i.e. bulls with EDC≥20 and EDCr=0, genotyped or not), µEBVall, and the standard deviation of EBV of all potential test bulls (σEBVall).
- i = (µEBVg - µEBVall)/ σEBVall [3]. 
Using tables from quantitative genetics books, (e.g. page 379 from Falconer, D. S. & Mackay, T. F. C. Introduction to Quantitative Genetics, Longman, 4th ed. 1996) the proportion of selected (genotyped) individuals (p) can be obtained for the selection differential (i) and the corresponding truncation point x that divides the standard normal  density into selected proportion p and non-selected (1-p).  
 
Having the proportion of the selected individuals, the expected value of the b1 (E(b1)) and the effect of the selection on R2 of the test model can be estimated by approximation of the effect of selection on the variance of the selected trait and on the covariance between the independent (GEBVr) and the dependent (φ) variables. Having (i) as the mean deviation of the selected individuals from the total population in terms of standard deviation from the total population, and (x) as a selection truncation point from the overall mean:
k = i(i - x) [4]
v1 = 1 – k [5]
Calculating R2 before selection (Rb2), which is the R2 for model [1], from R2 after selection
(Ra2): Rb2 = Ra2 / (v1 + kRa2) [6]
v2 = 1 – kRb2 [7]
E(b1) = v1 / v2 [8]
Example: Assuming that: µEBVg = 16.00; µEBVall = 11.76; σEBVall = 10.00; Ra2 = 0.555. The selection differential (i) for the genotyped bulls equal to 0.424 standard deviations of EBVs (equation [3]), the proportion of genotyped bulls (p) would be 75 percent and the mean deviation of the truncation point from the overall mean (x) would be equals to -0.674 (from reference table). Applying equations [4], [5] and [6] it is possible to calculate Rb2 = 0.70. Using equations [7] and [8] E(b1) = 0.793.
Table 2 – Examples of expected regression coefficients (E(b1)) as functions of the selection intensity (i) and the coefficient of determination before selection (Rb2).
| i | p | x | E(b1) | ||||
| Rb2 = 0.50 | Rb2 = 0.55 | Rb2 = 0.60 | Rb2 = 0.65 | Rb2 = 0.70 | |||
| 0.644 | 60 | -0.253 | 0.594 | 0.619 | 0.646 | 0.676 | 0.709 | 
| 0.570 | 65 | -0.385 | 0.626 | 0.650 | 0.677 | 0.705 | 0.736 | 
| 0.497 | 70 | -0.524 | 0.660 | 0.683 | 0.708 | 0.735 | 0.764 | 
| 0.424 | 75 | -0.674 | 0.697 | 0.718 | 0.742 | 0.766 | 0.793 | 
| 0.350 | 80 | -0.842 | 0.736 | 0.756 | 0.777 | 0.800 | 0.823 | 
| 0.274 | 85 | -1.036 | 0.781 | 0.799 | 0.817 | 0.836 | 0.856 | 
| 0.195 | 90 | -1.282 | 0.832 | 0.846 | 0.861 | 0.876 | 0.892 | 
| 0.109 | 95 | -1.645 | 0.894 | 0.904 | 0.914 | 0.924 | 0.934 | 
| 0.000 | 100 | 
 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 
Documents to be submitted by participating NGEC
Interbull GENO form
The methodology for estimation of GEBV and its’ accuracy (r2GEBV) have to be reported by the NGEC in Interbull GENO form (Appendix IV).
GEBV test estimates (File format 731)
The NGEC submitting genomic data for validation is required to provide also Form 731 (Appendix III), which contains the results from the validation test obtained by the applicant, as well as descriptive statistics needed for the correct treatment of the submitted data. Eventual discrepancies between values in file 731 and estimates obtained by ITBC will be discussed with the NGEC submitting data prior to publication of results.
APPENDIX I – Definitions:
- EBV – Estimated Breeding Value (conventional national evaluations of the trait, free of genomic information, which are submitted to Interbull to be used in MACE evaluations)
- DGV - Direct Estimated Genomic Value (genomic evaluations based on SNP prediction equations)
- GEBV – Genomically Enhanced Estimated Breeding Value (evaluations that combine EBV and DGV)
- EDC –Effective Daughter Contribution
- GEDC – Genomically Enhanced Effective Daughter Contribution (EDC plus the genomic contribution)
- GMACE - Multiple Trait Across Country Genomic Evaluation
- PA – Parent Average
- D_PGM – De-regressed Predicted Genetic Merit
- DD – Daughter Deviation
- NGEC - National Genetic Evaluation Centre
- λ = (4-h2)/h2 
- r2 – Reliability of the bull’s evaluation 
- R2 – Accuracy of the test model 
APPENDIX III - FILE FORMAT 731 File format for description of the GEBV test results.
| Starting position | Field description | Format | Example | 
| 1 | Record type | Character 3 | 731 | 
| 4 | Country sending this information | Character 3 | 
 | 
| 7 | Breed of evaluation | Character 3 | HOL | 
| 11 | Date | Integer 8 | 20100215 | 
| 20 | Trait | Character 3 | pr | 
| 25 | Model* | Integer 2 | 1 | 
| 30 | Mean of dependent variable | Real F15.7 | 12.3456789 | 
| 45 | SD of dependent variable | Real F15.7 | 4.9876543 | 
| 60 | Type of dependent variable used | Character 3 | DD = daughter deviation; GM = de-regressed predicted genetic merit | 
| 63 | Mean of independent variable | Real F15.7 | 11.9876543 | 
| 78 | SD of independent variable | Real F15.7 | 4.6549871 | 
| 93 | b0 | Real F15.7 | 0.1234567 | 
| 108 | Standard error of b0 | Real F15.7 | 0.0000123 | 
| 123 | b1 | Real F15.7 | 1.1234567 | 
| 138 | Standard error of b1 | Real F15.7 | 0.0000987 | 
| 153 | Selection intensity (i) | Real F15.7 | 0.4240000 | 
| 168 | Expected value of slope (b1) | Real F15.7 | 0.7700000 | 
| 183 | Number of test bulls | Integer 6 | 1500 | 
| 198 | R2 of the model | Integer 3 | 99 | 
- Models [1] or [2], as proposed on the validation test description.
APPENDIX IV – FORM GENO
- Standard Interbull form for description of national genomic evaluation systems. . Avaiable at 
